2005-01-27 17:53:00

by Shailabh Nagar

[permalink] [raw]
Subject: ckrm-e17

Version e17 of the Class-based Kernel Resource Management
is now available for download from

http://sourceforge.net/project/showfiles.php?group_id=85838&package_id=94608

The major updates since the previous version include:
1. Numerous bugfixes
2. Control over rate of process forks through the numtasks controller.
The rate of forking is a single systemwide parameter affecting all
classes. Existing share-based control over total number of forks allowed
per class remains the same.
3. Interface change: The "target" file has been removed from the RCFS
interface. The same functionality can now be obtained by writing to the
"members" file of any class.

Files released:

ckrm-e17.2610.patch
Combined patch against 2.6.10. Includes the numtasks and
listenaq controllers.
e17-incr.tar.bz2
Tarball of broken down patches. First 10 patches constitute
the e16 release and subsequent ones contain the updates since
then.
cpu.ckrm-e17.v10.patch
CPU controller.


Still to come:

memory controller
I/O controller
test packages


Please note that updates to CKRM based on the feedback from lkml on
the previous release (http://lkml.org/lkml/2004/11/29/152) are in
progress and will be included in the next release.

Testing and feedback welcome.

--Shailabh







2005-01-27 18:33:51

by Dave Hansen

[permalink] [raw]
Subject: Re: ckrm-e17

On Thu, 2005-01-27 at 12:52 -0500, Shailabh Nagar wrote:
> Version e17 of the Class-based Kernel Resource Management
> is now available for download from
>
> http://sourceforge.net/project/showfiles.php?group_id=85838&package_id=94608

If you want comments on these, please post them inline.

-- Dave

2005-01-27 19:56:59

by Shailabh Nagar

[permalink] [raw]
Subject: Re: ckrm-e17

Dave Hansen wrote:
> On Thu, 2005-01-27 at 12:52 -0500, Shailabh Nagar wrote:
>
>>Version e17 of the Class-based Kernel Resource Management
>>is now available for download from
>>
>>http://sourceforge.net/project/showfiles.php?group_id=85838&package_id=94608
>
>
> If you want comments on these, please post them inline.
>
> -- Dave
>
>

The combined patch is too large (377KB) to be inlined. Reviewing it
will be made much easier if its broken down as was the case in the
previous submission to lkml (http://lkml.org/lkml/2004/11/29/152)


We should be posting a readable breakdown of the next version (which
will also incorporate feedback sent on the earlier submission) fairly
soon and those will be inline.


-- Shailabh



2005-02-12 05:49:42

by Peter Williams

[permalink] [raw]
Subject: Re: ckrm-e17

Shailabh Nagar wrote:
> Version e17 of the Class-based Kernel Resource Management
> is now available for download from
>
> http://sourceforge.net/project/showfiles.php?group_id=85838&package_id=94608
>
>
> The major updates since the previous version include:
> 1. Numerous bugfixes
> 2. Control over rate of process forks through the numtasks controller.
> The rate of forking is a single systemwide parameter affecting all
> classes. Existing share-based control over total number of forks allowed
> per class remains the same.
> 3. Interface change: The "target" file has been removed from the RCFS
> interface. The same functionality can now be obtained by writing to the
> "members" file of any class.
>
> Files released:
>
> ckrm-e17.2610.patch
> Combined patch against 2.6.10. Includes the numtasks and
> listenaq controllers.
> e17-incr.tar.bz2
> Tarball of broken down patches. First 10 patches constitute
> the e16 release and subsequent ones contain the updates since
> then.
> cpu.ckrm-e17.v10.patch
> CPU controller.
>
>
> Still to come:
>
> memory controller
> I/O controller
> test packages
>
>
> Please note that updates to CKRM based on the feedback from lkml on
> the previous release (http://lkml.org/lkml/2004/11/29/152) are in
> progress and will be included in the next release.
>
> Testing and feedback welcome.

At line 3887 of cpu.ckrm-e17.v10.patch you add the line:

set_task_cpu(p,this_cpu);

to the middle of the function wake_up_new_task() resulting in the
following code:

} else {
this_rq = cpu_rq(this_cpu);

/*
* Not the local CPU - must adjust timestamp. This should
* get optimised away in the !CONFIG_SMP case.
*/
p->sdu.ingosched.timestamp = (p->sdu.ingosched.timestamp -
this_rq->timestamp_last_tick)
+ rq->timestamp_last_tick;
set_task_cpu(p,this_cpu);
__activate_task(p, rq);
if (TASK_PREEMPTS_CURR(p, rq))
resched_task(rq->curr);

schedstat_inc(rq, wunt_moved);
/*
* Parent and child are on different CPUs, now get the
* parent runqueue to update the parent's ->sdu.ingosched.sleep_avg:
*/
task_rq_unlock(rq, &flags);
this_rq = task_rq_lock(current, &flags);
}

where "rq" has been set by the return value of "task_rq_lock(p,
&flags)", and the test "(cpu == this_cpu)" has failed with "cpu" set to
"task_cpu(p)". The result of this when the CKRM CPU code is not
configured into the build is that "p" will be queued on a runqueue that
is not in agreement with "p->thread_info->cpu" which in turn will lead
to future use of "task_rq_lock()" locking the wrong run queue and
eventually triggering some form of race condition.

If CKRM CPU is configured into the build the results are less drastic as
they only result in "nr_running" being incremented for the wrong run
queue. However, even this will have adverse scheduling effects as it
will probably confuse the load balancing code. Another potentially
confusing thing with this code (when CKRM CPU is configured in) is that
__activate_task() does NOT queue "p" on "rq" but on the queue found by
the call "get_task_lrq(p)".

The recommended fix for this problem would be to withdraw the:

set_task_cpu(p,this_cpu);

Peter
PS I reported this to the ckrm-tech list 5 days ago but it was ignored.
--
Peter Williams [email protected]

"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce

2005-02-12 17:47:13

by Shailabh Nagar

[permalink] [raw]
Subject: Re: ckrm-e17

Peter Williams wrote:
> Shailabh Nagar wrote:
>
>
> At line 3887 of cpu.ckrm-e17.v10.patch you add the line:
>
> set_task_cpu(p,this_cpu);
>
> to the middle of the function wake_up_new_task() resulting in the
> following code:
>
> } else {
> this_rq = cpu_rq(this_cpu);
>
> /*
> * Not the local CPU - must adjust timestamp. This should
> * get optimised away in the !CONFIG_SMP case.
> */
> p->sdu.ingosched.timestamp = (p->sdu.ingosched.timestamp -
> this_rq->timestamp_last_tick)
> + rq->timestamp_last_tick;
> set_task_cpu(p,this_cpu);
> __activate_task(p, rq);
> if (TASK_PREEMPTS_CURR(p, rq))
> resched_task(rq->curr);
>
> schedstat_inc(rq, wunt_moved);
> /*
> * Parent and child are on different CPUs, now get the
> * parent runqueue to update the parent's
> ->sdu.ingosched.sleep_avg:
> */
> task_rq_unlock(rq, &flags);
> this_rq = task_rq_lock(current, &flags);
> }
>
> where "rq" has been set by the return value of "task_rq_lock(p,
> &flags)", and the test "(cpu == this_cpu)" has failed with "cpu" set to
> "task_cpu(p)". The result of this when the CKRM CPU code is not
> configured into the build is that "p" will be queued on a runqueue that
> is not in agreement with "p->thread_info->cpu" which in turn will lead
> to future use of "task_rq_lock()" locking the wrong run queue and
> eventually triggering some form of race condition.
>
> If CKRM CPU is configured into the build the results are less drastic as
> they only result in "nr_running" being incremented for the wrong run
> queue. However, even this will have adverse scheduling effects as it
> will probably confuse the load balancing code. Another potentially
> confusing thing with this code (when CKRM CPU is configured in) is that
> __activate_task() does NOT queue "p" on "rq" but on the queue found by
> the call "get_task_lrq(p)".
>
> The recommended fix for this problem would be to withdraw the:
>
> set_task_cpu(p,this_cpu);
>
> Peter

Thanks for finding that out. Will confirm and fix in the next release.

> PS I reported this to the ckrm-tech list 5 days ago but it was ignored.

Other project priorities prevented us from responding sooner. There's no
call to jump to conclusions.

-- Shailabh