Applies on 2.6.28-rc5.
With CONFIG_RT_GROUP_SCHED not set, don't allow a task's priority switch
to realtime if the task isn't part of init_task_group.
A task belonging to a fair group could use sched_setscheduler/sched_setparam
to become a realtime task. If such a task belongs to one of the
child groups of init_task_group and if CONFIG_RT_GROUP_SCHED is not set,
then it ends up getting queued in init_task_group's runqueue.
So we have a situation where, a task belongs to one group (child)
but ends in the runqueue of another group (init_task_group).
This does not look correct.
Fix this by failing such priority change requests in sched_setscheduler()
and sched_setparam().
Signed-off-by: Bharata B Rao <[email protected]>
---
kernel/sched.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -5206,6 +5206,13 @@ recheck:
if (rt_bandwidth_enabled() && rt_policy(policy) &&
task_group(p)->rt_bandwidth.rt_runtime == 0)
return -EPERM;
+#elif defined(CONFIG_FAIR_GROUP_SCHED)
+ /*
+ * If the task doesn't belong to init_task_group, don't
+ * allow priority switch to realtime. (!CONFIG_RT_GROUP_SCHED)
+ */
+ if (rt_policy(policy) && (task_group(p) != &init_task_group))
+ return -EPERM;
#endif
retval = security_task_setscheduler(p, policy, param);
Regards,
Bharata.
* Bharata B Rao <[email protected]> wrote:
> Applies on 2.6.28-rc5.
>
> With CONFIG_RT_GROUP_SCHED not set, don't allow a task's priority
> switch to realtime if the task isn't part of init_task_group.
>
> A task belonging to a fair group could use
> sched_setscheduler/sched_setparam to become a realtime task. If such
> a task belongs to one of the child groups of init_task_group and if
> CONFIG_RT_GROUP_SCHED is not set, then it ends up getting queued in
> init_task_group's runqueue. So we have a situation where, a task
> belongs to one group (child) but ends in the runqueue of another
> group (init_task_group). This does not look correct.
>
> Fix this by failing such priority change requests in
> sched_setscheduler() and sched_setparam().
>
> Signed-off-by: Bharata B Rao <[email protected]>
> ---
> kernel/sched.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -5206,6 +5206,13 @@ recheck:
> if (rt_bandwidth_enabled() && rt_policy(policy) &&
> task_group(p)->rt_bandwidth.rt_runtime == 0)
> return -EPERM;
> +#elif defined(CONFIG_FAIR_GROUP_SCHED)
> + /*
> + * If the task doesn't belong to init_task_group, don't
> + * allow priority switch to realtime. (!CONFIG_RT_GROUP_SCHED)
> + */
> + if (rt_policy(policy) && (task_group(p) != &init_task_group))
> + return -EPERM;
> #endif
>
> retval = security_task_setscheduler(p, policy, param);
hm, another option would be, instead of denying something (which
denial might not even be noticed by the app) that the app clearly has
enough privilege to request - to just act upon it and move the task to
the init_task_group?
the app cannot expect fair scheduling for this task anyway. And if we
want to forbid tasks from doing so - do not give them privilege to go
to RT priorities.
Ingo
On Thu, Nov 20, 2008 at 08:58:29AM +0100, Ingo Molnar wrote:
>
> * Bharata B Rao <[email protected]> wrote:
>
> > Applies on 2.6.28-rc5.
> >
> > With CONFIG_RT_GROUP_SCHED not set, don't allow a task's priority
> > switch to realtime if the task isn't part of init_task_group.
> >
> > A task belonging to a fair group could use
> > sched_setscheduler/sched_setparam to become a realtime task. If such
> > a task belongs to one of the child groups of init_task_group and if
> > CONFIG_RT_GROUP_SCHED is not set, then it ends up getting queued in
> > init_task_group's runqueue. So we have a situation where, a task
> > belongs to one group (child) but ends in the runqueue of another
> > group (init_task_group). This does not look correct.
> >
> > Fix this by failing such priority change requests in
> > sched_setscheduler() and sched_setparam().
> >
> > Signed-off-by: Bharata B Rao <[email protected]>
> > ---
> > kernel/sched.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > --- a/kernel/sched.c
> > +++ b/kernel/sched.c
> > @@ -5206,6 +5206,13 @@ recheck:
> > if (rt_bandwidth_enabled() && rt_policy(policy) &&
> > task_group(p)->rt_bandwidth.rt_runtime == 0)
> > return -EPERM;
> > +#elif defined(CONFIG_FAIR_GROUP_SCHED)
> > + /*
> > + * If the task doesn't belong to init_task_group, don't
> > + * allow priority switch to realtime. (!CONFIG_RT_GROUP_SCHED)
> > + */
> > + if (rt_policy(policy) && (task_group(p) != &init_task_group))
> > + return -EPERM;
> > #endif
> >
> > retval = security_task_setscheduler(p, policy, param);
>
> hm, another option would be, instead of denying something (which
> denial might not even be noticed by the app) that the app clearly has
> enough privilege to request - to just act upon it and move the task to
> the init_task_group?
>
> the app cannot expect fair scheduling for this task anyway. And if we
> want to forbid tasks from doing so - do not give them privilege to go
> to RT priorities.
>
I am wondering what would the right action then be if the task drops
back to CFS.
--
regards,
Dhaval
* Dhaval Giani <[email protected]> wrote:
> On Thu, Nov 20, 2008 at 08:58:29AM +0100, Ingo Molnar wrote:
> >
> > * Bharata B Rao <[email protected]> wrote:
> >
> > > Applies on 2.6.28-rc5.
> > >
> > > With CONFIG_RT_GROUP_SCHED not set, don't allow a task's priority
> > > switch to realtime if the task isn't part of init_task_group.
> > >
> > > A task belonging to a fair group could use
> > > sched_setscheduler/sched_setparam to become a realtime task. If such
> > > a task belongs to one of the child groups of init_task_group and if
> > > CONFIG_RT_GROUP_SCHED is not set, then it ends up getting queued in
> > > init_task_group's runqueue. So we have a situation where, a task
> > > belongs to one group (child) but ends in the runqueue of another
> > > group (init_task_group). This does not look correct.
> > >
> > > Fix this by failing such priority change requests in
> > > sched_setscheduler() and sched_setparam().
> > >
> > > Signed-off-by: Bharata B Rao <[email protected]>
> > > ---
> > > kernel/sched.c | 7 +++++++
> > > 1 file changed, 7 insertions(+)
> > >
> > > --- a/kernel/sched.c
> > > +++ b/kernel/sched.c
> > > @@ -5206,6 +5206,13 @@ recheck:
> > > if (rt_bandwidth_enabled() && rt_policy(policy) &&
> > > task_group(p)->rt_bandwidth.rt_runtime == 0)
> > > return -EPERM;
> > > +#elif defined(CONFIG_FAIR_GROUP_SCHED)
> > > + /*
> > > + * If the task doesn't belong to init_task_group, don't
> > > + * allow priority switch to realtime. (!CONFIG_RT_GROUP_SCHED)
> > > + */
> > > + if (rt_policy(policy) && (task_group(p) != &init_task_group))
> > > + return -EPERM;
> > > #endif
> > >
> > > retval = security_task_setscheduler(p, policy, param);
> >
> > hm, another option would be, instead of denying something (which
> > denial might not even be noticed by the app) that the app clearly has
> > enough privilege to request - to just act upon it and move the task to
> > the init_task_group?
> >
> > the app cannot expect fair scheduling for this task anyway. And if we
> > want to forbid tasks from doing so - do not give them privilege to go
> > to RT priorities.
> >
>
> I am wondering what would the right action then be if the task drops
> back to CFS.
yeah. If the integration artifacts around the edges get too awkward,
then the best would be to consolidate fair-group and rt-group into the
same group-sched config option and _eliminate_ such artifacts at their
root. rt-group was started as a separate option mostly because it was
new and experimental code - that splitup is not cast into stone.
Ingo
On Thu, Nov 20, 2008 at 08:58:29AM +0100, Ingo Molnar wrote:
> <snip>
> hm, another option would be, instead of denying something (which
> denial might not even be noticed by the app) that the app clearly has
> enough privilege to request - to just act upon it and move the task to
> the init_task_group?
Thought about that option, but decided against it because:
- The task was started in a group which is not "equipped" to handle
rt tasks (no rt_rq for the group because CONFIG_RT_GROUP_SCHED=n). So
the task shouldn't have been started in such a group in the first
place.
- We could move the task silently to init_task_group (which is kind
of done now with the task being placed in init_task_group's rq), but
that's not what a user would have expected when he started a task
under a group.
Also, silently moving the task to init_task_group would cause confusion
later when the task drops the RT privilege. As Dhaval is asking in the
other thread, which group do we move it to now ?
Regards,
Bharata.
On Thu, 2008-11-20 at 11:48 +0530, Bharata B Rao wrote:
> Applies on 2.6.28-rc5.
>
> With CONFIG_RT_GROUP_SCHED not set, don't allow a task's priority switch
> to realtime if the task isn't part of init_task_group.
>
> A task belonging to a fair group could use sched_setscheduler/sched_setparam
> to become a realtime task. If such a task belongs to one of the
> child groups of init_task_group and if CONFIG_RT_GROUP_SCHED is not set,
> then it ends up getting queued in init_task_group's runqueue.
> So we have a situation where, a task belongs to one group (child)
> but ends in the runqueue of another group (init_task_group).
> This does not look correct.
>
> Fix this by failing such priority change requests in sched_setscheduler()
> and sched_setparam().
NAK
!RT_GROUP means the RT tasks should be fully invariant to any grouping
configuration. This patch breaks that.
Furthermore your justification for this is plain wrong, you write as if
an RT tasks can belong to any grouping (in the !RT_GROUP case), by the
above this is untrue and shows a fundamental mis-understanding of the
concepts involved.
On Sun, Nov 23, 2008 at 02:11:16AM +0100, Peter Zijlstra wrote:
> On Thu, 2008-11-20 at 11:48 +0530, Bharata B Rao wrote:
> > Applies on 2.6.28-rc5.
> >
> > With CONFIG_RT_GROUP_SCHED not set, don't allow a task's priority switch
> > to realtime if the task isn't part of init_task_group.
> >
> > A task belonging to a fair group could use sched_setscheduler/sched_setparam
> > to become a realtime task. If such a task belongs to one of the
> > child groups of init_task_group and if CONFIG_RT_GROUP_SCHED is not set,
> > then it ends up getting queued in init_task_group's runqueue.
> > So we have a situation where, a task belongs to one group (child)
> > but ends in the runqueue of another group (init_task_group).
> > This does not look correct.
> >
> > Fix this by failing such priority change requests in sched_setscheduler()
> > and sched_setparam().
>
> NAK
>
> !RT_GROUP means the RT tasks should be fully invariant to any grouping
> configuration. This patch breaks that.
>
> Furthermore your justification for this is plain wrong, you write as if
> an RT tasks can belong to any grouping (in the !RT_GROUP case), by the
Hmm... but that's I the impression I got when I started an RT
task in a subgroup and saw it listed in the tasks list of the cgroup.
So if RT tasks don't care about grouping (!RT_GROUP_SCHED), may be then
such tasks shoudn't be shown in the cgroup tasks list. And may be as Ingo
suggested they should be moved to init_task_group. Without that it makes one
think that a task is contributing to the cpu time of a group in which
it was started, but it in reality it is not.
Regards,
Bharata.
On Mon, 2008-11-24 at 09:28 +0530, Bharata B Rao wrote:
> On Sun, Nov 23, 2008 at 02:11:16AM +0100, Peter Zijlstra wrote:
> > On Thu, 2008-11-20 at 11:48 +0530, Bharata B Rao wrote:
> > > Applies on 2.6.28-rc5.
> > >
> > > With CONFIG_RT_GROUP_SCHED not set, don't allow a task's priority switch
> > > to realtime if the task isn't part of init_task_group.
> > >
> > > A task belonging to a fair group could use sched_setscheduler/sched_setparam
> > > to become a realtime task. If such a task belongs to one of the
> > > child groups of init_task_group and if CONFIG_RT_GROUP_SCHED is not set,
> > > then it ends up getting queued in init_task_group's runqueue.
> > > So we have a situation where, a task belongs to one group (child)
> > > but ends in the runqueue of another group (init_task_group).
> > > This does not look correct.
> > >
> > > Fix this by failing such priority change requests in sched_setscheduler()
> > > and sched_setparam().
> >
> > NAK
> >
> > !RT_GROUP means the RT tasks should be fully invariant to any grouping
> > configuration. This patch breaks that.
> >
> > Furthermore your justification for this is plain wrong, you write as if
> > an RT tasks can belong to any grouping (in the !RT_GROUP case), by the
>
> Hmm... but that's I the impression I got when I started an RT
> task in a subgroup and saw it listed in the tasks list of the cgroup.
> So if RT tasks don't care about grouping (!RT_GROUP_SCHED), may be then
> such tasks shoudn't be shown in the cgroup tasks list.
Sure, do as you like upon the cgroup interface
> And may be as Ingo
> suggested they should be moved to init_task_group.
Bzzt, wrong! They should not be moved to any group, they cannot be moved
to any group, as they are group invariant.
> Without that it makes one
> think that a task is contributing to the cpu time of a group in which
> it was started, but it in reality it is not.
Until one thinks more and realizes RT tasks don't contribute anything to
groups as they are not part of them.
On Mon, Nov 24, 2008 at 09:32:06AM +0100, Peter Zijlstra wrote:
> On Mon, 2008-11-24 at 09:28 +0530, Bharata B Rao wrote:
> > On Sun, Nov 23, 2008 at 02:11:16AM +0100, Peter Zijlstra wrote:
> > > On Thu, 2008-11-20 at 11:48 +0530, Bharata B Rao wrote:
> > > > Applies on 2.6.28-rc5.
> > > >
> > > > With CONFIG_RT_GROUP_SCHED not set, don't allow a task's priority switch
> > > > to realtime if the task isn't part of init_task_group.
> > > >
> > > > A task belonging to a fair group could use sched_setscheduler/sched_setparam
> > > > to become a realtime task. If such a task belongs to one of the
> > > > child groups of init_task_group and if CONFIG_RT_GROUP_SCHED is not set,
> > > > then it ends up getting queued in init_task_group's runqueue.
> > > > So we have a situation where, a task belongs to one group (child)
> > > > but ends in the runqueue of another group (init_task_group).
> > > > This does not look correct.
> > > >
> > > > Fix this by failing such priority change requests in sched_setscheduler()
> > > > and sched_setparam().
> > >
> > > NAK
> > >
> > > !RT_GROUP means the RT tasks should be fully invariant to any grouping
> > > configuration. This patch breaks that.
> > >
> > > Furthermore your justification for this is plain wrong, you write as if
> > > an RT tasks can belong to any grouping (in the !RT_GROUP case), by the
> >
> > Hmm... but that's I the impression I got when I started an RT
> > task in a subgroup and saw it listed in the tasks list of the cgroup.
> > So if RT tasks don't care about grouping (!RT_GROUP_SCHED), may be then
> > such tasks shoudn't be shown in the cgroup tasks list.
>
> Sure, do as you like upon the cgroup interface
>
> > And may be as Ingo
> > suggested they should be moved to init_task_group.
>
> Bzzt, wrong! They should not be moved to any group, they cannot be moved
> to any group, as they are group invariant.
>
Which would mean the init_task_group becase it contains those tasks which are
not grouped.
> > Without that it makes one
> > think that a task is contributing to the cpu time of a group in which
> > it was started, but it in reality it is not.
>
> Until one thinks more and realizes RT tasks don't contribute anything to
> groups as they are not part of them.
>
I guess this is where education/documentation would help.
--
regards,
Dhaval
On Mon, 2008-11-24 at 14:16 +0530, Dhaval Giani wrote:
> Which would mean the init_task_group becase it contains those tasks which are
> not grouped.
Only because of implementation details (we implement the !group case by
having them all part of a single group), conceptually they don't belong
to any group, hence talking about moving it to some group is just wrong.
Furthermore your statement shows another misconception, a group of
ungrouped tasks doesn't make sense. Either we support grouping, in which
case any task belongs to a group, or we don't and they all belong to no
(or the same) group.
Peter Zijlstra wrote:
> On Mon, 2008-11-24 at 14:16 +0530, Dhaval Giani wrote:
>
>
>>Which would mean the init_task_group becase it contains those tasks which are
>>not grouped.
>
>
> Only because of implementation details (we implement the !group case by
> having them all part of a single group), conceptually they don't belong
> to any group, hence talking about moving it to some group is just wrong.
> Furthermore your statement shows another misconception, a group of
> ungrouped tasks doesn't make sense.
Arguably there is such a group, which is "the set of all RT tasks".
Whether or not they should map to the top-level cgroup is a different
question. Maybe in the !group case there should be a second top-level
"rt" cgroup? We could even make the RT sched tuning knobs available there.
Chris
On Mon, 2008-11-24 at 09:24 -0600, Chris Friesen wrote:
> Peter Zijlstra wrote:
> > On Mon, 2008-11-24 at 14:16 +0530, Dhaval Giani wrote:
> >
> >
> >>Which would mean the init_task_group becase it contains those tasks which are
> >>not grouped.
> >
> >
> > Only because of implementation details (we implement the !group case by
> > having them all part of a single group), conceptually they don't belong
> > to any group, hence talking about moving it to some group is just wrong.
>
> > Furthermore your statement shows another misconception, a group of
> > ungrouped tasks doesn't make sense.
>
> Arguably there is such a group, which is "the set of all RT tasks".
Sure, I understand that, and in fact that's how its implemented, no
group is still one group (which is how you can bootstrap math from
group/set theory).
But its not a manageable group in the cgroup sense, its just the
collection of RT tasks.
> Whether or not they should map to the top-level cgroup is a different
> question. Maybe in the !group case there should be a second top-level
> "rt" cgroup? We could even make the RT sched tuning knobs available there.
I'd rather just not display all that. We don't go make such 'unmanaged'
groups for other not configured controllers either.
On Mon, Nov 24, 2008 at 09:32:06AM +0100, Peter Zijlstra wrote:
> > And may be as Ingo
> > suggested they should be moved to init_task_group.
>
> Bzzt, wrong! They should not be moved to any group, they cannot be moved
> to any group, as they are group invariant.
Though what you say makes sense, the confusion arises from existing
cgroup<->scheduler interface, which can end up showing the above
single-set of RT-tasks to be split into multiple sets.
RT Tasks -> {RT0, RT1}
can be shown as:
/a/tasks {RT0, ...}
/b/tasks {RT1, ...}
Does this cause any problems? Perhaps no, just seems odd ..
Fixing this oddity of representing single RT-tasks set as multiple is not a
cgroup issue IMHO.
P.S :- If eventually RT_GROUP_SCHED will be merged with FAIR_GROUP_SCHED, then
it may make sense for us to just ignore this oddity for timebeing and look
forward to the options being merged.
- vatsa
On Mon, 2008-11-24 at 22:35 +0530, Srivatsa Vaddagiri wrote:
> On Mon, Nov 24, 2008 at 09:32:06AM +0100, Peter Zijlstra wrote:
> > > And may be as Ingo
> > > suggested they should be moved to init_task_group.
> >
> > Bzzt, wrong! They should not be moved to any group, they cannot be moved
> > to any group, as they are group invariant.
>
> Though what you say makes sense, the confusion arises from existing
> cgroup<->scheduler interface, which can end up showing the above
> single-set of RT-tasks to be split into multiple sets.
>
> RT Tasks -> {RT0, RT1}
>
> can be shown as:
>
> /a/tasks {RT0, ...}
> /b/tasks {RT1, ...}
>
> Does this cause any problems? Perhaps no, just seems odd ..
>
> Fixing this oddity of representing single RT-tasks set as multiple is not a
> cgroup issue IMHO.
I'm fine with just not showing RT tasks in cgroup:cpu/tasks.
> P.S :- If eventually RT_GROUP_SCHED will be merged with FAIR_GROUP_SCHED, then
> it may make sense for us to just ignore this oddity for timebeing and look
> forward to the options being merged.
Right, that might take a bit of time still as RT_GROUP needs a deadline
scheduler to be complete.
Peter Zijlstra wrote:
> On Mon, 2008-11-24 at 09:24 -0600, Chris Friesen wrote:
>
>>Peter Zijlstra wrote:
>>>Furthermore your statement shows another misconception, a group of
>>>ungrouped tasks doesn't make sense.
>>
>>Arguably there is such a group, which is "the set of all RT tasks".
>
>
> Sure, I understand that, and in fact that's how its implemented, no
> group is still one group (which is how you can bootstrap math from
> group/set theory).
>
> But its not a manageable group in the cgroup sense, its just the
> collection of RT tasks.
True, but it would be valuable to have statistics for how much cpu time
the RT tasks are consuming.
Chris
On Tue, 2008-11-25 at 12:51 -0600, Chris Friesen wrote:
> Peter Zijlstra wrote:
> > On Mon, 2008-11-24 at 09:24 -0600, Chris Friesen wrote:
> >
> >>Peter Zijlstra wrote:
>
> >>>Furthermore your statement shows another misconception, a group of
> >>>ungrouped tasks doesn't make sense.
> >>
> >>Arguably there is such a group, which is "the set of all RT tasks".
> >
> >
> > Sure, I understand that, and in fact that's how its implemented, no
> > group is still one group (which is how you can bootstrap math from
> > group/set theory).
> >
> > But its not a manageable group in the cgroup sense, its just the
> > collection of RT tasks.
>
> True, but it would be valuable to have statistics for how much cpu time
> the RT tasks are consuming.
True, I have patches for that though, and this is unrelated to cgroups.
The more interesting thing is using that time to scale the sched_other
balancing.