2008-01-06 16:24:48

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH 00/11] another rt group sched update

this time compile tested on all 16 combinations of:

CONFIG_SMP
CONFIG_FAIR_GROUP_SCHED
CONFIG_HIGH_RES_TIMERS
CONFIG_NO_HZ

ran some but not all combinations
--


2008-01-07 10:51:36

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 00/11] another rt group sched update


The list of open points and issues for this work:

- review/testing

- handle the PI case better

The only thing I can come up with currently is to basically have two
priority arrays one for boosted and one for non boosted tasks, and
normally run the highest of either array, but in the case of a
throttled group, only pick from the boosted array.

Not sure I like that for its space overhead, Steven?

- I occasionally see a weird lockup on iterating the task_groups list
on smp machines. - I failed to see anything wrong, but hey, this
stack of used brown paper bags is steadily growing.

- figure out what to do for UID based group scheduling, the current
implementation leaves it impossible for !root users to execute
real time tasks by setting rt_runtime_us to 0, and it has no way
to change it.

Srivatsa, what happened to the per uid weight patches?, Perhaps we
can extend that interface to allow changing this.

- I guess documentation needs to be written ;-)


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2008-01-07 11:17:47

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 00/11] another rt group sched update


* Peter Zijlstra <[email protected]> wrote:

> this time compile tested on all 16 combinations of:
>
> CONFIG_SMP
> CONFIG_FAIR_GROUP_SCHED
> CONFIG_HIGH_RES_TIMERS
> CONFIG_NO_HZ
>
> ran some but not all combinations

thanks, applied. This is a really big step forwards in terms of making
RT task CPU usage more flexible and more manageable.

Ingo

2008-01-07 11:26:57

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 00/11] another rt group sched update


On Mon, 2008-01-07 at 11:51 +0100, Peter Zijlstra wrote:

> - I occasionally see a weird lockup on iterating the task_groups list
> on smp machines. - I failed to see anything wrong, but hey, this
> stack of used brown paper bags is steadily growing.

D'oh

---

Don't add a task_group to the task_groups list for each cpu, there is
only a single task_group.


Signed-off-by: Peter Zijlstra <[email protected]>
---

Index: linux-2.6/kernel/sched.c
===================================================================
--- linux-2.6.orig/kernel/sched.c
+++ linux-2.6/kernel/sched.c
@@ -7175,6 +7175,10 @@ void __init sched_init(void)
init_defrootdomain();
#endif

+#ifdef CONFIG_FAIR_GROUP_SCHED
+ list_add(&init_task_group.list, &task_groups);
+#endif
+
for_each_possible_cpu(i) {
struct rq *rq;

@@ -7201,8 +7205,6 @@ void __init sched_init(void)
init_tg_rt_entry(rq, &init_task_group,
&per_cpu(init_rt_rq, i),
&per_cpu(init_sched_rt_entity, i), i, 1);
-
- list_add(&init_task_group.list, &task_groups);
#endif

for (j = 0; j < CPU_LOAD_IDX_MAX; j++)


2008-01-07 12:10:24

by Srivatsa Vaddagiri

[permalink] [raw]
Subject: Re: [PATCH 00/11] another rt group sched update

On Mon, Jan 07, 2008 at 11:51:20AM +0100, Peter Zijlstra wrote:
> - figure out what to do for UID based group scheduling, the current
> implementation leaves it impossible for !root users to execute
> real time tasks by setting rt_runtime_us to 0, and it has no way
> to change it.
>
> Srivatsa, what happened to the per uid weight patches?, Perhaps we
> can extend that interface to allow changing this.

Hi Peter,
The sysfs interface for tweaking each user's share should be in
mainline already (sysfs_create_file() in user_kobject_create()). This
could be extended for your purpose, hopefully in a straightforward
manner (you never know that with sysfs :(

--
Regards,
vatsa

2008-01-07 12:28:34

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 00/11] another rt group sched update


On Mon, 2008-01-07 at 17:53 +0530, Srivatsa Vaddagiri wrote:
> On Mon, Jan 07, 2008 at 11:51:20AM +0100, Peter Zijlstra wrote:
> > - figure out what to do for UID based group scheduling, the current
> > implementation leaves it impossible for !root users to execute
> > real time tasks by setting rt_runtime_us to 0, and it has no way
> > to change it.
> >
> > Srivatsa, what happened to the per uid weight patches?, Perhaps we
> > can extend that interface to allow changing this.
>
> Hi Peter,
> The sysfs interface for tweaking each user's share should be in
> mainline already (sysfs_create_file() in user_kobject_create()). This
> could be extended for your purpose, hopefully in a straightforward
> manner (you never know that with sysfs :(

Ah, I missed that going in. Thanks, I'll give it a go.


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2008-01-07 16:58:53

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH 12/11] sched: rt-group: uid-group interface


Subject: sched: rt-group: add uid-group interface

Extend the /sys/kernel/uids/<uid>/ interface to allow setting
the group's rt_period and rt_runtime.

Signed-off-by: Peter Zijlstra <[email protected]>
---
include/linux/sched.h | 4 +-
kernel/user.c | 93 +++++++++++++++++++++++++++++++++++++++++++-------
2 files changed, 84 insertions(+), 13 deletions(-)

Index: linux-2.6/include/linux/sched.h
===================================================================
--- linux-2.6.orig/include/linux/sched.h
+++ linux-2.6/include/linux/sched.h
@@ -559,7 +559,9 @@ struct user_struct {
struct task_group *tg;
#ifdef CONFIG_SYSFS
struct kset kset;
- struct subsys_attribute user_attr;
+ struct subsys_attribute share_attr;
+ struct subsys_attribute rt_period_attr;
+ struct subsys_attribute rt_runtime_attr;
struct work_struct work;
#endif
#endif
Index: linux-2.6/kernel/user.c
===================================================================
--- linux-2.6.orig/kernel/user.c
+++ linux-2.6/kernel/user.c
@@ -129,7 +129,7 @@ static inline void uids_mutex_unlock(voi
}

/* return cpu shares held by the user */
-static ssize_t cpu_shares_show(struct kset *kset, char *buffer)
+static ssize_t cpu_share_show(struct kset *kset, char *buffer)
{
struct user_struct *up = container_of(kset, struct user_struct, kset);

@@ -137,8 +137,8 @@ static ssize_t cpu_shares_show(struct ks
}

/* modify cpu shares held by the user */
-static ssize_t cpu_shares_store(struct kset *kset, const char *buffer,
- size_t size)
+static ssize_t cpu_share_store(struct kset *kset, const char *buffer,
+ size_t size)
{
struct user_struct *up = container_of(kset, struct user_struct, kset);
unsigned long shares;
@@ -151,12 +151,67 @@ static ssize_t cpu_shares_store(struct k
return (rc ? rc : size);
}

-static void user_attr_init(struct subsys_attribute *sa, char *name, int mode)
+static ssize_t cpu_rt_period_show(struct kset *kset, char *buffer)
{
- sa->attr.name = name;
- sa->attr.mode = mode;
- sa->show = cpu_shares_show;
- sa->store = cpu_shares_store;
+ struct user_struct *up = container_of(kset, struct user_struct, kset);
+
+ return sprintf(buffer, "%lu\n", sched_group_rt_period(up->tg));
+}
+
+static ssize_t cpu_rt_period_store(struct kset *kset, const char *buffer,
+ size_t size)
+{
+ struct user_struct *up = container_of(kset, struct user_struct, kset);
+ unsigned long rt_period_us;
+ int rc;
+
+ sscanf(buffer, "%lu", &rt_period_us);
+ rc = sched_group_set_rt_period(up->tg, rt_period_us);
+
+ return (rc ?: size);
+}
+
+static ssize_t cpu_rt_runtime_show(struct kset *kset, char *buffer)
+{
+ struct user_struct *up = container_of(kset, struct user_struct, kset);
+
+ return sprintf(buffer, "%lu\n", sched_group_rt_runtime(up->tg));
+}
+
+static ssize_t cpu_rt_runtime_store(struct kset *kset, const char *buffer,
+ size_t size)
+{
+ struct user_struct *up = container_of(kset, struct user_struct, kset);
+ unsigned long rt_runtime_us;
+ int rc;
+
+ sscanf(buffer, "%lu", &rt_runtime_us);
+ rc = sched_group_set_rt_runtime(up->tg, rt_runtime_us);
+
+ return (rc ?: size);
+}
+
+static void user_attr_init(struct user_struct *up)
+{
+ struct subsys_attribute *sa;
+
+ sa = &up->share_attr;
+ sa->attr.name = "cpu_share";
+ sa->attr.mode = 0644;
+ sa->show = cpu_share_show;
+ sa->store = cpu_share_store;
+
+ sa = &up->rt_period_attr;
+ sa->attr.name = "cpu_rt_period_us";
+ sa->attr.mode = 0644;
+ sa->show = cpu_rt_period_show;
+ sa->store = cpu_rt_period_store;
+
+ sa = &up->rt_runtime_attr;
+ sa->attr.name = "cpu_rt_runtime_us";
+ sa->attr.mode = 0644;
+ sa->show = cpu_rt_runtime_show;
+ sa->store = cpu_rt_runtime_store;
}

/* Create "/sys/kernel/uids/<uid>" directory and
@@ -172,15 +227,27 @@ static int user_kobject_create(struct us
kobj->parent = &uids_kobject; /* create under /sys/kernel/uids dir */
kobject_set_name(kobj, "%d", up->uid);
kset_init(kset);
- user_attr_init(&up->user_attr, "cpu_share", 0644);
+ user_attr_init(up);

error = kobject_add(kobj);
if (error)
goto done;

- error = sysfs_create_file(kobj, &up->user_attr.attr);
+ error = sysfs_create_file(kobj, &up->share_attr.attr);
+ if (error)
+ goto error1;
+ error = sysfs_create_file(kobj, &up->rt_period_attr.attr);
if (error)
- kobject_del(kobj);
+ goto error2;
+ error = sysfs_create_file(kobj, &up->rt_runtime_attr.attr);
+ if (error)
+ goto error3;
+
+ if (0) {
+error3: sysfs_remove_file(kobj, &up->rt_period_attr.attr);
+error2: sysfs_remove_file(kobj, &up->share_attr.attr);
+error1: kobject_del(kobj);
+ }

kobject_uevent(kobj, KOBJ_ADD);

@@ -238,7 +305,9 @@ static void remove_user_sysfs_dir(struct
if (!remove_user)
goto done;

- sysfs_remove_file(kobj, &up->user_attr.attr);
+ sysfs_remove_file(kobj, &up->share_attr.attr);
+ sysfs_remove_file(kobj, &up->rt_period_attr.attr);
+ sysfs_remove_file(kobj, &up->rt_runtime_attr.attr);
kobject_uevent(kobj, KOBJ_REMOVE);
kobject_del(kobj);


2008-01-08 10:34:14

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 12/11] sched: rt-group: uid-group interface


* Peter Zijlstra <[email protected]> wrote:

> Subject: sched: rt-group: add uid-group interface
>
> Extend the /sys/kernel/uids/<uid>/ interface to allow setting the
> group's rt_period and rt_runtime.

thanks, applied.

Ingo

2008-01-08 10:58:22

by Dhaval Giani

[permalink] [raw]
Subject: Re: [PATCH 12/11] sched: rt-group: uid-group interface

On Mon, Jan 07, 2008 at 05:57:42PM +0100, Peter Zijlstra wrote:
>
> Subject: sched: rt-group: add uid-group interface
>
> Extend the /sys/kernel/uids/<uid>/ interface to allow setting
> the group's rt_period and rt_runtime.
>

Hi Peter,

Cool stuff! I will try out these patches and try to give you some
feedback.

One request though, could you please add some documentation to
Documentation/ABI/testing/sysfs-kernel-uids?

Thanks,
--
regards,
Dhaval

2008-01-08 11:02:44

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 12/11] sched: rt-group: uid-group interface


On Tue, 2008-01-08 at 16:27 +0530, Dhaval Giani wrote:
> On Mon, Jan 07, 2008 at 05:57:42PM +0100, Peter Zijlstra wrote:
> >
> > Subject: sched: rt-group: add uid-group interface
> >
> > Extend the /sys/kernel/uids/<uid>/ interface to allow setting
> > the group's rt_period and rt_runtime.
> >
>
> Hi Peter,
>
> Cool stuff! I will try out these patches and try to give you some
> feedback.

Thanks, much appreciated!

> One request though, could you please add some documentation to
> Documentation/ABI/testing/sysfs-kernel-uids?

I already have documentation on the todo list, I'll add this file to
that list :-)


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2008-01-08 14:31:22

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH 12/11] sched: rt-group: uid-group interface

On Jan 8, 2008 12:02 PM, Peter Zijlstra <[email protected]> wrote:
>
> On Tue, 2008-01-08 at 16:27 +0530, Dhaval Giani wrote:
> > On Mon, Jan 07, 2008 at 05:57:42PM +0100, Peter Zijlstra wrote:
> > >
> > > Subject: sched: rt-group: add uid-group interface
> > >
> > > Extend the /sys/kernel/uids/<uid>/ interface to allow setting
> > > the group's rt_period and rt_runtime.
> > >
> >
> > Hi Peter,
> >
> > Cool stuff! I will try out these patches and try to give you some
> > feedback.
>
> Thanks, much appreciated!
>
> > One request though, could you please add some documentation to
> > Documentation/ABI/testing/sysfs-kernel-uids?
>
> I already have documentation on the todo list, I'll add this file to
> that list :-)

Care to rebase the patch against -mm, we fixed the mixed-up usage
of ksets and kobjects, and this can not apply anymore:
http://git.kernel.org/?p=linux/kernel/git/gregkh/patches.git;a=blob;f=driver/struct-user_info-sysfs.patch;hb=HEAD

There is also an attribute group now which makes it much easier to add
new files.

Thanks,
Kay

2008-01-08 23:27:40

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 12/11] sched: rt-group: uid-group interface


On Tue, 2008-01-08 at 16:27 +0530, Dhaval Giani wrote:
> On Mon, Jan 07, 2008 at 05:57:42PM +0100, Peter Zijlstra wrote:
> >
> > Subject: sched: rt-group: add uid-group interface
> >
> > Extend the /sys/kernel/uids/<uid>/ interface to allow setting
> > the group's rt_period and rt_runtime.
> >
>
> Hi Peter,
>
> Cool stuff! I will try out these patches and try to give you some
> feedback.
>
> One request though, could you please add some documentation to
> Documentation/ABI/testing/sysfs-kernel-uids?

compile tested only attempt at finalizing the interface

Signed-off-by: Peter Zijlstra <[email protected]>
---
Index: linux-2.6/include/linux/sched.h
===================================================================
--- linux-2.6.orig/include/linux/sched.h
+++ linux-2.6/include/linux/sched.h
@@ -1519,8 +1519,6 @@ extern unsigned int sysctl_sched_child_r
extern unsigned int sysctl_sched_features;
extern unsigned int sysctl_sched_migration_cost;
extern unsigned int sysctl_sched_nr_migrate;
-extern unsigned int sysctl_sched_rt_period;
-extern unsigned int sysctl_sched_rt_runtime;
#if defined(CONFIG_FAIR_GROUP_SCHED) && defined(CONFIG_SMP)
extern unsigned int sysctl_sched_min_bal_int_shares;
extern unsigned int sysctl_sched_max_bal_int_shares;
@@ -1530,6 +1528,8 @@ int sched_nr_latency_handler(struct ctl_
struct file *file, void __user *buffer, size_t *length,
loff_t *ppos);
#endif
+extern unsigned int sysctl_sched_rt_period;
+extern int sysctl_sched_rt_runtime;

extern unsigned int sysctl_sched_compat_yield;

@@ -2017,8 +2017,8 @@ extern void sched_move_task(struct task_
extern int sched_group_set_shares(struct task_group *tg, unsigned long shares);
extern unsigned long sched_group_shares(struct task_group *tg);
extern int sched_group_set_rt_runtime(struct task_group *tg,
- unsigned long rt_runtime_us);
-extern unsigned long sched_group_rt_runtime(struct task_group *tg);
+ long rt_runtime_us);
+extern long sched_group_rt_runtime(struct task_group *tg);
extern int sched_group_set_rt_period(struct task_group *tg,
unsigned long rt_runtime_us);
extern unsigned long sched_group_rt_period(struct task_group *tg);
Index: linux-2.6/kernel/sched.c
===================================================================
--- linux-2.6.orig/kernel/sched.c
+++ linux-2.6/kernel/sched.c
@@ -649,13 +649,18 @@ const_debug unsigned int sysctl_sched_nr
* period over which we measure rt task cpu usage in us.
* default: 1s
*/
-const_debug unsigned int sysctl_sched_rt_period = 1000000;
+unsigned int sysctl_sched_rt_period = 1000000;

/*
* part of the period that we allow rt tasks to run in us.
* default: 0.95s
*/
-const_debug unsigned int sysctl_sched_rt_runtime = 950000;
+int sysctl_sched_rt_runtime = 950000;
+
+/*
+ * single value that denotes runtime == period, ie unlimited time.
+ */
+#define RUNTIME_INF ((u64)~0ULL)

/*
* For kernel-internal use: high-speed (but slightly incorrect) per-cpu
@@ -7751,7 +7756,7 @@ struct task_group *sched_create_group(vo
goto err;

tg->shares = NICE_0_LOAD;
- tg->rt_runtime = 0; /* XXX */
+ tg->rt_runtime = 0;
tg->rt_period = ns_to_ktime(sysctl_sched_rt_period * NSEC_PER_USEC);

for_each_possible_cpu(i) {
@@ -7956,9 +7961,12 @@ static DEFINE_MUTEX(rt_constraints_mutex

static unsigned long to_ratio(u64 period, u64 runtime)
{
- u64 r = runtime * (1ULL << 16);
- do_div(r, period);
- return r;
+ if (runtime == RUNTIME_INF)
+ return 1ULL << 16;
+
+ runtime *= (1ULL << 16);
+ do_div(runtime, period);
+ return runtime;
}

static int __rt_schedulable(struct task_group *tg, u64 period, u64 runtime)
@@ -7980,12 +7988,15 @@ static int __rt_schedulable(struct task_
return total + to_ratio(period, runtime) < global_ratio;
}

-int sched_group_set_rt_runtime(struct task_group *tg,
- unsigned long rt_runtime_us)
+int sched_group_set_rt_runtime(struct task_group *tg, long rt_runtime_us)
{
- u64 rt_runtime = (u64)rt_runtime_us * NSEC_PER_USEC;
+ u64 rt_runtime;
int err = 0;

+ rt_runtime = (u64)rt_runtime_us * NSEC_PER_USEC;
+ if (rt_runtime_us == -1)
+ rt_runtime = RUNTIME_INF;
+
mutex_lock(&rt_constraints_mutex);
if (!__rt_schedulable(tg, ktime_to_ns(tg->rt_period), rt_runtime)) {
err = -EINVAL;
@@ -7999,10 +8010,14 @@ int sched_group_set_rt_runtime(struct ta
return err;
}

-unsigned long sched_group_rt_runtime(struct task_group *tg)
+long sched_group_rt_runtime(struct task_group *tg)
{
- u64 rt_runtime_us = tg->rt_runtime;
+ u64 rt_runtime_us;

+ if (tg->rt_runtime == RUNTIME_INF)
+ return -1;
+
+ rt_runtime_us = tg->rt_runtime;
do_div(rt_runtime_us, NSEC_PER_USEC);
return rt_runtime_us;
}
@@ -8108,15 +8123,49 @@ static u64 cpu_shares_read_uint(struct c
return (u64) tg->shares;
}

-static int cpu_rt_runtime_write_uint(struct cgroup *cgrp, struct cftype *cftype,
- u64 rt_runtime_val)
-{
- return sched_group_set_rt_runtime(cgroup_tg(cgrp), rt_runtime_val);
+static int cpu_rt_runtime_write(struct cgroup *cgrp, struct cftype *cft,
+ struct file *file,
+ const char __user *userbuf,
+ size_t nbytes, loff_t *unused_ppos)
+{
+ char buffer[64];
+ int retval = 0;
+ s64 val;
+ char *end;
+
+ if (!nbytes)
+ return -EINVAL;
+ if (nbytes >= sizeof(buffer))
+ return -E2BIG;
+ if (copy_from_user(buffer, userbuf, nbytes))
+ return -EFAULT;
+
+ buffer[nbytes] = 0; /* nul-terminate */
+
+ /* strip newline if necessary */
+ if (nbytes && (buffer[nbytes-1] == '\n'))
+ buffer[nbytes-1] = 0;
+ val = simple_strtoll(buffer, &end, 0);
+ if (*end)
+ return -EINVAL;
+
+ /* Pass to subsystem */
+ retval = sched_group_set_rt_runtime(cgroup_tg(cgrp), val);
+ if (!retval)
+ retval = nbytes;
+ return retval;
}

-static u64 cpu_rt_runtime_read_uint(struct cgroup *cgrp, struct cftype *cft)
-{
- return sched_group_rt_runtime(cgroup_tg(cgrp));
+static ssize_t cpu_rt_runtime_read(struct cgroup *cgrp, struct cftype *cft,
+ struct file *file,
+ char __user *buf, size_t nbytes,
+ loff_t *ppos)
+{
+ char tmp[64];
+ long val = sched_group_rt_runtime(cgroup_tg(cgrp));
+ int len = sprintf(tmp, "%ld\n", val);
+
+ return simple_read_from_buffer(buf, nbytes, ppos, tmp, len);
}

static int cpu_rt_period_write_uint(struct cgroup *cgrp, struct cftype *cftype,
@@ -8138,8 +8187,8 @@ static struct cftype cpu_files[] = {
},
{
.name = "rt_runtime_us",
- .read_uint = cpu_rt_runtime_read_uint,
- .write_uint = cpu_rt_runtime_write_uint,
+ .read = cpu_rt_runtime_read,
+ .write = cpu_rt_runtime_write,
},
{
.name = "rt_period_us",
Index: linux-2.6/kernel/sched_rt.c
===================================================================
--- linux-2.6.orig/kernel/sched_rt.c
+++ linux-2.6/kernel/sched_rt.c
@@ -60,7 +60,7 @@ static inline int on_rt_rq(struct sched_
static inline u64 sched_rt_runtime(struct rt_rq *rt_rq)
{
if (!rt_rq->tg)
- return 0;
+ return RUNTIME_INF;

return rt_rq->tg->rt_runtime;
}
@@ -220,6 +220,9 @@ static struct sched_rt_entity *next_rt_d

static inline u64 sched_rt_runtime(struct rt_rq *rt_rq)
{
+ if (sysctl_sched_rt_runtime == -1)
+ return RUNTIME_INF;
+
return (u64)sysctl_sched_rt_runtime * NSEC_PER_USEC;
}

@@ -304,7 +307,7 @@ static int sched_rt_runtime_exceeded(str
{
u64 runtime = sched_rt_runtime(rt_rq);

- if (!runtime)
+ if (runtime == RUNTIME_INF)
goto out;

if (rt_rq->rt_throttled)
Index: linux-2.6/kernel/sysctl.c
===================================================================
--- linux-2.6.orig/kernel/sysctl.c
+++ linux-2.6/kernel/sysctl.c
@@ -309,22 +309,6 @@ static struct ctl_table kern_table[] = {
.mode = 0644,
.proc_handler = &proc_dointvec,
},
- {
- .ctl_name = CTL_UNNUMBERED,
- .procname = "sched_rt_period_us",
- .data = &sysctl_sched_rt_period,
- .maxlen = sizeof(unsigned int),
- .mode = 0644,
- .proc_handler = &proc_dointvec,
- },
- {
- .ctl_name = CTL_UNNUMBERED,
- .procname = "sched_rt_runtime_us",
- .data = &sysctl_sched_rt_runtime,
- .maxlen = sizeof(unsigned int),
- .mode = 0644,
- .proc_handler = &proc_dointvec,
- },
#if defined(CONFIG_FAIR_GROUP_SCHED) && defined(CONFIG_SMP)
{
.ctl_name = CTL_UNNUMBERED,
@@ -346,6 +330,22 @@ static struct ctl_table kern_table[] = {
#endif
{
.ctl_name = CTL_UNNUMBERED,
+ .procname = "sched_rt_period_us",
+ .data = &sysctl_sched_rt_period,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .proc_handler = &proc_dointvec,
+ },
+ {
+ .ctl_name = CTL_UNNUMBERED,
+ .procname = "sched_rt_runtime_us",
+ .data = &sysctl_sched_rt_runtime,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = &proc_dointvec,
+ },
+ {
+ .ctl_name = CTL_UNNUMBERED,
.procname = "sched_compat_yield",
.data = &sysctl_sched_compat_yield,
.maxlen = sizeof(unsigned int),
Index: linux-2.6/kernel/user.c
===================================================================
--- linux-2.6.orig/kernel/user.c
+++ linux-2.6/kernel/user.c
@@ -175,17 +175,17 @@ static ssize_t cpu_rt_runtime_show(struc
{
struct user_struct *up = container_of(kset, struct user_struct, kset);

- return sprintf(buffer, "%lu\n", sched_group_rt_runtime(up->tg));
+ return sprintf(buffer, "%ld\n", sched_group_rt_runtime(up->tg));
}

static ssize_t cpu_rt_runtime_store(struct kset *kset, const char *buffer,
size_t size)
{
struct user_struct *up = container_of(kset, struct user_struct, kset);
- unsigned long rt_runtime_us;
+ long rt_runtime_us;
int rc;

- sscanf(buffer, "%lu", &rt_runtime_us);
+ sscanf(buffer, "%ld", &rt_runtime_us);
rc = sched_group_set_rt_runtime(up->tg, rt_runtime_us);

return (rc ?: size);
Index: linux-2.6/Documentation/ABI/testing/sysfs-kernel-uids
===================================================================
--- linux-2.6.orig/Documentation/ABI/testing/sysfs-kernel-uids
+++ linux-2.6/Documentation/ABI/testing/sysfs-kernel-uids
@@ -12,3 +12,14 @@ Description:
B has shares = 2048, User B will get twice the CPU
bandwidth user A will. For more details refer
Documentation/sched-design-CFS.txt
+
+What: /sys/kernel/uids/<uid>/cpu_rt_period_us
+Date: January 2008
+Contact: Peter Zijlstra <[email protected]>
+Description: See Documentation/sched-rt-group.txt
+
+What: /sys/kernel/uids/<uid>/cpu_rt_runtime_us
+Date: January 2008
+Contact: Peter Zijlstra <[email protected]>
+Description: See Documentation/sched-rt-group.txt
+
Index: linux-2.6/Documentation/sched-rt-group.txt
===================================================================
--- /dev/null
+++ linux-2.6/Documentation/sched-rt-group.txt
@@ -0,0 +1,69 @@
+
+
+Real-Time group scheduling.
+
+The problem space:
+
+In order to schedule multiple groups of realtime tasks each group must
+be assigned a fixed portion of the cpu time available. Without a minimum
+guarantee a realtime group can obviously fall short. A fuzzy upper limit
+is of no use since it cannot be relied upon. Which leaves us with just
+the single fixed portion.
+
+CPU time is divided by means of specifying how much time can be spend
+running in a given period. Say a frame fixed realtime renderer must
+deliver a 25 frames a second, which yields a period of 0.04s. Now say
+it will also have to play some music and respond to input, leaving it
+with around 80% for the graphics. We can then give this group a runtime
+of 0.8 * 0.04s = 0.032s.
+
+This way the graphics group will have a 0.04s period with a 0.032s runtime
+limit.
+
+Now if the audio thread needs to refill the dma buffer every 0.005s, but
+needs only about 3% cpu time to do so, it will can do with a 0.03 * 0.005s
+= 0.00015s.
+
+If it so happens that the graphics group runs at a higher priority than
+the audio group is might be that the audio group will not get CPU time
+in time to meet its deadline. Whereas the graphics group will still easily
+make its deadline if it were delayed for the amount of time the audio
+group needs.
+
+This problem is solved using Earliest Deadline First (EDF) scheduling of the
+realtime groups.
+
+The Interface:
+
+system wide:
+
+/proc/sys/kernel/sched_rt_period_us
+/proc/sys/kernel/sched_rt_runtime_us
+
+CONFIG_FAIR_USER_SCHED
+
+/sys/kernel/uids/<uid>/cpu_rt_period_us
+/sys/kernel/uids/<uid>/cpu_rt_runtime_us
+
+or
+
+CONFIG_FAIR_CGROUP_SCHED
+
+/cgroup/<cgroup>/cpu.rt_period_us
+/cgroup/<cgroup>/cpu.rt_runtime_us
+
+[ time is specified in us because the interface is s32, this gives an
+ operating range of ~35m to 1us ]
+
+The period takes values in [ 1, INT_MAX ], runtime in [ -1, INT_MAX - 1 ].
+
+A runtime of -1 specifies runtime == period, ie. no limit.
+
+New groups get the period from /proc/sys/kernel/sched_rt_period_us and
+a runtime of 0.
+
+Settings are constrainted to:
+
+ \Sum_{i} runtime_{i} / period_{i} <= global_runtime / global_period
+
+in order to keep the configuration schedulable.

2008-01-08 23:36:23

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 12/11] sched: rt-group: uid-group interface


On Tue, 2008-01-08 at 15:31 +0100, Kay Sievers wrote:
> On Jan 8, 2008 12:02 PM, Peter Zijlstra <[email protected]> wrote:
> >
> > On Tue, 2008-01-08 at 16:27 +0530, Dhaval Giani wrote:
> > > On Mon, Jan 07, 2008 at 05:57:42PM +0100, Peter Zijlstra wrote:
> > > >
> > > > Subject: sched: rt-group: add uid-group interface
> > > >
> > > > Extend the /sys/kernel/uids/<uid>/ interface to allow setting
> > > > the group's rt_period and rt_runtime.
> > > >
> > >
> > > Hi Peter,
> > >
> > > Cool stuff! I will try out these patches and try to give you some
> > > feedback.
> >
> > Thanks, much appreciated!
> >
> > > One request though, could you please add some documentation to
> > > Documentation/ABI/testing/sysfs-kernel-uids?
> >
> > I already have documentation on the todo list, I'll add this file to
> > that list :-)
>
> Care to rebase the patch against -mm, we fixed the mixed-up usage
> of ksets and kobjects, and this can not apply anymore:
> http://git.kernel.org/?p=linux/kernel/git/gregkh/patches.git;a=blob;f=driver/struct-user_info-sysfs.patch;hb=HEAD
>
> There is also an attribute group now which makes it much easier to add
> new files.

Ingo, Greg,

What would be the easiest way to carry this forward? sched-devel and
greg's tree would intersect at this point and leave poor akpm with the
resulting mess. Should I just make an incremental patch akpm can carry
and push? Or can we base one tree off the other?


2008-01-08 23:55:22

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 12/11] sched: rt-group: uid-group interface

On Wed, Jan 09, 2008 at 12:35:32AM +0100, Peter Zijlstra wrote:
>
> On Tue, 2008-01-08 at 15:31 +0100, Kay Sievers wrote:
> > On Jan 8, 2008 12:02 PM, Peter Zijlstra <[email protected]> wrote:
> > >
> > > On Tue, 2008-01-08 at 16:27 +0530, Dhaval Giani wrote:
> > > > On Mon, Jan 07, 2008 at 05:57:42PM +0100, Peter Zijlstra wrote:
> > > > >
> > > > > Subject: sched: rt-group: add uid-group interface
> > > > >
> > > > > Extend the /sys/kernel/uids/<uid>/ interface to allow setting
> > > > > the group's rt_period and rt_runtime.
> > > > >
> > > >
> > > > Hi Peter,
> > > >
> > > > Cool stuff! I will try out these patches and try to give you some
> > > > feedback.
> > >
> > > Thanks, much appreciated!
> > >
> > > > One request though, could you please add some documentation to
> > > > Documentation/ABI/testing/sysfs-kernel-uids?
> > >
> > > I already have documentation on the todo list, I'll add this file to
> > > that list :-)
> >
> > Care to rebase the patch against -mm, we fixed the mixed-up usage
> > of ksets and kobjects, and this can not apply anymore:
> > http://git.kernel.org/?p=linux/kernel/git/gregkh/patches.git;a=blob;f=driver/struct-user_info-sysfs.patch;hb=HEAD
> >
> > There is also an attribute group now which makes it much easier to add
> > new files.
>
> Ingo, Greg,
>
> What would be the easiest way to carry this forward? sched-devel and
> greg's tree would intersect at this point and leave poor akpm with the
> resulting mess. Should I just make an incremental patch akpm can carry
> and push? Or can we base one tree off the other?

If it's just a single patch for this, I'd be glad to take it. But by
looking at the [11/12] above, I doubt this is so...

If it's not that rough (12 patches is not a big deal), I'd be glad to
take these through my tree, after you fix up Kay's requests above :)

thanks,

greg k-h

2008-01-08 23:58:32

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 12/11] sched: rt-group: uid-group interface


* Greg KH <[email protected]> wrote:

> On Wed, Jan 09, 2008 at 12:35:32AM +0100, Peter Zijlstra wrote:
> >
> > On Tue, 2008-01-08 at 15:31 +0100, Kay Sievers wrote:
> > > On Jan 8, 2008 12:02 PM, Peter Zijlstra <[email protected]> wrote:
> > > >
> > > > On Tue, 2008-01-08 at 16:27 +0530, Dhaval Giani wrote:
> > > > > On Mon, Jan 07, 2008 at 05:57:42PM +0100, Peter Zijlstra wrote:
> > > > > >
> > > > > > Subject: sched: rt-group: add uid-group interface
> > > > > >
> > > > > > Extend the /sys/kernel/uids/<uid>/ interface to allow setting
> > > > > > the group's rt_period and rt_runtime.
> > > > > >
> > > > >
> > > > > Hi Peter,
> > > > >
> > > > > Cool stuff! I will try out these patches and try to give you some
> > > > > feedback.
> > > >
> > > > Thanks, much appreciated!
> > > >
> > > > > One request though, could you please add some documentation to
> > > > > Documentation/ABI/testing/sysfs-kernel-uids?
> > > >
> > > > I already have documentation on the todo list, I'll add this file to
> > > > that list :-)
> > >
> > > Care to rebase the patch against -mm, we fixed the mixed-up usage
> > > of ksets and kobjects, and this can not apply anymore:
> > > http://git.kernel.org/?p=linux/kernel/git/gregkh/patches.git;a=blob;f=driver/struct-user_info-sysfs.patch;hb=HEAD
> > >
> > > There is also an attribute group now which makes it much easier to add
> > > new files.
> >
> > Ingo, Greg,
> >
> > What would be the easiest way to carry this forward? sched-devel and
> > greg's tree would intersect at this point and leave poor akpm with the
> > resulting mess. Should I just make an incremental patch akpm can carry
> > and push? Or can we base one tree off the other?
>
> If it's just a single patch for this, I'd be glad to take it. But by
> looking at the [11/12] above, I doubt this is so...
>
> If it's not that rough (12 patches is not a big deal), I'd be glad to
> take these through my tree, after you fix up Kay's requests above :)

hm, i'd really like to see this tested and go through sched.git. It's
only the few sysfs bits which interfere, right?

Ingo

2008-01-10 03:17:57

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 12/11] sched: rt-group: uid-group interface

On Wed, Jan 09, 2008 at 12:57:50AM +0100, Ingo Molnar wrote:
>
> * Greg KH <[email protected]> wrote:
>
> > On Wed, Jan 09, 2008 at 12:35:32AM +0100, Peter Zijlstra wrote:
> > >
> > > On Tue, 2008-01-08 at 15:31 +0100, Kay Sievers wrote:
> > > > On Jan 8, 2008 12:02 PM, Peter Zijlstra <[email protected]> wrote:
> > > > >
> > > > > On Tue, 2008-01-08 at 16:27 +0530, Dhaval Giani wrote:
> > > > > > On Mon, Jan 07, 2008 at 05:57:42PM +0100, Peter Zijlstra wrote:
> > > > > > >
> > > > > > > Subject: sched: rt-group: add uid-group interface
> > > > > > >
> > > > > > > Extend the /sys/kernel/uids/<uid>/ interface to allow setting
> > > > > > > the group's rt_period and rt_runtime.
> > > > > > >
> > > > > >
> > > > > > Hi Peter,
> > > > > >
> > > > > > Cool stuff! I will try out these patches and try to give you some
> > > > > > feedback.
> > > > >
> > > > > Thanks, much appreciated!
> > > > >
> > > > > > One request though, could you please add some documentation to
> > > > > > Documentation/ABI/testing/sysfs-kernel-uids?
> > > > >
> > > > > I already have documentation on the todo list, I'll add this file to
> > > > > that list :-)
> > > >
> > > > Care to rebase the patch against -mm, we fixed the mixed-up usage
> > > > of ksets and kobjects, and this can not apply anymore:
> > > > http://git.kernel.org/?p=linux/kernel/git/gregkh/patches.git;a=blob;f=driver/struct-user_info-sysfs.patch;hb=HEAD
> > > >
> > > > There is also an attribute group now which makes it much easier to add
> > > > new files.
> > >
> > > Ingo, Greg,
> > >
> > > What would be the easiest way to carry this forward? sched-devel and
> > > greg's tree would intersect at this point and leave poor akpm with the
> > > resulting mess. Should I just make an incremental patch akpm can carry
> > > and push? Or can we base one tree off the other?
> >
> > If it's just a single patch for this, I'd be glad to take it. But by
> > looking at the [11/12] above, I doubt this is so...
> >
> > If it's not that rough (12 patches is not a big deal), I'd be glad to
> > take these through my tree, after you fix up Kay's requests above :)
>
> hm, i'd really like to see this tested and go through sched.git. It's
> only the few sysfs bits which interfere, right?

Yes, that should be it.

So why not put the majority of this through sched.git, then when my
sysfs changes go in at the beginning of the .25 merge cycle, you can
then add the sysfs changes through your tree or anywhere else.

Unless you are relying on the sysfs changes for this whole feature, and
without them it just doesn't make any sense at all?

thanks,

greg k-h

2008-02-07 04:17:55

by Dhaval Giani

[permalink] [raw]
Subject: Re: [PATCH 12/11] sched: rt-group: uid-group interface

On Wed, Jan 09, 2008 at 04:05:31PM -0800, Greg KH wrote:
> > > > Ingo, Greg,
> > > >
> > > > What would be the easiest way to carry this forward? sched-devel and
> > > > greg's tree would intersect at this point and leave poor akpm with the
> > > > resulting mess. Should I just make an incremental patch akpm can carry
> > > > and push? Or can we base one tree off the other?
> > >
> > > If it's just a single patch for this, I'd be glad to take it. But by
> > > looking at the [11/12] above, I doubt this is so...
> > >
> > > If it's not that rough (12 patches is not a big deal), I'd be glad to
> > > take these through my tree, after you fix up Kay's requests above :)
> >
> > hm, i'd really like to see this tested and go through sched.git. It's
> > only the few sysfs bits which interfere, right?
>
> Yes, that should be it.
>
> So why not put the majority of this through sched.git, then when my
> sysfs changes go in at the beginning of the .25 merge cycle, you can
> then add the sysfs changes through your tree or anywhere else.
>

Hi,

I was wondering where these changes are right now. I don't see the sysfs
interface for rt-group-sched in mainline right now.

Thanks,
--
regards,
Dhaval

2008-02-07 05:43:28

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 12/11] sched: rt-group: uid-group interface

On Thu, Feb 07, 2008 at 09:47:22AM +0530, Dhaval Giani wrote:
> On Wed, Jan 09, 2008 at 04:05:31PM -0800, Greg KH wrote:
> > > > > Ingo, Greg,
> > > > >
> > > > > What would be the easiest way to carry this forward? sched-devel and
> > > > > greg's tree would intersect at this point and leave poor akpm with the
> > > > > resulting mess. Should I just make an incremental patch akpm can carry
> > > > > and push? Or can we base one tree off the other?
> > > >
> > > > If it's just a single patch for this, I'd be glad to take it. But by
> > > > looking at the [11/12] above, I doubt this is so...
> > > >
> > > > If it's not that rough (12 patches is not a big deal), I'd be glad to
> > > > take these through my tree, after you fix up Kay's requests above :)
> > >
> > > hm, i'd really like to see this tested and go through sched.git. It's
> > > only the few sysfs bits which interfere, right?
> >
> > Yes, that should be it.
> >
> > So why not put the majority of this through sched.git, then when my
> > sysfs changes go in at the beginning of the .25 merge cycle, you can
> > then add the sysfs changes through your tree or anywhere else.
> >
>
> Hi,
>
> I was wondering where these changes are right now. I don't see the sysfs
> interface for rt-group-sched in mainline right now.

All of the sysfs changes I had are in Linus's tree, so you don't need me
anymore :)

thanks,

greg k-h