2011-03-02 16:40:07

by Kirill A. Shutemov

[permalink] [raw]
Subject: [PATCH, v7] Introduce timer slack controller

From: Kirill A. Shutemov <[email protected]>

Changelog:

v7:
- totally reworked interface and rewritten from scratch
(See Documentation/cgroups/timer_slack.txt for more information)
v6:
- add documentation
- use notifier_call_chain() instead of check hook
- fix validate_change()
- cleanup
v5:
- -EBUSY on writing to timer_slack.min_slack_ns/max_slack_ns if a child has
wider min-max range
v4:
- hierarchy support
- drop dummy_timer_slack_check()
- workaround lockdep false (?) positive
- allow 0 as timer slack value
v3:
- rework interface
- s/EXPORT_SYMBOL/EXPORT_SYMBOL_GPL/
v2:
- fixed with CONFIG_CGROUP_TIMER_SLACK=y
v1:
- initial revision

Kirill A. Shutemov (1):
cgroups: introduce timer slack controller

Documentation/cgroups/timer_slack.txt | 66 ++++++++++++++++++++
fs/select.c | 7 +--
include/linux/cgroup_subsys.h | 6 ++
include/linux/sched.h | 9 +++
init/Kconfig | 8 +++
kernel/Makefile | 1 +
kernel/cgroup_timer_slack.c | 107 +++++++++++++++++++++++++++++++++
kernel/futex.c | 4 +-
kernel/hrtimer.c | 2 +-
9 files changed, 202 insertions(+), 8 deletions(-)
create mode 100644 Documentation/cgroups/timer_slack.txt
create mode 100644 kernel/cgroup_timer_slack.c

--
1.7.4.1


2011-03-02 16:40:08

by Kirill A. Shutemov

[permalink] [raw]
Subject: [PATCH, v7] cgroups: introduce timer slack controller

From: Kirill A. Shutemov <[email protected]>

Every task_struct has timer_slack_ns value. This value uses to round up
poll() and select() timeout values. This feature can be useful in
mobile environment where combined wakeups are desired.

cgroup subsys "timer_slack" implement timer slack controller. It
provides a way to set minimal timer slack value for a group of tasks.
If a task belongs to a cgroup with minimal timer slack value higher than
task's value, cgroup's value will be applied.

Idea-by: Jacob Pan <[email protected]>
Signed-off-by: Kirill A. Shutemov <[email protected]>
---
Documentation/cgroups/timer_slack.txt | 66 ++++++++++++++++++++
fs/select.c | 7 +--
include/linux/cgroup_subsys.h | 6 ++
include/linux/sched.h | 9 +++
init/Kconfig | 8 +++
kernel/Makefile | 1 +
kernel/cgroup_timer_slack.c | 107 +++++++++++++++++++++++++++++++++
kernel/futex.c | 4 +-
kernel/hrtimer.c | 2 +-
9 files changed, 202 insertions(+), 8 deletions(-)
create mode 100644 Documentation/cgroups/timer_slack.txt
create mode 100644 kernel/cgroup_timer_slack.c

diff --git a/Documentation/cgroups/timer_slack.txt b/Documentation/cgroups/timer_slack.txt
new file mode 100644
index 0000000..16ac066
--- /dev/null
+++ b/Documentation/cgroups/timer_slack.txt
@@ -0,0 +1,66 @@
+Timer Slack Controller
+=====================
+
+Overview
+--------
+
+Every task_struct has timer_slack_ns value. This value uses to round up
+poll() and select() timeout values. This feature can be useful in
+mobile environment where combined wakeups are desired.
+
+cgroup subsys "timer_slack" implement timer slack controller. It
+provides a way to set minimal timer slack value for a group of tasks.
+If a task belongs to a cgroup with minimal timer slack value higher than
+task's value, cgroup's value will be applied.
+
+User interface
+--------------
+
+To get timer slack controller functionality you need to enable it in
+kernel configuration:
+
+CONFIG_CGROUP_TIMER_SLACK=y
+
+The controller provides only one file:
+
+# mount -t cgroup -o timer_slack none /sys/fs/cgroup
+# ls /sys/fs/cgroup/timer_slack.*
+/sys/fs/cgroup/timer_slack.min_slack_ns
+
+By defeault it's 0:
+
+# cat /sys/fs/cgroup/timer_slack.min_slack_ns
+0
+
+You can set it to some value:
+
+# echo 50000 > /sys/fs/cgroup/timer_slack.min_slack_ns
+# cat /sys/fs/cgroup/timer_slack.min_slack_ns
+50000
+
+Tasks still can set task's value below 50000 using prctl(), but in this
+case cgroup's value will be applied.
+
+Timer slack controller supports hierarchical groups. The only rule:
+parent's minimal timer slack value should be less or equal to child's.
+
+# mkdir /sys/fs/cgroup/a
+# cat /sys/fs/cgroup/a/timer_slack.min_slack_ns
+50000
+# echo 70000 > /sys/fs/cgroup/a/timer_slack.min_slack_ns
+# cat /sys/fs/cgroup/a/timer_slack.min_slack_ns
+70000
+
+You'll get -EPERM, if you try to set child's timer_slack.min_slack_ns >
+parent's timer_slack.min_slack_ns:
+
+# /bin/echo 40000 > /sys/fs/cgroup/a/timer_slack.min_slack_ns
+/bin/echo: write error: Operation not permitted
+
+Child's value will be adjusted if necessary on parent's value update:
+
+# echo 100000 > /sys/fs/cgroup/timer_slack.min_slack_ns
+# cat /sys/fs/cgroup/timer_slack.min_slack_ns
+100000
+# cat /sys/fs/cgroup/a/timer_slack.min_slack_ns
+100000
diff --git a/fs/select.c b/fs/select.c
index e56560d..a189e4d 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -69,7 +69,6 @@ static long __estimate_accuracy(struct timespec *tv)

long select_estimate_accuracy(struct timespec *tv)
{
- unsigned long ret;
struct timespec now;

/*
@@ -81,10 +80,8 @@ long select_estimate_accuracy(struct timespec *tv)

ktime_get_ts(&now);
now = timespec_sub(*tv, now);
- ret = __estimate_accuracy(&now);
- if (ret < current->timer_slack_ns)
- return current->timer_slack_ns;
- return ret;
+ return clamp(__estimate_accuracy(&now),
+ get_task_timer_slack(current), LONG_MAX);
}


diff --git a/include/linux/cgroup_subsys.h b/include/linux/cgroup_subsys.h
index ccefff0..e399228 100644
--- a/include/linux/cgroup_subsys.h
+++ b/include/linux/cgroup_subsys.h
@@ -66,3 +66,9 @@ SUBSYS(blkio)
#endif

/* */
+
+#ifdef CONFIG_CGROUP_TIMER_SLACK
+SUBSYS(timer_slack)
+#endif
+
+/* */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 777d8a5..3751aaa 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2620,6 +2620,15 @@ static inline unsigned long rlimit_max(unsigned int limit)
return task_rlimit_max(current, limit);
}

+#ifdef CONFIG_CGROUP_TIMER_SLACK
+extern unsigned long get_task_timer_slack(struct task_struct *tsk);
+#else
+static inline unsigned long get_task_timer_slack(struct task_struct *tsk)
+{
+ return tsk->timer_slack_ns;
+}
+#endif
+
#endif /* __KERNEL__ */

#endif
diff --git a/init/Kconfig b/init/Kconfig
index be788c0..bbc4d9c 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -596,6 +596,14 @@ config CGROUP_FREEZER
Provides a way to freeze and unfreeze all tasks in a
cgroup.

+config CGROUP_TIMER_SLACK
+ bool "Timer slack cgroup controller"
+ help
+ Provides a way to set minimal timer slack value for tasks in
+ a cgroup.
+ It's useful in mobile devices where certain background apps
+ are attached to a cgroup and combined wakeups are desired.
+
config CGROUP_DEVICE
bool "Device controller for cgroups"
help
diff --git a/kernel/Makefile b/kernel/Makefile
index 353d3fe..0b60239 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -61,6 +61,7 @@ obj-$(CONFIG_BACKTRACE_SELF_TEST) += backtracetest.o
obj-$(CONFIG_COMPAT) += compat.o
obj-$(CONFIG_CGROUPS) += cgroup.o
obj-$(CONFIG_CGROUP_FREEZER) += cgroup_freezer.o
+obj-$(CONFIG_CGROUP_TIMER_SLACK) += cgroup_timer_slack.o
obj-$(CONFIG_CPUSETS) += cpuset.o
obj-$(CONFIG_CGROUP_NS) += ns_cgroup.o
obj-$(CONFIG_UTS_NS) += utsname.o
diff --git a/kernel/cgroup_timer_slack.c b/kernel/cgroup_timer_slack.c
new file mode 100644
index 0000000..c300125
--- /dev/null
+++ b/kernel/cgroup_timer_slack.c
@@ -0,0 +1,107 @@
+#include <linux/cgroup.h>
+#include <linux/slab.h>
+
+struct cgroup_subsys timer_slack_subsys;
+struct tslack_cgroup {
+ struct cgroup_subsys_state css;
+ unsigned long min_slack_ns;
+};
+
+static struct tslack_cgroup *cgroup_to_tslack(struct cgroup *cgroup)
+{
+ struct cgroup_subsys_state *css;
+
+ css = cgroup_subsys_state(cgroup, timer_slack_subsys.subsys_id);
+ return container_of(css, struct tslack_cgroup, css);
+}
+
+static struct cgroup_subsys_state *tslack_create(struct cgroup_subsys *subsys,
+ struct cgroup *cgroup)
+{
+ struct tslack_cgroup *tslack_cgroup;
+
+ tslack_cgroup = kmalloc(sizeof(*tslack_cgroup), GFP_KERNEL);
+ if (!tslack_cgroup)
+ return ERR_PTR(-ENOMEM);
+
+ if (cgroup->parent) {
+ struct tslack_cgroup *parent = cgroup_to_tslack(cgroup->parent);
+ tslack_cgroup->min_slack_ns = parent->min_slack_ns;
+ } else
+ tslack_cgroup->min_slack_ns = 0UL;
+
+ return &tslack_cgroup->css;
+}
+
+static void tslack_destroy(struct cgroup_subsys *tslack_cgroup,
+ struct cgroup *cgroup)
+{
+ kfree(cgroup_to_tslack(cgroup));
+}
+
+static u64 tslack_read_min(struct cgroup *cgroup, struct cftype *cft)
+{
+ return cgroup_to_tslack(cgroup)->min_slack_ns;
+}
+
+static int tslack_write_min(struct cgroup *cgroup, struct cftype *cft, u64 val)
+{
+ struct cgroup *cur;
+
+ if (val > ULONG_MAX)
+ return -EINVAL;
+
+ /* the min timer slack value should be more or equal than parent's */
+ if (cgroup->parent) {
+ struct tslack_cgroup *parent = cgroup_to_tslack(cgroup->parent);
+ if (parent->min_slack_ns > val)
+ return -EPERM;
+ }
+
+ cgroup_to_tslack(cgroup)->min_slack_ns = val;
+
+ /* update children's min slack value if needed */
+ list_for_each_entry(cur, &cgroup->children, sibling) {
+ struct tslack_cgroup *child = cgroup_to_tslack(cur);
+ if (val > child->min_slack_ns)
+ tslack_write_min(cur, cft, val);
+ }
+
+ return 0;
+}
+
+static struct cftype files[] = {
+ {
+ .name = "min_slack_ns",
+ .read_u64 = tslack_read_min,
+ .write_u64 = tslack_write_min,
+ }
+};
+
+static int tslack_populate(struct cgroup_subsys *subsys, struct cgroup *cgroup)
+{
+ return cgroup_add_files(cgroup, subsys, files, ARRAY_SIZE(files));
+}
+
+struct cgroup_subsys timer_slack_subsys = {
+ .name = "timer_slack",
+ .subsys_id = timer_slack_subsys_id,
+ .create = tslack_create,
+ .destroy = tslack_destroy,
+ .populate = tslack_populate,
+};
+
+unsigned long get_task_timer_slack(struct task_struct *tsk)
+{
+ struct cgroup_subsys_state *css;
+ struct tslack_cgroup *tslack_cgroup;
+ unsigned long ret;
+
+ rcu_read_lock();
+ css = task_subsys_state(tsk, timer_slack_subsys.subsys_id);
+ tslack_cgroup = container_of(css, struct tslack_cgroup, css);
+ ret = max(tsk->timer_slack_ns, tslack_cgroup->min_slack_ns);
+ rcu_read_unlock();
+
+ return ret;
+}
diff --git a/kernel/futex.c b/kernel/futex.c
index b766d28..eca8773 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1845,7 +1845,7 @@ static int futex_wait(u32 __user *uaddr, unsigned int flags, u32 val,
HRTIMER_MODE_ABS);
hrtimer_init_sleeper(to, current);
hrtimer_set_expires_range_ns(&to->timer, *abs_time,
- current->timer_slack_ns);
+ get_task_timer_slack(current));
}

retry:
@@ -2242,7 +2242,7 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
HRTIMER_MODE_ABS);
hrtimer_init_sleeper(to, current);
hrtimer_set_expires_range_ns(&to->timer, *abs_time,
- current->timer_slack_ns);
+ get_task_timer_slack(current));
}

/*
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 0c8d7c0..cdf47ba 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1542,7 +1542,7 @@ long hrtimer_nanosleep(struct timespec *rqtp, struct timespec __user *rmtp,
int ret = 0;
unsigned long slack;

- slack = current->timer_slack_ns;
+ slack = get_task_timer_slack(current);
if (rt_task(current))
slack = 0;

--
1.7.4.1

2011-03-02 18:41:34

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH, v7] cgroups: introduce timer slack controller

On Wed, 2 Mar 2011, Kirill A. Shutsemov wrote:

Not CC'ing me does not avoid another review :)

> diff --git a/fs/select.c b/fs/select.c
> index e56560d..a189e4d 100644
> --- a/fs/select.c
> +++ b/fs/select.c
> @@ -69,7 +69,6 @@ static long __estimate_accuracy(struct timespec *tv)
>
> long select_estimate_accuracy(struct timespec *tv)
> {
> - unsigned long ret;
> struct timespec now;
>
> /*
> @@ -81,10 +80,8 @@ long select_estimate_accuracy(struct timespec *tv)
>
> ktime_get_ts(&now);
> now = timespec_sub(*tv, now);
> - ret = __estimate_accuracy(&now);
> - if (ret < current->timer_slack_ns)
> - return current->timer_slack_ns;
> - return ret;
> + return clamp(__estimate_accuracy(&now),
> + get_task_timer_slack(current), LONG_MAX);
> }

Can you please split out the get_task_timer_slack() change into a
separate patch?

Also the function wants to be named different.

task_get_effective_timer_slack() or such, so it becomes clear, that
it's not just a wrapper around tsk->timer_slack_ns

And the places which access tsk->timer_slack_ns directly should be
updated with comments, why they are not using the wrapper.

> +static int tslack_write_min(struct cgroup *cgroup, struct cftype *cft, u64 val)
> +{
> + struct cgroup *cur;
> +
> + if (val > ULONG_MAX)
> + return -EINVAL;
> +
> + /* the min timer slack value should be more or equal than parent's */

s/should/must/

> + if (cgroup->parent) {
> + struct tslack_cgroup *parent = cgroup_to_tslack(cgroup->parent);

New line between variables and code please

> + if (parent->min_slack_ns > val)
> + return -EPERM;
> + }
> +
> + cgroup_to_tslack(cgroup)->min_slack_ns = val;
> +
> + /* update children's min slack value if needed */
> + list_for_each_entry(cur, &cgroup->children, sibling) {
> + struct tslack_cgroup *child = cgroup_to_tslack(cur);

Ditto

> + if (val > child->min_slack_ns)
> + tslack_write_min(cur, cft, val);
> + }

So, we can increase the value and propagate it through the groups
children, but decreasing it requires to go through all child groups
seperately.

When I'm trying to optimize something during runtime and chose a too
high value I need to go through hoops and loops to make it smaller
again.

Asymetric behaviour sucks always.

> +unsigned long get_task_timer_slack(struct task_struct *tsk)
> +{
> + struct cgroup_subsys_state *css;
> + struct tslack_cgroup *tslack_cgroup;
> + unsigned long ret;
> +
> + rcu_read_lock();

Did you just remove the odd comment or actually figure out why you
need rcu_read_lock() here ?

> + css = task_subsys_state(tsk, timer_slack_subsys.subsys_id);
> + tslack_cgroup = container_of(css, struct tslack_cgroup, css);
> + ret = max(tsk->timer_slack_ns, tslack_cgroup->min_slack_ns);
> + rcu_read_unlock();
> +
> + return ret;
> +}

Otherwise, it's way more palatable than the last one.

Thanks,

tglx

2011-03-03 06:57:34

by Li Zefan

[permalink] [raw]
Subject: Re: [PATCH, v7] cgroups: introduce timer slack controller

>> +unsigned long get_task_timer_slack(struct task_struct *tsk)
>> +{
>> + struct cgroup_subsys_state *css;
>> + struct tslack_cgroup *tslack_cgroup;
>> + unsigned long ret;
>> +
>> + rcu_read_lock();
>
> Did you just remove the odd comment or actually figure out why you
> need rcu_read_lock() here ?
>

It's necessary to protect against task exiting or task moving between cgroups.

>> + css = task_subsys_state(tsk, timer_slack_subsys.subsys_id);
>> + tslack_cgroup = container_of(css, struct tslack_cgroup, css);
>> + ret = max(tsk->timer_slack_ns, tslack_cgroup->min_slack_ns);
>> + rcu_read_unlock();
>> +
>> + return ret;
>> +}
>
> Otherwise, it's way more palatable than the last one.
>
> Thanks,
>
> tglx
>

2011-03-03 07:35:52

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH, v7] cgroups: introduce timer slack controller

On Thu, 3 Mar 2011, Li Zefan wrote:

> >> +unsigned long get_task_timer_slack(struct task_struct *tsk)
> >> +{
> >> + struct cgroup_subsys_state *css;
> >> + struct tslack_cgroup *tslack_cgroup;
> >> + unsigned long ret;
> >> +
> >> + rcu_read_lock();
> >
> > Did you just remove the odd comment or actually figure out why you
> > need rcu_read_lock() here ?
> >
>
> It's necessary to protect against task exiting or task moving between cgroups.

I know, though after the last review I wanted to make sure, that the
author understands it as well and not just removed the odd comment
just because I ranted about it :)

Thanks,

tglx

2011-03-03 07:46:18

by Matt Helsley

[permalink] [raw]
Subject: Re: [PATCH, v7] cgroups: introduce timer slack controller

On Wed, Mar 02, 2011 at 07:40:01PM +0100, Thomas Gleixner wrote:
> On Wed, 2 Mar 2011, Kirill A. Shutsemov wrote:

<snip>

> > + if (val > child->min_slack_ns)
> > + tslack_write_min(cur, cft, val);
> > + }
>
> So, we can increase the value and propagate it through the groups
> children, but decreasing it requires to go through all child groups
> seperately.

One goal is to restrict shrinking timer slacks so that they
cannot be less than the parent cgroup's.

>
> When I'm trying to optimize something during runtime and chose a too
> high value I need to go through hoops and loops to make it smaller
> again.
>
> Asymetric behaviour sucks always.

Well, in this case I think the asymmetry is less sucky because I don't
see any point in imposing the same restrictions on raising the timer
slack *except* this symmetry argument. But I haven't played much with
timer slacks so I don't know: Is there any case where raising the timer
slack would be harmful?

Would making the values additive solve the symmetry problem? In other words,
the "minimum" you see in a cgroups min_slack_ns file is the minimum in
addition to the minimum timer slack of its parent. Then, so long as negative
values are disallowed, you can't possibly write values that violate this
restriction. We could re-evaluate the resulting minimum timer slack internally
during writes to the file so functions using timer slack won't have to walk
the cgroup parents to calculate it but it couldn't result in EPERM...

Cheers,
-Matt Helsley

2011-03-03 08:30:24

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH, v7] cgroups: introduce timer slack controller

On Wed, 2 Mar 2011, Matt Helsley wrote:

> On Wed, Mar 02, 2011 at 07:40:01PM +0100, Thomas Gleixner wrote:
> > On Wed, 2 Mar 2011, Kirill A. Shutsemov wrote:
>
> <snip>
>
> > > + if (val > child->min_slack_ns)
> > > + tslack_write_min(cur, cft, val);
> > > + }
> >
> > So, we can increase the value and propagate it through the groups
> > children, but decreasing it requires to go through all child groups
> > seperately.
>
> One goal is to restrict shrinking timer slacks so that they
> cannot be less than the parent cgroup's.

Right, but that's not what the code is doing.

> >
> > When I'm trying to optimize something during runtime and chose a too
> > high value I need to go through hoops and loops to make it smaller
> > again.
> >
> > Asymetric behaviour sucks always.
>
> Well, in this case I think the asymmetry is less sucky because I don't
> see any point in imposing the same restrictions on raising the timer
> slack *except* this symmetry argument. But I haven't played much with
> timer slacks so I don't know: Is there any case where raising the timer
> slack would be harmful?

Well, the slack can delay your timer expiry by slack ns. So chosing a
too large slack value can affect the functionality of an application.

Now imagine you are tuning your system and typo the slack value by an
order of magnitude, which makes your apps unhappy. Then you definitely
want to be able to undo that w/o going through loops and circles.

> Would making the values additive solve the symmetry problem? In other words,
> the "minimum" you see in a cgroups min_slack_ns file is the minimum in
> addition to the minimum timer slack of its parent. Then, so long as negative
> values are disallowed, you can't possibly write values that violate this
> restriction. We could re-evaluate the resulting minimum timer slack internally
> during writes to the file so functions using timer slack won't have to walk
> the cgroup parents to calculate it but it couldn't result in EPERM...

That would work, but that wants some interface to read out the
effective slack value for a group.

Thanks,

tglx

2011-03-03 08:33:48

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH, v7] cgroups: introduce timer slack controller

On Wed, Mar 02, 2011 at 07:40:01PM +0100, Thomas Gleixner wrote:
> On Wed, 2 Mar 2011, Kirill A. Shutsemov wrote:
>
> Not CC'ing me does not avoid another review :)

Sorry for that. It's by mistake.

> > diff --git a/fs/select.c b/fs/select.c
> > index e56560d..a189e4d 100644
> > --- a/fs/select.c
> > +++ b/fs/select.c
> > @@ -69,7 +69,6 @@ static long __estimate_accuracy(struct timespec *tv)
> >
> > long select_estimate_accuracy(struct timespec *tv)
> > {
> > - unsigned long ret;
> > struct timespec now;
> >
> > /*
> > @@ -81,10 +80,8 @@ long select_estimate_accuracy(struct timespec *tv)
> >
> > ktime_get_ts(&now);
> > now = timespec_sub(*tv, now);
> > - ret = __estimate_accuracy(&now);
> > - if (ret < current->timer_slack_ns)
> > - return current->timer_slack_ns;
> > - return ret;
> > + return clamp(__estimate_accuracy(&now),
> > + get_task_timer_slack(current), LONG_MAX);
> > }
>
> Can you please split out the get_task_timer_slack() change into a
> separate patch?
>
> Also the function wants to be named different.
>
> task_get_effective_timer_slack() or such, so it becomes clear, that
> it's not just a wrapper around tsk->timer_slack_ns
>
> And the places which access tsk->timer_slack_ns directly should be
> updated with comments, why they are not using the wrapper.

Ok, I'll fix it.

> > +static int tslack_write_min(struct cgroup *cgroup, struct cftype *cft, u64 val)
> > +{
> > + struct cgroup *cur;
> > +
> > + if (val > ULONG_MAX)
> > + return -EINVAL;
> > +
> > + /* the min timer slack value should be more or equal than parent's */
>
> s/should/must/

Ok.

> > + if (cgroup->parent) {
> > + struct tslack_cgroup *parent = cgroup_to_tslack(cgroup->parent);
>
> New line between variables and code please

Ok.

> > + if (parent->min_slack_ns > val)
> > + return -EPERM;
> > + }
> > +
> > + cgroup_to_tslack(cgroup)->min_slack_ns = val;
> > +
> > + /* update children's min slack value if needed */
> > + list_for_each_entry(cur, &cgroup->children, sibling) {
> > + struct tslack_cgroup *child = cgroup_to_tslack(cur);
>
> Ditto
>
> > + if (val > child->min_slack_ns)
> > + tslack_write_min(cur, cft, val);
> > + }
>
> So, we can increase the value and propagate it through the groups
> children, but decreasing it requires to go through all child groups
> seperately.
>
> When I'm trying to optimize something during runtime and chose a too
> high value I need to go through hoops and loops to make it smaller
> again.
>
> Asymetric behaviour sucks always.

Other option is to remove restriction on the min_slack_ns value and
modify get_task_timer_slack() to use the most strict value up by
hierarchy.
Do you prefer this approach?

> > +unsigned long get_task_timer_slack(struct task_struct *tsk)
> > +{
> > + struct cgroup_subsys_state *css;
> > + struct tslack_cgroup *tslack_cgroup;
> > + unsigned long ret;
> > +
> > + rcu_read_lock();
>
> Did you just remove the odd comment or actually figure out why you
> need rcu_read_lock() here ?

The second ;)

> > + css = task_subsys_state(tsk, timer_slack_subsys.subsys_id);
> > + tslack_cgroup = container_of(css, struct tslack_cgroup, css);
> > + ret = max(tsk->timer_slack_ns, tslack_cgroup->min_slack_ns);
> > + rcu_read_unlock();
> > +
> > + return ret;
> > +}
>
> Otherwise, it's way more palatable than the last one.

Thanks for reviewing.

--
Kirill A. Shutemov

2011-03-04 19:02:15

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH, v7] cgroups: introduce timer slack controller

On Wed, 2011-03-02 at 18:40 +0200, Kirill A. Shutsemov wrote:
> - if (ret < current->timer_slack_ns)
> - return current->timer_slack_ns;
> - return ret;
> + return clamp(__estimate_accuracy(&now),
> + get_task_timer_slack(current), LONG_MAX);

That actually makes the code worse, how about:

min(__estimate_accuracy(), get_task_timer_slack()) ?


2011-03-04 23:22:18

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH, v7] cgroups: introduce timer slack controller

On Fri, Mar 04, 2011 at 08:03:42PM +0100, Peter Zijlstra wrote:
> On Wed, 2011-03-02 at 18:40 +0200, Kirill A. Shutsemov wrote:
> > - if (ret < current->timer_slack_ns)
> > - return current->timer_slack_ns;
> > - return ret;
> > + return clamp(__estimate_accuracy(&now),
> > + get_task_timer_slack(current), LONG_MAX);
>
> That actually makes the code worse, how about:
>
> min(__estimate_accuracy(), get_task_timer_slack()) ?

It's better. Thanks.

--
Kirill A. Shutemov