2009-12-21 20:33:20

by Ben Blum

[permalink] [raw]
Subject: [PATCH 0/4] cgroups: support for module-loadable subsystems

[This is a revision of http://lkml.org/lkml/2009/12/4/53 ]

This patch series implements support for building, loading, and
unloading subsystems as modules, both within and outside the kernel
source tree. It provides an interface cgroup_load_subsys() and
cgroup_unload_subsys() which modular subsystems can use to register and
depart during runtime. The net_cls classifier subsystem serves as the
example for a subsystem which can be converted into a module using these
changes.

Patch #1 sets up the subsys[] array so its contents can be dynamic as
modules appear and (eventually) disappear. Iterations over the array are
modified to handle when subsystems are absent, and the dynamic section
of the array is protected by cgroup_mutex.

Patch #2 implements an interface for modules to load subsystems, called
cgroup_load_subsys, similar to cgroup_init_subsys, and adds a module
pointer in struct cgroup_subsys.

Patch #3 modifies the net_cls subsystem, which already had some module
declarations, to be configurable as a module, which also serves as a
simple proof-of-concept.

Patch #4 adds a mechanism for unloading modular subsystems, which
includes a more advanced rework of the rudimentary reference counting
introduced in patch 2. It also modifies the exitcall of net_cls to use
this interface.

Part of implementing patches 2 and 4 involved updating css pointers in
each css_set when the module appears or leaves. In doing this, it was
discovered that css_sets always remain linked to the dummy cgroup,
regardless of whether or not any subsystems are actually bound to it
(i.e., not mounted on an actual hierarchy). The subsystem loading and
unloading code therefore should keep in mind the special cases where the
added subsystem is the only one in the dummy cgroup (and therefore all
css_sets need to be linked back into it) and where the removed subsys
was the only one in the dummy cgroup (and therefore all css_sets should
be unlinked from it) - however, as all css_sets always stay attached to
the dummy cgroup anyway, these cases are ignored. Any fix that addresses
this issue should also make sure these cases are addressed in the
subsystem loading and unloading code.

-- bblum

---

Documentation/cgroups/cgroups.txt | 6
include/linux/cgroup.h | 16 +
kernel/cgroup.c | 337 +++++++++++++++++++++++++++++++++-----
net/sched/Kconfig | 5
net/sched/cls_cgroup.c | 40 +++-
5 files changed, 350 insertions(+), 54 deletions(-)


2009-12-21 20:36:37

by Ben Blum

[permalink] [raw]
Subject: [PATCH 1/4] cgroups: revamp subsys array

Make subsys[] able to be dynamically populated to support modular subsystems

From: Ben Blum <[email protected]>

This patch reworks the way the subsys[] array is used so that subsystems can
register themselves after boot time, and enables the internals of cgroups to
be able to handle when subsystems are not present or may appear/disappear.

Signed-off-by: Ben Blum <[email protected]>
---

include/linux/cgroup.h | 8 ++++-
kernel/cgroup.c | 77 +++++++++++++++++++++++++++++++++++++++---------
2 files changed, 68 insertions(+), 17 deletions(-)


diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 642a47f..d7f1545 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -41,13 +41,17 @@ extern int cgroupstats_build(struct cgroupstats *stats,

extern struct file_operations proc_cgroup_operations;

-/* Define the enumeration of all cgroup subsystems */
+/* Define the enumeration of all builtin cgroup subsystems */
#define SUBSYS(_x) _x ## _subsys_id,
enum cgroup_subsys_id {
#include <linux/cgroup_subsys.h>
- CGROUP_SUBSYS_COUNT
+ CGROUP_BUILTIN_SUBSYS_COUNT
};
#undef SUBSYS
+/* This define indicates the maximum number of subsystems that can be loaded
+ * at once. We limit to this many since cgroupfs_root has subsys_bits to keep
+ * track of all of them. */
+#define CGROUP_SUBSYS_COUNT (BITS_PER_BYTE*sizeof(unsigned long))

/* Per-subsystem/per-cgroup state maintained by the system. */
struct cgroup_subsys_state {
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 7af6168..ece9321 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -56,10 +56,12 @@

static DEFINE_MUTEX(cgroup_mutex);

-/* Generate an array of cgroup subsystem pointers */
+/* Generate an array of cgroup subsystem pointers. At boot time, this is
+ * populated up to CGROUP_BUILTIN_SUBSYS_COUNT, and modular subsystems are
+ * registered after that. The mutable section of this array is protected by
+ * cgroup_mutex. */
#define SUBSYS(_x) &_x ## _subsys,
-
-static struct cgroup_subsys *subsys[] = {
+static struct cgroup_subsys *subsys[CGROUP_SUBSYS_COUNT] = {
#include <linux/cgroup_subsys.h>
};

@@ -433,8 +435,9 @@ static struct css_set *find_existing_css_set(
struct hlist_node *node;
struct css_set *cg;

- /* Built the set of subsystem state objects that we want to
- * see in the new css_set */
+ /* Build the set of subsystem state objects that we want to see in the
+ * new css_set. while subsystems can change globally, the entries here
+ * won't change, so no need for locking. */
for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
if (root->subsys_bits & (1UL << i)) {
/* Subsystem is in this hierarchy. So we want
@@ -869,7 +872,9 @@ void cgroup_release_and_wakeup_rmdir(struct cgroup_subsys_state *css)
css_put(css);
}

-
+/*
+ * Call with cgroup_mutex held.
+ */
static int rebind_subsystems(struct cgroupfs_root *root,
unsigned long final_bits)
{
@@ -877,6 +882,8 @@ static int rebind_subsystems(struct cgroupfs_root *root,
struct cgroup *cgrp = &root->top_cgroup;
int i;

+ BUG_ON(!mutex_is_locked(&cgroup_mutex));
+
removed_bits = root->actual_subsys_bits & ~final_bits;
added_bits = final_bits & ~root->actual_subsys_bits;
/* Check that any added subsystems are currently free */
@@ -885,6 +892,10 @@ static int rebind_subsystems(struct cgroupfs_root *root,
struct cgroup_subsys *ss = subsys[i];
if (!(bit & added_bits))
continue;
+ /* nobody should tell us to do a subsys that doesn't exist:
+ * parse_cgroupfs_options should catch that case and refcounts
+ * ensure that subsystems won't disappear once selected. */
+ BUG_ON(ss == NULL);
if (ss->root != &rootnode) {
/* Subsystem isn't free */
return -EBUSY;
@@ -904,6 +915,7 @@ static int rebind_subsystems(struct cgroupfs_root *root,
unsigned long bit = 1UL << i;
if (bit & added_bits) {
/* We're binding this subsystem to this hierarchy */
+ BUG_ON(ss == NULL);
BUG_ON(cgrp->subsys[i]);
BUG_ON(!dummytop->subsys[i]);
BUG_ON(dummytop->subsys[i]->cgroup != dummytop);
@@ -917,6 +929,7 @@ static int rebind_subsystems(struct cgroupfs_root *root,
mutex_unlock(&ss->hierarchy_mutex);
} else if (bit & removed_bits) {
/* We're removing this subsystem */
+ BUG_ON(ss == NULL);
BUG_ON(cgrp->subsys[i] != dummytop->subsys[i]);
BUG_ON(cgrp->subsys[i]->cgroup != cgrp);
mutex_lock(&ss->hierarchy_mutex);
@@ -929,6 +942,7 @@ static int rebind_subsystems(struct cgroupfs_root *root,
mutex_unlock(&ss->hierarchy_mutex);
} else if (bit & final_bits) {
/* Subsystem state should already exist */
+ BUG_ON(ss == NULL);
BUG_ON(!cgrp->subsys[i]);
} else {
/* Subsystem state shouldn't exist */
@@ -971,14 +985,18 @@ struct cgroup_sb_opts {

};

-/* Convert a hierarchy specifier into a bitmask of subsystems and
- * flags. */
+/*
+ * Convert a hierarchy specifier into a bitmask of subsystems and flags. Call
+ * with cgroup_mutex held to protect the subsys[] array.
+ */
static int parse_cgroupfs_options(char *data,
struct cgroup_sb_opts *opts)
{
char *token, *o = data ?: "all";
unsigned long mask = (unsigned long)-1;

+ BUG_ON(!mutex_is_locked(&cgroup_mutex));
+
#ifdef CONFIG_CPUSETS
mask = ~(1UL << cpuset_subsys_id);
#endif
@@ -994,6 +1012,8 @@ static int parse_cgroupfs_options(char *data,
opts->subsys_bits = 0;
for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
struct cgroup_subsys *ss = subsys[i];
+ if (ss == NULL)
+ continue;
if (!ss->disabled)
opts->subsys_bits |= 1ul << i;
}
@@ -1038,6 +1058,8 @@ static int parse_cgroupfs_options(char *data,
int i;
for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
ss = subsys[i];
+ if (ss == NULL)
+ continue;
if (!strcmp(token, ss->name)) {
if (!ss->disabled)
set_bit(i, &opts->subsys_bits);
@@ -1291,7 +1313,9 @@ static int cgroup_get_sb(struct file_system_type *fs_type,
struct cgroupfs_root *new_root;

/* First find the desired set of subsystems */
+ mutex_lock(&cgroup_mutex);
ret = parse_cgroupfs_options(data, &opts);
+ mutex_unlock(&cgroup_mutex);
if (ret)
goto out_err;

@@ -3277,8 +3301,12 @@ static void cgroup_lock_hierarchy(struct cgroupfs_root *root)
/* We need to take each hierarchy_mutex in a consistent order */
int i;

+ /* no worry about a race with rebind_subsystems that might mess up the
+ * locking order, since both parties are under cgroup_mutex. */
for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
struct cgroup_subsys *ss = subsys[i];
+ if (ss == NULL)
+ continue;
if (ss->root == root)
mutex_lock(&ss->hierarchy_mutex);
}
@@ -3290,6 +3318,8 @@ static void cgroup_unlock_hierarchy(struct cgroupfs_root *root)

for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
struct cgroup_subsys *ss = subsys[i];
+ if (ss == NULL)
+ continue;
if (ss->root == root)
mutex_unlock(&ss->hierarchy_mutex);
}
@@ -3410,11 +3440,14 @@ static int cgroup_has_css_refs(struct cgroup *cgrp)
* synchronization other than RCU, and the subsystem linked
* list isn't RCU-safe */
int i;
+ /* we won't need to lock the subsys array, because the subsystems
+ * we're concerned about aren't going anywhere since our cgroup root
+ * has a reference on them. */
for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
struct cgroup_subsys *ss = subsys[i];
struct cgroup_subsys_state *css;
- /* Skip subsystems not in this hierarchy */
- if (ss->root != cgrp->root)
+ /* Skip subsystems not present or not in this hierarchy */
+ if (ss == NULL || ss->root != cgrp->root)
continue;
css = cgrp->subsys[ss->subsys_id];
/* When called from check_for_release() it's possible
@@ -3635,7 +3668,8 @@ int __init cgroup_init_early(void)
for (i = 0; i < CSS_SET_TABLE_SIZE; i++)
INIT_HLIST_HEAD(&css_set_table[i]);

- for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
+ /* at bootup time, we don't worry about modular subsystems */
+ for (i = 0; i < CGROUP_BUILTIN_SUBSYS_COUNT; i++) {
struct cgroup_subsys *ss = subsys[i];

BUG_ON(!ss->name);
@@ -3670,7 +3704,8 @@ int __init cgroup_init(void)
if (err)
return err;

- for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
+ /* at bootup time, we don't worry about modular subsystems */
+ for (i = 0; i < CGROUP_BUILTIN_SUBSYS_COUNT; i++) {
struct cgroup_subsys *ss = subsys[i];
if (!ss->early_init)
cgroup_init_subsys(ss);
@@ -3779,9 +3814,14 @@ static int proc_cgroupstats_show(struct seq_file *m, void *v)
int i;

seq_puts(m, "#subsys_name\thierarchy\tnum_cgroups\tenabled\n");
+ /* ideally we don't want subsystems moving around while we do this.
+ * cgroup_mutex is also necessary to guarantee an atomic snapshot of
+ * subsys/hierarchy state. */
mutex_lock(&cgroup_mutex);
for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
struct cgroup_subsys *ss = subsys[i];
+ if (ss == NULL)
+ continue;
seq_printf(m, "%s\t%d\t%d\t%d\n",
ss->name, ss->root->hierarchy_id,
ss->root->number_of_cgroups, !ss->disabled);
@@ -3841,7 +3881,10 @@ void cgroup_fork_callbacks(struct task_struct *child)
{
if (need_forkexit_callback) {
int i;
- for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
+ /* forkexit callbacks are only supported for builtin modules,
+ * and the builtin section of the subsys array is immutable,
+ * so we don't need to lock the subsys array here. */
+ for (i = 0; i < CGROUP_BUILTIN_SUBSYS_COUNT; i++) {
struct cgroup_subsys *ss = subsys[i];
if (ss->fork)
ss->fork(ss, child);
@@ -3912,7 +3955,9 @@ void cgroup_exit(struct task_struct *tsk, int run_callbacks)
struct css_set *cg;

if (run_callbacks && need_forkexit_callback) {
- for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
+ /* modular subsystems can't use callbacks, so no need to lock
+ * the subsys array */
+ for (i = 0; i < CGROUP_BUILTIN_SUBSYS_COUNT; i++) {
struct cgroup_subsys *ss = subsys[i];
if (ss->exit)
ss->exit(ss, tsk);
@@ -4221,7 +4266,9 @@ static int __init cgroup_disable(char *str)
if (!*token)
continue;

- for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
+ /* cgroup_disable, being at boot time, can't know about module
+ * subsystems, so we don't worry about them. */
+ for (i = 0; i < CGROUP_BUILTIN_SUBSYS_COUNT; i++) {
struct cgroup_subsys *ss = subsys[i];

if (!strcmp(token, ss->name)) {


Attachments:
cgroups-revamp-subsys-array.patch (10.35 kB)

2009-12-21 20:37:43

by Ben Blum

[permalink] [raw]
Subject: [PATCH 2/4] cgroups: subsystem module loading interface

Add interface between cgroups subsystem management and module loading

From: Ben Blum <[email protected]>

This patch implements rudimentary module-loading support for cgroups - namely,
a cgroup_load_subsys (similar to cgroup_init_subsys) for use as a module
initcall, and a struct module pointer in struct cgroup_subsys.

Several functions that might be wanted by modules have had EXPORT_SYMBOL added
to them, but it's unclear exactly which functions want it and which won't.

Signed-off-by: Ben Blum <[email protected]>
---

Documentation/cgroups/cgroups.txt | 3 +
include/linux/cgroup.h | 4 +
kernel/cgroup.c | 114 +++++++++++++++++++++++++++++++++++++
3 files changed, 121 insertions(+), 0 deletions(-)


diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt
index 3df4b9a..dd0d6f1 100644
--- a/Documentation/cgroups/cgroups.txt
+++ b/Documentation/cgroups/cgroups.txt
@@ -492,6 +492,9 @@ Each subsystem should:
- add an entry in linux/cgroup_subsys.h
- define a cgroup_subsys object called <name>_subsys

+If a subsystem can be compiled as a module, it should also have in its
+module initcall a call to cgroup_load_subsys().
+
Each subsystem may export the following methods. The only mandatory
methods are create/destroy. Any others that are null are presumed to
be successful no-ops.
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index d7f1545..c8474c4 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -38,6 +38,7 @@ extern void cgroup_fork_failed(struct task_struct *p, int run_callbacks,
unsigned long clone_flags);
extern int cgroupstats_build(struct cgroupstats *stats,
struct dentry *dentry);
+extern int cgroup_load_subsys(struct cgroup_subsys *ss);

extern struct file_operations proc_cgroup_operations;

@@ -477,6 +478,9 @@ struct cgroup_subsys {
/* used when use_id == true */
struct idr idr;
spinlock_t id_lock;
+
+ /* should be defined only by modular subsystems */
+ struct module *module;
};

#define SUBSYS(_x) extern struct cgroup_subsys _x ## _subsys;
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index ece9321..bddf96b 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2477,6 +2477,7 @@ int cgroup_add_file(struct cgroup *cgrp,
error = PTR_ERR(dentry);
return error;
}
+EXPORT_SYMBOL_GPL(cgroup_add_file);

int cgroup_add_files(struct cgroup *cgrp,
struct cgroup_subsys *subsys,
@@ -2491,6 +2492,7 @@ int cgroup_add_files(struct cgroup *cgrp,
}
return 0;
}
+EXPORT_SYMBOL_GPL(cgroup_add_files);

/**
* cgroup_task_count - count the number of tasks in a cgroup.
@@ -3638,7 +3640,119 @@ static void __init cgroup_init_subsys(struct cgroup_subsys *ss)
mutex_init(&ss->hierarchy_mutex);
lockdep_set_class(&ss->hierarchy_mutex, &ss->subsys_key);
ss->active = 1;
+
+ /* this function shouldn't be used with modular subsystems, since they
+ * need to register a subsys_id, among other things */
+ BUG_ON(ss->module);
+}
+
+/**
+ * cgroup_load_subsys: load and register a modular subsystem at runtime
+ * @ss: the subsystem to load
+ *
+ * This function should be called in a modular subsystem's initcall. If the
+ * subsytem is built as a module, it will be assigned a new subsys_id and set
+ * up for use. If the subsystem is built-in anyway, work is delegated to the
+ * simpler cgroup_init_subsys.
+ */
+int __init_or_module cgroup_load_subsys(struct cgroup_subsys *ss)
+{
+ int i;
+ struct cgroup_subsys_state *css;
+
+ /* check name and function validity */
+ if (ss->name == NULL || strlen(ss->name) > MAX_CGROUP_TYPE_NAMELEN ||
+ ss->create == NULL || ss->destroy == NULL)
+ return -EINVAL;
+
+ /* we don't support callbacks in modular subsystems. this check is
+ * before the ss->module check for consistency - a module that *could*
+ * be a module should still have no callbacks for consistency. */
+ if (ss->fork || ss->exit)
+ return -EINVAL;
+
+ /* an optionally modular subsystem is built-in: we want to do nothing,
+ * since cgroup_init_subsys will take care of it. */
+ if (ss->module == NULL) {
+ /* sanity: ss->module NULL only if the subsys is built-in and
+ * appears in subsys[] already. */
+ BUG_ON(ss->subsys_id >= CGROUP_BUILTIN_SUBSYS_COUNT);
+ BUG_ON(subsys[ss->subsys_id] != ss);
+ return 0;
+ }
+
+ /* need to register a subsys id before anything else - for example,
+ * init_cgroup_css needs it. */
+ mutex_lock(&cgroup_mutex);
+ /* find the first empty slot in the array */
+ for (i = CGROUP_BUILTIN_SUBSYS_COUNT; i < CGROUP_SUBSYS_COUNT; i++) {
+ if (subsys[i] == NULL)
+ break;
+ }
+ if (i == CGROUP_SUBSYS_COUNT) {
+ /* maximum number of subsystems already registered! */
+ mutex_unlock(&cgroup_mutex);
+ return -EBUSY;
+ }
+ /* assign ourselves the subsys_id */
+ ss->subsys_id = i;
+ subsys[i] = ss;
+
+ /* no ss->create seems to need anything important in the ss struct, so
+ * this can happen first (i.e. before the rootnode attachment). */
+ css = ss->create(ss, dummytop);
+ if (IS_ERR(css)) {
+ /* failure case - need to deassign the subsys[] slot. */
+ subsys[i] = NULL;
+ mutex_unlock(&cgroup_mutex);
+ return PTR_ERR(css);
+ }
+
+ list_add(&ss->sibling, &rootnode.subsys_list);
+ ss->root = &rootnode;
+
+ /* our new subsystem will be attached to the dummy hierarchy. */
+ init_cgroup_css(css, ss, dummytop);
+ /* now we need to entangle the css into the existing css_sets. unlike
+ * in cgroup_init_subsys, there are now multiple css_sets, so each one
+ * will need a new pointer to it; done by iterating the css_set_table.
+ * furthermore, modifying the existing css_sets will corrupt the hash
+ * table state, so each changed css_set will need its hash recomputed.
+ * this is all done under the css_set_lock. */
+ write_lock(&css_set_lock);
+ for (i = 0; i < CSS_SET_TABLE_SIZE; i++) {
+ struct css_set *cg;
+ struct hlist_node *node, *tmp;
+ struct hlist_head *bucket = &css_set_table[i], *new_bucket;
+ hlist_for_each_entry_safe(cg, node, tmp, bucket, hlist) {
+ /* skip entries that we already rehashed */
+ if (cg->subsys[ss->subsys_id])
+ continue;
+ /* remove existing entry */
+ hlist_del(&cg->hlist);
+ /* set new value */
+ cg->subsys[ss->subsys_id] = css;
+ /* recompute hash and restore entry */
+ new_bucket = css_set_hash(cg->subsys);
+ hlist_add_head(&cg->hlist, new_bucket);
+ }
+ }
+ write_unlock(&css_set_lock);
+
+ mutex_init(&ss->hierarchy_mutex);
+ lockdep_set_class(&ss->hierarchy_mutex, &ss->subsys_key);
+ ss->active = 1;
+
+ /* pin the subsystem's module so it doesn't go away. this shouldn't
+ * fail, since the module's initcall calls us.
+ * TODO: with module unloading, move this elsewhere */
+ BUG_ON(!try_module_get(ss->module));
+
+ /* success! */
+ mutex_unlock(&cgroup_mutex);
+ return 0;
}
+EXPORT_SYMBOL_GPL(cgroup_load_subsys);

/**
* cgroup_init_early - cgroup initialization at system boot


Attachments:
cgroups-subsys-module-interface.patch (7.01 kB)

2009-12-21 20:38:27

by Ben Blum

[permalink] [raw]
Subject: [PATCH 3/4] cgroups: net_cls as module

Allows the net_cls cgroup subsystem to be compiled as a module

From: Ben Blum <[email protected]>

This patch modifies net/sched/cls_cgroup.c to allow the net_cls subsystem to
be optionally compiled as a module instead of builtin. The cgroup_subsys
struct is moved around a bit to allow the subsys_id to be either declared as a
compile-time constant by the cgroup_subsys.h include in cgroup.h, or, if it's
a module, initialized within the struct by cgroup_load_subsys.

Signed-off-by: Ben Blum <[email protected]>
---

net/sched/Kconfig | 5 ++++-
net/sched/cls_cgroup.c | 37 ++++++++++++++++++++++++++++---------
2 files changed, 32 insertions(+), 10 deletions(-)


diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index 929218a..08ff9bc 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -328,13 +328,16 @@ config NET_CLS_FLOW
module will be called cls_flow.

config NET_CLS_CGROUP
- bool "Control Group Classifier"
+ tristate "Control Group Classifier"
select NET_CLS
depends on CGROUPS
---help---
Say Y here if you want to classify packets based on the control
cgroup of their process.

+ To compile this code as a module, choose M here: the
+ module will be called cls_cgroup.
+
config NET_EMATCH
bool "Extended Matches"
select NET_CLS
diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
index e4877ca..df9723b 100644
--- a/net/sched/cls_cgroup.c
+++ b/net/sched/cls_cgroup.c
@@ -24,6 +24,25 @@ struct cgroup_cls_state
u32 classid;
};

+static struct cgroup_subsys_state *cgrp_create(struct cgroup_subsys *ss,
+ struct cgroup *cgrp);
+static void cgrp_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp);
+static int cgrp_populate(struct cgroup_subsys *ss, struct cgroup *cgrp);
+
+struct cgroup_subsys net_cls_subsys = {
+ .name = "net_cls",
+ .create = cgrp_create,
+ .destroy = cgrp_destroy,
+ .populate = cgrp_populate,
+#ifdef CONFIG_NET_CLS_CGROUP
+ .subsys_id = net_cls_subsys_id,
+#else
+#define net_cls_subsys_id net_cls_subsys.subsys_id
+#endif
+ .module = THIS_MODULE,
+};
+
+
static inline struct cgroup_cls_state *cgrp_cls_state(struct cgroup *cgrp)
{
return container_of(cgroup_subsys_state(cgrp, net_cls_subsys_id),
@@ -79,14 +98,6 @@ static int cgrp_populate(struct cgroup_subsys *ss, struct cgroup *cgrp)
return cgroup_add_files(cgrp, ss, ss_files, ARRAY_SIZE(ss_files));
}

-struct cgroup_subsys net_cls_subsys = {
- .name = "net_cls",
- .create = cgrp_create,
- .destroy = cgrp_destroy,
- .populate = cgrp_populate,
- .subsys_id = net_cls_subsys_id,
-};
-
struct cls_cgroup_head
{
u32 handle;
@@ -277,12 +288,20 @@ static struct tcf_proto_ops cls_cgroup_ops __read_mostly = {

static int __init init_cgroup_cls(void)
{
- return register_tcf_proto_ops(&cls_cgroup_ops);
+ int ret = register_tcf_proto_ops(&cls_cgroup_ops);
+ if (ret)
+ return ret;
+ ret = cgroup_load_subsys(&net_cls_subsys);
+ if (ret)
+ unregister_tcf_proto_ops(&cls_cgroup_ops);
+ return ret;
}

static void __exit exit_cgroup_cls(void)
{
unregister_tcf_proto_ops(&cls_cgroup_ops);
+ /* TODO: unload subsystem. for now, the try_module_get in load_subsys
+ * prevents us from getting here. */
}

module_init(init_cgroup_cls);


Attachments:
cgroups-net_cls-as-module.patch (3.40 kB)

2009-12-21 20:39:24

by Ben Blum

[permalink] [raw]
Subject: [PATCH 4/4] cgroups: subsystem module unloading

Provides support for unloading modular subsystems.

From: Ben Blum <[email protected]>

This patch adds a new function cgroup_unload_subsys which is to be used for
removing a loaded subsystem during module deletion. Reference counting of the
subsystems' modules is moved from once (at load time) to once per attached
hierarchy (in parse_cgroupfs_options and rebind_subsystems) (i.e., 0 or 1).

It also adds a proper module_delete call in net/sched/cls_cgroup.c.

Signed-off-by: Ben Blum <[email protected]>
---

Documentation/cgroups/cgroups.txt | 3 +
include/linux/cgroup.h | 4 +
kernel/cgroup.c | 146 +++++++++++++++++++++++++++++++------
net/sched/cls_cgroup.c | 3 -
4 files changed, 129 insertions(+), 27 deletions(-)


diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt
index dd0d6f1..110228e 100644
--- a/Documentation/cgroups/cgroups.txt
+++ b/Documentation/cgroups/cgroups.txt
@@ -493,7 +493,8 @@ Each subsystem should:
- define a cgroup_subsys object called <name>_subsys

If a subsystem can be compiled as a module, it should also have in its
-module initcall a call to cgroup_load_subsys().
+module initcall a call to cgroup_load_subsys(), and in its exitcall a
+call to cgroup_unload_subsys().

Each subsystem may export the following methods. The only mandatory
methods are create/destroy. Any others that are null are presumed to
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index c8474c4..1cbb07f 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -39,6 +39,7 @@ extern void cgroup_fork_failed(struct task_struct *p, int run_callbacks,
extern int cgroupstats_build(struct cgroupstats *stats,
struct dentry *dentry);
extern int cgroup_load_subsys(struct cgroup_subsys *ss);
+extern void cgroup_unload_subsys(struct cgroup_subsys *ss);

extern struct file_operations proc_cgroup_operations;

@@ -264,7 +265,8 @@ struct css_set {
/*
* Set of subsystem states, one for each subsystem. This array
* is immutable after creation apart from the init_css_set
- * during subsystem registration (at boot time).
+ * during subsystem registration (at boot time) and modular subsystem
+ * loading/unloading.
*/
struct cgroup_subsys_state *subsys[CGROUP_SUBSYS_COUNT];

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index bddf96b..e74887f 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -873,7 +873,9 @@ void cgroup_release_and_wakeup_rmdir(struct cgroup_subsys_state *css)
}

/*
- * Call with cgroup_mutex held.
+ * Call with cgroup_mutex held. Drops reference counts on modules, including
+ * any duplicate ones that parse_cgroupfs_options took. If this function
+ * returns an error, no reference counts are touched.
*/
static int rebind_subsystems(struct cgroupfs_root *root,
unsigned long final_bits)
@@ -927,6 +929,7 @@ static int rebind_subsystems(struct cgroupfs_root *root,
if (ss->bind)
ss->bind(ss, cgrp);
mutex_unlock(&ss->hierarchy_mutex);
+ /* refcount was already taken, and we're keeping it */
} else if (bit & removed_bits) {
/* We're removing this subsystem */
BUG_ON(ss == NULL);
@@ -940,10 +943,16 @@ static int rebind_subsystems(struct cgroupfs_root *root,
subsys[i]->root = &rootnode;
list_move(&ss->sibling, &rootnode.subsys_list);
mutex_unlock(&ss->hierarchy_mutex);
+ /* subsystem is now free - drop reference on module */
+ module_put(ss->module);
} else if (bit & final_bits) {
/* Subsystem state should already exist */
BUG_ON(ss == NULL);
BUG_ON(!cgrp->subsys[i]);
+ /* a refcount was taken, but we already had one, so
+ * drop the extra reference. */
+ module_put(ss->module);
+ BUG_ON(ss->module && !module_refcount(ss->module));
} else {
/* Subsystem state shouldn't exist */
BUG_ON(cgrp->subsys[i]);
@@ -987,13 +996,16 @@ struct cgroup_sb_opts {

/*
* Convert a hierarchy specifier into a bitmask of subsystems and flags. Call
- * with cgroup_mutex held to protect the subsys[] array.
+ * with cgroup_mutex held to protect the subsys[] array. This function takes
+ * refcounts on subsystems to be used, unless it returns error, in which case
+ * no refcounts are taken.
*/
-static int parse_cgroupfs_options(char *data,
- struct cgroup_sb_opts *opts)
+static int parse_cgroupfs_options(char *data, struct cgroup_sb_opts *opts)
{
char *token, *o = data ?: "all";
unsigned long mask = (unsigned long)-1;
+ int i;
+ bool module_pin_failed = false;

BUG_ON(!mutex_is_locked(&cgroup_mutex));

@@ -1008,7 +1020,6 @@ static int parse_cgroupfs_options(char *data,
return -EINVAL;
if (!strcmp(token, "all")) {
/* Add all non-disabled subsystems */
- int i;
opts->subsys_bits = 0;
for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
struct cgroup_subsys *ss = subsys[i];
@@ -1031,7 +1042,6 @@ static int parse_cgroupfs_options(char *data,
if (!opts->release_agent)
return -ENOMEM;
} else if (!strncmp(token, "name=", 5)) {
- int i;
const char *name = token + 5;
/* Can't specify an empty name */
if (!strlen(name))
@@ -1055,7 +1065,6 @@ static int parse_cgroupfs_options(char *data,
return -ENOMEM;
} else {
struct cgroup_subsys *ss;
- int i;
for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) {
ss = subsys[i];
if (ss == NULL)
@@ -1094,9 +1103,49 @@ static int parse_cgroupfs_options(char *data,
if (!opts->subsys_bits && !opts->name)
return -EINVAL;

+ /*
+ * Grab references on all the modules we'll need, so the subsystems
+ * don't dance around before rebind_subsystems attaches them. This may
+ * take duplicate reference counts on a subsystem that's already used,
+ * but rebind_subsystems handles this case.
+ */
+ for (i = CGROUP_BUILTIN_SUBSYS_COUNT; i < CGROUP_SUBSYS_COUNT; i++) {
+ unsigned long bit = 1UL << i;
+ if (!(bit & opts->subsys_bits))
+ continue;
+ if (!try_module_get(subsys[i]->module)) {
+ module_pin_failed = true;
+ break;
+ }
+ }
+ if (module_pin_failed) {
+ /* oops, one of the modules was going away. this means that we
+ * raced with a module_delete call, and to the user this is
+ * essentially a "subsystem doesn't exist" case. */
+ for (i--; i >= CGROUP_BUILTIN_SUBSYS_COUNT; i--) {
+ /* drop refcounts only on the ones we took */
+ unsigned long bit = 1UL << i;
+ if (!(bit & opts->subsys_bits))
+ continue;
+ module_put(subsys[i]->module);
+ }
+ return -ENOENT;
+ }
+
return 0;
}

+static void drop_parsed_module_refcounts(unsigned long subsys_bits)
+{
+ int i;
+ for (i = CGROUP_BUILTIN_SUBSYS_COUNT; i < CGROUP_SUBSYS_COUNT; i++) {
+ unsigned long bit = 1UL << i;
+ if (!(bit & subsys_bits))
+ continue;
+ module_put(subsys[i]->module);
+ }
+}
+
static int cgroup_remount(struct super_block *sb, int *flags, char *data)
{
int ret = 0;
@@ -1113,21 +1162,19 @@ static int cgroup_remount(struct super_block *sb, int *flags, char *data)
if (ret)
goto out_unlock;

- /* Don't allow flags to change at remount */
- if (opts.flags != root->flags) {
- ret = -EINVAL;
- goto out_unlock;
- }
-
- /* Don't allow name to change at remount */
- if (opts.name && strcmp(opts.name, root->name)) {
+ /* Don't allow flags or name to change at remount */
+ if (opts.flags != root->flags ||
+ (opts.name && strcmp(opts.name, root->name))) {
ret = -EINVAL;
+ drop_parsed_module_refcounts(opts.subsys_bits);
goto out_unlock;
}

ret = rebind_subsystems(root, opts.subsys_bits);
- if (ret)
+ if (ret) {
+ drop_parsed_module_refcounts(opts.subsys_bits);
goto out_unlock;
+ }

/* (re)populate subsystem files */
cgroup_populate_dir(cgrp);
@@ -1326,7 +1373,7 @@ static int cgroup_get_sb(struct file_system_type *fs_type,
new_root = cgroup_root_from_opts(&opts);
if (IS_ERR(new_root)) {
ret = PTR_ERR(new_root);
- goto out_err;
+ goto drop_modules;
}
opts.new_root = new_root;

@@ -1335,7 +1382,7 @@ static int cgroup_get_sb(struct file_system_type *fs_type,
if (IS_ERR(sb)) {
ret = PTR_ERR(sb);
cgroup_drop_root(opts.new_root);
- goto out_err;
+ goto drop_modules;
}

root = sb->s_fs_info;
@@ -1391,6 +1438,9 @@ static int cgroup_get_sb(struct file_system_type *fs_type,
free_cg_links(&tmp_cg_links);
goto drop_new_super;
}
+ /* There must be no failure case after here, since rebinding
+ * takes care of subsystems' refcounts, which are explicitly
+ * dropped in the failure exit path. */

/* EBUSY should be the only error here */
BUG_ON(ret);
@@ -1429,6 +1479,8 @@ static int cgroup_get_sb(struct file_system_type *fs_type,
* any) is not needed
*/
cgroup_drop_root(opts.new_root);
+ /* no subsys rebinding, so refcounts don't change */
+ drop_parsed_module_refcounts(opts.subsys_bits);
}

simple_set_mnt(mnt, sb);
@@ -1438,6 +1490,8 @@ static int cgroup_get_sb(struct file_system_type *fs_type,

drop_new_super:
deactivate_locked_super(sb);
+ drop_modules:
+ drop_parsed_module_refcounts(opts.subsys_bits);
out_err:
kfree(opts.release_agent);
kfree(opts.name);
@@ -3743,11 +3797,6 @@ int __init_or_module cgroup_load_subsys(struct cgroup_subsys *ss)
lockdep_set_class(&ss->hierarchy_mutex, &ss->subsys_key);
ss->active = 1;

- /* pin the subsystem's module so it doesn't go away. this shouldn't
- * fail, since the module's initcall calls us.
- * TODO: with module unloading, move this elsewhere */
- BUG_ON(!try_module_get(ss->module));
-
/* success! */
mutex_unlock(&cgroup_mutex);
return 0;
@@ -3755,6 +3804,57 @@ int __init_or_module cgroup_load_subsys(struct cgroup_subsys *ss)
EXPORT_SYMBOL_GPL(cgroup_load_subsys);

/**
+ * cgroup_unload_subsys: unload a modular subsystem
+ * @ss: the subsystem to unload
+ *
+ * This function should be called in a modular subsystem's exitcall. When this
+ * function is invoked, the refcount on the subsystem's module will be 0, so
+ * the subsystem will not be attached to any hierarchy.
+ */
+void cgroup_unload_subsys(struct cgroup_subsys *ss)
+{
+ struct cg_cgroup_link *link;
+ struct hlist_head *hhead;
+
+ BUG_ON(ss->module == NULL);
+
+ /* we shouldn't be called if the subsystem is in use, and the use of
+ * try_module_get in rebind_subsystems should ensure that it doesn't
+ * start being used while we're killing it off. */
+ BUG_ON(ss->root != &rootnode);
+
+ mutex_lock(&cgroup_mutex);
+ /* deassign the subsys_id */
+ BUG_ON(ss->subsys_id < CGROUP_BUILTIN_SUBSYS_COUNT);
+ subsys[ss->subsys_id] = NULL;
+
+ /* remove subsystem from rootnode's list of subsystems */
+ list_del(&ss->sibling);
+
+ /* disentangle the css from all css_sets attached to the dummytop. as
+ * in loading, we need to pay our respects to the hashtable gods. */
+ write_lock(&css_set_lock);
+ list_for_each_entry(link, &dummytop->css_sets, cgrp_link_list) {
+ struct css_set *cg = link->cg;
+ hlist_del(&cg->hlist);
+ BUG_ON(!cg->subsys[ss->subsys_id]);
+ cg->subsys[ss->subsys_id] = NULL;
+ hhead = css_set_hash(cg->subsys);
+ hlist_add_head(&cg->hlist, hhead);
+ }
+ write_unlock(&css_set_lock);
+
+ /* remove subsystem's css from the dummytop and free it - need to free
+ * before marking as null because ss->destroy needs the cgrp->subsys
+ * pointer to find their state. */
+ ss->destroy(ss, dummytop);
+ dummytop->subsys[ss->subsys_id] = NULL;
+
+ mutex_unlock(&cgroup_mutex);
+}
+EXPORT_SYMBOL_GPL(cgroup_unload_subsys);
+
+/**
* cgroup_init_early - cgroup initialization at system boot
*
* Initialize cgroups at system boot, and initialize any
diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
index df9723b..7f27d2c 100644
--- a/net/sched/cls_cgroup.c
+++ b/net/sched/cls_cgroup.c
@@ -300,8 +300,7 @@ static int __init init_cgroup_cls(void)
static void __exit exit_cgroup_cls(void)
{
unregister_tcf_proto_ops(&cls_cgroup_ops);
- /* TODO: unload subsystem. for now, the try_module_get in load_subsys
- * prevents us from getting here. */
+ cgroup_unload_subsys(&net_cls_subsys);
}

module_init(init_cgroup_cls);


Attachments:
cgroups-subsys-unloading.patch (12.01 kB)

2009-12-28 06:30:08

by Li Zefan

[permalink] [raw]
Subject: Re: [PATCH 1/4] cgroups: revamp subsys array

> Make subsys[] able to be dynamically populated to support modular subsystems
>
> From: Ben Blum <[email protected]>
>
> This patch reworks the way the subsys[] array is used so that subsystems can
> register themselves after boot time, and enables the internals of cgroups to
> be able to handle when subsystems are not present or may appear/disappear.
>
> Signed-off-by: Ben Blum <[email protected]>

Acked-by: Li Zefan <[email protected]>

...

A small nitpick:

> +/* This define indicates the maximum number of subsystems that can be loaded
> + * at once. We limit to this many since cgroupfs_root has subsys_bits to keep
> + * track of all of them. */

Please always use this style:

/*
* xxx
* xxx
*/

2009-12-28 06:32:28

by Li Zefan

[permalink] [raw]
Subject: Re: [PATCH 2/4] cgroups: subsystem module loading interface

> Add interface between cgroups subsystem management and module loading
>
> From: Ben Blum <[email protected]>
>
> This patch implements rudimentary module-loading support for cgroups - namely,
> a cgroup_load_subsys (similar to cgroup_init_subsys) for use as a module
> initcall, and a struct module pointer in struct cgroup_subsys.
>
> Several functions that might be wanted by modules have had EXPORT_SYMBOL added
> to them, but it's unclear exactly which functions want it and which won't.
>
> Signed-off-by: Ben Blum <[email protected]>

Acked-by: Li Zefan <[email protected]>

Some small nitpicks:

> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -2477,6 +2477,7 @@ int cgroup_add_file(struct cgroup *cgrp,
> error = PTR_ERR(dentry);
> return error;
> }
> +EXPORT_SYMBOL_GPL(cgroup_add_file);

I got compile errors, because you forgot to include <linux/module.h>

...
> + write_lock(&css_set_lock);
> + for (i = 0; i < CSS_SET_TABLE_SIZE; i++) {
> + struct css_set *cg;
> + struct hlist_node *node, *tmp;
> + struct hlist_head *bucket = &css_set_table[i], *new_bucket;

Please add a blank line between variable declaration and other code.

And elsewhere in this patchset.

> + hlist_for_each_entry_safe(cg, node, tmp, bucket, hlist) {
> + /* skip entries that we already rehashed */
> + if (cg->subsys[ss->subsys_id])
> + continue;
> + /* remove existing entry */
> + hlist_del(&cg->hlist);
> + /* set new value */
> + cg->subsys[ss->subsys_id] = css;
> + /* recompute hash and restore entry */
> + new_bucket = css_set_hash(cg->subsys);
> + hlist_add_head(&cg->hlist, new_bucket);
> + }
> + }
> + write_unlock(&css_set_lock);

2009-12-28 06:34:21

by Li Zefan

[permalink] [raw]
Subject: Re: [PATCH 3/4] cgroups: net_cls as module

> Allows the net_cls cgroup subsystem to be compiled as a module
>
> From: Ben Blum <[email protected]>
>
> This patch modifies net/sched/cls_cgroup.c to allow the net_cls subsystem to
> be optionally compiled as a module instead of builtin. The cgroup_subsys
> struct is moved around a bit to allow the subsys_id to be either declared as a
> compile-time constant by the cgroup_subsys.h include in cgroup.h, or, if it's
> a module, initialized within the struct by cgroup_load_subsys.
>
> Signed-off-by: Ben Blum <[email protected]>

Acked-by: Li Zefan <[email protected]>

But can we reorder this patch as the last patch?

2009-12-28 06:37:06

by Li Zefan

[permalink] [raw]
Subject: Re: [PATCH 4/4] cgroups: subsystem module unloading

> Provides support for unloading modular subsystems.
>
> From: Ben Blum <[email protected]>
>
> This patch adds a new function cgroup_unload_subsys which is to be used for
> removing a loaded subsystem during module deletion. Reference counting of the
> subsystems' modules is moved from once (at load time) to once per attached
> hierarchy (in parse_cgroupfs_options and rebind_subsystems) (i.e., 0 or 1).
>
> It also adds a proper module_delete call in net/sched/cls_cgroup.c.
>
> Signed-off-by: Ben Blum <[email protected]>

Acked-by: Li Zefan <[email protected]>

Also please fix comment style and add some blank lines properly.

2009-12-28 12:38:04

by Ben Blum

[permalink] [raw]
Subject: Re: [PATCH 3/4] cgroups: net_cls as module

On Mon, Dec 28, 2009 at 02:34:12PM +0800, Li Zefan wrote:
> > Allows the net_cls cgroup subsystem to be compiled as a module
> >
> > From: Ben Blum <[email protected]>
> >
> > This patch modifies net/sched/cls_cgroup.c to allow the net_cls subsystem to
> > be optionally compiled as a module instead of builtin. The cgroup_subsys
> > struct is moved around a bit to allow the subsys_id to be either declared as a
> > compile-time constant by the cgroup_subsys.h include in cgroup.h, or, if it's
> > a module, initialized within the struct by cgroup_load_subsys.
> >
> > Signed-off-by: Ben Blum <[email protected]>
>
> Acked-by: Li Zefan <[email protected]>
>
> But can we reorder this patch as the last patch?

that does make a bit more sense, saving the unloading patch from having
to touch cls_cgroup.c to add an unload_subsys() call. will do.

2009-12-28 12:41:56

by Ben Blum

[permalink] [raw]
Subject: Re: [PATCH 2/4] cgroups: subsystem module loading interface

On Mon, Dec 28, 2009 at 02:32:20PM +0800, Li Zefan wrote:
> > Add interface between cgroups subsystem management and module loading
> >
> > From: Ben Blum <[email protected]>
> >
> > This patch implements rudimentary module-loading support for cgroups - namely,
> > a cgroup_load_subsys (similar to cgroup_init_subsys) for use as a module
> > initcall, and a struct module pointer in struct cgroup_subsys.
> >
> > Several functions that might be wanted by modules have had EXPORT_SYMBOL added
> > to them, but it's unclear exactly which functions want it and which won't.
> >
> > Signed-off-by: Ben Blum <[email protected]>
>
> Acked-by: Li Zefan <[email protected]>
>
> Some small nitpicks:
>
> > --- a/kernel/cgroup.c
> > +++ b/kernel/cgroup.c
> > @@ -2477,6 +2477,7 @@ int cgroup_add_file(struct cgroup *cgrp,
> > error = PTR_ERR(dentry);
> > return error;
> > }
> > +EXPORT_SYMBOL_GPL(cgroup_add_file);
>
> I got compile errors, because you forgot to include <linux/module.h>
>
> ...
> > + write_lock(&css_set_lock);
> > + for (i = 0; i < CSS_SET_TABLE_SIZE; i++) {
> > + struct css_set *cg;
> > + struct hlist_node *node, *tmp;
> > + struct hlist_head *bucket = &css_set_table[i], *new_bucket;
>
> Please add a blank line between variable declaration and other code.
>
> And elsewhere in this patchset.
>

checkpatch.pl didn't trip on this [and the more screenspace-intensive
comment style]; are they cgroups-specific style requirements or from
somewhere higher up?

> > + hlist_for_each_entry_safe(cg, node, tmp, bucket, hlist) {
> > + /* skip entries that we already rehashed */
> > + if (cg->subsys[ss->subsys_id])
> > + continue;
> > + /* remove existing entry */
> > + hlist_del(&cg->hlist);
> > + /* set new value */
> > + cg->subsys[ss->subsys_id] = css;
> > + /* recompute hash and restore entry */
> > + new_bucket = css_set_hash(cg->subsys);
> > + hlist_add_head(&cg->hlist, new_bucket);
> > + }
> > + }
> > + write_unlock(&css_set_lock);
>
>
>

2009-12-29 01:04:09

by Li Zefan

[permalink] [raw]
Subject: Re: [PATCH 2/4] cgroups: subsystem module loading interface

>>> + write_lock(&css_set_lock);
>>> + for (i = 0; i < CSS_SET_TABLE_SIZE; i++) {
>>> + struct css_set *cg;
>>> + struct hlist_node *node, *tmp;
>>> + struct hlist_head *bucket = &css_set_table[i], *new_bucket;
>> Please add a blank line between variable declaration and other code.
>>
>> And elsewhere in this patchset.
>>
>
> checkpatch.pl didn't trip on this [and the more screenspace-intensive
> comment style]; are they cgroups-specific style requirements or from
> somewhere higher up?
>

You'll find the comment style in Documentation/CodingStyle:

==============
The preferred style for long (multi-line) comments is:

/*
* This is the preferred style for multi-line
* comments in the Linux kernel source code.
* Please use it consistently.
*
* Description: A column of asterisks on the left side,
* with beginning and ending almost-blank lines.
*/
==============

Though the other coding style is not documented, I think it's
well accepted.

You'll hardly find code that breaks this style, say in kernel/* and
mm/*. And you can see that people pointed out this when they were
reviewing patches.

And the fact is it does make the code more readable.