2014-04-22 06:30:26

by Jianyu Zhan

[permalink] [raw]
Subject: [PATCH 0/4] cgroup: substitude per-cgroup id with per-subsys id

Currently, cgrp->id is only used to look up css's. As cgroup and
css's lifetimes is now decoupled, it should be made per-subsystem
and moved to css->css_id so that lookups are successful until the
target css is released.

Patch 1-3 are prep patches.
Patch 4 do the coverting job.

Thanks!

Jianyu Zhan (4):
cgroup: introduce helper css_to_id()
mm/memcontrol.c: use accessor to get id from css
netprio_cgroup: use accessor to get id from css
cgroup: convert from per-cgroup id to per-subsys id

include/linux/cgroup.h | 27 ++++++-------
kernel/cgroup.c | 96 +++++++++++++++++++++++++----------------------
mm/memcontrol.c | 8 ++--
net/core/netprio_cgroup.c | 8 ++--
4 files changed, 74 insertions(+), 65 deletions(-)

--
2.0.0-rc0


2014-04-22 06:30:42

by Jianyu Zhan

[permalink] [raw]
Subject: [PATCH 1/4] cgroup: introduce helper css_to_id()

This is a prepared patch for converting from per-cgroup id to
per-subsystem id.

Some subsystems dereference the per-cgrpu id directly, but this is
implementation-specific, so it should be transparent for subsystems.

Use this accessor instead.

Signed-off-by: Jianyu Zhan <[email protected]>
---
include/linux/cgroup.h | 1 +
kernel/cgroup.c | 11 +++++++++++
2 files changed, 12 insertions(+)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 4b38e2d..de31b2a 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -754,6 +754,7 @@ struct cgroup_subsys_state *css_next_child(struct cgroup_subsys_state *pos,
struct cgroup_subsys_state *parent);

struct cgroup_subsys_state *css_from_id(int id, struct cgroup_subsys *ss);
+int css_to_id(struct cgroup_subsys_state *css);

/**
* css_for_each_child - iterate through children of a css
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 930569c..80c1cf3 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -5153,6 +5153,17 @@ struct cgroup_subsys_state *css_tryget_from_dir(struct dentry *dentry,
}

/**
+ * css_to_id - get the id of a css
+ * @css: the css of interest
+ *
+ * Return the corresponding id for the @css.
+ */
+int css_to_id(struct cgroup_subsys_state *css)
+{
+ return css->cgroup->id;
+}
+
+/**
* css_from_id - lookup css by id
* @id: the cgroup id
* @ss: cgroup subsys to be looked into
--
2.0.0-rc0

2014-04-22 06:31:01

by Jianyu Zhan

[permalink] [raw]
Subject: [PATCH 2/4] mm/memcontrol.c: use accessor to get id from css

This is a prepared patch for converting from per-cgroup id to
per-subsystem id.

We should not access per-cgroup id directly, since this is implemetation
detail. Use the accessor css_from_id() instead.

This patch has no functional change.

Signed-off-by: Jianyu Zhan <[email protected]>
---
mm/memcontrol.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 80d9e38..46333cb 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -528,10 +528,10 @@ static inline bool mem_cgroup_is_root(struct mem_cgroup *memcg)
static inline unsigned short mem_cgroup_id(struct mem_cgroup *memcg)
{
/*
- * The ID of the root cgroup is 0, but memcg treat 0 as an
- * invalid ID, so we return (cgroup_id + 1).
+ * The ID of css for the root cgroup is 0, but memcg treat 0 as an
+ * invalid ID, so we return (id + 1).
*/
- return memcg->css.cgroup->id + 1;
+ return css_to_id(&memcg->css) + 1;
}

static inline struct mem_cgroup *mem_cgroup_from_id(unsigned short id)
@@ -6407,7 +6407,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
struct mem_cgroup *memcg = mem_cgroup_from_css(css);
struct mem_cgroup *parent = mem_cgroup_from_css(css_parent(css));

- if (css->cgroup->id > MEM_CGROUP_ID_MAX)
+ if (css_to_id(css) > MEM_CGROUP_ID_MAX)
return -ENOSPC;

if (!parent)
--
2.0.0-rc0

2014-04-22 06:31:16

by Jianyu Zhan

[permalink] [raw]
Subject: [PATCH 3/4] netprio_cgroup: use accessor to get id from css

This is a prepared patch for converting from per-cgroup id to
per-subsystem id.

We should not access per-cgroup id directly, since this is
implemetation detail.

Use the accessor css_from_id() instead.

This patch has no functional change.

Signed-off-by: Jianyu Zhan <[email protected]>
---
net/core/netprio_cgroup.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index ce285c6..fc21035 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -89,7 +89,7 @@ static int extend_netdev_table(struct net_device *dev, u32 target_idx)
static u32 netprio_prio(struct cgroup_subsys_state *css, struct net_device *dev)
{
struct netprio_map *map = rcu_dereference_rtnl(dev->priomap);
- int id = css->cgroup->id;
+ int id = css_to_id(css);

if (map && id < map->priomap_len)
return map->priomap[id];
@@ -109,7 +109,7 @@ static int netprio_set_prio(struct cgroup_subsys_state *css,
struct net_device *dev, u32 prio)
{
struct netprio_map *map;
- int id = css->cgroup->id;
+ int id = css_to_id(css);
int ret;

/* avoid extending priomap for zero writes */
@@ -170,7 +170,7 @@ static void cgrp_css_free(struct cgroup_subsys_state *css)

static u64 read_prioidx(struct cgroup_subsys_state *css, struct cftype *cft)
{
- return css->cgroup->id;
+ return css_to_id(css);
}

static int read_priomap(struct seq_file *sf, void *v)
@@ -222,7 +222,7 @@ static void net_prio_attach(struct cgroup_subsys_state *css,
struct cgroup_taskset *tset)
{
struct task_struct *p;
- void *v = (void *)(unsigned long)css->cgroup->id;
+ void *v = (void *)(unsigned long)css_to_id(css);

cgroup_taskset_for_each(p, tset) {
task_lock(p);
--
2.0.0-rc0

2014-04-22 06:31:27

by Jianyu Zhan

[permalink] [raw]
Subject: [PATCH 4/4] cgroup: convert from per-cgroup id to per-subsys id

Currently, cgrp->id is only used to look up css's. As cgroup and
css's lifetimes is now decoupled, it should be made per-subsystem
and moved to css->css_id so that lookups are successful until the
target css is released.

Signed-off-by: Jianyu Zhan <[email protected]>
---
include/linux/cgroup.h | 26 +++++++--------
kernel/cgroup.c | 87 ++++++++++++++++++++++++--------------------------
2 files changed, 55 insertions(+), 58 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index de31b2a..66085bd 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -62,6 +62,16 @@ struct cgroup_subsys_state {
/* the parent css */
struct cgroup_subsys_state *parent;

+ /*
+ * per css id
+ *
+ * The css id of cgrp_dfl_root for each subsys is always 0, and
+ * a new css will be assigned with a smallest available ID.
+ *
+ * Allocating/Removing ID must be protected by cgroup_mutex.
+ */
+ int css_id;
+
unsigned long flags;

/* percpu_ref killing and RCU release */
@@ -141,16 +151,6 @@ enum {
struct cgroup {
unsigned long flags; /* "unsigned long" so bitops work */

- /*
- * idr allocated in-hierarchy ID.
- *
- * The ID of the root cgroup is always 0, and a new cgroup
- * will be assigned with a smallest available ID.
- *
- * Allocating/Removing ID must be protected by cgroup_mutex.
- */
- int id;
-
/* the number of attached css's */
int nr_css;

@@ -329,9 +329,6 @@ struct cgroup_root {
/* Hierarchy-specific flags */
unsigned long flags;

- /* IDs for cgroups in this hierarchy */
- struct idr cgroup_idr;
-
/* The path to use for release notifications. */
char release_agent_path[PATH_MAX];

@@ -655,6 +652,9 @@ struct cgroup_subsys {
/* link to parent, protected by cgroup_lock() */
struct cgroup_root *root;

+ /* IDs for css'es of this subsys */
+ struct idr css_idr;
+
/*
* List of cftypes. Each entry is the first entry of an array
* terminated by zero length name.
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 80c1cf3..f41cf8b 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -838,7 +838,6 @@ static void cgroup_free_root(struct cgroup_root *root)
/* hierarhcy ID shoulid already have been released */
WARN_ON_ONCE(root->hierarchy_id);

- idr_destroy(&root->cgroup_idr);
kfree(root);
}
}
@@ -1050,17 +1049,6 @@ static void cgroup_put(struct cgroup *cgrp)
if (WARN_ON_ONCE(cgrp->parent && !cgroup_is_dead(cgrp)))
return;

- /*
- * XXX: cgrp->id is only used to look up css's. As cgroup and
- * css's lifetimes will be decoupled, it should be made
- * per-subsystem and moved to css->id so that lookups are
- * successful until the target css is released.
- */
- mutex_lock(&cgroup_mutex);
- idr_remove(&cgrp->root->cgroup_idr, cgrp->id);
- mutex_unlock(&cgroup_mutex);
- cgrp->id = -1;
-
call_rcu(&cgrp->rcu_head, cgroup_free_rcu);
}

@@ -1512,7 +1500,6 @@ static void init_cgroup_root(struct cgroup_root *root,
atomic_set(&root->nr_cgrps, 1);
cgrp->root = root;
init_cgroup_housekeeping(cgrp);
- idr_init(&root->cgroup_idr);

root->flags = opts->flags;
if (opts->release_agent)
@@ -1533,11 +1520,6 @@ static int cgroup_setup_root(struct cgroup_root *root, unsigned long ss_mask)
lockdep_assert_held(&cgroup_tree_mutex);
lockdep_assert_held(&cgroup_mutex);

- ret = idr_alloc(&root->cgroup_idr, root_cgrp, 0, 1, GFP_KERNEL);
- if (ret < 0)
- goto out;
- root_cgrp->id = ret;
-
/*
* We're accessing css_set_count without locking css_set_rwsem here,
* but that's OK - it can only be increased by someone holding
@@ -4056,6 +4038,11 @@ static void css_free_work_fn(struct work_struct *work)

css->ss->css_free(css);
cgroup_put(cgrp);
+
+ mutex_lock(&cgroup_mutex);
+ idr_remove(&css->ss->css_idr, css->css_id);
+ mutex_unlock(&cgroup_mutex);
+ css->css_id = -1;
}

static void css_free_rcu_fn(struct rcu_head *rcu_head)
@@ -4152,9 +4139,17 @@ static int create_css(struct cgroup *cgrp, struct cgroup_subsys *ss)
if (IS_ERR(css))
return PTR_ERR(css);

+ /*
+ * Temporarily set the pointer to NULL, so idr_find() won't return
+ * a half-baked css.
+ */
+ css->css_id = idr_alloc(&ss->css_idr, NULL, 1, 0, GFP_KERNEL);
+ if (css->css_id < 0)
+ goto err_free_css;
+
err = percpu_ref_init(&css->refcnt, css_release);
if (err)
- goto err_free_css;
+ goto err_free_id;

init_css(css, ss, cgrp);

@@ -4166,6 +4161,12 @@ static int create_css(struct cgroup *cgrp, struct cgroup_subsys *ss)
if (err)
goto err_clear_dir;

+ /*
+ * @css is now fully operational.
+ * Nothing can fail from this point on.
+ */
+ idr_replace(&ss->css_idr, css, css->css_id);
+
cgroup_get(cgrp);
css_get(css->parent);

@@ -4184,6 +4185,8 @@ err_clear_dir:
cgroup_clear_dir(css->cgroup, 1 << css->ss->id);
err_free_percpu_ref:
percpu_ref_cancel_init(&css->refcnt);
+err_free_id:
+ idr_remove(&ss->css_idr, css->css_id);
err_free_css:
ss->css_free(css);
return err;
@@ -4223,16 +4226,6 @@ static long cgroup_create(struct cgroup *parent, const char *name,
goto err_unlock_tree;
}

- /*
- * Temporarily set the pointer to NULL, so idr_find() won't return
- * a half-baked cgroup.
- */
- cgrp->id = idr_alloc(&root->cgroup_idr, NULL, 1, 0, GFP_KERNEL);
- if (cgrp->id < 0) {
- err = -ENOMEM;
- goto err_unlock;
- }
-
init_cgroup_housekeeping(cgrp);

cgrp->parent = parent;
@@ -4249,7 +4242,7 @@ static long cgroup_create(struct cgroup *parent, const char *name,
kn = kernfs_create_dir(parent->kn, name, mode, cgrp);
if (IS_ERR(kn)) {
err = PTR_ERR(kn);
- goto err_free_id;
+ goto err_unlock;
}
cgrp->kn = kn;

@@ -4266,12 +4259,6 @@ static long cgroup_create(struct cgroup *parent, const char *name,
atomic_inc(&root->nr_cgrps);
cgroup_get(parent);

- /*
- * @cgrp is now fully operational. If something fails after this
- * point, it'll be released via the normal destruction path.
- */
- idr_replace(&root->cgroup_idr, cgrp, cgrp->id);
-
err = cgroup_kn_set_ugid(kn);
if (err)
goto err_destroy;
@@ -4303,8 +4290,6 @@ static long cgroup_create(struct cgroup *parent, const char *name,

return 0;

-err_free_id:
- idr_remove(&root->cgroup_idr, cgrp->id);
err_unlock:
mutex_unlock(&cgroup_mutex);
err_unlock_tree:
@@ -4617,6 +4602,7 @@ static void __init cgroup_init_subsys(struct cgroup_subsys *ss)
mutex_lock(&cgroup_tree_mutex);
mutex_lock(&cgroup_mutex);

+ idr_init(&ss->css_idr);
INIT_LIST_HEAD(&ss->cfts);

/* Create the root cgroup state for this subsystem */
@@ -4707,9 +4693,25 @@ int __init cgroup_init(void)
mutex_unlock(&cgroup_tree_mutex);

for_each_subsys(ss, ssid) {
+ struct cgroup_subsys_state *css;
+ int id;
+
if (!ss->early_init)
cgroup_init_subsys(ss);

+ /*
+ * mm_init() runs lately after cgroup_init_early(), thus
+ * the world isn't set up yet for idr_alloc() to run, so
+ * we have to defer the id allocation to this point.
+ *
+ * And we don't gracefully handle early failure.
+ */
+ css = init_css_set.subsys[ss->id];
+ id = idr_alloc(&ss->css_idr, css, 0, 1, GFP_KERNEL);
+ if (id < 0)
+ BUG();
+ css->css_id = id;
+
list_add_tail(&init_css_set.e_cset_node[ssid],
&cgrp_dfl_root.cgrp.e_csets[ssid]);

@@ -5160,7 +5162,7 @@ struct cgroup_subsys_state *css_tryget_from_dir(struct dentry *dentry,
*/
int css_to_id(struct cgroup_subsys_state *css)
{
- return css->cgroup->id;
+ return css->css_id;
}

/**
@@ -5173,14 +5175,9 @@ int css_to_id(struct cgroup_subsys_state *css)
*/
struct cgroup_subsys_state *css_from_id(int id, struct cgroup_subsys *ss)
{
- struct cgroup *cgrp;
-
cgroup_assert_mutexes_or_rcu_locked();

- cgrp = idr_find(&ss->root->cgroup_idr, id);
- if (cgrp)
- return cgroup_css(cgrp, ss);
- return NULL;
+ return idr_find(&ss->css_idr, id);
}

#ifdef CONFIG_CGROUP_DEBUG
--
2.0.0-rc0

2014-04-22 07:04:16

by Jianyu Zhan

[permalink] [raw]
Subject: Re: [PATCH 2/4] mm/memcontrol.c: use accessor to get id from css

Cc Andrew.

Thanks,
Jianyu Zhan


On Tue, Apr 22, 2014 at 2:30 PM, Jianyu Zhan <[email protected]> wrote:
> This is a prepared patch for converting from per-cgroup id to
> per-subsystem id.
>
> We should not access per-cgroup id directly, since this is implemetation
> detail. Use the accessor css_from_id() instead.
>
> This patch has no functional change.
>
> Signed-off-by: Jianyu Zhan <[email protected]>
> ---
> mm/memcontrol.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 80d9e38..46333cb 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -528,10 +528,10 @@ static inline bool mem_cgroup_is_root(struct mem_cgroup *memcg)
> static inline unsigned short mem_cgroup_id(struct mem_cgroup *memcg)
> {
> /*
> - * The ID of the root cgroup is 0, but memcg treat 0 as an
> - * invalid ID, so we return (cgroup_id + 1).
> + * The ID of css for the root cgroup is 0, but memcg treat 0 as an
> + * invalid ID, so we return (id + 1).
> */
> - return memcg->css.cgroup->id + 1;
> + return css_to_id(&memcg->css) + 1;
> }
>
> static inline struct mem_cgroup *mem_cgroup_from_id(unsigned short id)
> @@ -6407,7 +6407,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
> struct mem_cgroup *memcg = mem_cgroup_from_css(css);
> struct mem_cgroup *parent = mem_cgroup_from_css(css_parent(css));
>
> - if (css->cgroup->id > MEM_CGROUP_ID_MAX)
> + if (css_to_id(css) > MEM_CGROUP_ID_MAX)
> return -ENOSPC;
>
> if (!parent)
> --
> 2.0.0-rc0
>

2014-04-22 08:04:29

by Jianyu Zhan

[permalink] [raw]
Subject: Re: [PATCH 4/4] cgroup: convert from per-cgroup id to per-subsys id

Hi, all.

Sorry, previous patch has a minor fault, and cause a unitialized variable warning.
I've fixed it up in this.

Renewed patch:
---

Currently, cgrp->id is only used to look up css's. As cgroup and
css's lifetimes is now decoupled, it should be made per-subsystem
and moved to css->css_id so that lookups are successful until the
target css is released.

Signed-off-by: Jianyu Zhan <[email protected]>
---
include/linux/cgroup.h | 26 +++++++--------
kernel/cgroup.c | 88 ++++++++++++++++++++++++--------------------------
2 files changed, 56 insertions(+), 58 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index de31b2a..66085bd 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -62,6 +62,16 @@ struct cgroup_subsys_state {
/* the parent css */
struct cgroup_subsys_state *parent;

+ /*
+ * per css id
+ *
+ * The css id of cgrp_dfl_root for each subsys is always 0, and
+ * a new css will be assigned with a smallest available ID.
+ *
+ * Allocating/Removing ID must be protected by cgroup_mutex.
+ */
+ int css_id;
+
unsigned long flags;

/* percpu_ref killing and RCU release */
@@ -141,16 +151,6 @@ enum {
struct cgroup {
unsigned long flags; /* "unsigned long" so bitops work */

- /*
- * idr allocated in-hierarchy ID.
- *
- * The ID of the root cgroup is always 0, and a new cgroup
- * will be assigned with a smallest available ID.
- *
- * Allocating/Removing ID must be protected by cgroup_mutex.
- */
- int id;
-
/* the number of attached css's */
int nr_css;

@@ -329,9 +329,6 @@ struct cgroup_root {
/* Hierarchy-specific flags */
unsigned long flags;

- /* IDs for cgroups in this hierarchy */
- struct idr cgroup_idr;
-
/* The path to use for release notifications. */
char release_agent_path[PATH_MAX];

@@ -655,6 +652,9 @@ struct cgroup_subsys {
/* link to parent, protected by cgroup_lock() */
struct cgroup_root *root;

+ /* IDs for css'es of this subsys */
+ struct idr css_idr;
+
/*
* List of cftypes. Each entry is the first entry of an array
* terminated by zero length name.
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index f23cb67..9841a33 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -838,7 +838,6 @@ static void cgroup_free_root(struct cgroup_root *root)
/* hierarhcy ID shoulid already have been released */
WARN_ON_ONCE(root->hierarchy_id);

- idr_destroy(&root->cgroup_idr);
kfree(root);
}
}
@@ -1050,17 +1049,6 @@ static void cgroup_put(struct cgroup *cgrp)
if (WARN_ON_ONCE(cgrp->parent && !cgroup_is_dead(cgrp)))
return;

- /*
- * XXX: cgrp->id is only used to look up css's. As cgroup and
- * css's lifetimes will be decoupled, it should be made
- * per-subsystem and moved to css->id so that lookups are
- * successful until the target css is released.
- */
- mutex_lock(&cgroup_mutex);
- idr_remove(&cgrp->root->cgroup_idr, cgrp->id);
- mutex_unlock(&cgroup_mutex);
- cgrp->id = -1;
-
call_rcu(&cgrp->rcu_head, cgroup_free_rcu);
}

@@ -1512,7 +1500,6 @@ static void init_cgroup_root(struct cgroup_root *root,
atomic_set(&root->nr_cgrps, 1);
cgrp->root = root;
init_cgroup_housekeeping(cgrp);
- idr_init(&root->cgroup_idr);

root->flags = opts->flags;
if (opts->release_agent)
@@ -1533,11 +1520,6 @@ static int cgroup_setup_root(struct cgroup_root *root, unsigned long ss_mask)
lockdep_assert_held(&cgroup_tree_mutex);
lockdep_assert_held(&cgroup_mutex);

- ret = idr_alloc(&root->cgroup_idr, root_cgrp, 0, 1, GFP_KERNEL);
- if (ret < 0)
- goto out;
- root_cgrp->id = ret;
-
/*
* We're accessing css_set_count without locking css_set_rwsem here,
* but that's OK - it can only be increased by someone holding
@@ -4056,6 +4038,11 @@ static void css_free_work_fn(struct work_struct *work)

css->ss->css_free(css);
cgroup_put(cgrp);
+
+ mutex_lock(&cgroup_mutex);
+ idr_remove(&css->ss->css_idr, css->css_id);
+ mutex_unlock(&cgroup_mutex);
+ css->css_id = -1;
}

static void css_free_rcu_fn(struct rcu_head *rcu_head)
@@ -4152,9 +4139,18 @@ static int create_css(struct cgroup *cgrp, struct cgroup_subsys *ss)
if (IS_ERR(css))
return PTR_ERR(css);

+ /*
+ * Temporarily set the pointer to NULL, so idr_find() won't return
+ * a half-baked css.
+ */
+ err = idr_alloc(&ss->css_idr, NULL, 1, 0, GFP_KERNEL);
+ if (err < 0)
+ goto err_free_css;
+ css->css_id = err;
+
err = percpu_ref_init(&css->refcnt, css_release);
if (err)
- goto err_free_css;
+ goto err_free_id;

init_css(css, ss, cgrp);

@@ -4166,6 +4162,12 @@ static int create_css(struct cgroup *cgrp, struct cgroup_subsys *ss)
if (err)
goto err_clear_dir;

+ /*
+ * @css is now fully operational.
+ * Nothing can fail from this point on.
+ */
+ idr_replace(&ss->css_idr, css, css->css_id);
+
cgroup_get(cgrp);
css_get(css->parent);

@@ -4184,6 +4186,8 @@ err_clear_dir:
cgroup_clear_dir(css->cgroup, 1 << css->ss->id);
err_free_percpu_ref:
percpu_ref_cancel_init(&css->refcnt);
+err_free_id:
+ idr_remove(&ss->css_idr, css->css_id);
err_free_css:
ss->css_free(css);
return err;
@@ -4223,16 +4227,6 @@ static long cgroup_create(struct cgroup *parent, const char *name,
goto err_unlock_tree;
}

- /*
- * Temporarily set the pointer to NULL, so idr_find() won't return
- * a half-baked cgroup.
- */
- cgrp->id = idr_alloc(&root->cgroup_idr, NULL, 1, 0, GFP_KERNEL);
- if (cgrp->id < 0) {
- err = -ENOMEM;
- goto err_unlock;
- }
-
init_cgroup_housekeeping(cgrp);

cgrp->parent = parent;
@@ -4249,7 +4243,7 @@ static long cgroup_create(struct cgroup *parent, const char *name,
kn = kernfs_create_dir(parent->kn, name, mode, cgrp);
if (IS_ERR(kn)) {
err = PTR_ERR(kn);
- goto err_free_id;
+ goto err_unlock;
}
cgrp->kn = kn;

@@ -4266,12 +4260,6 @@ static long cgroup_create(struct cgroup *parent, const char *name,
atomic_inc(&root->nr_cgrps);
cgroup_get(parent);

- /*
- * @cgrp is now fully operational. If something fails after this
- * point, it'll be released via the normal destruction path.
- */
- idr_replace(&root->cgroup_idr, cgrp, cgrp->id);
-
err = cgroup_kn_set_ugid(kn);
if (err)
goto err_destroy;
@@ -4303,8 +4291,6 @@ static long cgroup_create(struct cgroup *parent, const char *name,

return 0;

-err_free_id:
- idr_remove(&root->cgroup_idr, cgrp->id);
err_unlock:
mutex_unlock(&cgroup_mutex);
err_unlock_tree:
@@ -4617,6 +4603,7 @@ static void __init cgroup_init_subsys(struct cgroup_subsys *ss)
mutex_lock(&cgroup_tree_mutex);
mutex_lock(&cgroup_mutex);

+ idr_init(&ss->css_idr);
INIT_LIST_HEAD(&ss->cfts);

/* Create the root cgroup state for this subsystem */
@@ -4707,9 +4694,25 @@ int __init cgroup_init(void)
mutex_unlock(&cgroup_tree_mutex);

for_each_subsys(ss, ssid) {
+ struct cgroup_subsys_state *css;
+ int id;
+
if (!ss->early_init)
cgroup_init_subsys(ss);

+ /*
+ * mm_init() runs lately after cgroup_init_early(), thus
+ * the world isn't set up yet for idr_alloc() to run, so
+ * we have to defer the id allocation to this point.
+ *
+ * And we don't gracefully handle early failure.
+ */
+ css = init_css_set.subsys[ss->id];
+ id = idr_alloc(&ss->css_idr, css, 0, 1, GFP_KERNEL);
+ if (id < 0)
+ BUG();
+ css->css_id = id;
+
list_add_tail(&init_css_set.e_cset_node[ssid],
&cgrp_dfl_root.cgrp.e_csets[ssid]);

@@ -5160,7 +5163,7 @@ struct cgroup_subsys_state *css_tryget_from_dir(struct dentry *dentry,
*/
int css_to_id(struct cgroup_subsys_state *css)
{
- return css->cgroup->id;
+ return css->css_id;
}

/**
@@ -5173,14 +5176,9 @@ int css_to_id(struct cgroup_subsys_state *css)
*/
struct cgroup_subsys_state *css_from_id(int id, struct cgroup_subsys *ss)
{
- struct cgroup *cgrp;
-
cgroup_assert_mutexes_or_rcu_locked();

- cgrp = idr_find(&ss->root->cgroup_idr, id);
- if (cgrp)
- return cgroup_css(cgrp, ss);
- return NULL;
+ return idr_find(&ss->css_idr, id);
}

#ifdef CONFIG_CGROUP_DEBUG
--
2.0.0-rc0

2014-04-22 08:08:48

by Jianyu Zhan

[permalink] [raw]
Subject: Re: [PATCH 2/4] mm/memcontrol.c: use accessor to get id from css

On Tue, Apr 22, 2014 at 2:30 PM, Jianyu Zhan <[email protected]> wrote:
> This is a prepared patch for converting from per-cgroup id to
> per-subsystem id.
>
> We should not access per-cgroup id directly, since this is implemetation
> detail. Use the accessor css_from_id() instead.
>
> This patch has no functional change.

Hi, I'm sorry. This patch should be applied on top of its previous patch:
https://lkml.org/lkml/2014/4/22/54

Sorry for my fault , not cc'ing this mail-list in that patch.

Thanks,
Jianyu Zhan

2014-04-22 10:01:09

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH 2/4] mm/memcontrol.c: use accessor to get id from css

On Tue 22-04-14 14:30:41, Jianyu Zhan wrote:
> This is a prepared patch for converting from per-cgroup id to
> per-subsystem id.
>
> We should not access per-cgroup id directly, since this is implemetation
> detail. Use the accessor css_from_id() instead.
>
> This patch has no functional change.
>
> Signed-off-by: Jianyu Zhan <[email protected]>

Acked-by: Michal Hocko <[email protected]>
Thanks!

> ---
> mm/memcontrol.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 80d9e38..46333cb 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -528,10 +528,10 @@ static inline bool mem_cgroup_is_root(struct mem_cgroup *memcg)
> static inline unsigned short mem_cgroup_id(struct mem_cgroup *memcg)
> {
> /*
> - * The ID of the root cgroup is 0, but memcg treat 0 as an
> - * invalid ID, so we return (cgroup_id + 1).
> + * The ID of css for the root cgroup is 0, but memcg treat 0 as an
> + * invalid ID, so we return (id + 1).
> */
> - return memcg->css.cgroup->id + 1;
> + return css_to_id(&memcg->css) + 1;
> }
>
> static inline struct mem_cgroup *mem_cgroup_from_id(unsigned short id)
> @@ -6407,7 +6407,7 @@ mem_cgroup_css_online(struct cgroup_subsys_state *css)
> struct mem_cgroup *memcg = mem_cgroup_from_css(css);
> struct mem_cgroup *parent = mem_cgroup_from_css(css_parent(css));
>
> - if (css->cgroup->id > MEM_CGROUP_ID_MAX)
> + if (css_to_id(css) > MEM_CGROUP_ID_MAX)
> return -ENOSPC;
>
> if (!parent)
> --
> 2.0.0-rc0
>

--
Michal Hocko
SUSE Labs

2014-04-22 11:27:58

by Neil Horman

[permalink] [raw]
Subject: Re: [PATCH 3/4] netprio_cgroup: use accessor to get id from css

On Tue, Apr 22, 2014 at 02:31:01PM +0800, Jianyu Zhan wrote:
> This is a prepared patch for converting from per-cgroup id to
> per-subsystem id.
>
> We should not access per-cgroup id directly, since this is
> implemetation detail.
>
> Use the accessor css_from_id() instead.
>
> This patch has no functional change.
>
> Signed-off-by: Jianyu Zhan <[email protected]>
> ---
> net/core/netprio_cgroup.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
> index ce285c6..fc21035 100644
> --- a/net/core/netprio_cgroup.c
> +++ b/net/core/netprio_cgroup.c
> @@ -89,7 +89,7 @@ static int extend_netdev_table(struct net_device *dev, u32 target_idx)
> static u32 netprio_prio(struct cgroup_subsys_state *css, struct net_device *dev)
> {
> struct netprio_map *map = rcu_dereference_rtnl(dev->priomap);
> - int id = css->cgroup->id;
> + int id = css_to_id(css);
>
> if (map && id < map->priomap_len)
> return map->priomap[id];
> @@ -109,7 +109,7 @@ static int netprio_set_prio(struct cgroup_subsys_state *css,
> struct net_device *dev, u32 prio)
> {
> struct netprio_map *map;
> - int id = css->cgroup->id;
> + int id = css_to_id(css);
> int ret;
>
> /* avoid extending priomap for zero writes */
> @@ -170,7 +170,7 @@ static void cgrp_css_free(struct cgroup_subsys_state *css)
>
> static u64 read_prioidx(struct cgroup_subsys_state *css, struct cftype *cft)
> {
> - return css->cgroup->id;
> + return css_to_id(css);
> }
>
> static int read_priomap(struct seq_file *sf, void *v)
> @@ -222,7 +222,7 @@ static void net_prio_attach(struct cgroup_subsys_state *css,
> struct cgroup_taskset *tset)
> {
> struct task_struct *p;
> - void *v = (void *)(unsigned long)css->cgroup->id;
> + void *v = (void *)(unsigned long)css_to_id(css);
>
> cgroup_taskset_for_each(p, tset) {
> task_lock(p);
> --
> 2.0.0-rc0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
Acked-by: Neil Horman <[email protected]>

2014-04-22 20:31:36

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 1/4] cgroup: introduce helper css_to_id()

On Tue, Apr 22, 2014 at 02:30:27PM +0800, Jianyu Zhan wrote:
> This is a prepared patch for converting from per-cgroup id to
> per-subsystem id.
>
> Some subsystems dereference the per-cgrpu id directly, but this is
> implementation-specific, so it should be transparent for subsystems.

Hmm... why would cgroup ID be implementation-specific? It's a
published field

> Use this accessor instead.

I'm not a big believer of trivial accessors. They tend to obfuscate
things more than helping anything. Here, we need to switch from
cgrp->id to css->id. Wrapping cgrp->id by css_to_id() doesn't really
help anything especially because there will be cases where we'd
actually want cgroup IDs instead of css IDs too.

Thanks.

--
tejun

2014-04-22 20:33:56

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 3/4] netprio_cgroup: use accessor to get id from css

On Tue, Apr 22, 2014 at 02:31:01PM +0800, Jianyu Zhan wrote:
> This is a prepared patch for converting from per-cgroup id to
> per-subsystem id.
>
> We should not access per-cgroup id directly, since this is
> implemetation detail.
>
> Use the accessor css_from_id() instead.
>
> This patch has no functional change.

netprio_cgroup is a cgroup identifying controller. We *do* want to
use per-cgroup IDs for it.

Thanks.

--
tejun

2014-04-22 20:36:44

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 4/4] cgroup: convert from per-cgroup id to per-subsys id

On Tue, Apr 22, 2014 at 04:03:31PM +0800, Jianyu Zhan wrote:
> Hi, all.
>
> Sorry, previous patch has a minor fault, and cause a unitialized variable warning.
> I've fixed it up in this.
>
> Renewed patch:
> ---
>
> Currently, cgrp->id is only used to look up css's. As cgroup and
> css's lifetimes is now decoupled, it should be made per-subsystem
> and moved to css->css_id so that lookups are successful until the
> target css is released.

So, css ID can't replace cgroup ID as whole. It's just that there are
cases where we're currently using cgroup IDs where we should be using
css IDs. Also, we need to update css creation error path RCU safe so
that idr lookup is always RCU safe. I already had patches queued for
posting. Will post them soon.

Thanks.

--
tejun

2014-04-23 06:41:42

by Jianyu Zhan

[permalink] [raw]
Subject: Re: [PATCH 2/4] mm/memcontrol.c: use accessor to get id from css

On Wed, Apr 23, 2014 at 3:49 AM, Andrew Morton
<[email protected]> wrote:
> I'd expect Tejun to process this series, but you didn't cc him on 2/4.

Oh, I reposed too much confidence in get_maintainer.pl. I thought it
would cc tj as usual. :-)

Tj said he has a patchset queued for addressing this problem and will
be sent out soon,
so just forget this patch and wait.


Thanks,
Jianyu Zhan