2022-01-12 23:12:27

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH v5 10/16] mm: list_lru: allocate list_lru_one only when needed

On Thu, Jan 6, 2022 at 7:00 PM Michal Koutný <[email protected]> wrote:
>
> On Mon, Dec 20, 2021 at 04:56:43PM +0800, Muchun Song <[email protected]> wrote:
> (Thanks for pointing me here.)
>
> > -void memcg_drain_all_list_lrus(int src_idx, struct mem_cgroup *dst_memcg)
> > +void memcg_drain_all_list_lrus(struct mem_cgroup *src, struct mem_cgroup *dst)
> > {
> > + struct cgroup_subsys_state *css;
> > struct list_lru *lru;
> > + int src_idx = src->kmemcg_id;
> > +
> > + /*
> > + * Change kmemcg_id of this cgroup and all its descendants to the
> > + * parent's id, and then move all entries from this cgroup's list_lrus
> > + * to ones of the parent.
> > + *
> > + * After we have finished, all list_lrus corresponding to this cgroup
> > + * are guaranteed to remain empty. So we can safely free this cgroup's
> > + * list lrus in memcg_list_lru_free().
> > + *
> > + * Changing ->kmemcg_id to the parent can prevent memcg_list_lru_alloc()
> > + * from allocating list lrus for this cgroup after memcg_list_lru_free()
> > + * call.
> > + */
> > + rcu_read_lock();
> > + css_for_each_descendant_pre(css, &src->css) {
> > + struct mem_cgroup *memcg;
> > +
> > + memcg = mem_cgroup_from_css(css);
> > + memcg->kmemcg_id = dst->kmemcg_id;
> > + }
> > + rcu_read_unlock();
>
> Do you envision using this function anywhere else beside offlining?
> If not, you shouldn't need traversing whole subtree because normally
> parents are offlined only after children (see cgroup_subsys_state.online_cnt).
>
> >
> > mutex_lock(&list_lrus_mutex);
> > list_for_each_entry(lru, &memcg_list_lrus, list)
> > - memcg_drain_list_lru(lru, src_idx, dst_memcg);
> > + memcg_drain_list_lru(lru, src_idx, dst);
> > mutex_unlock(&list_lrus_mutex);
>
> If you do, then here you only drain list_lru of the subtree root but not
> the descendants anymore.
>
> So I do get that mem_cgroup.kmemcg_id refernces the "effective"
> kmemcg_id after offlining, so that proper list_lrus are used afterwards.
>
> I wonder -- is this necessary when objcgs are reparented too? IOW would
> anyone query the offlined child's kmemcg_id?
> (Maybe it's worth explaining better in the commit message, I think even
> current approach is OK (better safe than sorry).)
>

Sorry for the late reply.

Image a bunch of memcg as follows. C's parent is B, B's parent is A and
A's parent is root. The numbers in parentheses are their kmemcg_id
respectively.

root(-1) -> A(0) -> B(1) -> C(2)

CPU0: CPU1:
memcg_list_lru_alloc(C)
memcg_drain_all_list_lrus(C)
memcg_drain_all_list_lrus(B)
// Now C and B are offline. The
// kmemcg_id becomes the following if
// we do not the kmemcg_id of its
// descendants in
// memcg_drain_all_list_lrus().
//
// root(-1) -> A(0) -> B(0) -> C(1)

for (i = 0; memcg; memcg = parent_mem_cgroup(memcg), i++) {
// allocate struct list_lru_per_memcg for memcg C
table[i].mlru = memcg_init_list_lru_one(gfp);
}

spin_lock_irqsave(&lru->lock, flags);
while (i--) {
// here index = 1
int index = table[i].memcg->kmemcg_id;

struct list_lru_per_memcg *mlru = table[i].mlru;
if (index < 0 || rcu_dereference_protected(mlrus->mlru[index], true))
kfree(mlru);
else
// mlrus->mlru[index] will be assigned a new value regardless
// memcg C is already offline.
rcu_assign_pointer(mlrus->mlru[index], mlru);
}
spin_unlock_irqrestore(&lru->lock, flags);

So changing ->kmemcg_id of all its descendants can prevent
memcg_list_lru_alloc() from allocating list lrus for the offlined
cgroup after memcg_list_lru_free() calling.

Thanks.


2022-01-13 21:34:55

by Michal Koutný

[permalink] [raw]
Subject: Re: [PATCH v5 10/16] mm: list_lru: allocate list_lru_one only when needed

On Wed, Jan 12, 2022 at 09:22:36PM +0800, Muchun Song <[email protected]> wrote:
> root(-1) -> A(0) -> B(1) -> C(2)
>
> CPU0: CPU1:
> memcg_list_lru_alloc(C)
> memcg_drain_all_list_lrus(C)
> memcg_drain_all_list_lrus(B)
> // Now C and B are offline. The
> // kmemcg_id becomes the following if
> // we do not the kmemcg_id of its
> // descendants in
> // memcg_drain_all_list_lrus().
> //
> // root(-1) -> A(0) -> B(0) -> C(1)
>
> for (i = 0; memcg; memcg = parent_mem_cgroup(memcg), i++) {
> // allocate struct list_lru_per_memcg for memcg C
> table[i].mlru = memcg_init_list_lru_one(gfp);
> }
>
> spin_lock_irqsave(&lru->lock, flags);
> while (i--) {
> // here index = 1
> int index = table[i].memcg->kmemcg_id;
>
> struct list_lru_per_memcg *mlru = table[i].mlru;
> if (index < 0 || rcu_dereference_protected(mlrus->mlru[index], true))
> kfree(mlru);
> else
> // mlrus->mlru[index] will be assigned a new value regardless
> // memcg C is already offline.
> rcu_assign_pointer(mlrus->mlru[index], mlru);
> }
> spin_unlock_irqrestore(&lru->lock, flags);
>

> So changing ->kmemcg_id of all its descendants can prevent
> memcg_list_lru_alloc() from allocating list lrus for the offlined
> cgroup after memcg_list_lru_free() calling.

Thanks for the illustrative example. I can see how this can be a problem
in a general call of memcg_list_lru_alloc(C).

However, the code, as I understand it, resolves the memcg to which lru
allocation should be associated via get_mem_cgroup_from_objcg() and
memcg_reparent_list_lrus(C) comes after memcg_reparent_objcgs(C, B),
i.e. the allocation would target B (or even A if after
memcg_reparent_objcgs(B, A))?

It seems to me like "wasting" the existing objcg reparenting mechanism.
Or what do you think could be a problem relying on it?

Thanks,
Michal