2022-05-26 03:02:55

by Michal Koutný

[permalink] [raw]
Subject: [PATCH 0/2] cgroup_subsys_state lifecycle fixups

Two corner cases were hanging around [2][3] related to css lifecycles,
since they're loosely related I'm sending them together.

The 2nd patch fixes problems encountered in syzbot tests only.
Alternative solutions could be:
- daisy-chain css_release_work_fn from the offending css_killed_work_fn call,
- rework kill_css() not to rely on multi-stage css_killed_work_fn() [1].

The simplest approach was chosen.

The other existing users of percpu_ref_kill_and_confirm are not affected by
similar issues.

[1] Rough idea is to only synchronize via a completion like e.g.
nvmet_sq_destroy() does and move most of css_killed_work_fn() at the end of
kill_css(). kill_css() is only used in process context when de-configuring
controllers or rmdiring a cgroup.

[2] https://lore.kernel.org/lkml/[email protected]/
[3] https://lore.kernel.org/lkml/[email protected]/

Michal Koutný (2):
cgroup: Wait for cgroup_subsys_state offlining on unmount
cgroup: Use separate work structs on css release path

include/linux/cgroup-defs.h | 5 +++--
kernel/cgroup/cgroup.c | 19 +++++++++++--------
2 files changed, 14 insertions(+), 10 deletions(-)

--
2.35.3