2022-06-23 13:12:21

by Michal Koutný

[permalink] [raw]
Subject: [RFC PATCH] cpuset: Allow setscheduler regardless of manipulated task

When we migrate a task between two cgroups, one of the checks is a
verification that we can modify task's scheduler settings
(cap_task_setscheduler()).

An implicit migration occurs also when enabling a controller on the
unified hierarchy (think of parent to child migration). The
aforementioned check may be problematic if the caller of the migration
(enabling a controller) has no permissions over migrated tasks.
For instance, user's cgroup that ends up running a process of a
different user. Although cgroup permissions are configured favorably,
the enablement fails due to the foreign process [1].

Change the behavior by relaxing the permissions check on the unified
hierarchy (or in v2 mode). This is in accordance with unified hierarchy
attachment behavior when permissions of the source to target cgroups are
decisive whereas the migrated task is opaque (for contrast, see more
restrictive check in __cgroup1_procs_write()).

[1] https://github.com/systemd/systemd/issues/18293#issuecomment-831205649

Reasons for RFC:

1) The unified hierarchy attachment behavior -- is that the
right/consented model that migrated objects don't matter?

2) If 1) is true, have I missed any danger in allowing cpuset'ing a
possibly privileged processes?

2.2) cpuset may be in v2 mode even on v1 hierarchy with different
migration control rules (but checking migratee's creds in v1
eliminates effect of this patch).

3) Alternative approach would be to allow cpuset migrations only when
nothing effectively changes (which is the case for parent->child
migration upon controller enablement).

4) This is just idea draft, not tested in the real case.

Thanks.

Signed-off-by: Michal Koutný <[email protected]>
---
kernel/cgroup/cpuset.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 71a418858a5e..dbe78577de5b 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -2232,7 +2232,9 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)

percpu_down_write(&cpuset_rwsem);

- /* allow moving tasks into an empty cpuset if on default hierarchy */
+ /* allow moving tasks into an empty cpuset if on default hierarchy,
+ * also bypass permission check (access is controlled via cgroup(s)
+ * owner in cgroup_procs_write_permission()) */
ret = -ENOSPC;
if (!is_in_v2_mode() &&
(cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed)))
@@ -2242,6 +2244,9 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
ret = task_can_attach(task, cs->cpus_allowed);
if (ret)
goto out_unlock;
+
+ if (is_in_v2_mode())
+ continue;
ret = security_task_setscheduler(task);
if (ret)
goto out_unlock;
--
2.35.3


2022-06-23 19:40:22

by Waiman Long

[permalink] [raw]
Subject: Re: [RFC PATCH] cpuset: Allow setscheduler regardless of manipulated task

On 6/23/22 08:49, Michal Koutný wrote:
> When we migrate a task between two cgroups, one of the checks is a
> verification that we can modify task's scheduler settings
> (cap_task_setscheduler()).
>
> An implicit migration occurs also when enabling a controller on the
> unified hierarchy (think of parent to child migration). The
> aforementioned check may be problematic if the caller of the migration
> (enabling a controller) has no permissions over migrated tasks.
> For instance, user's cgroup that ends up running a process of a
> different user. Although cgroup permissions are configured favorably,
> the enablement fails due to the foreign process [1].
>
> Change the behavior by relaxing the permissions check on the unified
> hierarchy (or in v2 mode). This is in accordance with unified hierarchy
> attachment behavior when permissions of the source to target cgroups are
> decisive whereas the migrated task is opaque (for contrast, see more
> restrictive check in __cgroup1_procs_write()).
>
> [1] https://github.com/systemd/systemd/issues/18293#issuecomment-831205649
>
> Reasons for RFC:
>
> 1) The unified hierarchy attachment behavior -- is that the
> right/consented model that migrated objects don't matter?
>
> 2) If 1) is true, have I missed any danger in allowing cpuset'ing a
> possibly privileged processes?
That could be an issue.

> 2.2) cpuset may be in v2 mode even on v1 hierarchy with different
> migration control rules (but checking migratee's creds in v1
> eliminates effect of this patch).
>
> 3) Alternative approach would be to allow cpuset migrations only when
> nothing effectively changes (which is the case for parent->child
> migration upon controller enablement).
What do you mean by nothing effectively changes?
>
> 4) This is just idea draft, not tested in the real case.

Since the check is done on a taskset level, if only one of the tasks in
the taskset fails, the whole taskset fails. Maybe we should consider an
option for task based migration. So all the tasks that can be migrated
will be migrated and the rests will be left behind in the original
cpuset. Just a thought.

Cheers,
Longman

2022-06-24 12:55:34

by Michal Koutný

[permalink] [raw]
Subject: Re: [RFC PATCH] cpuset: Allow setscheduler regardless of manipulated task

On Thu, Jun 23, 2022 at 02:44:33PM -0400, Waiman Long <[email protected]> wrote:
> That could be an issue.

The way how I understand here is that the privileged process isn't part
of contrained workload (if it were then, it can modify cpuset itself)
like debugging or tracing a code within a container entered by the
admin. The bypass could not be used to setscheduler (via migration) of
an arbitrary process.

> What do you mean by nothing effectively changes?

It's a freshly created child (after cpuset_css_online()), so it inherits
parent's attributes, so the migration from the parent to this child
doesn't affect CPU affinity etc.

> Since the check is done on a taskset level, if only one of the tasks in the
> taskset fails, the whole taskset fails. Maybe we should consider an option
> for task based migration. So all the tasks that can be migrated will be
> migrated and the rests will be left behind in the original cpuset.

Hm, I haven't thought about that. That might be in theory possible for
threaded controllers (like cpuset) but I imagine it'd a bit messy, in
particular for these implicit migrations upon controller enablement.

Thanks,
Michal


Attachments:
(No filename) (1.16 kB)
signature.asc (235.00 B)
Digital signature
Download all attachments

2022-06-25 04:34:48

by Tejun Heo

[permalink] [raw]
Subject: Re: [RFC PATCH] cpuset: Allow setscheduler regardless of manipulated task

Hello,

On Thu, Jun 23, 2022 at 02:49:44PM +0200, Michal Koutn? wrote:
> 1) The unified hierarchy attachment behavior -- is that the
> right/consented model that migrated objects don't matter?

Yes.

> 2) If 1) is true, have I missed any danger in allowing cpuset'ing a
> possibly privileged processes?

Given that the someone who has write perm on the cgroup or its
ancestors are allowed to change cpuset config itself, I have a hard
time imagining that check being all that useful as a security
mechanism.

> 2.2) cpuset may be in v2 mode even on v1 hierarchy with different
> migration control rules (but checking migratee's creds in v1
> eliminates effect of this patch).

Yeah, it should be fine to apply the change only to v2.

> 3) Alternative approach would be to allow cpuset migrations only when
> nothing effectively changes (which is the case for parent->child
> migration upon controller enablement).
>
> 4) This is just idea draft, not tested in the real case.

Unless I'm missing something obvious, I don't see a reason to keep the
check, so please feel free to test and send a SOB'd patch.

Thanks.

--
tejun