Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752703AbcLFSXW (ORCPT ); Tue, 6 Dec 2016 13:23:22 -0500 Received: from mail-pg0-f65.google.com ([74.125.83.65]:36224 "EHLO mail-pg0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751327AbcLFSXS (ORCPT ); Tue, 6 Dec 2016 13:23:18 -0500 Date: Tue, 6 Dec 2016 13:23:15 -0500 From: Tejun Heo To: Andy Lutomirski Cc: John Stultz , Alexei Starovoitov , Andy Lutomirski , =?iso-8859-1?Q?Micka=EBl_Sala=FCn?= , Daniel Mack , "David S. Miller" , kafai@fb.com, Florian Westphal , Harald Hoyer , Network Development , Sargun Dhillon , Pablo Neira Ayuso , lkml , Li Zefan , Jonathan Corbet , "open list:CONTROL GROUP (CGROUP)" , Android Kernel Team , Rom Lemarchand , Colin Cross , Dmitry Shmidt , Todd Kjos , Christian Poetzsch , Amit Pundir , Dmitry Torokhov , Kees Cook , "Serge E . Hallyn" , Linux API Subject: Re: [RESEND][PATCH v4] cgroup: Use CAP_SYS_RESOURCE to allow a process to migrate other tasks between cgroups Message-ID: <20161206182315.GB2625@mtj.duckdns.org> References: <20161109000342.GA42532@ast-mbp.thefacebook.com> <20161206165519.GA17648@mtj.duckdns.org> <20161206181221.GA2625@mtj.duckdns.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.7.1 (2016-10-04) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4066 Lines: 103 Hello, On Tue, Dec 06, 2016 at 10:13:53AM -0800, Andy Lutomirski wrote: > > Delegation is an explicit operation and reflected in the ownership of > > the subdirectories and cgroup interface files in them. The > > subhierarchy containment is achieved by requiring the user who's > > trying to migrate a process to have write perm on cgroup.procs on the > > common ancestor of the source and target in addition to the target. > > OK, I see what you're doing. That's interesting. It's something born out of usages of cgroup v1. People used it that way (chowning files and directories) and combined with the uid checksn it yielded something which is useful sometimes, but it always had issues with hierarchical behaviors, which files to chmod and the weird combination of uid checks. cgroup v2 has a clear delegation model but the uid checks are still left in as not changing was the default. It's not necessary and I'm thinking about queueing something like the following in the next cycle. As for the android CAP discussion, I think it'd be nice to share an existing CAP but if we can't find a good one to share, let's create a new one. Thanks. diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt index 4cc07ce..34b4b44 100644 --- a/Documentation/cgroup-v2.txt +++ b/Documentation/cgroup-v2.txt @@ -328,14 +328,12 @@ a process with a non-root euid to migrate a target process into a cgroup by writing its PID to the "cgroup.procs" file, the following conditions must be met. -- The writer's euid must match either uid or suid of the target process. - - The writer must have write access to the "cgroup.procs" file. - The writer must have write access to the "cgroup.procs" file of the common ancestor of the source and destination cgroups. -The above three constraints ensure that while a delegatee may migrate +The above two constraints ensure that while a delegatee may migrate processes around freely in the delegated sub-hierarchy it can't pull in from or push out to outside the sub-hierarchy. @@ -350,10 +348,10 @@ all processes under C0 and C1 belong to U0. Let's also say U0 wants to write the PID of a process which is currently in C10 into "C00/cgroup.procs". U0 has write access to the -file and uid match on the process; however, the common ancestor of the -source cgroup C10 and the destination cgroup C00 is above the points -of delegation and U0 would not have write access to its "cgroup.procs" -files and thus the write will be denied with -EACCES. +file; however, the common ancestor of the source cgroup C10 and the +destination cgroup C00 is above the points of delegation and U0 would +not have write access to its "cgroup.procs" files and thus the write +will be denied with -EACCES. 2-6. Guidelines diff --git a/kernel/cgroup.c b/kernel/cgroup.c index 85bc9be..49384ff 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -2854,12 +2854,18 @@ static int cgroup_procs_write_permission(struct task_struct *task, * even if we're attaching all tasks in the thread group, we only * need to check permissions on one of them. */ - if (!uid_eq(cred->euid, GLOBAL_ROOT_UID) && - !uid_eq(cred->euid, tcred->uid) && - !uid_eq(cred->euid, tcred->suid)) - ret = -EACCES; - if (!ret && cgroup_on_dfl(dst_cgrp)) { + /* root is allowed to do anything */ + if (uid_eq(cred->euid, GLOBAL_ROOT_UID)) + goto out; + + /* + * On v2, follow the delegation model. Inside a delegated subtree, + * the delegatee can move around the processes however it sees fit. + * + * On v1, a process should match one of the target's uids. + */ + if (cgroup_on_dfl(dst_cgrp)) { struct super_block *sb = of->file->f_path.dentry->d_sb; struct cgroup *cgrp; struct inode *inode; @@ -2877,8 +2883,11 @@ static int cgroup_procs_write_permission(struct task_struct *task, ret = inode_permission(inode, MAY_WRITE); iput(inode); } + } else if (!uid_eq(cred->euid, tcred->uid) && + !uid_eq(cred->euid, tcred->suid)) { + ret = -EACCES; } - +out: put_cred(tcred); return ret; }