Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753528AbcKIAEh (ORCPT ); Tue, 8 Nov 2016 19:04:37 -0500 Received: from mail-pf0-f195.google.com ([209.85.192.195]:32810 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751157AbcKIAE3 (ORCPT ); Tue, 8 Nov 2016 19:04:29 -0500 Date: Tue, 8 Nov 2016 16:03:44 -0800 From: Alexei Starovoitov To: Andy Lutomirski Cc: John Stultz , =?iso-8859-1?Q?Micka=EBl_Sala=FCn?= , Daniel Mack , "David S. Miller" , kafai@fb.com, fw@strlen.de, Harald Hoyer , Network Development , Sargun Dhillon , Pablo Neira Ayuso , lkml , Tejun Heo , Li Zefan , Jonathan Corbet , "open list:CONTROL GROUP (CGROUP)" , Android Kernel Team , Rom Lemarchand , Colin Cross , Dmitry Shmidt , Todd Kjos , Christian Poetzsch , Amit Pundir , Dmitry Torokhov , Kees Cook , "Serge E . Hallyn" , Linux API Subject: Re: [RESEND][PATCH v4] cgroup: Use CAP_SYS_RESOURCE to allow a process to migrate other tasks between cgroups Message-ID: <20161109000342.GA42532@ast-mbp.thefacebook.com> References: <1478647728-30357-1-git-send-email-john.stultz@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2447 Lines: 49 On Tue, Nov 08, 2016 at 03:51:40PM -0800, Andy Lutomirski wrote: > On Tue, Nov 8, 2016 at 3:28 PM, John Stultz wrote: > > This patch adds logic to allows a process to migrate other tasks > > between cgroups if they have CAP_SYS_RESOURCE. > > > > In Android (where this feature originated), the ActivityManager tracks > > various application states (TOP_APP, FOREGROUND, BACKGROUND, SYSTEM, > > etc), and then as applications change states, the SchedPolicy logic > > will migrate the application tasks between different cgroups used > > to control the different application states (for example, there is a > > background cpuset cgroup which can limit background tasks to stay > > on one low-power cpu, and the bg_non_interactive cpuctrl cgroup can > > then further limit those background tasks to a small percentage of > > that one cpu's cpu time). > > > > However, for security reasons, Android doesn't want to make the > > system_server (the process that runs the ActivityManager and > > SchedPolicy logic), run as root. So in the Android common.git > > kernel, they have some logic to allow cgroups to loosen their > > permissions so CAP_SYS_NICE tasks can migrate other tasks between > > cgroups. > > > > I feel the approach taken there overloads CAP_SYS_NICE a bit much > > for non-android environments. > > > > So this patch, as suggested by Michael Kerrisk, simply adds a > > check for CAP_SYS_RESOURCE. > > > > I've tested this with AOSP master, and this seems to work well > > as Zygote and system_server already use CAP_SYS_RESOURCE. I've > > also submitted patches against the android-4.4 kernel to change > > it to use CAP_SYS_RESOURCE, and the Android developers just merged > > it. > > > > I hate to say it, but I think I may see a problem. Current > developments are afoot to make cgroups do more than resource control. > For example, there's Landlock and there's Daniel's ingress/egress > filter thing. Current cgroup controllers can mostly just DoS their > controlled processes. These new controllers (or controller-like > things) can exfiltrate data and change semantics. > > Does anyone have a security model in mind for these controllers and > the cgroups that they're attached to? I'm reasonably confident that > CAP_SYS_RESOURCE is not the answer... and specifically the answer is... ? Also would be great if you start with specifying the question first and the problem you're trying to solve.