Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753391AbcKHXwM (ORCPT ); Tue, 8 Nov 2016 18:52:12 -0500 Received: from mail.kernel.org ([198.145.29.136]:36428 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753000AbcKHXwI (ORCPT ); Tue, 8 Nov 2016 18:52:08 -0500 MIME-Version: 1.0 In-Reply-To: <1478647728-30357-1-git-send-email-john.stultz@linaro.org> References: <1478647728-30357-1-git-send-email-john.stultz@linaro.org> From: Andy Lutomirski Date: Tue, 8 Nov 2016 15:51:40 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RESEND][PATCH v4] cgroup: Use CAP_SYS_RESOURCE to allow a process to migrate other tasks between cgroups To: John Stultz , Alexei Starovoitov , =?UTF-8?B?TWlja2HDq2wgU2FsYcO8bg==?= , Daniel Mack , "David S. Miller" , kafai@fb.com, fw@strlen.de, Harald Hoyer , Network Development , Sargun Dhillon , Pablo Neira Ayuso Cc: lkml , Tejun Heo , Li Zefan , Jonathan Corbet , "open list:CONTROL GROUP (CGROUP)" , Android Kernel Team , Rom Lemarchand , Colin Cross , Dmitry Shmidt , Todd Kjos , Christian Poetzsch , Amit Pundir , Dmitry Torokhov , Kees Cook , "Serge E . Hallyn" , Linux API Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2149 Lines: 44 On Tue, Nov 8, 2016 at 3:28 PM, John Stultz wrote: > This patch adds logic to allows a process to migrate other tasks > between cgroups if they have CAP_SYS_RESOURCE. > > In Android (where this feature originated), the ActivityManager tracks > various application states (TOP_APP, FOREGROUND, BACKGROUND, SYSTEM, > etc), and then as applications change states, the SchedPolicy logic > will migrate the application tasks between different cgroups used > to control the different application states (for example, there is a > background cpuset cgroup which can limit background tasks to stay > on one low-power cpu, and the bg_non_interactive cpuctrl cgroup can > then further limit those background tasks to a small percentage of > that one cpu's cpu time). > > However, for security reasons, Android doesn't want to make the > system_server (the process that runs the ActivityManager and > SchedPolicy logic), run as root. So in the Android common.git > kernel, they have some logic to allow cgroups to loosen their > permissions so CAP_SYS_NICE tasks can migrate other tasks between > cgroups. > > I feel the approach taken there overloads CAP_SYS_NICE a bit much > for non-android environments. > > So this patch, as suggested by Michael Kerrisk, simply adds a > check for CAP_SYS_RESOURCE. > > I've tested this with AOSP master, and this seems to work well > as Zygote and system_server already use CAP_SYS_RESOURCE. I've > also submitted patches against the android-4.4 kernel to change > it to use CAP_SYS_RESOURCE, and the Android developers just merged > it. > I hate to say it, but I think I may see a problem. Current developments are afoot to make cgroups do more than resource control. For example, there's Landlock and there's Daniel's ingress/egress filter thing. Current cgroup controllers can mostly just DoS their controlled processes. These new controllers (or controller-like things) can exfiltrate data and change semantics. Does anyone have a security model in mind for these controllers and the cgroups that they're attached to? I'm reasonably confident that CAP_SYS_RESOURCE is not the answer...