Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936573AbcLMRR5 (ORCPT ); Tue, 13 Dec 2016 12:17:57 -0500 Received: from nm3-vm6.bullet.mail.ne1.yahoo.com ([98.138.91.96]:55622 "EHLO nm3-vm6.bullet.mail.ne1.yahoo.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933474AbcLMRRy (ORCPT ); Tue, 13 Dec 2016 12:17:54 -0500 X-Yahoo-Newman-Id: 545371.74415.bm@smtp223.mail.ne1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: zOEYodoVM1n2spUeC48XqRJ05rMbOJB7eWGI26.9Uj34a1q D.s7pRx0rXuLK0Dwi_RwI0ucHrbXI7hHCLAfeZ2QbMGA8EvXjlAxG4gp9rb_ 8FbS34HnjyH.Kqn0qPxXAj73.PvlqZTq2tR.mJy19Dt.OtBAx66akGLU3X3Z Ca7dXGzpY_rJJmxX8rri2EVgoR5GJbaBTPpXSQsIO4QhkTKQoyNxqflDj1.Z 6Zh14T5pSWn0xjbcsoH194i76ApCI72RWURKiNqUp6Ot3Us6e4YOTXslcXrX 9NLJfkXLdyOb.KTAEOh6EBRJyu3r_oxym4eEO5EW0JqoLGTJoWxDegB0Q5Cw m9YSjdyjBeH8VA1Ptr2jmQ6a1oSXoBeXXLuTTrCXFkMuDGzQFLG4v2jmfY.o fVI2DUyD6jsLlK1lZVq6YaFkkwi6xosaSznE5TW_2g3s3osRIL46UC9.FAbA ktMZ1WR.jCdhPNdYLqQmd17D.mwIY_teyxhrUXY2AajIliG7buGvj3yjRzqp tLi0d6lEtdqIRx6cLDwOt1rfZAXa8IB.zLU95vnXQckFYF_XBdBxUYWomXsq p0KW1yo15LV2A X-Yahoo-SMTP: OIJXglSswBDfgLtXluJ6wiAYv6_cnw-- Subject: Re: [PATCH v5] cgroup: Add new capability to allow a process to migrate other tasks between cgroups To: John Stultz References: <1481593143-18756-1-git-send-email-john.stultz@linaro.org> <221e80bd-3d99-6c35-dcd3-b2547f0abb11@schaufler-ca.com> Cc: Michael Kerrisk , lkml , Tejun Heo , Li Zefan , Jonathan Corbet , "open list:CONTROL GROUP (CGROUP)" , Android Kernel Team , Rom Lemarchand , Colin Cross , Dmitry Shmidt , Todd Kjos , Christian Poetzsch , Amit Pundir , Dmitry Torokhov , Kees Cook , "Serge E . Hallyn" , Andy Lutomirski , Linux API From: Casey Schaufler Message-ID: Date: Tue, 13 Dec 2016 09:17:43 -0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4398 Lines: 93 On 12/13/2016 8:49 AM, John Stultz wrote: > On Tue, Dec 13, 2016 at 8:39 AM, Casey Schaufler wrote: >> On 12/13/2016 1:47 AM, Michael Kerrisk (man-pages) wrote: >>> Hi John, >>> >>> On 13 December 2016 at 02:39, John Stultz wrote: >>>> This patch adds CAP_GROUP_MIGRATE and logic to allows a process >>> s/CAP_GROUP_MIGRATE/CAP_CGROUP_MIGRATE/ >>> >>>> to migrate other tasks between cgroups. >>>> >>>> In Android (where this feature originated), the ActivityManager >>>> tracks various application states (TOP_APP, FOREGROUND, >>>> BACKGROUND, SYSTEM, etc), and then as applications change >>>> states, the SchedPolicy logic will migrate the application tasks >>>> between different cgroups used to control the different >>>> application states (for example, there is a background cpuset >>>> cgroup which can limit background tasks to stay on one low-power >>>> cpu, and the bg_non_interactive cpuctrl cgroup can then further >>>> limit those background tasks to a small percentage of that one >>>> cpu's cpu time). >>>> >>>> However, for security reasons, Android doesn't want to make the >>>> system_server (the process that runs the ActivityManager and >>>> SchedPolicy logic), run as root. So in the Android common.git >>>> kernel, they have some logic to allow cgroups to loosen their >>>> permissions so CAP_SYS_NICE tasks can migrate other tasks between >>>> cgroups. >>>> >>>> I feel the approach taken there overloads CAP_SYS_NICE a bit much >>>> for non-android environments. Efforts to re-use CAP_SYS_RESOURCE >>>> for this purpose (which Android has since adopted) was also >>>> stymied by concerns about risks from future cgroups that could be >>>> considered "dangerous" by how they might change system semantics. >>>> >>>> So to avoid overlapping usage, this patch adds a brand new >>>> process capability flag (CAP_CGROUP_MIGRATE), and uses it when >>>> checking if a task can migrate other tasks between cgroups. >>>> >>>> I've tested this with AOSP master (though its a bit hacked in as >>>> I still need to properly get the selinux bits aware of the new >>>> capability bit) with selinux set to permissive and it seems to be >>>> working well. >>>> >>>> Thoughts and feedback would be appreciated! >>> So, back to the discussion of silos. I understand the argument for >>> wanting a new silo. But, in that case can we at least try not to make >>> it a single-use silo? >>> >>> How about CAP_CGROUP_CONTROL or some such, with the idea that this >>> might be a capability that allows the holder to step outside usual >>> cgroup rules? At the moment, that capability would allow only one such >>> step, but maybe there would be others in the future. >> I agree, but want to put it more strongly. The granularity of >> capabilities can never be fine enough for some people, and this >> is an example of a case where you're going a bit too far. If the >> use case is Android as you say, you don't need this. As my friends >> on the far side of the aisle would say, "just write SELinux policy" >> to correctly control access as required. > So.. The trouble is that while selinux is good for restricting > permissions, the in-kernel permission checks here are already too > restrictive. Why did the original authors of cgroups make it that restrictive? If there isn't a good reason, loosen it up. If there is a good reason, then pay heed to it. > It seems one must first loosen things up before we can > tighten it with selinux rules. You're looking at splitting the granularity hair. Is your userspace code really so delicate that it can't handle the existing, "coarse" privilege and needs to protect at the "fine" granularity you're proposing? > Or are you suggesting the system_server > run as root + further selinux limitations? I worry, the Android > developers may still be hesitant to do that. Unlike many of my peers, I am not afraid of running good solid services with privilege. A proper implementation of system_server ought to be able to run completely unconstrained without causing anyone the least concern. I understand all the arguments against that, and am disinclined to get into the religious debates that ensue. So no, I am not going to suggest running system server as root, but I am going to suggest giving it the capability currently required and clamping it down with SELinux policy. > > thanks > -john >