Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932418AbbFDRLW (ORCPT ); Thu, 4 Jun 2015 13:11:22 -0400 Received: from mail-ig0-f179.google.com ([209.85.213.179]:36971 "EHLO mail-ig0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932293AbbFDRLS (ORCPT ); Thu, 4 Jun 2015 13:11:18 -0400 MIME-Version: 1.0 In-Reply-To: <20150603055057.GE20091@mtj.duckdns.org> References: <1432179674-19154-1-git-send-email-john.stultz@linaro.org> <20150603055057.GE20091@mtj.duckdns.org> Date: Thu, 4 Jun 2015 10:11:17 -0700 Message-ID: Subject: Re: [RFC][PATCH 0/2] Android style loosening of cgroup attach permissions From: John Stultz To: Tejun Heo Cc: lkml , Li Zefan , Jonathan Corbet , cgroups@vger.kernel.org, Android Kernel Team , Rom Lemarchand , Colin Cross , Johannes Weiner Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3818 Lines: 80 On Tue, Jun 2, 2015 at 10:50 PM, Tejun Heo wrote: > Hello, John. > > On Tue, Jun 02, 2015 at 12:07:23PM -0700, John Stultz wrote: >> On Wed, May 20, 2015 at 8:41 PM, John Stultz wrote: >> > As a heads up, this is just a first RFC and not a submission. >> > >> > Android currently loosens the cgroup attchment permissions, allowing >> > tasks with CAP_SYS_NICE to be able to allow tasks to move arbitrary >> > tasks across cgroups. >> > >> > At first glance, overloading CAP_SYS_NICE seems a bit hackish, but this >> > shows that there is a active and widely deployed use for different cgroup >> > attachment rules then what is currently available. >> > >> > I've tried to rework the patches so this attachment policy is build >> > time configurable, and wanted to send them out for review so folks >> > might give their thoughts on this implementation and what they might >> > see as a better way to go about achieving the same goal. >> > >> > Thoughts and feedback would be appriciated! >> >> Ping? Not sure if I hit folks at a busy time or if I didn't cc the right folks? > > Can you explain why this is necessary? android's cgroups usage is > pretty weird at this point. IIRC, it moves tasks between fg and bg > cgroups with memory immigration turned on which basically forces memcg > to walk every single page relabeling them on each fg/bg switch, which > is a pretty silly thing to do and for this android kernel also removes > a rcu sync operation which makes break things on removal of cgroups, > which android doesn't do. So Colin and Rom could likely do a better job explaining this then me, but my understanding is that from the scheduling side, they use cgroups to bound cpu usage of tasks in various states (ie: foreground applications, background applications, audio application, system audio, and system tasks). This allows for things like audio applications to be SCHED_FIFO but not run-away hogging infinite cpu, and background task cpu usage to be similarly limited. The migration of a task from the foreground to background, or to elevate a task to audio priority, may be done by system service that does not run as root. So this patch allows processes with CAP_SYS_NICE to be able to migrate tasks between cgroups. I suspect if there was a specific cap (CAP_SYS_CHANGE_CGROUP) for this, it would be usable here, but they just overloaded CAP_SYS_NICE. One example of this: https://android.googlesource.com/platform/frameworks/base/+/b267554/services/java/com/android/server/os/SchedulingPolicyService.java As for the move_charge_at_immigrate, that does seem to be true, and I'm not sure at this moment of the details (Rom, Colin?), but I suspect it has to do with the memgc mempressure notifiers. And at least in the current android-3.18 kernel, I don't see any rcu sync modifications. But maybe I'm missing what you mean? > memcg usage came up a while ago and there wasn't anything major which > can't be achieved (usually better) by following more standard cgroup > usage - changing knobs rather than moving tasks around. Do you have a pointer to that discussion or maybe even just a sense of who was involved so I can trawl the list and better understand it? > Given that, I'm not sure this patchset makes sense for upstream. Right, I'm not suggesting the patch go in "as-is". I just wanted to show what Android is currently using, and start a discussion of what would be a better approach for Android to use, or what the kernel might need to be able to support Android's use case. thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/