Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965418AbcCORVm (ORCPT ); Tue, 15 Mar 2016 13:21:42 -0400 Received: from mail-wm0-f53.google.com ([74.125.82.53]:36792 "EHLO mail-wm0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750839AbcCORVj (ORCPT ); Tue, 15 Mar 2016 13:21:39 -0400 Date: Tue, 15 Mar 2016 18:21:36 +0100 From: Michal Hocko To: Tejun Heo Cc: torvalds@linux-foundation.org, akpm@linux-foundation.org, a.p.zijlstra@chello.nl, mingo@redhat.com, lizefan@huawei.com, hannes@cmpxchg.org, pjt@google.com, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-api@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCHSET RFC cgroup/for-4.6] cgroup, sched: implement resource group and PRIO_RGRP Message-ID: <20160315172136.GA6114@dhcp22.suse.cz> References: <1457710888-31182-1-git-send-email-tj@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1457710888-31182-1-git-send-email-tj@kernel.org> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1907 Lines: 45 On Fri 11-03-16 10:41:18, Tejun Heo wrote: > Hello, > > This patchset extends cgroup v2 to support rgroup (resource group) for > in-process hierarchical resource control and implements PRIO_RGRP for > setpriority(2) on top to allow in-process hierarchical CPU cycle > control in a seamless way. > > cgroup v1 allowed putting threads of a process in different cgroups > which enabled ad-hoc in-process resource control of some resources. > Unfortunately, this approach was fraught with problems such as > membership ambiguity with per-process resources [Sorry if this has been already discussed, I haven't followed all the discussions regarding this topic] While I agree that per-thread granularity is no fun for controllers which operate on different than task_struct entities (like memory cgroup controller) but I am afraid that all the complications will not go away if we are strictly per-process anyway. For example memcg controller is not strictly per-process either, it operates on the mm_struct and that might be shared between different _processes_. So we still might end up in the same schizophrenic situation where two different processes are living in different cgroups while one of them is silently operating in a different memcg cgroup. I really hate this but this is what our clone(CLONE_VM) (without CLONE_THREAD) allows to do. I do not know about other controllers, maybe only memcg is so special, but that would suggest that even process-only restriction might turn out to be a problem in the future and controllers would have to face the same problem later on. Now I have to admit I do not have great ideas how to cover all the possible cases but wouldn't it make more sense to allow for more flexibility and allow thread migration while the migration can be vetoed by any controller should it cross into a different/incompatible cgroup. [...] Thanks! -- Michal Hocko SUSE Labs