Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759228Ab0KPCzs (ORCPT ); Mon, 15 Nov 2010 21:55:48 -0500 Received: from fgwmail6.fujitsu.co.jp ([192.51.44.36]:54885 "EHLO fgwmail6.fujitsu.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755415Ab0KPCzr (ORCPT ); Mon, 15 Nov 2010 21:55:47 -0500 X-SecurityPolicyCheck-FJ: OK by FujitsuOutboundMailChecker v1.3.1 Date: Tue, 16 Nov 2010 11:50:13 +0900 From: KAMEZAWA Hiroyuki To: Vivek Goyal Cc: Jens Axboe , linux kernel mailing list , Gui Jianfeng , Balbir Singh , Li Zefan , Nauman Rafique , "Daniel P. Berrange" Subject: Re: [RFC] blk-cgroup: Allow creation of hierarchical cgroups Message-Id: <20101116115013.5ec4e452.kamezawa.hiroyu@jp.fujitsu.com> In-Reply-To: <20101102222030.GI7198@redhat.com> References: <20101102222030.GI7198@redhat.com> Organization: FUJITSU Co. LTD. X-Mailer: Sylpheed 3.0.3 (GTK+ 2.10.14; i686-pc-mingw32) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4683 Lines: 120 On Tue, 2 Nov 2010 18:20:30 -0400 Vivek Goyal wrote: > o Allow hierarchical cgroup creation for blkio controller > > o Currently we disallow it as both the io controller policies (throttling > as well as proportion bandwidth) do not support hierarhical accounting > and control. But the flip side is that blkio controller can not be used with > libvirt as libvirt creates a cgroup hierarchy deeper than 1 level. > > //libvirt/qemu/ > > o So this patch will allow creation of cgroup hierarhcy but at the backend > everything will be treated as flat. So if somebody created a an hierarchy > like as follows. > > root > / \ > test1 test2 > | > test3 > > CFQ and throttling will practically treat all groups at same level. > > pivot > / | \ \ > root test1 test2 test3 > > o Once we have actual support for hierarchical accounting and control > then we can introduce another cgroup tunable file "blkio.use_hierarchy" > which will be 0 by default but if user wants to enforce hierarhical > control then it can be set to 1. This way there should not be any > ABI problems down the line. > > o The only not so pretty part is introduction of extra file "use_hierarchy" > down the line. Kame-san had mentioned that hierarhical accounting is > expensive in memory controller hence they keep it off by default. I > suspect same will be the case for IO controller also as for each IO > completion we shall have to account IO through hierarchy up to the root. > if yes, then it probably is not a very bad idea to introduce this extra > file so that it will be used only when somebody needs it and some people > might enable hierarchy only in part of the hierarchy. > > o This is how basically memory controller also uses "use_hierarhcy" and > they also allowed creation of hierarchies when actual backend support > was not available. > > Signed-off-by: Vivek Goyal Thank you! Reviewed-by: KAMEZAWA Hiroyuki > --- > Documentation/cgroups/blkio-controller.txt | 27 +++++++++++++++++++++++++++ > block/blk-cgroup.c | 4 ---- > 2 files changed, 27 insertions(+), 4 deletions(-) > > Index: linux-2.6/block/blk-cgroup.c > =================================================================== > --- linux-2.6.orig/block/blk-cgroup.c 2010-10-28 14:19:02.000000000 -0400 > +++ linux-2.6/block/blk-cgroup.c 2010-11-02 13:10:13.000000000 -0400 > @@ -1452,10 +1452,6 @@ blkiocg_create(struct cgroup_subsys *sub > goto done; > } > > - /* Currently we do not support hierarchy deeper than two level (0,1) */ > - if (parent != cgroup->top_cgroup) > - return ERR_PTR(-EPERM); > - > blkcg = kzalloc(sizeof(*blkcg), GFP_KERNEL); > if (!blkcg) > return ERR_PTR(-ENOMEM); > Index: linux-2.6/Documentation/cgroups/blkio-controller.txt > =================================================================== > --- linux-2.6.orig/Documentation/cgroups/blkio-controller.txt 2010-10-28 14:19:01.000000000 -0400 > +++ linux-2.6/Documentation/cgroups/blkio-controller.txt 2010-11-02 17:51:52.000000000 -0400 > @@ -89,6 +89,33 @@ Throttling/Upper Limit policy > > Limits for writes can be put using blkio.write_bps_device file. > > +Hierarchical Cgroups > +==================== > +- Currently none of the IO control policy supports hierarhical groups. But > + cgroup interface does allow creation of hierarhical cgroups and internally > + IO policies treat them as flat hierarchy. > + > + So this patch will allow creation of cgroup hierarhcy but at the backend > + everything will be treated as flat. So if somebody created a hierarchy like > + as follows. > + > + root > + / \ > + test1 test2 > + | > + test3 > + > + CFQ and throttling will practically treat all groups at same level. > + > + pivot > + / | \ \ > + root test1 test2 test3 > + > + Down the line we can implement hierarchical accounting/control support > + and also introduce a new cgroup file "use_hierarchy" which will control > + whether cgroup hierarchy is viewed as flat or hierarchical by the policy. > + This is how memory controller also has implemented the things. > + > Various user visible config options > =================================== > CONFIG_BLK_CGROUP > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/