Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932805AbbHYTSs (ORCPT ); Tue, 25 Aug 2015 15:18:48 -0400 Received: from mail-yk0-f174.google.com ([209.85.160.174]:34769 "EHLO mail-yk0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932468AbbHYTSp (ORCPT ); Tue, 25 Aug 2015 15:18:45 -0400 Date: Tue, 25 Aug 2015 15:18:42 -0400 From: Tejun Heo To: Paul Turner Cc: Austin S Hemmelgarn , Peter Zijlstra , Ingo Molnar , Johannes Weiner , lizefan@huawei.com, cgroups , LKML , kernel-team , Linus Torvalds , Andrew Morton Subject: Re: [PATCH 3/3] sched: Implement interface for cgroup unified hierarchy Message-ID: <20150825191842.GD26785@mtj.duckdns.org> References: <20150824170427.GA27262@mtj.duckdns.org> <20150824210223.GH28944@mtj.duckdns.org> <20150824211707.GJ28944@mtj.duckdns.org> <20150824214000.GL28944@mtj.duckdns.org> <20150824224936.GO28944@mtj.duckdns.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3169 Lines: 75 Hello, Paul. On Mon, Aug 24, 2015 at 04:15:59PM -0700, Paul Turner wrote: > > Hmmm... if that's the case, would limiting iops on those IO devices > > (or classes of them) work? qemu already implements IO limit mechanism > > after all. > > No. > > 1) They should proceed at the maximum rate that they can that's still > within their provisioning budget. Ooh, right. > 2) The cost/IO is both inconsistent and changes over time. Attempting > to micro-optimize every backend for this is infeasible, this is > exactly the type of problem that the scheduler can usefully help > arbitrate. > 3) Even pretending (2) is fixable, dynamically dividing these > right-to-work tokens between different I/O device backends is > extremely complex. > > > Anyways, a point here is that threads of the same process competing > > isn't a new problem. There are many ways to make those threads play > > nice as the application itself often has to be involved anyway, > > especially for something like qemu which is heavily involved in > > provisioning resources. > > It's certainly not a new problem, but it's a real one, and it's > _hard_. You're proposing removing the best known solution. Well, I'm trying to figure out whether we actually need it and implement something sane if so. We actually can't do hierarchical resource distribution with existing mechanisms, so if that is something which is beneficial enough, let's go ahead and figure it out. > > cgroups can be a nice brute-force add-on which lets sysadmins do wild > > things but it's inherently hacky and incomplete for coordinating > > threads. For example, what is it gonna do if qemu cloned vcpus and IO > > helpers dynamically off of the same parent thread? > > We're talking about sub-process usage here. This is the application > coordinating itself, NOT the sysadmin. Processes are becoming larger > and larger, we need many of the same controls within them that we have > between them. > > > It requires > > application's cooperation anyway but at the same time is painful to > > actually interact from those applications. > > As discussed elsewhere on thread this is really not a problem if you > define consistent rules with respect to which parts are managed by > who. The argument of potential interference is no different to > messing with an application's on-disk configuration behind its back. > Alternate strawmen which greatly improve this from where we are today > have also been proposed. Let's continue in the other sub-thread but it's not just system management and applications not stepping on each other's toes although even just that is extremely painful with the current interface. cgroup membership is inherently tied to process tree no matter who's managing it which requires coordination from the application side for sub-process management and at that point it's really matter of putting one and one together. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/