Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752907AbbGaQZE (ORCPT ); Fri, 31 Jul 2015 12:25:04 -0400 Received: from mga14.intel.com ([192.55.52.115]:42846 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751664AbbGaQZB (ORCPT ); Fri, 31 Jul 2015 12:25:01 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.15,585,1432623600"; d="scan'208";a="774147258" Date: Fri, 31 Jul 2015 09:24:58 -0700 (PDT) From: Vikas Shivappa X-X-Sender: vikas@vshiva-Udesk To: Tejun Heo cc: Vikas Shivappa , linux-kernel@vger.kernel.org, vikas.shivappa@intel.com, x86@kernel.org, hpa@zytor.com, tglx@linutronix.de, mingo@kernel.org, peterz@infradead.org, matt.fleming@intel.com, will.auld@intel.com, glenn.p.williamson@intel.com, kanaka.d.juvva@intel.com Subject: Re: [PATCH 5/9] x86/intel_rdt: Add new cgroup and Class of service management In-Reply-To: <20150730194458.GD3504@mtj.duckdns.org> Message-ID: References: <1435789270-27010-1-git-send-email-vikas.shivappa@linux.intel.com> <1435789270-27010-6-git-send-email-vikas.shivappa@linux.intel.com> <20150730194458.GD3504@mtj.duckdns.org> User-Agent: Alpine 2.10 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5694 Lines: 136 On Thu, 30 Jul 2015, Tejun Heo wrote: > Hello, Vikas. > > On Wed, Jul 01, 2015 at 03:21:06PM -0700, Vikas Shivappa wrote: >> This patch adds a cgroup subsystem for Intel Resource Director >> Technology(RDT) feature and Class of service(CLOSid) management which is >> part of common RDT framework. This cgroup would eventually be used by >> all sub-features of RDT and hence be associated with the common RDT >> framework as well as sub-feature specific framework. However current >> patch series only adds cache allocation sub-feature specific code. >> >> When a cgroup directory is created it has a CLOSid associated with it >> which is inherited from its parent. The Closid is mapped to a >> cache_mask which represents the L3 cache allocation to the cgroup. >> Tasks belonging to the cgroup get to fill the cache represented by the >> cache_mask. > > First of all, I apologize for being so late. I've been thinking about > it but the thoughts didn't quite crystalize (which isn't to say that > it's very crystal now) until recently. If I understand correctly, > there are a couple suggested use cases for explicitly managing cache > usage. > > 1. Pinning known hot areas of memory in cache. No , the cache allocation doesnt do this. (or it isn't expected to do) > > 2. Explicitly regulating cache usage so that cacheline allocation can > be better than CPU itself doing it. yes , this is what we want to do using cache alloc. > > #1 isn't part of this patchset, right? Is there any plan for working > towards this too? cache allocation is not intended to do #1 , so we dont have to support this. > > For #2, it is likely that the targeted use cases would involve threads > of a process or at least cooperating processes and having a simple API > which just goes "this (or the current) thread is only gonna use this > part of cache" would be a lot easier to use and actually beneficial. > > I don't really think it makes sense to implement a fully hierarchical > cgroup solution when there isn't the basic affinity-adjusting > interface and it isn't clear whether fully hierarchical resource > distribution would be necessary especially given that the granularity > of the target resource is very coarse. > > I can see that how cpuset would seem to invite this sort of usage but > cpuset itself is more of an arbitrary outgrowth (regardless of > history) in terms of resource control and most things controlled by > cpuset already have countepart interface which is readily accessible > to the normal applications. Yes today we dont have an alternative interface - but we can always build one. We simply dont have it because till now Linux kernel just tolerated the degradation that could have occured by cache contention and this is the first interface we are building. > > Given that what the feature allows is restricting usage rather than > granting anything exclusively, a programmable interface wouldn't need > to worry about complications around priviledges while being able to > reap most of the benefits in an a lot easier way. Am I missing > something? > For #2 , from the intel_rdt cgroup we develop a framework where the user can regulate the cache allocation. A user space app could also eventually use this as underlying support and then do things on top of it depending on the enterprise or other requirements. A typical use case would be that an application which is say continuously polluting the cache(low priority app from cache usage perspective) by bringing in data from the network (copying/streaming app) and and not letting an app to use the cache which has legitimate requirement of cache usage(high priority app). We need to map the group of tasks to a particular class of service and way for the user to specify the cache capacity for that class of service . Also a default cgroup which could have all the tasks and use all the cache. The hierarchical interface can be used by the user as required and does not really interfere with allocating exclusive blocks of cache - all the user needs to do is make sure the masks dont overlap. The user can configure the masks to be exclusive from others. But note that overlapping mask provides a very easy way to share the cache usage which is what you may want to do sometimes. The current implementation can be easily extended to *enforce* exclusive capacity masks between child nodes if required. But since its expected for the super user to be using this , the usage may be limited as well or the user can still care of it like i said above. Some of the emails may have been confusing that we cannot do exclusive allocations - but thats not true all together : we can do canfigure the masks to have exclusive cache blocks for different cgroups but its just left to the user... We did have a lot of discussions during the design and V3 if you remember and were closed on using a seperate controller ... Below is one such thread where we discussed the same . Dont want to loop throug again with this already full marathon patch :) https://lkml.org/lkml/2015/1/27/846 quick copy from V3 thread - " > proposal but was removed as we did not get agreement on lkml. > > the original lkml thread is here from 10/2014 for your reference - > https://lkml.org/lkml/2014/10/16/568 Yeap, I followed that thread and this being a separate controller definitely makes a lot more sense. " Thanks, Vikas > Thanks. > > -- > tejun > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/