Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754375AbbGaPMh (ORCPT ); Fri, 31 Jul 2015 11:12:37 -0400 Received: from mx1.redhat.com ([209.132.183.28]:38255 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752118AbbGaPMg (ORCPT ); Fri, 31 Jul 2015 11:12:36 -0400 Date: Fri, 31 Jul 2015 11:45:41 -0300 From: Marcelo Tosatti To: Vikas Shivappa Cc: "Auld, Will" , Vikas Shivappa , "linux-kernel@vger.kernel.org" , "x86@kernel.org" , "hpa@zytor.com" , "tglx@linutronix.de" , "mingo@kernel.org" , "tj@kernel.org" , "peterz@infradead.org" , "Fleming, Matt" , "Williamson, Glenn P" , "Juvva, Kanaka D" Subject: Re: [PATCH 3/9] x86/intel_rdt: Cache Allocation documentation and cgroup usage guide Message-ID: <20150731144541.GB22948@amt.cnet> References: <1435789270-27010-1-git-send-email-vikas.shivappa@linux.intel.com> <1435789270-27010-4-git-send-email-vikas.shivappa@linux.intel.com> <20150728231516.GA16204@amt.cnet> <96EC5A4F3149B74492D2D9B9B1602C27461EB932@ORSMSX105.amr.corp.intel.com> <20150729193208.GC3201@amt.cnet> <20150730202253.GA12921@amt.cnet> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4154 Lines: 125 On Thu, Jul 30, 2015 at 04:03:07PM -0700, Vikas Shivappa wrote: > > > On Thu, 30 Jul 2015, Marcelo Tosatti wrote: > > >On Thu, Jul 30, 2015 at 10:47:23AM -0700, Vikas Shivappa wrote: > >> > >> > >>Marcello, > >> > >> > >>On Wed, 29 Jul 2015, Marcelo Tosatti wrote: > >>> > >>>How about this: > >>> > >>>desiredclos (closid p1 p2 p3 p4) > >>> 1 1 0 0 0 > >>> 2 0 0 0 1 > >>> 3 0 1 1 0 > >> > >>#1 Currently in the rdt cgroup , the root cgroup always has all the > >>bits set and cant be changed (because the cgroup hierarchy would by > >>default make this to have all bits as all the children need to have > >>a subset of the root's bitmask). So if the user creates a cgroup and > >>not put any task in it , the tasks in the root cgroup could be still > >>using that part of the cache. Thats the reason i say we can have > >>really 'exclusive' masks. > >> > >>Or in other words - there is always a desired clos (0) which has all > >>parts set which acts like a default pool. > >> > >>Also the parts can overlap. Please apply this for all the below > >>comments which will change the way they work. > > > > > >> > >>> > >>>p means part. > >> > >>I am assuming p = (a contiguous cache capacity bit mask) > >> > >>>closid 1 is a exclusive cgroup. > >>>closid 2 is a "cache hog" class. > >>>closid 3 is "default closid". > >>> > >>>Desiredclos is what user has specified. > >>> > >>>Transition 1: desiredclos --> effectiveclos > >>>Clean all bits of unused closid's > >>>(that must be updated whenever a > >>>closid1 cgroup goes from empty->nonempty > >>>and vice-versa). > >>> > >>>effectiveclos (closid p1 p2 p3 p4) > >>> 1 0 0 0 0 > >>> 2 0 0 0 1 > >>> 3 0 1 1 0 > >> > >>> > >>>Transition 2: effectiveclos --> expandedclos > >>>expandedclos (closid p1 p2 p3 p4) > >>> 1 0 0 0 0 > >>> 2 0 0 0 1 > >>> 3 1 1 1 0 > >>>Then you have different inplacecos for each > >>>CPU (see pseudo-code below): > >>> > >>>On the following events. > >>> > >>>- task migration to new pCPU: > >>>- task creation: > >>> > >>> id = smp_processor_id(); > >>> for (part = desiredclos.p1; ...; part++) > >>> /* if my cosid is set and any other > >>> cosid is clear, for the part, > >>> synchronize desiredclos --> inplacecos */ > >>> if (part[mycosid] == 1 && > >>> part[any_othercosid] == 0) > >>> wrmsr(part, desiredclos); > >>> > >> > >>Currently the root cgroup would have all the bits set which will act > >>like a default cgroup where all the otherwise unused parts (assuming > >>they are a set of contiguous cache capacity bits) will be used. > > > >Right, but we don't want to place tasks in there in case one cgroup > >wants exclusive cache access. > > > >So whenever you want an exclusive cgroup you'd do: > > > >create cgroup-exclusive; reserve desired part of the cache > >for it. > >create cgroup-default; reserved all cache minus that of cgroup-exclusive > >for it. > > > >place tasks that belong to cgroup-exclusive into it. > >place all other tasks (including init) into cgroup-default. > > > >Is that right? > > Yes you could do that. > > You can create cgroups to have masks which are exclusive in todays > implementation, just that you could also created more cgroups to > overlap the masks again.. iow we dont have an exclusive flag for the > cgroup mask. > Is that a common use case in the server environment that you need to > prevent other cgroups from using a certain mask ? (since the root > user should control these allocations .. he should know?) Yes, there are two known use-cases that have this characteristic: 1) High performance numeric application which has been optimized to a certain fraction of the cache. 2) Low latency application in multi-application OS. For both cases exclusive cache access is wanted. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/