Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751868AbbGMRQJ (ORCPT ); Mon, 13 Jul 2015 13:16:09 -0400 Received: from mga01.intel.com ([192.55.52.88]:26401 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751729AbbGMRQG (ORCPT ); Mon, 13 Jul 2015 13:16:06 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.15,464,1432623600"; d="scan'208";a="746273148" Date: Mon, 13 Jul 2015 10:13:28 -0700 (PDT) From: Vikas Shivappa X-X-Sender: vikas@vshiva-Udesk To: Vikas Shivappa cc: linux-kernel@vger.kernel.org, vikas.shivappa@intel.com, x86@kernel.org, hpa@zytor.com, tglx@linutronix.de, mingo@kernel.org, tj@kernel.org, peterz@infradead.org, Matt Fleming , "Auld, Will" , "Williamson, Glenn P" , Marcelo Tosatti , "Juvva, Kanaka D" Subject: Re: [PATCH V12 0/9] Hot cpu handling changes to cqm, rapl and Intel Cache Allocation support In-Reply-To: <1435789270-27010-1-git-send-email-vikas.shivappa@linux.intel.com> Message-ID: References: <1435789270-27010-1-git-send-email-vikas.shivappa@linux.intel.com> User-Agent: Alpine 2.10 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 10746 Lines: 222 Hello Thomas, Just a ping for any feedback if any. Have tried to fix some issues you pointed out in V11 and V12. Thanks, Vikas On Wed, 1 Jul 2015, Vikas Shivappa wrote: > This patch has some changes to hot cpu handling code in existing cache > monitoring and RAPL kernel code. This improves hot cpu notification > handling by not looping through all online cpus which could be expensive > in large systems. > > Cache allocation patches(dependent on prep patches) adds a cgroup > subsystem to support the new Cache Allocation feature found in future > Intel Xeon Intel processors. Cache Allocation is a sub-feature with in > Resource Director Technology(RDT) feature. RDT which provides support to > control sharing of platform resources like L3 cache. > > Cache Allocation Technology provides a way for the Software (OS/VMM) to > restrict cache allocation to a defined 'subset' of cache which may be > overlapping with other 'subsets'. This feature is used when allocating > a line in cache ie when pulling new data into the cache. The > programming of the h/w is done via programming MSRs. The patch series > support to perform L3 cache allocation. > > In todays new processors the number of cores is continuously increasing > which in turn increase the number of threads or workloads that can > simultaneously be run. When multi-threaded applications run > concurrently, they compete for shared resources including L3 cache. At > times, this L3 cache resource contention may result in inefficient space > utilization. For example a higher priority thread may end up with lesser > L3 cache resource or a cache sensitive app may not get optimal cache > occupancy thereby degrading the performance. Cache Allocation kernel > patch helps provides a framework for sharing L3 cache so that users can > allocate the resource according to set requirements. > > More information about the feature can be found in the Intel SDM, Volume > 3 section 17.15. SDM does not yet use the 'RDT' term yet and it is > planned to be changed at a later time. > > *All the patches will apply on tip/perf/core*. > > Changes in v12: > > - From Matt's feedback replaced static cpumask_t tmp with function > scope at multiple locations to static cpumask_t tmp_cpumask for the > whole file. This is a temporary mask used during handling of hot cpu > notifications in cqm/rapl and rdt code(1/9,2/9 and 8/9). Although all > the usage was serialized by hot cpu locking this makes it more > readable. > > Changes in V11: As per feedback from Thomas and discussions: > > - removed the cpumask_any_online_but.its usage could be easily replaced with > 'and'ing the cpu_online mask during hot cpu notifications. Thomas > pointed the API had issue where there tmp mask wasnt thread safe. I > realized the support it indends to give does not seem to match with > others in cpumask.h > - the cqm patch which added mutex to hot cpu notification was merged > with the cqm hot plug patch to improve notificaiton handling > without commit logs and wasnt correct. seperated and just sending the > cqm hot plug patch and will send the mutex cqm patch seperately > - fixed issues in the hot cpu rdt handling. Since the cpu_starting was > replaced with cpu_online , now the wrmsr needs to be actually > scheduled on the target cpu - which the previous patch wasnt doing. > Replaced the cpu_dead with cpu_down_prepare. the cpu_down_failed is > handled the same way as cpu_online. By waiting till cpu_dead to update > the rdt_cpumask , we may miss some of the msr updates. > > Changes in V10: > > - changed the hot cpu notification we handle in cqm and cache allocation > to cpu_online and cpu_dead and removed others as the > cpu_*_prepare also had corresponding cancel notification > which we did not handle. > - changed the file in rdt cgroup to l3_cache_mask to represent that its > for l3 cache. > > Changes as per Thomas and PeterZ feedback: > - fixed the cpumask declarations in cpumask.h and rdt,cmt and rapl to > have static so that they burden stack space when large. > - removed mutex in cpu_starting notifications, replaced the locking with > cpu_online. > - changed name from hsw_probetest to cache_alloc_hsw_probe. > - changed x86_rdt_max_closid to x86_cache_max_closid and > x86_rdt_max_cbm_len to x86_cache_max_cbm_len as they are only related > to cache allocation and not to all rdt. > > Changes in V9: > Changes made as per Thomas feedback: > - added a comment where we call schedule in code only when RDT is > enabled. > - Reordered the local declarations to follow convention in > intel_cqm_xchg_rmid > > Changes in V8: Thanks to feedback from Thomas and following changes are > made based on his feedback: > > Generic changes/Preparatory patches: > -added a new cpumask_any_online_but which returns the next > core sibling that is online. > -Made changes in Intel Cache monitoring and Intel RAPL(Running average > power limit) code to use the new function above to find the next cpu > that can be a designated reader for the package. Also changed the way > the package masks are computed which can be simplified using > topology_core_cpumask. > > Cache allocation specific changes: > -Moved the documentation to the begining of the patch series. > -Added more documentation for the rdt cgroup files in the documentation. > -Changed the dmesg output when cache alloc is enabled to be more helpful > and updated few other comments to be better readable. > -removed __ prefix to functions like clos_get which were not following > convention. > -added code to take action on a WARN_ON in clos_put. Made a few other > changes to reduce code text. > -updated better readable/Kernel doc format comments for the > call to rdt_css_alloc, datastructures . > -removed cgroup_init > -changed the names of functions to only have intel_ prefix for external > APIs. > -replaced (void *)&closid with (void *)closid when calling > on_each_cpu_mask > -fixed the reference release of closid during cache bitmask write. > -changed the code to not ignore a cache mask which has bits set outside > of the max bits allowed. It returns an error instead. > -replaced bitmap_set(&max_mask, 0, max_cbm_len) with max_mask = > (1ULL << max_cbm) - 1. > - update the rdt_cpu_mask which has one cpu for each package, using > topology_core_cpumask instead of looping through existing rdt_cpu_mask. > Realized topology_core_cpumask name is misleading and it actually > returns the cores in a cpu package! > -arranged the code better to have the code relating to similar task > together. > -Improved searching for the next online cpu sibling and maintaining the > rdt_cpu_mask which has one cpu per package. > -removed the unnecessary wrapper rdt_enabled. > -removed unnecessary spin lock and rculock in the scheduling code. > -merged all scheduling code into one patch not seperating the RDT common > software cache code. > > Changes in V7: Based on feedback from PeterZ and Matt and following > discussions : > - changed lot of naming to reflect the data structures which are common > to RDT and specific to Cache allocation. > - removed all usage of 'cat'. replace with more friendly cache > allocation > - fixed lot of convention issues (whitespace, return paradigm etc) > - changed the scheduling hook for RDT to not use a inline. > - removed adding new scheduling hook and just reused the existing one > similar to perf hook. > > Changes in V6: > - rebased to 4.1-rc1 which has the CMT(cache monitoring) support included. > - (Thanks to Marcelo's feedback).Fixed support for hot cpu handling for > IA32_L3_QOS MSRs. Although during deep C states the MSR need not be restored > this is needed when physically a new package is added. > -some other coding convention changes including renaming to cache_mask using a > refcnt to track the number of cgroups using a closid in clos_cbm map. > -1b cbm support for non-hsw SKUs. HSW is an exception which needs the cache > bit masks to be at least 2 bits. > > Changes in v5: > - Added support to propagate the cache bit mask update for each > package. > - Removed the cache bit mask reference in the intel_rdt structure as > there was no need for that and we already maintain a separate > closid<->cbm mapping. > - Made a few coding convention changes which include adding the > assertion while freeing the CLOSID. > > Changes in V4: > - Integrated with the latest V5 CMT patches. > - Changed naming of cgroup to rdt(resource director technology) from > cat(cache allocation technology). This was done as the RDT is the > umbrella term for platform shared resources allocation. Hence in > future it would be easier to add resource allocation to the same > cgroup > - Naming changes also applied to a lot of other data structures/APIs. > - Added documentation on cgroup usage for cache allocation to address > a lot of questions from various academic and industry regarding > cache allocation usage. > > Changes in V3: > - Implements a common software cache for IA32_PQR_MSR > - Implements support for hsw Cache Allocation enumeration. This does not use the brand > strings like earlier version but does a probe test. The probe test is done only > on hsw family of processors > - Made a few coding convention, name changes > - Check for lock being held when ClosID manipulation happens > > Changes in V2: > - Removed HSW specific enumeration changes. Plan to include it later as a > separate patch. > - Fixed the code in prep_arch_switch to be specific for x86 and removed > x86 defines. > - Fixed cbm_write to not write all 1s when a cgroup is freed. > - Fixed one possible memory leak in init. > - Changed some of manual bitmap > manipulation to use the predefined bitmap APIs to make code more readable > - Changed name in sources from cqe to cat > - Global cat enable flag changed to static_key and disabled cgroup early_init > > [PATCH 1/9] x86/intel_cqm: Modify hot cpu notification handling > [PATCH 2/9] x86/intel_rapl: Modify hot cpu notification handling for > [PATCH 3/9] x86/intel_rdt: Cache Allocation documentation and cgroup > [PATCH 4/9] x86/intel_rdt: Add support for Cache Allocation detection > [PATCH 5/9] x86/intel_rdt: Add new cgroup and Class of service > [PATCH 6/9] x86/intel_rdt: Add support for cache bit mask management > [PATCH 7/9] x86/intel_rdt: Implement scheduling support for Intel RDT > [PATCH 8/9] x86/intel_rdt: Hot cpu support for Cache Allocation > [PATCH 9/9] x86/intel_rdt: Intel haswell Cache Allocation enumeration > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/