Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755762AbbFLSUR (ORCPT ); Fri, 12 Jun 2015 14:20:17 -0400 Received: from mga02.intel.com ([134.134.136.20]:36583 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755651AbbFLSUM (ORCPT ); Fri, 12 Jun 2015 14:20:12 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.13,602,1427785200"; d="scan'208";a="742306766" From: Vikas Shivappa To: linux-kernel@vger.kernel.org Cc: vikas.shivappa@intel.com, x86@kernel.org, hpa@zytor.com, tglx@linutronix.de, mingo@kernel.org, peterz@infradead.org, matt.fleming@intel.com, will.auld@intel.com, linux-rdt@eclists.intel.com, vikas.shivappa@linux.intel.com Subject: [PATCH V9 00/10] New cpumask API and Intel Cache Allocation support Date: Fri, 12 Jun 2015 11:17:07 -0700 Message-Id: <1434133037-25189-1-git-send-email-vikas.shivappa@linux.intel.com> X-Mailer: git-send-email 1.9.1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8098 Lines: 166 This patch has some preparatory patches which add a new API cpumask_any_online_but and change hot cpu handling code in existing cache monitoring and RAPL kernel code. This improves hot cpu notification handling by not looping through all online cpus which could be expensive in large systems. Cache allocation patches(dependent on prep patches) adds a cgroup subsystem to support the new Cache Allocation feature found in future Intel Xeon Intel processors. Cache Allocation is a sub-feature with in Resource Director Technology(RDT) feature. RDT which provides support to control sharing of platform resources like L3 cache. Cache Allocation Technology provides a way for the Software (OS/VMM) to restrict cache allocation to a defined 'subset' of cache which may be overlapping with other 'subsets'. This feature is used when allocating a line in cache ie when pulling new data into the cache. The programming of the h/w is done via programming MSRs. The patch series support to perform L3 cache allocation. In todays new processors the number of cores is continuously increasing which in turn increase the number of threads or workloads that can simultaneously be run. When multi-threaded applications run concurrently, they compete for shared resources including L3 cache. At times, this L3 cache resource contention may result in inefficient space utilization. For example a higher priority thread may end up with lesser L3 cache resource or a cache sensitive app may not get optimal cache occupancy thereby degrading the performance. Cache Allocation kernel patch helps provides a framework for sharing L3 cache so that users can allocate the resource according to set requirements. More information about the feature can be found in the Intel SDM, Volume 3 section 17.15. SDM does not yet use the 'RDT' term yet and it is planned to be changed at a later time. *All the patches will apply on tip/perf/core*. Changes in V9: Changes made as per Thomas feedback: - added a comment where we call schedule in code only when RDT is enabled. - Reordered the local declarations to follow convention in intel_cqm_xchg_rmid Changes in V8: Thanks to feedback from Thomas and following changes are made based on his feedback: Generic changes/Preparatory patches: -added a new cpumask_any_online_but which returns the next core sibling that is online. -Made changes in Intel Cache monitoring and Intel RAPL(Running average power limit) code to use the new function above to find the next cpu that can be a designated reader for the package. Also changed the way the package masks are computed which can be simplified using topology_core_cpumask. Cache allocation specific changes: -Moved the documentation to the begining of the patch series. -Added more documentation for the rdt cgroup files in the documentation. -Changed the dmesg output when cache alloc is enabled to be more helpful and updated few other comments to be better readable. -removed __ prefix to functions like clos_get which were not following convention. -added code to take action on a WARN_ON in clos_put. Made a few other changes to reduce code text. -updated better readable/Kernel doc format comments for the call to rdt_css_alloc, datastructures . -removed cgroup_init -changed the names of functions to only have intel_ prefix for external APIs. -replaced (void *)&closid with (void *)closid when calling on_each_cpu_mask -fixed the reference release of closid during cache bitmask write. -changed the code to not ignore a cache mask which has bits set outside of the max bits allowed. It returns an error instead. -replaced bitmap_set(&max_mask, 0, max_cbm_len) with max_mask = (1ULL << max_cbm) - 1. - update the rdt_cpu_mask which has one cpu for each package, using topology_core_cpumask instead of looping through existing rdt_cpu_mask. Realized topology_core_cpumask name is misleading and it actually returns the cores in a cpu package! -arranged the code better to have the code relating to similar task together. -Improved searching for the next online cpu sibling and maintaining the rdt_cpu_mask which has one cpu per package. -removed the unnecessary wrapper rdt_enabled. -removed unnecessary spin lock and rculock in the scheduling code. -merged all scheduling code into one patch not seperating the RDT common software cache code. Changes in V7: Based on feedback from PeterZ and Matt and following discussions : - changed lot of naming to reflect the data structures which are common to RDT and specific to Cache allocation. - removed all usage of 'cat'. replace with more friendly cache allocation - fixed lot of convention issues (whitespace, return paradigm etc) - changed the scheduling hook for RDT to not use a inline. - removed adding new scheduling hook and just reused the existing one similar to perf hook. Changes in V6: - rebased to 4.1-rc1 which has the CMT(cache monitoring) support included. - (Thanks to Marcelo's feedback).Fixed support for hot cpu handling for IA32_L3_QOS MSRs. Although during deep C states the MSR need not be restored this is needed when physically a new package is added. -some other coding convention changes including renaming to cache_mask using a refcnt to track the number of cgroups using a closid in clos_cbm map. -1b cbm support for non-hsw SKUs. HSW is an exception which needs the cache bit masks to be at least 2 bits. Changes in v5: - Added support to propagate the cache bit mask update for each package. - Removed the cache bit mask reference in the intel_rdt structure as there was no need for that and we already maintain a separate closid<->cbm mapping. - Made a few coding convention changes which include adding the assertion while freeing the CLOSID. Changes in V4: - Integrated with the latest V5 CMT patches. - Changed naming of cgroup to rdt(resource director technology) from cat(cache allocation technology). This was done as the RDT is the umbrella term for platform shared resources allocation. Hence in future it would be easier to add resource allocation to the same cgroup - Naming changes also applied to a lot of other data structures/APIs. - Added documentation on cgroup usage for cache allocation to address a lot of questions from various academic and industry regarding cache allocation usage. Changes in V3: - Implements a common software cache for IA32_PQR_MSR - Implements support for hsw Cache Allocation enumeration. This does not use the brand strings like earlier version but does a probe test. The probe test is done only on hsw family of processors - Made a few coding convention, name changes - Check for lock being held when ClosID manipulation happens Changes in V2: - Removed HSW specific enumeration changes. Plan to include it later as a separate patch. - Fixed the code in prep_arch_switch to be specific for x86 and removed x86 defines. - Fixed cbm_write to not write all 1s when a cgroup is freed. - Fixed one possible memory leak in init. - Changed some of manual bitmap manipulation to use the predefined bitmap APIs to make code more readable - Changed name in sources from cqe to cat - Global cat enable flag changed to static_key and disabled cgroup early_init [PATCH 01/10] cpumask: Introduce cpumask_any_online_but [PATCH 02/10] x86/intel_cqm: Modify hot cpu notification handling [PATCH 03/10] x86/intel_rapl: Modify hot cpu notification handling [PATCH 04/10] x86/intel_rdt: Cache Allocation documentation and [PATCH 05/10] x86/intel_rdt: Add support for Cache Allocation [PATCH 06/10] x86/intel_rdt: Add new cgroup and Class of service [PATCH 07/10] x86/intel_rdt: Add support for cache bit mask [PATCH 08/10] x86/intel_rdt: Implement scheduling support for Intel [PATCH 09/10] x86/intel_rdt: Hot cpu support for Cache Allocation [PATCH 10/10] x86/intel_rdt: Intel haswell Cache Allocation -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/