by Andy Lutomirski

[permalink] [raw]

Subject: Re: [PATCH 7/9] x86/intel_rdt: Implement scheduling support for Intel RDT

On 08/06/2015 02:55 PM, Vikas Shivappa wrote:
> Adds support for IA32_PQR_ASSOC MSR writes during task scheduling. For
> Cache Allocation, MSR write would let the task fill in the cache
> 'subset' represented by the task's intel_rdt cgroup cache_mask.
>
> The high 32 bits in the per processor MSR IA32_PQR_ASSOC represents the
> CLOSid. During context switch kernel implements this by writing the
> CLOSid of the cgroup to which the task belongs to the CPU's
> IA32_PQR_ASSOC MSR.
>
> This patch also implements a common software cache for IA32_PQR_MSR
> (RMID 0:9, CLOSId 32:63) to be used by both Cache monitoring (CMT) and
> Cache allocation. CMT updates the RMID where as cache_alloc updates the
> CLOSid in the software cache. During scheduling when the new RMID/CLOSid
> value is different from the cached values, IA32_PQR_MSR is updated.
> Since the measured rdmsr latency for IA32_PQR_MSR is very high (~250
> cycles) this software cache is necessary to avoid reading the MSR to
> compare the current CLOSid value.
>
> The following considerations are done for the PQR MSR write so that it
> minimally impacts scheduler hot path:
> - This path does not exist on any non-intel platforms.
> - On Intel platforms, this would not exist by default unless CGROUP_RDT
> is enabled.
> - remains a no-op when CGROUP_RDT is enabled and intel SKU does not
> support the feature.
> - When feature is available and enabled, never does MSR write till the
> user manually creates a cgroup directory *and* assigns a cache_mask
> different from root cgroup directory. Since the child node inherits
> the parents cache mask, by cgroup creation there is no scheduling hot
> path impact from the new cgroup.
> - MSR write is only done when there is a task with different Closid is
> scheduled on the CPU. Typically if the task groups are bound to be
> scheduled on a set of CPUs, the number of MSR writes is greatly
> reduced.
> - A per CPU cache of CLOSids is maintained to do the check so that we
> dont have to do a rdmsr which actually costs a lot of cycles.
> - For cgroup directories having same cache_mask the CLOSids are reused.
> This minimizes the number of CLOSids used and hence reduces the MSR
> write frequency.

What happens if a user process sets a painfully restrictive CLOS and
then spends most of its time in the kernel doing work on behalf of
unrelated tasks? Does performance suck?

--Andy

2015-08-07 18:52:20

by Shivappa Vikas

[permalink] [raw]

Subject: Re: [PATCH 7/9] x86/intel_rdt: Implement scheduling support for Intel RDT

On Thu, 6 Aug 2015, Andy Lutomirski wrote:

> On 08/06/2015 02:55 PM, Vikas Shivappa wrote:
>> Adds support for IA32_PQR_ASSOC MSR writes during task scheduling. For
>> Cache Allocation, MSR write would let the task fill in the cache
>> 'subset' represented by the task's intel_rdt cgroup cache_mask.
>>
>> The high 32 bits in the per processor MSR IA32_PQR_ASSOC represents the
>> CLOSid. During context switch kernel implements this by writing the
>> CLOSid of the cgroup to which the task belongs to the CPU's
>> IA32_PQR_ASSOC MSR.
>>
>> This patch also implements a common software cache for IA32_PQR_MSR
>> (RMID 0:9, CLOSId 32:63) to be used by both Cache monitoring (CMT) and
>> Cache allocation. CMT updates the RMID where as cache_alloc updates the
>> CLOSid in the software cache. During scheduling when the new RMID/CLOSid
>> value is different from the cached values, IA32_PQR_MSR is updated.
>> Since the measured rdmsr latency for IA32_PQR_MSR is very high (~250
>> cycles) this software cache is necessary to avoid reading the MSR to
>> compare the current CLOSid value.
>>
>> The following considerations are done for the PQR MSR write so that it
>> minimally impacts scheduler hot path:
>> - This path does not exist on any non-intel platforms.
>> - On Intel platforms, this would not exist by default unless CGROUP_RDT
>> is enabled.
>> - remains a no-op when CGROUP_RDT is enabled and intel SKU does not
>> support the feature.
>> - When feature is available and enabled, never does MSR write till the
>> user manually creates a cgroup directory *and* assigns a cache_mask
>> different from root cgroup directory. Since the child node inherits
>> the parents cache mask, by cgroup creation there is no scheduling hot
>> path impact from the new cgroup.
>> - MSR write is only done when there is a task with different Closid is
>> scheduled on the CPU. Typically if the task groups are bound to be
>> scheduled on a set of CPUs, the number of MSR writes is greatly
>> reduced.
>> - A per CPU cache of CLOSids is maintained to do the check so that we
>> dont have to do a rdmsr which actually costs a lot of cycles.
>> - For cgroup directories having same cache_mask the CLOSids are reused.
>> This minimizes the number of CLOSids used and hence reduces the MSR
>> write frequency.
>
> What happens if a user process sets a painfully restrictive CLOS

The patches currently lets the system admin/root user configure the cache
allocation for threads using cgroup. user process cant decide for itself.

Thanks,
Vikas

and then
> spends most of its time in the kernel doing work on behalf of unrelated
> tasks? Does performance suck?
>
> --Andy
>