On Fri, 17 Feb 2017, Vikas Shivappa wrote:
> For closid and rmid, change both the per cpu cache and PQR_MSR to be
> cleared only when offlining cpu at the respective handlers. The other
> places to clear them may not be required and is removed. This can be
> done at offlining so that the cache occupancy is not counted soon after
> the cpu goes down, rather than waiting to clear it during online cpu.
Yet another unstructured lump of blurb describing the WHAT and not the WHY.
> diff --git a/arch/x86/events/intel/cqm.c b/arch/x86/events/intel/cqm.c
> index 8c00dc0..681e32f 100644
> --- a/arch/x86/events/intel/cqm.c
> +++ b/arch/x86/events/intel/cqm.c
> @@ -1569,13 +1569,8 @@ static inline void cqm_pick_event_reader(int cpu)
>
> static int intel_cqm_cpu_starting(unsigned int cpu)
> {
> - struct intel_pqr_state *state = &per_cpu(pqr_state, cpu);
> struct cpuinfo_x86 *c = &cpu_data(cpu);
>
> - state->rmid = 0;
> - state->closid = 0;
> - state->rmid_usecnt = 0;
> -
> WARN_ON(c->x86_cache_max_rmid != cqm_max_rmid);
> WARN_ON(c->x86_cache_occ_scale != cqm_l3_scale);
>
> @@ -1585,12 +1580,17 @@ static int intel_cqm_cpu_starting(unsigned int cpu)
>
> static int intel_cqm_cpu_exit(unsigned int cpu)
> {
> + struct intel_pqr_state *state = &per_cpu(pqr_state, cpu);
Can be this_cpu_ptr() because the callback is guaranteed to run on the
outgoing CPU.
> int target;
>
> /* Is @cpu the current cqm reader for this package ? */
> if (!cpumask_test_and_clear_cpu(cpu, &cqm_cpumask))
> return 0;
So if the CPU is not the current cqm reader then the per cpu state of this
CPU is left stale. Great improvement.
> + state->rmid = 0;
> + state->rmid_usecnt = 0;
> + wrmsr(MSR_IA32_PQR_ASSOC, 0, state->closid);
What clears state->closid? And what guarantees that state->rmid is not
updated before the CPU has really gone away?
I doubt that this is correct, but if it is, then this lacks a big fat
comment explaining WHY.
Thanks,
tglx
On Wed, 1 Mar 2017, Thomas Gleixner wrote:
>> WARN_ON(c->x86_cache_occ_scale != cqm_l3_scale);
>>
>> @@ -1585,12 +1580,17 @@ static int intel_cqm_cpu_starting(unsigned int cpu)
>>
>> static int intel_cqm_cpu_exit(unsigned int cpu)
>> {
>> + struct intel_pqr_state *state = &per_cpu(pqr_state, cpu);
>
> Can be this_cpu_ptr() because the callback is guaranteed to run on the
> outgoing CPU.
Will fix this. Assumed the calls are setup cache alloc way -
cpuhp_setup_state(CPUHP_AP_ONLINE_DYN ..
>
>> int target;
>>
>> /* Is @cpu the current cqm reader for this package ? */
>> if (!cpumask_test_and_clear_cpu(cpu, &cqm_cpumask))
>> return 0;
>
> So if the CPU is not the current cqm reader then the per cpu state of this
> CPU is left stale. Great improvement.
>
>> + state->rmid = 0;
>> + state->rmid_usecnt = 0;
>> + wrmsr(MSR_IA32_PQR_ASSOC, 0, state->closid);
>
> What clears state->closid? And what guarantees that state->rmid is not
> updated before the CPU has really gone away?
- The rdt code takes care of clearing closid state now. Will update the comment.
- The cqm however was never writing a zero to PQR_ASSOC.
So the update needs to be - to remove the state->closid = 0 from cqm code as the
rdt code takes care of closid state in clear_closid() called from both offline and
online cpu.
And also write a rmid = 0 to PQR_ASSOC.
We can integrate the two of these hot cpu calls(from cat and cqm) to write PQR
only once.
guess I can skip all of these and send it as part of cqm changes we planned
anyways, because this is really a cqm change.
Thanks,
Vikas