Added lock in event reader function. The cqm_pick_event_reader() function
accesses cqm_cpumask and it is critical section between this and
cqm_stable().
This situation is true when a CPU is hotplugged. Mutex is used to protect
the critical section.
Signed-off-by: Kanaka Juvva <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel_cqm.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/perf_event_intel_cqm.c b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
index 1880761..e17e37f 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_cqm.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
@@ -1239,12 +1239,15 @@ static inline void cqm_pick_event_reader(int cpu)
int phys_id = topology_physical_package_id(cpu);
int i;
+ mutex_lock(&cache_mutex);
for_each_cpu(i, &cqm_cpumask) {
if (phys_id == topology_physical_package_id(i))
- return; /* already got reader for this socket */
+ goto out; /* already got reader for this socket */
}
cpumask_set_cpu(cpu, &cqm_cpumask);
+out:
+ mutex_unlock(&cache_mutex);
}
static void intel_cqm_cpu_prepare(unsigned int cpu)
--
2.1.0
On Thu, 18 Jun 2015, Kanaka Juvva wrote:
> Added lock in event reader function. The cqm_pick_event_reader() function
> accesses cqm_cpumask and it is critical section between this and
> cqm_stable().
>
> This situation is true when a CPU is hotplugged. Mutex is used to protect
> the critical section.
>
> Signed-off-by: Kanaka Juvva <[email protected]>
> ---
> arch/x86/kernel/cpu/perf_event_intel_cqm.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event_intel_cqm.c b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
> index 1880761..e17e37f 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel_cqm.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
> @@ -1239,12 +1239,15 @@ static inline void cqm_pick_event_reader(int cpu)
> int phys_id = topology_physical_package_id(cpu);
> int i;
>
> + mutex_lock(&cache_mutex);
I already explained it to Vikas. You CANNOT take a mutex in that code
path as it runs with interrupts disabled on a CPU which cannot
schedule.
Sigh.
tglx
On 06/18/2015 12:47 PM, Thomas Gleixner wrote:
>> > @@ -1239,12 +1239,15 @@ static inline void cqm_pick_event_reader(int cpu)
>> > int phys_id = topology_physical_package_id(cpu);
>> > int i;
>> >
>> > + mutex_lock(&cache_mutex);
> I already explained it to Vikas. You CANNOT take a mutex in that code
> path as it runs with interrupts disabled on a CPU which cannot
> schedule.
How did lockdep not blow up and scream about this?
On Thu, 18 Jun 2015, Dave Hansen wrote:
> On 06/18/2015 12:47 PM, Thomas Gleixner wrote:
> >> > @@ -1239,12 +1239,15 @@ static inline void cqm_pick_event_reader(int cpu)
> >> > int phys_id = topology_physical_package_id(cpu);
> >> > int i;
> >> >
> >> > + mutex_lock(&cache_mutex);
> > I already explained it to Vikas. You CANNOT take a mutex in that code
> > path as it runs with interrupts disabled on a CPU which cannot
> > schedule.
>
> How did lockdep not blow up and scream about this?
Dunno. I did not test that stuff.
The might_sleep() in mutex_lock() should catch it as well, IF that
code was ever executed AFTER boot on a running system.
Thanks,
tglx
On Thu, 18 Jun 2015, Thomas Gleixner wrote:
> On Thu, 18 Jun 2015, Kanaka Juvva wrote:
>
>> Added lock in event reader function. The cqm_pick_event_reader() function
>> accesses cqm_cpumask and it is critical section between this and
>> cqm_stable().
>>
>> This situation is true when a CPU is hotplugged. Mutex is used to protect
>> the critical section.
>>
>> Signed-off-by: Kanaka Juvva <[email protected]>
>> ---
>> arch/x86/kernel/cpu/perf_event_intel_cqm.c | 5 ++++-
>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/cpu/perf_event_intel_cqm.c b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
>> index 1880761..e17e37f 100644
>> --- a/arch/x86/kernel/cpu/perf_event_intel_cqm.c
>> +++ b/arch/x86/kernel/cpu/perf_event_intel_cqm.c
>> @@ -1239,12 +1239,15 @@ static inline void cqm_pick_event_reader(int cpu)
>> int phys_id = topology_physical_package_id(cpu);
>> int i;
>>
>> + mutex_lock(&cache_mutex);
>
> I already explained it to Vikas. You CANNOT take a mutex in that code
> path as it runs with interrupts disabled on a CPU which cannot
> schedule.
This patch also needs to be merged with the new package mask changes that
was added.
Will merge and send the fix.
Thanks,
Vikas
>
> Sigh.
>
> tglx
>