2014-04-23 14:37:38

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH] perf/x86/intel: Use rdmsrl_safe when initializing RAPL PMU.

On Thu, Mar 13, 2014 at 8:36 PM, Venkatesh Srinivas
<[email protected]> wrote:
> CPUs which should support the RAPL counters according to
> Family/Model/Stepping may still issue #GP when attempting to access
> the RAPL MSRs. This may happen when Linux is running under KVM and
> we are passing-through host F/M/S data, for example. Use rdmsrl_safe
> to first access the RAPL_POWER_UNIT MSR; if this fails, do not
> attempt to use this PMU.
>
> Signed-off-by: Venkatesh Srinivas <[email protected]>
> ---
> arch/x86/kernel/cpu/perf_event_intel_rapl.c | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event_intel_rapl.c b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
> index 5ad35ad..95700e5 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
> @@ -511,6 +511,7 @@ static int rapl_cpu_prepare(int cpu)
> struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
> int phys_id = topology_physical_package_id(cpu);
> u64 ms;
> + u64 msr_rapl_power_unit_bits;
>
> if (pmu)
> return 0;
> @@ -518,6 +519,9 @@ static int rapl_cpu_prepare(int cpu)
> if (phys_id < 0)
> return -1;
>
> + if (!rdmsrl_safe(MSR_RAPL_POWER_UNIT, &msr_rapl_power_unit_bits))
> + return -1;
> +
I have a problem with this patch on native hardware. This
rdmsrl_safe() systematically
fails when I know the MSR is perfectly valid on the CPU. Consequently, RAPL PMU
is disabled when I tried on IvyBridge and Haswell CPUs.

I don't know the internals of rdmsrl_safe(). Maybe it is invoked too
early in the boot process.

Please fix this.

> pmu = kzalloc_node(sizeof(*pmu), GFP_KERNEL, cpu_to_node(cpu));
> if (!pmu)
> return -1;
> @@ -531,8 +535,7 @@ static int rapl_cpu_prepare(int cpu)
> *
> * we cache in local PMU instance
> */
> - rdmsrl(MSR_RAPL_POWER_UNIT, pmu->hw_unit);
> - pmu->hw_unit = (pmu->hw_unit >> 8) & 0x1FULL;
> + pmu->hw_unit = (msr_rapl_power_unit_bits >> 8) & 0x1FULL;
> pmu->pmu = &rapl_pmu_class;
>
> /*
> @@ -649,7 +652,9 @@ static int __init rapl_pmu_init(void)
> get_online_cpus();
>
> for_each_online_cpu(cpu) {
> - rapl_cpu_prepare(cpu);
> + ret = rapl_cpu_prepare(cpu);
> + if (ret)
> + goto out;
> rapl_cpu_init(cpu);
> }
>
> @@ -672,6 +677,7 @@ static int __init rapl_pmu_init(void)
> hweight32(rapl_cntr_mask),
> ktime_to_ms(pmu->timer_interval));
>
> +out:
> put_online_cpus();
>
> return 0;
> --
> 1.9.0.279.gdc9e3eb
>


2014-04-23 14:45:48

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] perf/x86/intel: Use rdmsrl_safe when initializing RAPL PMU.

On Wed, Apr 23, 2014 at 04:31:32PM +0200, Stephane Eranian wrote:
> On Thu, Mar 13, 2014 at 8:36 PM, Venkatesh Srinivas
> <[email protected]> wrote:
> > CPUs which should support the RAPL counters according to
> > Family/Model/Stepping may still issue #GP when attempting to access
> > the RAPL MSRs. This may happen when Linux is running under KVM and
> > we are passing-through host F/M/S data, for example. Use rdmsrl_safe
> > to first access the RAPL_POWER_UNIT MSR; if this fails, do not
> > attempt to use this PMU.
> >
> > Signed-off-by: Venkatesh Srinivas <[email protected]>
> > ---
> > arch/x86/kernel/cpu/perf_event_intel_rapl.c | 12 +++++++++---
> > 1 file changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/x86/kernel/cpu/perf_event_intel_rapl.c b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
> > index 5ad35ad..95700e5 100644
> > --- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
> > +++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
> > @@ -511,6 +511,7 @@ static int rapl_cpu_prepare(int cpu)
> > struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
> > int phys_id = topology_physical_package_id(cpu);
> > u64 ms;
> > + u64 msr_rapl_power_unit_bits;
> >
> > if (pmu)
> > return 0;
> > @@ -518,6 +519,9 @@ static int rapl_cpu_prepare(int cpu)
> > if (phys_id < 0)
> > return -1;
> >
> > + if (!rdmsrl_safe(MSR_RAPL_POWER_UNIT, &msr_rapl_power_unit_bits))
> > + return -1;
> > +
> I have a problem with this patch on native hardware. This
> rdmsrl_safe() systematically
> fails when I know the MSR is perfectly valid on the CPU. Consequently, RAPL PMU
> is disabled when I tried on IvyBridge and Haswell CPUs.
>
> I don't know the internals of rdmsrl_safe(). Maybe it is invoked too
> early in the boot process.

Weird; so the way it works is that it adds an exception table entry for
the wrmsr instruction, so when the wrmsr generates a fault due to being
an invalid msr the fault handler looks at the exception table, and finds
the entry, which instructs it to continue execution at the error path
and return -EFAULT.

2014-04-23 14:58:06

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH] perf/x86/intel: Use rdmsrl_safe when initializing RAPL PMU.

On Wed, Apr 23, 2014 at 4:45 PM, Peter Zijlstra <[email protected]> wrote:
> On Wed, Apr 23, 2014 at 04:31:32PM +0200, Stephane Eranian wrote:
>> On Thu, Mar 13, 2014 at 8:36 PM, Venkatesh Srinivas
>> <[email protected]> wrote:
>> > CPUs which should support the RAPL counters according to
>> > Family/Model/Stepping may still issue #GP when attempting to access
>> > the RAPL MSRs. This may happen when Linux is running under KVM and
>> > we are passing-through host F/M/S data, for example. Use rdmsrl_safe
>> > to first access the RAPL_POWER_UNIT MSR; if this fails, do not
>> > attempt to use this PMU.
>> >
>> > Signed-off-by: Venkatesh Srinivas <[email protected]>
>> > ---
>> > arch/x86/kernel/cpu/perf_event_intel_rapl.c | 12 +++++++++---
>> > 1 file changed, 9 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/arch/x86/kernel/cpu/perf_event_intel_rapl.c b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
>> > index 5ad35ad..95700e5 100644
>> > --- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
>> > +++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
>> > @@ -511,6 +511,7 @@ static int rapl_cpu_prepare(int cpu)
>> > struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
>> > int phys_id = topology_physical_package_id(cpu);
>> > u64 ms;
>> > + u64 msr_rapl_power_unit_bits;
>> >
>> > if (pmu)
>> > return 0;
>> > @@ -518,6 +519,9 @@ static int rapl_cpu_prepare(int cpu)
>> > if (phys_id < 0)
>> > return -1;
>> >
>> > + if (!rdmsrl_safe(MSR_RAPL_POWER_UNIT, &msr_rapl_power_unit_bits))
>> > + return -1;
>> > +
>> I have a problem with this patch on native hardware. This
>> rdmsrl_safe() systematically
>> fails when I know the MSR is perfectly valid on the CPU. Consequently, RAPL PMU
>> is disabled when I tried on IvyBridge and Haswell CPUs.
>>
>> I don't know the internals of rdmsrl_safe(). Maybe it is invoked too
>> early in the boot process.
>
> Weird; so the way it works is that it adds an exception table entry for
> the wrmsr instruction, so when the wrmsr generates a fault due to being
> an invalid msr the fault handler looks at the exception table, and finds
> the entry, which instructs it to continue execution at the error path
> and return -EFAULT.
>
On your machine, booted with 3.15-rc2, do you have /sys/devices/power?
If not, and you have at least an SNB, you should have RAPL and that
RAPL_UNIT MSR.

Proof is that if you read that MSR using /dev/cpu/msr it works just fine:
# modprobe msr
# rdmsr 0x606
a1003

So something is broken somewhere.

2014-04-23 15:09:20

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] perf/x86/intel: Use rdmsrl_safe when initializing RAPL PMU.

On Wed, Apr 23, 2014 at 04:49:55PM +0200, Stephane Eranian wrote:
> On your machine, booted with 3.15-rc2, do you have /sys/devices/power?
> If not, and you have at least an SNB, you should have RAPL and that
> RAPL_UNIT MSR.
>
> Proof is that if you read that MSR using /dev/cpu/msr it works just fine:
> # modprobe msr
> # rdmsr 0x606
> a1003
>
> So something is broken somewhere.

I've not got SNB+ class hardware (for testing). But I believe you that
there's something funny, I just do not understanding how *msr_safe()
could cause this.

2014-04-23 15:14:40

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH] perf/x86/intel: Use rdmsrl_safe when initializing RAPL PMU.

On Wed, Apr 23, 2014 at 5:09 PM, Peter Zijlstra <[email protected]> wrote:
> On Wed, Apr 23, 2014 at 04:49:55PM +0200, Stephane Eranian wrote:
>> On your machine, booted with 3.15-rc2, do you have /sys/devices/power?
>> If not, and you have at least an SNB, you should have RAPL and that
>> RAPL_UNIT MSR.
>>
>> Proof is that if you read that MSR using /dev/cpu/msr it works just fine:
>> # modprobe msr
>> # rdmsr 0x606
>> a1003
>>
>> So something is broken somewhere.
>
> I've not got SNB+ class hardware (for testing). But I believe you that
> there's something funny, I just do not understanding how *msr_safe()
> could cause this.

Or the logic of the test in rapl_cpu_prepare() is wrong, maybe?
Wouldn't rdmsrl_safe() return 0 on success?

2014-04-23 15:16:46

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] perf/x86/intel: Use rdmsrl_safe when initializing RAPL PMU.

On Wed, Apr 23, 2014 at 05:14:33PM +0200, Stephane Eranian wrote:
> Wouldn't rdmsrl_safe() return 0 on success?

Yes.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-04-23 15:18:36

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH] perf/x86/intel: Use rdmsrl_safe when initializing RAPL PMU.

On Wed, Apr 23, 2014 at 5:16 PM, Borislav Petkov <[email protected]> wrote:
> On Wed, Apr 23, 2014 at 05:14:33PM +0200, Stephane Eranian wrote:
>> Wouldn't rdmsrl_safe() return 0 on success?
>
> Yes.
>
then the if() test is wrong:
if (!rdmsrl_safe())
return -1;

Should be:
if (rdmsrl_safe())
return -1;

Or am I missing something?


> --
> Regards/Gruss,
> Boris.
>
> Sent from a fat crate under my desk. Formatting is fine.
> --

2014-04-23 15:35:53

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] perf/x86/intel: Use rdmsrl_safe when initializing RAPL PMU.

On Wed, Apr 23, 2014 at 05:18:29PM +0200, Stephane Eranian wrote:
> On Wed, Apr 23, 2014 at 5:16 PM, Borislav Petkov <[email protected]> wrote:
> > On Wed, Apr 23, 2014 at 05:14:33PM +0200, Stephane Eranian wrote:
> >> Wouldn't rdmsrl_safe() return 0 on success?
> >
> > Yes.
> >
> then the if() test is wrong:
> if (!rdmsrl_safe())
> return -1;
>
> Should be:
> if (rdmsrl_safe())
> return -1;
>
> Or am I missing something?

Yeah, that is wrong:

if (!rdmsrl_safe(MSR_RAPL_POWER_UNIT, &msr_rapl_power_unit_bits))
return -1;

On error we return -EIO, on success 0. Just remove the "!".


--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2014-04-23 15:44:33

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH] perf/x86/intel: Use rdmsrl_safe when initializing RAPL PMU.

On Wed, Apr 23, 2014 at 5:35 PM, Borislav Petkov <[email protected]> wrote:
> On Wed, Apr 23, 2014 at 05:18:29PM +0200, Stephane Eranian wrote:
>> On Wed, Apr 23, 2014 at 5:16 PM, Borislav Petkov <[email protected]> wrote:
>> > On Wed, Apr 23, 2014 at 05:14:33PM +0200, Stephane Eranian wrote:
>> >> Wouldn't rdmsrl_safe() return 0 on success?
>> >
>> > Yes.
>> >
>> then the if() test is wrong:
>> if (!rdmsrl_safe())
>> return -1;
>>
>> Should be:
>> if (rdmsrl_safe())
>> return -1;
>>
>> Or am I missing something?
>
> Yeah, that is wrong:
>
> if (!rdmsrl_safe(MSR_RAPL_POWER_UNIT, &msr_rapl_power_unit_bits))
> return -1;
>
> On error we return -EIO, on success 0. Just remove the "!".
>
Yeah, will post a patch ASAP. A comment describing return values
where rdmsrl_safe() is defined would maybe help avoid problems in the future.
Thanks.

>
> --
> Regards/Gruss,
> Boris.
>
> Sent from a fat crate under my desk. Formatting is fine.
> --

2014-04-23 15:55:46

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] perf/x86/intel: Use rdmsrl_safe when initializing RAPL PMU.

On Wed, Apr 23, 2014 at 05:44:30PM +0200, Stephane Eranian wrote:
> A comment describing return values where rdmsrl_safe() is defined
> would maybe help avoid problems in the future.

If it helps, sure. But please as a separate patch and to x86 guys.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--