2018-08-21 21:28:15

by Eduardo Valentin

[permalink] [raw]
Subject: [PATCH 1/1] perf/x86/intel: make error messages less confusing

On a system with X86_FEATURE_ARCH_PERFMON disabled
and with a model not known by family PMU drivers,
user gets a kernel message log like the following:
[ 0.100114] Performance Events: unsupported p6 CPU model 85 no PMU driver, software events only.

The "unsupported .. CPU" part may be confusing for some
users. Rewording the messages on the failure path to:
[ 0.667154] Performance Events: unknown p6 PMU on CPU model 85: !X86_FEATURE_ARCH_PERFMON: no PMU driver, software events only.

Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
Cc: Peter Zijlstra <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Dan Carpenter <[email protected]>
Cc: Eduardo Valentin <[email protected]>
Cc: Jia Zhang <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: [email protected]
Reported-by: Matt Wilson <[email protected]>
Signed-off-by: Eduardo Valentin <[email protected]>
---
arch/x86/events/intel/core.c | 15 +++++++++++----
arch/x86/events/intel/p4.c | 2 +-
arch/x86/events/intel/p6.c | 3 ++-
3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 86f0c15dcc2d..b57a16997ee6 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3884,15 +3884,22 @@ __init int intel_pmu_init(void)
char *name;

if (!cpu_has(&boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) {
+ int ret = -ENODEV;
+
switch (boot_cpu_data.x86) {
case 0x6:
- return p6_pmu_init();
+ ret = p6_pmu_init();
+ break;
case 0xb:
- return knc_pmu_init();
+ ret = knc_pmu_init();
+ break;
case 0xf:
- return p4_pmu_init();
+ ret = p4_pmu_init();
+ break;
}
- return -ENODEV;
+ if (ret)
+ pr_cont(" !X86_FEATURE_ARCH_PERFMON: ");
+ return ret;
}

/*
diff --git a/arch/x86/events/intel/p4.c b/arch/x86/events/intel/p4.c
index d32c0eed38ca..963d2b0600f6 100644
--- a/arch/x86/events/intel/p4.c
+++ b/arch/x86/events/intel/p4.c
@@ -1346,7 +1346,7 @@ __init int p4_pmu_init(void)

rdmsr(MSR_IA32_MISC_ENABLE, low, high);
if (!(low & (1 << 7))) {
- pr_cont("unsupported Netburst CPU model %d ",
+ pr_cont("unknown Netburst PMU on CPU model %d: ",
boot_cpu_data.x86_model);
return -ENODEV;
}
diff --git a/arch/x86/events/intel/p6.c b/arch/x86/events/intel/p6.c
index 408879b0c0d4..221e374299b2 100644
--- a/arch/x86/events/intel/p6.c
+++ b/arch/x86/events/intel/p6.c
@@ -269,7 +269,8 @@ __init int p6_pmu_init(void)
break;

default:
- pr_cont("unsupported p6 CPU model %d ", boot_cpu_data.x86_model);
+ pr_cont("unknown p6 PMU on CPU model %d: ",
+ boot_cpu_data.x86_model);
return -ENODEV;
}

--
2.18.0



2018-08-21 22:31:47

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 1/1] perf/x86/intel: make error messages less confusing

On Tue, Aug 21, 2018 at 02:15:28PM -0700, Eduardo Valentin wrote:
> On a system with X86_FEATURE_ARCH_PERFMON disabled
> and with a model not known by family PMU drivers,
> user gets a kernel message log like the following:
> [ 0.100114] Performance Events: unsupported p6 CPU model 85 no PMU driver, software events only.
>
> The "unsupported .. CPU" part may be confusing for some
> users. Rewording the messages on the failure path to:
> [ 0.667154] Performance Events: unknown p6 PMU on CPU model 85: !X86_FEATURE_ARCH_PERFMON: no PMU driver, software events only.

Are you sure users even know what ARCH_PERFMON is?

Maybe it is confusing (why exactly?), but it doesn't seem to me that your
new message is any better.

If you refer to VMs not exposing the PMU perhaps that should be explicitely mentioned.

Of course the real fix is to always expose the PMU, not improve the error messages...

-Andi


2018-08-21 23:11:29

by Eduardo Valentin

[permalink] [raw]
Subject: Re: [PATCH 1/1] perf/x86/intel: make error messages less confusing

On Tue, Aug 21, 2018 at 03:09:37PM -0700, Andi Kleen wrote:
> On Tue, Aug 21, 2018 at 02:15:28PM -0700, Eduardo Valentin wrote:
> > On a system with X86_FEATURE_ARCH_PERFMON disabled
> > and with a model not known by family PMU drivers,
> > user gets a kernel message log like the following:
> > [ 0.100114] Performance Events: unsupported p6 CPU model 85 no PMU driver, software events only.
> >
> > The "unsupported .. CPU" part may be confusing for some
> > users. Rewording the messages on the failure path to:
> > [ 0.667154] Performance Events: unknown p6 PMU on CPU model 85: !X86_FEATURE_ARCH_PERFMON: no PMU driver, software events only.
>
> Are you sure users even know what ARCH_PERFMON is?
>
> Maybe it is confusing (why exactly?), but it doesn't seem to me that your
> new message is any better.

Yeah, the part that says "unsupported CPU" is the confusing part,
I get people thinking that the specific reported CPU model is not
supported by the kernel :-)

>
> If you refer to VMs not exposing the PMU perhaps that should be explicitely mentioned.
>
> Of course the real fix is to always expose the PMU, not improve the error messages...

I agree that best is simply to enable PMU. But it does not hurt to improve the error messaging, does it?

Any suggestions there, given that the initial attempt seams to make it even worse :-)

Was it only the ARCH_PERFMON part? I can probably just take that out.
Something like:

[ 0.667154] Performance Events: unknown p6 PMU on CPU model 85: no PMU driver, software events only.

>
> -Andi
>
>

--
All the best,
Eduardo Valentin

2018-08-22 00:00:39

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 1/1] perf/x86/intel: make error messages less confusing

On Tue, Aug 21, 2018 at 04:05:22PM -0700, Eduardo Valentin wrote:
> On Tue, Aug 21, 2018 at 03:09:37PM -0700, Andi Kleen wrote:
> > On Tue, Aug 21, 2018 at 02:15:28PM -0700, Eduardo Valentin wrote:
> > > On a system with X86_FEATURE_ARCH_PERFMON disabled
> > > and with a model not known by family PMU drivers,
> > > user gets a kernel message log like the following:
> > > [ 0.100114] Performance Events: unsupported p6 CPU model 85 no PMU driver, software events only.
> > >
> > > The "unsupported .. CPU" part may be confusing for some
> > > users. Rewording the messages on the failure path to:
> > > [ 0.667154] Performance Events: unknown p6 PMU on CPU model 85: !X86_FEATURE_ARCH_PERFMON: no PMU driver, software events only.
> >
> > Are you sure users even know what ARCH_PERFMON is?
> >
> > Maybe it is confusing (why exactly?), but it doesn't seem to me that your
> > new message is any better.
>
> Yeah, the part that says "unsupported CPU" is the confusing part,

That makes sense.

> I get people thinking that the specific reported CPU model is not
> supported by the kernel :-)
>
> >
> > If you refer to VMs not exposing the PMU perhaps that should be explicitely mentioned.
> >
> > Of course the real fix is to always expose the PMU, not improve the error messages...
>
> I agree that best is simply to enable PMU. But it does not hurt to improve the error messaging, does it?
>
> Any suggestions there, given that the initial attempt seams to make it even worse :-)

Perhaps just say

"CPU does not support PMU"

which is really what the problem is here.

The other option would be to move this message after the big model switch,
but would need to be very careful that it doesn't have any unintended
side effects.

-Andi

2018-08-22 09:53:57

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 1/1] perf/x86/intel: make error messages less confusing

On Tue, Aug 21, 2018 at 04:05:22PM -0700, Eduardo Valentin wrote:
> On Tue, Aug 21, 2018 at 03:09:37PM -0700, Andi Kleen wrote:
> > On Tue, Aug 21, 2018 at 02:15:28PM -0700, Eduardo Valentin wrote:

> > > [ 0.100114] Performance Events: unsupported p6 CPU model 85 no PMU driver, software events only.

> > Maybe it is confusing (why exactly?), but it doesn't seem to me that your
> > new message is any better.
>
> Yeah, the part that says "unsupported CPU" is the confusing part,
> I get people thinking that the specific reported CPU model is not
> supported by the kernel :-)

It is prefixed by: "Performance Events:", what is the problem?

2018-08-22 20:59:09

by Eduardo Valentin

[permalink] [raw]
Subject: Re: [PATCH 1/1] perf/x86/intel: make error messages less confusing

Hey,

On Tue, Aug 21, 2018 at 04:59:08PM -0700, Andi Kleen wrote:
> On Tue, Aug 21, 2018 at 04:05:22PM -0700, Eduardo Valentin wrote:
> > On Tue, Aug 21, 2018 at 03:09:37PM -0700, Andi Kleen wrote:
> > > On Tue, Aug 21, 2018 at 02:15:28PM -0700, Eduardo Valentin wrote:
> > > > On a system with X86_FEATURE_ARCH_PERFMON disabled
> > > > and with a model not known by family PMU drivers,
> > > > user gets a kernel message log like the following:
> > > > [ 0.100114] Performance Events: unsupported p6 CPU model 85 no PMU driver, software events only.
> > > >
> > > > The "unsupported .. CPU" part may be confusing for some
> > > > users. Rewording the messages on the failure path to:
> > > > [ 0.667154] Performance Events: unknown p6 PMU on CPU model 85: !X86_FEATURE_ARCH_PERFMON: no PMU driver, software events only.
> > >
> > > Are you sure users even know what ARCH_PERFMON is?
> > >
> > > Maybe it is confusing (why exactly?), but it doesn't seem to me that your
> > > new message is any better.
> >
> > Yeah, the part that says "unsupported CPU" is the confusing part,
>
> That makes sense.
>
> > I get people thinking that the specific reported CPU model is not
> > supported by the kernel :-)
> >
> > >
> > > If you refer to VMs not exposing the PMU perhaps that should be explicitely mentioned.
> > >
> > > Of course the real fix is to always expose the PMU, not improve the error messages...
> >
> > I agree that best is simply to enable PMU. But it does not hurt to improve the error messaging, does it?
> >
> > Any suggestions there, given that the initial attempt seams to make it even worse :-)
>
> Perhaps just say
>
> "CPU does not support PMU"
>

Ok.

> which is really what the problem is here.
>

Fair enough.

> The other option would be to move this message after the big model switch,
> but would need to be very careful that it doesn't have any unintended
> side effects.

I will simply remove those messages with CPU model details and on
the failure path build the message to look like:
[ 0.666785] Performance Events: CPU does not support PMU: no PMU driver, software events only.


>
> -Andi
>

--
All the best,
Eduardo Valentin