2010-11-23 15:22:17

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] x86, perf, nmi: Disable perf if counters are not accessable

Hi,

I am seeing for a while this warning in my system-wide logs:

Nov 23 11:54:14 tbox kernel: [ 0.040335] NMI watchdog failed to
create perf event on cpu0: ffffffa1

As I saw this patch from [1], I was hoping it's also fixing my problem
on an Intel Pentium-M (Banias) Single-Core CPU:

"So I changed it to:

static bool check_hw_exists(void)
{
u64 val, val_new = 0;
int ret = 0;
val = 0xabcdUL;
ret |= checking_wrmsrl(x86_pmu.perfctr, val);
ret |= rdmsrl_safe(x86_pmu.perfctr, &val_new);
if (ret || val != val_new)
return false;
return true;
}
And have applied the patch,

Thanks Don!"

Where did you apply it? Shouldn't that be in linux-2.6-tip perf/* GIT tree?

Regards,
- Sedat -

[1] http://lkml.org/lkml/2010/11/23/121
[2] http://git.kernel.org/?p=linux/kernel/git/tip/linux-2.6-tip.git;a=summary


2010-11-23 15:19:40

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] x86, perf, nmi: Disable perf if counters are not accessable

On Tue, 2010-11-23 at 16:15 +0100, Sedat Dilek wrote:
>
> Where did you apply it? Shouldn't that be in linux-2.6-tip perf/* GIT
> tree?
>
No, I don't have commit access to -tip, I've applied it to my local
quite queue and will feed that to Ingo once every few days.

2010-11-23 16:57:05

by Don Zickus

[permalink] [raw]
Subject: Re: [PATCH] x86, perf, nmi: Disable perf if counters are not accessable

On Tue, Nov 23, 2010 at 04:15:05PM +0100, Sedat Dilek wrote:
> Hi,
>
> I am seeing for a while this warning in my system-wide logs:
>
> Nov 23 11:54:14 tbox kernel: [ 0.040335] NMI watchdog failed to
> create perf event on cpu0: ffffffa1
>
> As I saw this patch from [1], I was hoping it's also fixing my problem
> on an Intel Pentium-M (Banias) Single-Core CPU:

I doubt it. This patch was intended for virtualization where the perf
counters are not emulated but the perf subsystem didn't know that.

Your error code is 'ffffffa1'. That translates to EOPNOTSUPP. The only
place I can see where that is returned is if your system does not have a
local apic on it (as set by the cpu feature bits).

Applying this patch may still get you the same result because the perf
counters might be there but there is no local apic to deliver the
interrupts.

I would have to see in your log file the output starting at the line with

Performance Events:

and pasting the next dozen lines or so to have a better understanding what
is going on. Or you can just attach the whole log in your reply.

Cheers,
Don

2010-11-23 18:21:32

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] x86, perf, nmi: Disable perf if counters are not accessable

On Tue, Nov 23, 2010 at 5:56 PM, Don Zickus <[email protected]> wrote:
> On Tue, Nov 23, 2010 at 04:15:05PM +0100, Sedat Dilek wrote:
>> Hi,
>>
>> I am seeing for a while this warning in my system-wide logs:
>>
>> Nov 23 11:54:14 tbox kernel: [    0.040335] NMI watchdog failed to
>> create perf event on cpu0: ffffffa1
>>
>> As I saw this patch from [1], I was hoping it's also fixing my problem
>> on an Intel Pentium-M (Banias) Single-Core CPU:
>
> I doubt it.  This patch was intended for virtualization where the perf
> counters are not emulated but the perf subsystem didn't know that.
>
> Your error code is 'ffffffa1'.  That translates to EOPNOTSUPP.  The only
> place I can see where that is returned is if your system does not have a
> local apic on it (as set by the cpu feature bits).
>

The problem still remains with your original patch [1] and Peter's
followup patch (is attached)

Due to BIOS l(ocal)apic is not possible:

# dmesg | grep -i apic
[ 0.000000] Using APIC driver default
[ 0.000000] Local APIC disabled by BIOS -- you can enable it with "lapic"
[ 0.000000] APIC: disable apic facility
[ 0.000000] APIC: switched to apic NOOP
[ 0.008891] no APIC, boot with the "lapic" boot parameter to force-enable it.
[ 0.036141] Local APIC not detected. Using dummy APIC emulation.

> Applying this patch may still get you the same result because the perf
> counters might be there but there is no local apic to deliver the
> interrupts.
>
> I would have to see in your log file the output starting at the line with
>
> Performance Events:
>
> and pasting the next dozen lines or so to have a better understanding what
> is going on.  Or you can just attach the whole log in your reply.
>
> Cheers,
> Don
>

Full dmesg is attached.

- Sedat -

[1] https://patchwork.kernel.org/patch/348341/


Attachments:
dmesg.txt (53.71 kB)
x86-perf-nmi-Disable-perf-if-counters-are-not-accessable-followup-patch-by-peter-zijlstra.patch (635.00 B)
Download all attachments

2010-11-23 18:26:50

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] x86, perf, nmi: Disable perf if counters are not accessable

On Tue, 2010-11-23 at 19:21 +0100, Sedat Dilek wrote:
> Due to BIOS l(ocal)apic is not possible:
>
> # dmesg | grep -i apic
> [ 0.000000] Using APIC driver default
> [ 0.000000] Local APIC disabled by BIOS -- you can enable it with "lapic"
> [ 0.000000] APIC: disable apic facility
> [ 0.000000] APIC: switched to apic NOOP
> [ 0.008891] no APIC, boot with the "lapic" boot parameter to force-enable it.
> [ 0.036141] Local APIC not detected. Using dummy APIC emulation.

Have you tried booting with "lapic" as the second last msg suggests you
do?

2010-11-23 18:29:25

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] x86, perf, nmi: Disable perf if counters are not accessable

On Tue, Nov 23, 2010 at 7:27 PM, Peter Zijlstra <[email protected]> wrote:
> On Tue, 2010-11-23 at 19:21 +0100, Sedat Dilek wrote:
>> Due to BIOS l(ocal)apic is not possible:
>>
>> # dmesg | grep -i apic
>> [    0.000000] Using APIC driver default
>> [    0.000000] Local APIC disabled by BIOS -- you can enable it with "lapic"
>> [    0.000000] APIC: disable apic facility
>> [    0.000000] APIC: switched to apic NOOP
>> [    0.008891] no APIC, boot with the "lapic" boot parameter to force-enable it.
>> [    0.036141] Local APIC not detected. Using dummy APIC emulation.
>
> Have you tried booting with "lapic" as the second last msg suggests you
> do?
>

Yes, I did try before there was [1], booting with "lapic" had no effect.

- Sedat -

[1] bugfix/x86/Skip-looking-for-ioapic-overrides-when-ioapics-are-not-present.patch

2010-11-23 18:37:25

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] x86, perf, nmi: Disable perf if counters are not accessable

On Tue, Nov 23, 2010 at 7:29 PM, Sedat Dilek <[email protected]> wrote:
> On Tue, Nov 23, 2010 at 7:27 PM, Peter Zijlstra <[email protected]> wrote:
>> On Tue, 2010-11-23 at 19:21 +0100, Sedat Dilek wrote:
>>> Due to BIOS l(ocal)apic is not possible:
>>>
>>> # dmesg | grep -i apic
>>> [    0.000000] Using APIC driver default
>>> [    0.000000] Local APIC disabled by BIOS -- you can enable it with "lapic"
>>> [    0.000000] APIC: disable apic facility
>>> [    0.000000] APIC: switched to apic NOOP
>>> [    0.008891] no APIC, boot with the "lapic" boot parameter to force-enable it.
>>> [    0.036141] Local APIC not detected. Using dummy APIC emulation.
>>
>> Have you tried booting with "lapic" as the second last msg suggests you
>> do?
>>
>
> Yes, I did try before there was [1], booting with "lapic" had no effect.
>
> - Sedat -
>
> [1] bugfix/x86/Skip-looking-for-ioapic-overrides-when-ioapics-are-not-present.patch
>

OK, some months went by when I last tested this option.

Now, with linux-next (next-20101123) things look better.

# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-2.6.37-rc3-686
root=UUID=1ceb69a7-ecf4-47e9-a231-b74e0f0a9b62 ro radeon.modeset=1
lapic 3

# dmesg | egrep -i 'NMI|APIC'
[ 0.000000] Using APIC driver default
[ 0.000000] Local APIC disabled by BIOS -- reenabling.
[ 0.000000] Found and enabled local APIC!
[ 0.000000] Kernel command line:
BOOT_IMAGE=/boot/vmlinuz-2.6.37-rc3-686
root=UUID=1ceb69a7-ecf4-47e9-a231-b74e0f0a9b62 ro radeon.modeset=1
lapic 3
[ 0.032158] Enabling APIC mode: Flat. Using 0 I/O APICs
[ 0.036000] NMI watchdog enabled, takes one hw-pmu counter.

- Sedat -

2010-11-23 19:04:24

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [PATCH] x86, perf, nmi: Disable perf if counters are not accessable

On Tue, Nov 23, 2010 at 07:27:07PM +0100, Peter Zijlstra wrote:
> On Tue, 2010-11-23 at 19:21 +0100, Sedat Dilek wrote:
> > Due to BIOS l(ocal)apic is not possible:
> >
> > # dmesg | grep -i apic
> > [ 0.000000] Using APIC driver default
> > [ 0.000000] Local APIC disabled by BIOS -- you can enable it with "lapic"
> > [ 0.000000] APIC: disable apic facility
> > [ 0.000000] APIC: switched to apic NOOP
> > [ 0.008891] no APIC, boot with the "lapic" boot parameter to force-enable it.
> > [ 0.036141] Local APIC not detected. Using dummy APIC emulation.
>
> Have you tried booting with "lapic" as the second last msg suggests you
> do?

Peter, Don, might not we need something like the patch below -- ie to check for
apic earlier and do not acquire cpu for PERF cpu bit, and its cpu model, etc
if there is no active apic? And perhaps for nmi-watchdog, we should not try
to creat perf event for same reason and simply report that nmi-watchdog is
disabled (though of course hpet based one should try to continue).

No?

Cyrill
---
arch/x86/kernel/cpu/perf_event.c | 24 ++++++++++++++++--------
1 file changed, 16 insertions(+), 8 deletions(-)

Index: linux-2.6.git/arch/x86/kernel/cpu/perf_event.c
=====================================================================
--- linux-2.6.git.orig/arch/x86/kernel/cpu/perf_event.c
+++ linux-2.6.git/arch/x86/kernel/cpu/perf_event.c
@@ -1329,14 +1329,16 @@ x86_pmu_notifier(struct notifier_block *
return ret;
}

-static void __init pmu_check_apic(void)
+static int __init pmu_check_apic(void)
{
if (cpu_has_apic)
- return;
+ return 0;

x86_pmu.apic = 0;
pr_info("no APIC, boot with the \"lapic\" boot parameter to force-enable it.\n");
pr_info("no hardware sampling interrupt available.\n");
+
+ return -1;
}

void __init init_hw_perf_events(void)
@@ -1346,6 +1348,10 @@ void __init init_hw_perf_events(void)

pr_info("Performance Events: ");

+ /* apic is required */
+ if (pmu_check_apic())
+ goto no_pmu;
+
switch (boot_cpu_data.x86_vendor) {
case X86_VENDOR_INTEL:
err = intel_pmu_init();
@@ -1356,12 +1362,8 @@ void __init init_hw_perf_events(void)
default:
return;
}
- if (err != 0) {
- pr_cont("no PMU driver, software events only.\n");
- return;
- }
-
- pmu_check_apic();
+ if (err != 0)
+ goto no_pmu;

pr_cont("%s PMU driver.\n", x86_pmu.name);

@@ -1411,6 +1413,12 @@ void __init init_hw_perf_events(void)

perf_pmu_register(&pmu);
perf_cpu_notifier(x86_pmu_notifier);
+
+ return;
+
+no_pmu:
+ pr_cont("no PMU driver, software events only.\n");
+ return;
}

static inline void x86_pmu_read(struct perf_event *event)

2010-11-23 19:07:37

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] x86, perf, nmi: Disable perf if counters are not accessable

On Tue, 2010-11-23 at 22:04 +0300, Cyrill Gorcunov wrote:
> On Tue, Nov 23, 2010 at 07:27:07PM +0100, Peter Zijlstra wrote:
> > On Tue, 2010-11-23 at 19:21 +0100, Sedat Dilek wrote:
> > > Due to BIOS l(ocal)apic is not possible:
> > >
> > > # dmesg | grep -i apic
> > > [ 0.000000] Using APIC driver default
> > > [ 0.000000] Local APIC disabled by BIOS -- you can enable it with "lapic"
> > > [ 0.000000] APIC: disable apic facility
> > > [ 0.000000] APIC: switched to apic NOOP
> > > [ 0.008891] no APIC, boot with the "lapic" boot parameter to force-enable it.
> > > [ 0.036141] Local APIC not detected. Using dummy APIC emulation.
> >
> > Have you tried booting with "lapic" as the second last msg suggests you
> > do?
>
> Peter, Don, might not we need something like the patch below -- ie to check for
> apic earlier and do not acquire cpu for PERF cpu bit, and its cpu model, etc
> if there is no active apic? And perhaps for nmi-watchdog, we should not try
> to creat perf event for same reason and simply report that nmi-watchdog is
> disabled (though of course hpet based one should try to continue).
>
> No?
>

Ah, no.. now I get what you mean.

We can use the pmu without interrupt with we miss the lapic, that is
perf-stat will still work.

> void __init init_hw_perf_events(void)
> @@ -1346,6 +1348,10 @@ void __init init_hw_perf_events(void)
>
> pr_info("Performance Events: ");
>
> + /* apic is required */
> + if (pmu_check_apic())
> + goto no_pmu;
> +

> +no_pmu:
> + pr_cont("no PMU driver, software events only.\n");
> + return;
> }
>
> static inline void x86_pmu_read(struct perf_event *event)