2015-07-02 15:12:02

by Vince Weaver

[permalink] [raw]
Subject: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()


So sad to say the lack of fuzzer reports was because I was out of town for
a bit, not due to the kernel suddenly getting amazingly better.

In any case I am running against current git and getting a lot of
warnings, but most of them seem to be old ones. This following one looks
new though.

This is current linus-git on a Haswell machine with peterz's patch to fix
the aux buffer spinlock recursion (I can still crash the kernel if that
patch is not applied).

It corresponds to:

WARN_ON_ONCE(!event->attr.precise_ip);

[ 584.352324] WARNING: CPU: 2 PID: 18924 at arch/x86/kernel/cpu/perf_event_intel_ds.c:1198 intel_pmu_drain_pebs_nhm+0x283/0x2e0()
[ 584.364649] Modules linked in: fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm_intel kvm snd_hda_codec_hdmi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel hmac drbg i915 ansi_cprng snd_hda_codec_realtek snd_hda_codec_generic iTCO_wdt ppdev snd_hda_intel snd_hda_codec aesni_intel aes_x86_64 snd_hda_core lrw evdev snd_hwdep drm_kms_helper drm psmouse iTCO_vendor_support snd_pcm gf128mul glue_helper ablk_helper serio_raw pcspkr cryptd snd_timer i2c_i801 processor battery video snd mei_me parport_pc i2c_algo_bit button soundcore lpc_ich mfd_core tpm_tis mei parport tpm wmi sg sr_mod sd_mod cdrom ehci_pci ehci_hcd ahci libahci libata xhci_pci xhci_hcd e1000e ptp usbcore crc32c_intel fan scsi_mod pps_core usb_common thermal thermal_sys
[ 584.440755] CPU: 2 PID: 18924 Comm: perf_fuzzer Tainted: G W 4.1.0+ #160
[ 584.449736] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[ 584.458241] ffffffff81a100e0 ffff8800cd793898 ffffffff8169e763 0000000000000000
[ 584.466868] 0000000000000000 ffff8800cd7938d8 ffffffff8106ecba ffffffff810b7806
[ 584.475466] 0000000000000000 0000000000000001 ffff88011ea8bd80 ffff8801190400c0
[ 584.484101] Call Trace:
[ 584.487287] [<ffffffff8169e763>] dump_stack+0x45/0x57
[ 584.493360] [<ffffffff8106ecba>] warn_slowpath_common+0x8a/0xc0
[ 584.500393] [<ffffffff810b7806>] ? __lock_acquire.isra.31+0x3a6/0xf90
[ 584.507953] [<ffffffff8106edaa>] warn_slowpath_null+0x1a/0x20
[ 584.514770] [<ffffffff8102f783>] intel_pmu_drain_pebs_nhm+0x283/0x2e0
[ 584.522287] [<ffffffff8102ec48>] intel_pmu_drain_pebs_buffer+0x18/0x20
[ 584.529870] [<ffffffff8115bb9b>] ? ___perf_sw_event+0x13b/0x170
[ 584.536856] [<ffffffff811569e8>] ? perf_pmu_sched_task+0xa8/0x140
[ 584.543978] [<ffffffff810303c1>] ? intel_pmu_pebs_disable_all+0x31/0x40
[ 584.551590] [<ffffffff81030a49>] ? __intel_pmu_disable_all+0x49/0x70
[ 584.559021] [<ffffffff81030064>] intel_pmu_pebs_sched_task+0x14/0x20
[ 584.566387] [<ffffffff8103070d>] intel_pmu_sched_task+0x3d/0x40
[ 584.573311] [<ffffffff81028646>] x86_pmu_sched_task+0x16/0x20
[ 584.580002] [<ffffffff81156a01>] perf_pmu_sched_task+0xc1/0x140
[ 584.586954] [<ffffffff8115695e>] ? perf_pmu_sched_task+0x1e/0x140
[ 584.594023] [<ffffffff81158596>] __perf_event_task_sched_out+0x66/0x4c0
[ 584.601657] [<ffffffff810b641d>] ? __lock_is_held+0x4d/0x70
[ 584.608184] [<ffffffff816a0a07>] __schedule+0x427/0xa50
[ 584.614334] [<ffffffff816a1067>] schedule+0x37/0x80
[ 584.620114] [<ffffffff816a502c>] schedule_timeout+0x1bc/0x250
[ 584.626829] [<ffffffff816a2228>] ? wait_for_completion+0x28/0x100
[ 584.633877] [<ffffffff816a22be>] ? wait_for_completion+0xbe/0x100
[ 584.640955] [<ffffffff816a22c6>] wait_for_completion+0xc6/0x100
[ 584.647842] [<ffffffff81098cb0>] ? wake_up_q+0x70/0x70
[ 584.653888] [<ffffffff810d63c0>] ? call_rcu_bh+0x20/0x20
[ 584.660135] [<ffffffff810d2fcd>] wait_rcu_gp+0x5d/0x80
[ 584.666191] [<ffffffff810d2f10>] ? trace_raw_output_rcu_utilization+0x60/0x60
[ 584.674389] [<ffffffff81152156>] ? perf_unpin_context+0x16/0x30
[ 584.681272] [<ffffffff810d52eb>] synchronize_sched+0x3b/0x50
[ 584.687867] [<ffffffff8113d998>] perf_trace_event_unreg.isra.1+0x38/0x90
[ 584.695620] [<ffffffff8113dcec>] perf_trace_destroy+0x3c/0x50
[ 584.702349] [<ffffffff81152d79>] tp_perf_event_destroy+0x9/0x10
[ 584.709207] [<ffffffff81158ed1>] _free_event+0xc1/0x250
[ 584.715306] [<ffffffff811591ff>] free_event+0x1f/0x50
[ 584.721224] [<ffffffff8115c8e0>] SYSC_perf_event_open+0x400/0xd40
[ 584.728289] [<ffffffff8105f89b>] ? __do_page_fault+0x1ab/0x3f0
[ 584.735056] [<ffffffff8115d699>] SyS_perf_event_open+0x9/0x10
[ 584.741649] [<ffffffff816a64b2>] entry_SYSCALL_64_fastpath+0x16/0x7a
[ 584.748915] ---[ end trace e3d880444bf5705b ]---


2015-07-02 19:43:26

by Peter Zijlstra

[permalink] [raw]
Subject: Re: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()

On Thu, Jul 02, 2015 at 11:18:10AM -0400, Vince Weaver wrote:
> This is current linus-git on a Haswell machine with peterz's patch to fix
> the aux buffer spinlock recursion (I can still crash the kernel if that
> patch is not applied).

Argh, thanks for reminding me of that patch.

2015-07-03 13:13:51

by Peter Zijlstra

[permalink] [raw]
Subject: Re: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()

On Thu, Jul 02, 2015 at 11:18:10AM -0400, Vince Weaver wrote:
>
> So sad to say the lack of fuzzer reports was because I was out of town for
> a bit, not due to the kernel suddenly getting amazingly better.
>
> In any case I am running against current git and getting a lot of
> warnings, but most of them seem to be old ones. This following one looks
> new though.
>
> This is current linus-git on a Haswell machine with peterz's patch to fix
> the aux buffer spinlock recursion (I can still crash the kernel if that
> patch is not applied).
>
> It corresponds to:
>
> WARN_ON_ONCE(!event->attr.precise_ip);
>
> [ 584.352324] WARNING: CPU: 2 PID: 18924 at arch/x86/kernel/cpu/perf_event_intel_ds.c:1198 intel_pmu_drain_pebs_nhm+0x283/0x2e0()

I've not yet tried to reproduce, but the below could explain things.

On disabling an event we first clear our cpuc->pebs_enabled bits, only
to then check them to see if there are any set, and if so, drain the
buffer.

If we just cleared the last bit, we'll fail to drain the buffer.

If we then program another event on that counter and another PEBS event,
we can hit the above WARN with the 'stale' entries left over from the
previous event.

---
arch/x86/kernel/cpu/perf_event_intel_ds.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 71fc40238843..041a30ba5654 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -548,7 +548,7 @@ int intel_pmu_drain_bts_buffer(void)

static inline void intel_pmu_drain_pebs_buffer(void)
{
- struct pt_regs regs;
+ struct pt_regs regs; /* SAMPLE_REGS_INTR must not be set for FREERUNNING */

x86_pmu.drain_pebs(&regs);
}
@@ -755,13 +755,6 @@ void intel_pmu_pebs_disable(struct perf_event *event)
struct hw_perf_event *hwc = &event->hw;
struct debug_store *ds = cpuc->ds;

- cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
-
- if (event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT)
- cpuc->pebs_enabled &= ~(1ULL << (hwc->idx + 32));
- else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
- cpuc->pebs_enabled &= ~(1ULL << 63);
-
if (ds->pebs_interrupt_threshold >
ds->pebs_buffer_base + x86_pmu.pebs_record_size) {
intel_pmu_drain_pebs_buffer();
@@ -769,6 +762,13 @@ void intel_pmu_pebs_disable(struct perf_event *event)
perf_sched_cb_dec(event->ctx->pmu);
}

+ cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
+
+ if (event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT)
+ cpuc->pebs_enabled &= ~(1ULL << (hwc->idx + 32));
+ else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
+ cpuc->pebs_enabled &= ~(1ULL << 63);
+
if (cpuc->enabled)
wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);

2015-07-03 18:57:17

by Stephane Eranian

[permalink] [raw]
Subject: Re: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()

Peter,

On Fri, Jul 3, 2015 at 3:13 PM, Peter Zijlstra <[email protected]> wrote:
> On Thu, Jul 02, 2015 at 11:18:10AM -0400, Vince Weaver wrote:
>>
>> So sad to say the lack of fuzzer reports was because I was out of town for
>> a bit, not due to the kernel suddenly getting amazingly better.
>>
>> In any case I am running against current git and getting a lot of
>> warnings, but most of them seem to be old ones. This following one looks
>> new though.
>>
>> This is current linus-git on a Haswell machine with peterz's patch to fix
>> the aux buffer spinlock recursion (I can still crash the kernel if that
>> patch is not applied).
>>
>> It corresponds to:
>>
>> WARN_ON_ONCE(!event->attr.precise_ip);
>>
>> [ 584.352324] WARNING: CPU: 2 PID: 18924 at arch/x86/kernel/cpu/perf_event_intel_ds.c:1198 intel_pmu_drain_pebs_nhm+0x283/0x2e0()
>
> I've not yet tried to reproduce, but the below could explain things.
>
> On disabling an event we first clear our cpuc->pebs_enabled bits, only
> to then check them to see if there are any set, and if so, drain the
> buffer.
>
> If we just cleared the last bit, we'll fail to drain the buffer.
>
Where do you see that we use cpuc->pebs_enabled after clearing it
in pebs_disable() to check for overflow or active in drain_pebs()?
I only see it used in get_next_pebs_record_by_bit()?

> If we then program another event on that counter and another PEBS event,
> we can hit the above WARN with the 'stale' entries left over from the
> previous event.
>
> ---
> arch/x86/kernel/cpu/perf_event_intel_ds.c | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> index 71fc40238843..041a30ba5654 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> @@ -548,7 +548,7 @@ int intel_pmu_drain_bts_buffer(void)
>
> static inline void intel_pmu_drain_pebs_buffer(void)
> {
> - struct pt_regs regs;
> + struct pt_regs regs; /* SAMPLE_REGS_INTR must not be set for FREERUNNING */
>
> x86_pmu.drain_pebs(&regs);
> }
> @@ -755,13 +755,6 @@ void intel_pmu_pebs_disable(struct perf_event *event)
> struct hw_perf_event *hwc = &event->hw;
> struct debug_store *ds = cpuc->ds;
>
> - cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
> -
> - if (event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT)
> - cpuc->pebs_enabled &= ~(1ULL << (hwc->idx + 32));
> - else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
> - cpuc->pebs_enabled &= ~(1ULL << 63);
> -
> if (ds->pebs_interrupt_threshold >
> ds->pebs_buffer_base + x86_pmu.pebs_record_size) {
> intel_pmu_drain_pebs_buffer();
> @@ -769,6 +762,13 @@ void intel_pmu_pebs_disable(struct perf_event *event)
> perf_sched_cb_dec(event->ctx->pmu);
> }
>
> + cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
> +
> + if (event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT)
> + cpuc->pebs_enabled &= ~(1ULL << (hwc->idx + 32));
> + else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
> + cpuc->pebs_enabled &= ~(1ULL << 63);
> +
> if (cpuc->enabled)
> wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
>

2015-07-03 18:57:04

by Vince Weaver

[permalink] [raw]
Subject: Re: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()

On Fri, 3 Jul 2015, Peter Zijlstra wrote:

> On Thu, Jul 02, 2015 at 11:18:10AM -0400, Vince Weaver wrote:
> >
> > So sad to say the lack of fuzzer reports was because I was out of town for
> > a bit, not due to the kernel suddenly getting amazingly better.
> >
> > In any case I am running against current git and getting a lot of
> > warnings, but most of them seem to be old ones. This following one looks
> > new though.
> >
> > This is current linus-git on a Haswell machine with peterz's patch to fix
> > the aux buffer spinlock recursion (I can still crash the kernel if that
> > patch is not applied).
> >
> > It corresponds to:
> >
> > WARN_ON_ONCE(!event->attr.precise_ip);
> >
> > [ 584.352324] WARNING: CPU: 2 PID: 18924 at arch/x86/kernel/cpu/perf_event_intel_ds.c:1198 intel_pmu_drain_pebs_nhm+0x283/0x2e0()
>
> I've not yet tried to reproduce, but the below could explain things.
>
> On disabling an event we first clear our cpuc->pebs_enabled bits, only
> to then check them to see if there are any set, and if so, drain the
> buffer.
>
> If we just cleared the last bit, we'll fail to drain the buffer.
>
> If we then program another event on that counter and another PEBS event,
> we can hit the above WARN with the 'stale' entries left over from the
> previous event.

with that patch applied I still managed to hit this:

WARN_ON_ONCE(!event->attr.precise_ip);

I'll let it run some more and see if the watchdog still gets triggered.

[ 2217.544901] ------------[ cut here ]------------
[ 2217.550351] WARNING: CPU: 2 PID: 9136 at arch/x86/kernel/cpu/perf_event_intel_ds.c:1198 intel_pmu_drain_pebs_nhm+0x283/0x2e0()
[ 2217.563534] Modules linked in: fuse snd_hda_codec_hdmi i915 x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel psmouse hmac drbg evdev serio_raw ansi_cprng snd_hda_codec_realtek drm_kms_helper snd_hda_codec_generic ppdev iTCO_wdt iTCO_vendor_support pcspkr drm i2c_algo_bit aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper snd_hda_intel cryptd mei_me mei snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer tpm_tis tpm wmi button processor video battery i2c_i801 parport_pc parport snd lpc_ich mfd_core soundcore sg sr_mod sd_mod cdrom ehci_pci ehci_hcd ahci libahci xhci_pci xhci_hcd e1000e libata ptp crc32c_intel scsi_mod pps_core usbcore usb_common fan thermal thermal_sys
[ 2217.640998] CPU: 2 PID: 9136 Comm: perf_fuzzer Tainted: G W 4.1.0+ #163
[ 2217.649810] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[ 2217.658281] ffffffff81a105a0 ffff88011ea85b10 ffffffff8169f823 0000000000000000
[ 2217.666818] 0000000000000000 ffff88011ea85b50 ffffffff8106ec8a ffff88011ea85ba0
[ 2217.675329] 0000000000000002 0000000000000001 ffff88011ea8bd80 ffff8801190400c0
[ 2217.683821] Call Trace:
[ 2217.686960] <NMI> [<ffffffff8169f823>] dump_stack+0x45/0x57
[ 2217.693638] [<ffffffff8106ec8a>] warn_slowpath_common+0x8a/0xc0
[ 2217.700549] [<ffffffff8106ed7a>] warn_slowpath_null+0x1a/0x20
[ 2217.707296] [<ffffffff8102f783>] intel_pmu_drain_pebs_nhm+0x283/0x2e0
[ 2217.714775] [<ffffffff81031834>] ? intel_pmu_disable_event+0xa4/0x130
[ 2217.722216] [<ffffffff81032235>] intel_pmu_handle_irq+0x255/0x440
[ 2217.729339] [<ffffffff8115413e>] ? perf_event_ctx_lock_nested+0x5e/0xf0
[ 2217.737026] [<ffffffff81028e76>] perf_event_nmi_handler+0x26/0x40
[ 2217.744070] [<ffffffff810181ad>] nmi_handle+0x9d/0x140
[ 2217.750160] [<ffffffff81018115>] ? nmi_handle+0x5/0x140
[ 2217.756290] [<ffffffff8101843a>] default_do_nmi+0x4a/0x120
[ 2217.762688] [<ffffffff8101859d>] do_nmi+0x8d/0xc0
[ 2217.768280] [<ffffffff816a979f>] end_repeat_nmi+0x1e/0x2e
[ 2217.774627] [<ffffffff810309ba>] ? __intel_pmu_enable_all+0x5a/0xc0
[ 2217.781894] [<ffffffff810309ba>] ? __intel_pmu_enable_all+0x5a/0xc0
[ 2217.789153] [<ffffffff810309ba>] ? __intel_pmu_enable_all+0x5a/0xc0
[ 2217.796415] <<EOE>> <IRQ> [<ffffffff81030a30>] intel_pmu_enable_all+0x10/0x20
[ 2217.804847] [<ffffffff8102a95c>] x86_pmu_enable+0x25c/0x2e0
[ 2217.811383] [<ffffffff811560e2>] perf_pmu_enable+0x22/0x30
[ 2217.817837] [<ffffffff81157a80>] perf_mux_hrtimer_handler+0x120/0x1f0
[ 2217.825316] [<ffffffff81157960>] ? perf_event_context_sched_in+0x150/0x150
[ 2217.833239] [<ffffffff810dcf43>] __hrtimer_run_queues+0xd3/0x260
[ 2217.840239] [<ffffffff810dd4bb>] hrtimer_interrupt+0xab/0x1b0
[ 2217.846930] [<ffffffff8104b32c>] local_apic_timer_interrupt+0x3c/0x70
[ 2217.854367] [<ffffffff816aa1a1>] smp_apic_timer_interrupt+0x41/0x60
[ 2217.861630] [<ffffffff816a83eb>] apic_timer_interrupt+0x6b/0x70
[ 2217.868540] <EOI>
[ 2217.870633] ---[ end trace 3a31b4d07b4f3450 ]---
[ 2353.824071] Uhhuh. NMI received for unknown reason 31 on CPU 1.
[ 2353.831238] Do you have a strange power saving mode enabled?
[ 2353.838120] Dazed and confused, but trying to continue

2015-07-03 19:04:37

by Peter Zijlstra

[permalink] [raw]
Subject: Re: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()

On Fri, Jul 03, 2015 at 08:56:57PM +0200, Stephane Eranian wrote:
> Where do you see that we use cpuc->pebs_enabled after clearing it
> in pebs_disable() to check for overflow or active in drain_pebs()?
> I only see it used in get_next_pebs_record_by_bit()?


| static inline bool pebs_is_enabled(struct cpu_hw_events *cpuc)
| {
| return (cpuc->pebs_enabled & ((1ULL << MAX_PEBS_EVENTS) - 1));
| }


| void intel_pmu_pebs_disable(struct perf_event *event)
| {
| struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
| struct hw_perf_event *hwc = &event->hw;
| struct debug_store *ds = cpuc->ds;
|
| cpuc->pebs_enabled &= ~(1ULL << hwc->idx);

Here we clear it...

| if (event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT)
| cpuc->pebs_enabled &= ~(1ULL << (hwc->idx + 32));
| else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
| cpuc->pebs_enabled &= ~(1ULL << 63);
|
| if (ds->pebs_interrupt_threshold >
| ds->pebs_buffer_base + x86_pmu.pebs_record_size) {
| intel_pmu_drain_pebs_buffer();

Here we drain the buffer; which uses ->pebs_enabled

| if (!pebs_is_enabled(cpuc))

And here we test it

| perf_sched_cb_dec(event->ctx->pmu);
| }
|
| if (cpuc->enabled)
| wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
|
| hwc->config |= ARCH_PERFMON_EVENTSEL_INT;
| }

That said, its far too warm and I might just not be making sense.

2015-07-03 19:43:37

by Vince Weaver

[permalink] [raw]
Subject: Re: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()

On Fri, 3 Jul 2015, Peter Zijlstra wrote:

> That said, its far too warm and I might just not be making sense.

you need to come visit Maine! Although I am not sure the cooler weather
necessarily improves my kernel debugging skills.

I managed to lock the machine (again this is with the patch applied).

[ 299.366027] ------------[ cut here ]------------
[ 299.370985] WARNING: CPU: 2 PID: 8241 at arch/x86/kernel/cpu/perf_event_intel_ds.c:1198 intel_pmu_drain_pebs_nhm+0x283/0x2e0()
[ 299.456929] CPU: 2 PID: 8241 Comm: perf_fuzzer Tainted: G W 4.1.0+ #164
[ 299.465750] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[ 299.474274] ffffffff81a105a0 ffff88011ea85b10 ffffffff8169f823 0000000000000000
[ 299.482864] 0000000000000000 ffff88011ea85b50 ffffffff8106ec8a ffff88011ea85ba0
[ 299.491488] 0000000000000000 0000000000000001 ffff88011ea8bd80 ffff8801190400c0
[ 299.500029] Call Trace:
[ 299.503190] <NMI> [<ffffffff8169f823>] dump_stack+0x45/0x57
[ 299.509936] [<ffffffff8106ec8a>] warn_slowpath_common+0x8a/0xc0
[ 299.516901] [<ffffffff8106ed7a>] warn_slowpath_null+0x1a/0x20
[ 299.523715] [<ffffffff8102f783>] intel_pmu_drain_pebs_nhm+0x283/0x2e0
[ 299.531268] [<ffffffff81032235>] intel_pmu_handle_irq+0x255/0x440
[ 299.538487] [<ffffffff81028e76>] perf_event_nmi_handler+0x26/0x40
[ 299.545638] [<ffffffff810181ad>] nmi_handle+0x9d/0x140
[ 299.551772] [<ffffffff81018115>] ? nmi_handle+0x5/0x140
[ 299.558013] [<ffffffff8101843a>] default_do_nmi+0x4a/0x120
[ 299.564527] [<ffffffff8101859d>] do_nmi+0x8d/0xc0
[ 299.570185] [<ffffffff816a979f>] end_repeat_nmi+0x1e/0x2e
[ 299.576580] [<ffffffff811bc9d2>] ? check_poison_obj+0x92/0x230
[ 299.583390] [<ffffffff811bc9d2>] ? check_poison_obj+0x92/0x230
[ 299.590163] [<ffffffff811bc9d2>] ? check_poison_obj+0x92/0x230
[ 299.596922] <<EOE>> [<ffffffff8115bea8>] ? perf_event_alloc+0x58/0x680
[ 299.604594] [<ffffffff811bcf7d>] cache_alloc_debugcheck_after.isra.51+0x1cd/0x250
[ 299.613140] [<ffffffff811c08b6>] kmem_cache_alloc_trace+0xa6/0x510
[ 299.620330] [<ffffffff8115bea8>] ? perf_event_alloc+0x58/0x680
[ 299.627088] [<ffffffff8106ee48>] ? get_online_cpus+0x58/0x70
[ 299.633688] [<ffffffff8115bea8>] perf_event_alloc+0x58/0x680
[ 299.640319] [<ffffffff8115c897>] SYSC_perf_event_open+0x3c7/0xd40
[ 299.647353] [<ffffffff8105f86b>] ? __do_page_fault+0x1ab/0x3f0
[ 299.654172] [<ffffffff8115d689>] SyS_perf_event_open+0x9/0x10
[ 299.660871] [<ffffffff816a7572>] entry_SYSCALL_64_fastpath+0x16/0x7a
[ 299.668236] ---[ end trace 3356c74581c13f1d ]---
[ 299.673648] Uhhuh. NMI received for unknown reason 31 on CPU 2.
[ 299.680427] Do you have a strange power saving mode enabled?
[ 299.686963] Dazed and confused, but trying to continue
[ 299.692904] Uhhuh. NMI received for unknown reason 31 on CPU 2.
[ 299.699748] Do you have a strange power saving mode enabled?
[ 299.706227] Dazed and confused, but trying to continue
[ 299.712172] Uhhuh. NMI received for unknown reason 31 on CPU 2.
[ 299.718946] Do you have a strange power saving mode enabled?
[ 299.725446] Dazed and confused, but trying to continue
[ 299.731419] Uhhuh. NMI received for unknown reason 31 on CPU 2.
[ 299.738235] Do you have a strange power saving mode enabled?
[ 299.744740] Dazed and confused, but trying to continue
[ 299.750660] Uhhuh. NMI received for unknown reason 21 on CPU 2.
[ 299.757398] Do you have a strange power saving mode enabled?
[ 299.763862] Dazed and confused, but trying to continue

(machine eventually locks up after lots of these messages)

2015-07-03 20:08:41

by Liang, Kan

[permalink] [raw]
Subject: RE: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()


>
> I've not yet tried to reproduce, but the below could explain things.
>
> On disabling an event we first clear our cpuc->pebs_enabled bits, only to
> then check them to see if there are any set, and if so, drain the buffer.
>
> If we just cleared the last bit, we'll fail to drain the buffer.
>
> If we then program another event on that counter and another PEBS event,
> we can hit the above WARN with the 'stale' entries left over from the
> previous event.
>
> ---
> arch/x86/kernel/cpu/perf_event_intel_ds.c | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c
> b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> index 71fc40238843..041a30ba5654 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> @@ -548,7 +548,7 @@ int intel_pmu_drain_bts_buffer(void)
>
> static inline void intel_pmu_drain_pebs_buffer(void) {
> - struct pt_regs regs;
> + struct pt_regs regs; /* SAMPLE_REGS_INTR must not be set for
> +FREERUNNING */
>
> x86_pmu.drain_pebs(&regs);
> }
> @@ -755,13 +755,6 @@ void intel_pmu_pebs_disable(struct perf_event
> *event)
> struct hw_perf_event *hwc = &event->hw;
> struct debug_store *ds = cpuc->ds;
>
> - cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
> -
> - if (event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT)
> - cpuc->pebs_enabled &= ~(1ULL << (hwc->idx + 32));
> - else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
> - cpuc->pebs_enabled &= ~(1ULL << 63);
> -
> if (ds->pebs_interrupt_threshold >
> ds->pebs_buffer_base + x86_pmu.pebs_record_size) {
> intel_pmu_drain_pebs_buffer();
> @@ -769,6 +762,13 @@ void intel_pmu_pebs_disable(struct perf_event
> *event)
> perf_sched_cb_dec(event->ctx->pmu);
> }
>
> + cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
> +
> + if (event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT)
> + cpuc->pebs_enabled &= ~(1ULL << (hwc->idx + 32));
> + else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
> + cpuc->pebs_enabled &= ~(1ULL << 63);
> +
> if (cpuc->enabled)
> wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
>

If we cleared the last bit, we not only drain the buffer but also decrease
the event->ctx->pmu, which is used to flush the PEBS buffer during
context switches.
We need to disable cpuc->pebs_enabled before changing
event->ctx->pmu as below.


diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c
b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 71fc402..76285c1 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -754,6 +754,11 @@ void intel_pmu_pebs_disable(struct
perf_event *event)
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
struct hw_perf_event *hwc = &event->hw;
struct debug_store *ds = cpuc->ds;
+ bool large_pebs = ds->pebs_interrupt_threshold >
+ ds->pebs_buffer_base + x86_pmu.pebs_record_size;
+
+ if (large_pebs)
+ intel_pmu_drain_pebs_buffer();

cpuc->pebs_enabled &= ~(1ULL << hwc->idx);

@@ -762,12 +767,8 @@ void intel_pmu_pebs_disable(struct
perf_event *event)
else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
cpuc->pebs_enabled &= ~(1ULL << 63);

- if (ds->pebs_interrupt_threshold >
- ds->pebs_buffer_base + x86_pmu.pebs_record_size) {
- intel_pmu_drain_pebs_buffer();
- if (!pebs_is_enabled(cpuc))
- perf_sched_cb_dec(event->ctx->pmu);
- }
+ if (large_pebs && !pebs_is_enabled(cpuc))
+ perf_sched_cb_dec(event->ctx->pmu);

if (cpuc->enabled)
wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);



Thanks,
Kan

2015-07-06 10:55:27

by Peter Zijlstra

[permalink] [raw]
Subject: Re: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()

On Fri, Jul 03, 2015 at 08:08:27PM +0000, Liang, Kan wrote:
> If we cleared the last bit, we not only drain the buffer but also decrease
> the event->ctx->pmu, which is used to flush the PEBS buffer during
> context switches.
> We need to disable cpuc->pebs_enabled before changing
> event->ctx->pmu as below.
>

Indeed, mind sending a proper patch so I can press 'A' on it?

> diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c
> b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> index 71fc402..76285c1 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> @@ -754,6 +754,11 @@ void intel_pmu_pebs_disable(struct
> perf_event *event)
> struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> struct hw_perf_event *hwc = &event->hw;
> struct debug_store *ds = cpuc->ds;
> + bool large_pebs = ds->pebs_interrupt_threshold >
> + ds->pebs_buffer_base + x86_pmu.pebs_record_size;
> +
> + if (large_pebs)
> + intel_pmu_drain_pebs_buffer();
>
> cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
>
> @@ -762,12 +767,8 @@ void intel_pmu_pebs_disable(struct
> perf_event *event)
> else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
> cpuc->pebs_enabled &= ~(1ULL << 63);
>
> - if (ds->pebs_interrupt_threshold >
> - ds->pebs_buffer_base + x86_pmu.pebs_record_size) {
> - intel_pmu_drain_pebs_buffer();
> - if (!pebs_is_enabled(cpuc))
> - perf_sched_cb_dec(event->ctx->pmu);
> - }
> + if (large_pebs && !pebs_is_enabled(cpuc))
> + perf_sched_cb_dec(event->ctx->pmu);
>
> if (cpuc->enabled)
> wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
>
>
>
> Thanks,
> Kan

2015-07-06 13:48:07

by Liang, Kan

[permalink] [raw]
Subject: RE: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()


> On Fri, Jul 03, 2015 at 08:08:27PM +0000, Liang, Kan wrote:
> > If we cleared the last bit, we not only drain the buffer but also
> > decrease the event->ctx->pmu, which is used to flush the PEBS buffer
> > during context switches.
> > We need to disable cpuc->pebs_enabled before changing
> > event->ctx->pmu as below.
> >
>
> Indeed, mind sending a proper patch so I can press 'A' on it?

Sure, I will do that.
But I didn't verify the patch, since I cannot reproduce the issue.

Vince, would you mind testing the patch?
If the issue is gone, I will send a proper patch then.

Thanks,
Kan

>
> > diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c
> > b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> > index 71fc402..76285c1 100644
> > --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
> > +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> > @@ -754,6 +754,11 @@ void intel_pmu_pebs_disable(struct perf_event
> > *event)
> > struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> > struct hw_perf_event *hwc = &event->hw;
> > struct debug_store *ds = cpuc->ds;
> > + bool large_pebs = ds->pebs_interrupt_threshold >
> > + ds->pebs_buffer_base +
> > + x86_pmu.pebs_record_size;
> > +
> > + if (large_pebs)
> > + intel_pmu_drain_pebs_buffer();
> >
> > cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
> >
> > @@ -762,12 +767,8 @@ void intel_pmu_pebs_disable(struct perf_event
> > *event)
> > else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
> > cpuc->pebs_enabled &= ~(1ULL << 63);
> >
> > - if (ds->pebs_interrupt_threshold >
> > - ds->pebs_buffer_base + x86_pmu.pebs_record_size) {
> > - intel_pmu_drain_pebs_buffer();
> > - if (!pebs_is_enabled(cpuc))
> > - perf_sched_cb_dec(event->ctx->pmu);
> > - }
> > + if (large_pebs && !pebs_is_enabled(cpuc))
> > + perf_sched_cb_dec(event->ctx->pmu);
> >
> > if (cpuc->enabled)
> > wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
> >
> >
> >
> > Thanks,
> > Kan

2015-07-06 16:16:09

by Vince Weaver

[permalink] [raw]
Subject: RE: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()

On Mon, 6 Jul 2015, Liang, Kan wrote:

>
> > On Fri, Jul 03, 2015 at 08:08:27PM +0000, Liang, Kan wrote:
> > > If we cleared the last bit, we not only drain the buffer but also
> > > decrease the event->ctx->pmu, which is used to flush the PEBS buffer
> > > during context switches.
> > > We need to disable cpuc->pebs_enabled before changing
> > > event->ctx->pmu as below.
> > >
> >
> > Indeed, mind sending a proper patch so I can press 'A' on it?
>
> Sure, I will do that.
> But I didn't verify the patch, since I cannot reproduce the issue.
>
> Vince, would you mind testing the patch?
> If the issue is gone, I will send a proper patch then.

I've got too many patches floating around so I forget what this one is
trying to fix. The pebs related lockup? Or the warning?

Was this patch meant to be in addition to PeterZ's, or standalone?

Also please send proper patches in the future, this one was whitespace
damaged and a pain to get applied.

With just this patch applied (without PeterZ's) I still managed to trigger
the following warning.

[ 1328.103920] ------------[ cut here ]------------
[ 1328.109367] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/perf_event_intel_ds.c:1199 intel_pmu_drain_pebs_nhm+0x283/0x2e0()
[ 1328.199193] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.2.0-rc1+ #166
[ 1328.207955] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[ 1328.216461] ffffffff81a10f88 ffff88011ea05b10 ffffffff816a10a3 0000000000000000
[ 1328.225043] 0000000000000000 ffff88011ea05b50 ffffffff8106ec8a ffff88011ea05ba0
[ 1328.233654] 0000000000000000 0000000000000001 ffff88011ea0bd80 ffff8801190000c0
[ 1328.242260] Call Trace:
[ 1328.245452] <NMI> [<ffffffff816a10a3>] dump_stack+0x45/0x57
[ 1328.252243] [<ffffffff8106ec8a>] warn_slowpath_common+0x8a/0xc0
[ 1328.259212] [<ffffffff8106ed7a>] warn_slowpath_null+0x1a/0x20
[ 1328.266000] [<ffffffff8102f783>] intel_pmu_drain_pebs_nhm+0x283/0x2e0
[ 1328.273551] [<ffffffff81032235>] intel_pmu_handle_irq+0x255/0x440
[ 1328.280742] [<ffffffff811574bf>] ? __perf_event_enable+0x1f/0x2a0
[ 1328.287887] [<ffffffff8114b19d>] ? irq_work_queue+0x5d/0x80
[ 1328.294500] [<ffffffff81028e76>] perf_event_nmi_handler+0x26/0x40
[ 1328.301659] [<ffffffff810181ad>] nmi_handle+0x9d/0x140
[ 1328.307783] [<ffffffff81018115>] ? nmi_handle+0x5/0x140
[ 1328.314010] [<ffffffff8101843a>] default_do_nmi+0x4a/0x120
[ 1328.320495] [<ffffffff8101859d>] do_nmi+0x8d/0xc0
[ 1328.326053] [<ffffffff816ab01f>] end_repeat_nmi+0x1e/0x2e
[ 1328.332405] [<ffffffff810309ba>] ? __intel_pmu_enable_all+0x5a/0xc0
[ 1328.339627] [<ffffffff810309ba>] ? __intel_pmu_enable_all+0x5a/0xc0
[ 1328.346894] [<ffffffff810309ba>] ? __intel_pmu_enable_all+0x5a/0xc0
[ 1328.354130] <<EOE>> <IRQ> [<ffffffff81030a30>] intel_pmu_enable_all+0x10/0x20
[ 1328.362556] [<ffffffff8102a95c>] x86_pmu_enable+0x25c/0x2e0
[ 1328.369079] [<ffffffff81156202>] perf_pmu_enable+0x22/0x30
[ 1328.375447] [<ffffffff81157da1>] __perf_install_in_context+0x131/0x1d0
[ 1328.382936] [<ffffffff811533b0>] ? cpu_clock_event_start+0x40/0x40
[ 1328.390106] [<ffffffff811533f2>] remote_function+0x42/0x50
[ 1328.396525] [<ffffffff810f0e9b>] flush_smp_call_function_queue+0x7b/0x170
[ 1328.404360] [<ffffffff810f1883>] generic_smp_call_function_single_interrupt+0x13/0x60
[ 1328.413345] [<ffffffff81049217>] smp_call_function_single_interrupt+0x27/0x40
[ 1328.421544] [<ffffffff816aa1ab>] call_function_single_interrupt+0x6b/0x70
[ 1328.429358] <EOI> [<ffffffff81545454>] ? cpuidle_enter_state+0xf4/0x220
[ 1328.437120] [<ffffffff81545430>] ? cpuidle_enter_state+0xd0/0x220
[ 1328.444210] [<ffffffff815455b7>] cpuidle_enter+0x17/0x20
[ 1328.450491] [<ffffffff810b06eb>] call_cpuidle+0x3b/0x70
[ 1328.456680] [<ffffffff81545593>] ? cpuidle_select+0x13/0x20
[ 1328.463196] [<ffffffff810b0965>] cpu_startup_entry+0x245/0x310
[ 1328.469992] [<ffffffff81695b3b>] rest_init+0xbb/0xd0
[ 1328.475860] [<ffffffff81d4af8b>] start_kernel+0x460/0x46d
[ 1328.482154] [<ffffffff81d4a120>] ? early_idt_handler_array+0x120/0x120
[ 1328.489661] [<ffffffff81d4a4d7>] x86_64_start_reservations+0x2a/0x2c
[ 1328.496974] [<ffffffff81d4a614>] x86_64_start_kernel+0x13b/0x14a
[ 1328.503905] ---[ end trace a75b257dea18211b ]---

2015-07-06 16:24:04

by Liang, Kan

[permalink] [raw]
Subject: RE: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()


> On Mon, 6 Jul 2015, Liang, Kan wrote:
>
> >
> > > On Fri, Jul 03, 2015 at 08:08:27PM +0000, Liang, Kan wrote:
> > > > If we cleared the last bit, we not only drain the buffer but also
> > > > decrease the event->ctx->pmu, which is used to flush the PEBS
> > > > buffer during context switches.
> > > > We need to disable cpuc->pebs_enabled before changing
> > > > event->ctx->pmu as below.
> > > >
> > >
> > > Indeed, mind sending a proper patch so I can press 'A' on it?
> >
> > Sure, I will do that.
> > But I didn't verify the patch, since I cannot reproduce the issue.
> >
> > Vince, would you mind testing the patch?
> > If the issue is gone, I will send a proper patch then.
>
> I've got too many patches floating around so I forget what this one is trying
> to fix. The pebs related lockup? Or the warning?
>

It's trying to fix the warning issue as below.
For the test result, it looks the patch doesn't help, does it?

> > WARN_ON_ONCE(!event->attr.precise_ip);
> >
> > [ 584.352324] WARNING: CPU: 2 PID: 18924 at
> > arch/x86/kernel/cpu/perf_event_intel_ds.c:1198
> > intel_pmu_drain_pebs_nhm+0x283/0x2e0()


> Was this patch meant to be in addition to PeterZ's, or standalone?
>

Standalone.

> Also please send proper patches in the future, this one was whitespace
> damaged and a pain to get applied.
>
> With just this patch applied (without PeterZ's) I still managed to trigger the
> following warning.

Thanks for the test.

>
> [ 1328.103920] ------------[ cut here ]------------ [ 1328.109367] WARNING:
> CPU: 0 PID: 0 at arch/x86/kernel/cpu/perf_event_intel_ds.c:1199
> intel_pmu_drain_pebs_nhm+0x283/0x2e0()
> [ 1328.199193] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.2.0-
> rc1+ #166
> [ 1328.207955] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS
> FBKT72AUS 01/26/2014 [ 1328.216461] ffffffff81a10f88 ffff88011ea05b10
> ffffffff816a10a3 0000000000000000 [ 1328.225043] 0000000000000000
> ffff88011ea05b50 ffffffff8106ec8a ffff88011ea05ba0 [ 1328.233654]
> 0000000000000000 0000000000000001 ffff88011ea0bd80 ffff8801190000c0
> [ 1328.242260] Call Trace:
> [ 1328.245452] <NMI> [<ffffffff816a10a3>] dump_stack+0x45/0x57
> [ 1328.252243] [<ffffffff8106ec8a>] warn_slowpath_common+0x8a/0xc0
> [ 1328.259212] [<ffffffff8106ed7a>] warn_slowpath_null+0x1a/0x20
> [ 1328.266000] [<ffffffff8102f783>]
> intel_pmu_drain_pebs_nhm+0x283/0x2e0
> [ 1328.273551] [<ffffffff81032235>] intel_pmu_handle_irq+0x255/0x440
> [ 1328.280742] [<ffffffff811574bf>] ? __perf_event_enable+0x1f/0x2a0
> [ 1328.287887] [<ffffffff8114b19d>] ? irq_work_queue+0x5d/0x80
> [ 1328.294500] [<ffffffff81028e76>] perf_event_nmi_handler+0x26/0x40
> [ 1328.301659] [<ffffffff810181ad>] nmi_handle+0x9d/0x140 [ 1328.307783]
> [<ffffffff81018115>] ? nmi_handle+0x5/0x140 [ 1328.314010]
> [<ffffffff8101843a>] default_do_nmi+0x4a/0x120 [ 1328.320495]
> [<ffffffff8101859d>] do_nmi+0x8d/0xc0 [ 1328.326053] [<ffffffff816ab01f>]
> end_repeat_nmi+0x1e/0x2e [ 1328.332405] [<ffffffff810309ba>] ?
> __intel_pmu_enable_all+0x5a/0xc0 [ 1328.339627] [<ffffffff810309ba>] ?
> __intel_pmu_enable_all+0x5a/0xc0 [ 1328.346894] [<ffffffff810309ba>] ?
> __intel_pmu_enable_all+0x5a/0xc0 [ 1328.354130] <<EOE>> <IRQ>
> [<ffffffff81030a30>] intel_pmu_enable_all+0x10/0x20 [ 1328.362556]
> [<ffffffff8102a95c>] x86_pmu_enable+0x25c/0x2e0 [ 1328.369079]
> [<ffffffff81156202>] perf_pmu_enable+0x22/0x30 [ 1328.375447]
> [<ffffffff81157da1>] __perf_install_in_context+0x131/0x1d0
> [ 1328.382936] [<ffffffff811533b0>] ? cpu_clock_event_start+0x40/0x40
> [ 1328.390106] [<ffffffff811533f2>] remote_function+0x42/0x50
> [ 1328.396525] [<ffffffff810f0e9b>]
> flush_smp_call_function_queue+0x7b/0x170
> [ 1328.404360] [<ffffffff810f1883>]
> generic_smp_call_function_single_interrupt+0x13/0x60
> [ 1328.413345] [<ffffffff81049217>]
> smp_call_function_single_interrupt+0x27/0x40
> [ 1328.421544] [<ffffffff816aa1ab>]
> call_function_single_interrupt+0x6b/0x70
> [ 1328.429358] <EOI> [<ffffffff81545454>] ?
> cpuidle_enter_state+0xf4/0x220 [ 1328.437120] [<ffffffff81545430>] ?
> cpuidle_enter_state+0xd0/0x220 [ 1328.444210] [<ffffffff815455b7>]
> cpuidle_enter+0x17/0x20 [ 1328.450491] [<ffffffff810b06eb>]
> call_cpuidle+0x3b/0x70 [ 1328.456680] [<ffffffff81545593>] ?
> cpuidle_select+0x13/0x20 [ 1328.463196] [<ffffffff810b0965>]
> cpu_startup_entry+0x245/0x310 [ 1328.469992] [<ffffffff81695b3b>]
> rest_init+0xbb/0xd0 [ 1328.475860] [<ffffffff81d4af8b>]
> start_kernel+0x460/0x46d [ 1328.482154] [<ffffffff81d4a120>] ?
> early_idt_handler_array+0x120/0x120
> [ 1328.489661] [<ffffffff81d4a4d7>] x86_64_start_reservations+0x2a/0x2c
> [ 1328.496974] [<ffffffff81d4a614>] x86_64_start_kernel+0x13b/0x14a
> [ 1328.503905] ---[ end trace a75b257dea18211b ]---

2015-07-06 16:45:27

by Vince Weaver

[permalink] [raw]
Subject: RE: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()

On Mon, 6 Jul 2015, Liang, Kan wrote:

>
> > On Mon, 6 Jul 2015, Liang, Kan wrote:
> >
> > >
> > > > On Fri, Jul 03, 2015 at 08:08:27PM +0000, Liang, Kan wrote:
> > > > > If we cleared the last bit, we not only drain the buffer but also
> > > > > decrease the event->ctx->pmu, which is used to flush the PEBS
> > > > > buffer during context switches.
> > > > > We need to disable cpuc->pebs_enabled before changing
> > > > > event->ctx->pmu as below.
> > > > >
> > > >
> > > > Indeed, mind sending a proper patch so I can press 'A' on it?
> > >
> > > Sure, I will do that.
> > > But I didn't verify the patch, since I cannot reproduce the issue.
> > >
> > > Vince, would you mind testing the patch?
> > > If the issue is gone, I will send a proper patch then.
> >
> > I've got too many patches floating around so I forget what this one is trying
> > to fix. The pebs related lockup? Or the warning?
> >
>
> It's trying to fix the warning issue as below.
> For the test result, it looks the patch doesn't help, does it?
>
> > > WARN_ON_ONCE(!event->attr.precise_ip);
> > >
> > > [ 584.352324] WARNING: CPU: 2 PID: 18924 at
> > > arch/x86/kernel/cpu/perf_event_intel_ds.c:1198
> > > intel_pmu_drain_pebs_nhm+0x283/0x2e0()
>
>
> > Was this patch meant to be in addition to PeterZ's, or standalone?
> >
>
> Standalone.
>
> > Also please send proper patches in the future, this one was whitespace
> > damaged and a pain to get applied.
> >
> > With just this patch applied (without PeterZ's) I still managed to trigger the
> > following warning.
>
> Thanks for the test.

The machine also crashed a few minutes later.

[ 2972.105858] INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 7, t=5482 jiffies, g=338012, c=338011, q=205)
[ 2972.118544] Task dump for CPU 3:
[ 2972.122706] perf_fuzzer R running task 0 9409 2404 0x0000000c
[ 2972.131021] 0000000000000092 ffffffff81030dfb 0000000300000005 0000000400000004
[ 2972.139762] ffff88011eacc4a0 0000000000000005 0000000000000092 0000000500000002
[ 2972.148507] ffffffff81030e6b ffff8801197b2800 ffff8801197b2800 ffff88011eacbd80
[ 2972.157225] Call Trace:
[ 2972.160547] [<ffffffff81030dfb>] ? intel_start_scheduling+0x4b/0x70
[ 2972.168098] [<ffffffff81030e6b>] ? intel_stop_scheduling+0x4b/0x70
[ 2972.175513] [<ffffffff816a853b>] ? _raw_spin_unlock+0x2b/0x40
[ 2972.182502] [<ffffffff81030e6b>] ? intel_stop_scheduling+0x4b/0x70
[ 2972.189920] [<ffffffff81029f32>] ? x86_schedule_events+0x1e2/0x260
[ 2972.197340] [<ffffffff810b78f6>] ? __lock_acquire.isra.31+0x3a6/0xf90
[ 2972.204998] [<ffffffff810b78f6>] ? __lock_acquire.isra.31+0x3a6/0xf90
[ 2972.212670] [<ffffffff811585b2>] ? perf_event_update_userpage+0x102/0x170
[ 2972.220680] [<ffffffff811585ca>] ? perf_event_update_userpage+0x11a/0x170
[ 2972.228699] [<ffffffff811584b0>] ? perf_event_task_disable+0xd0/0xd0
[ 2972.236281] [<ffffffff81031d7b>] ? intel_pmu_enable_event+0xfb/0x210
[ 2972.243882] [<ffffffff810303a4>] ? intel_pmu_pebs_enable_all+0x34/0x40
[ 2972.251652] [<ffffffff810309ed>] ? __intel_pmu_enable_all+0x8d/0xc0
[ 2972.259115] [<ffffffff81030a30>] ? intel_pmu_enable_all+0x10/0x20
[ 2972.266402] [<ffffffff8102a95c>] ? x86_pmu_enable+0x25c/0x2e0
[ 2972.273316] [<ffffffff81156202>] ? perf_pmu_enable+0x22/0x30
[ 2972.280152] [<ffffffff81157da1>] ? __perf_install_in_context+0x131/0x1d0
[ 2972.288087] [<ffffffff811533f2>] ? remote_function+0x42/0x50
[ 2972.294856] [<ffffffff810f1046>] ? generic_exec_single+0xb6/0x120
[ 2972.302156] [<ffffffff8115d13a>] ? SYSC_perf_event_open+0xb4a/0xd40
[ 2972.309584] [<ffffffff811533b0>] ? cpu_clock_event_start+0x40/0x40
[ 2972.316895] [<ffffffff810f1160>] ? smp_call_function_single+0xb0/0x110
[ 2972.324617] [<ffffffff81152484>] ? task_function_call+0x44/0x50
[ 2972.331698] [<ffffffff81157c70>] ? perf_mux_hrtimer_handler+0x1f0/0x1f0
[ 2972.339490] [<ffffffff81152843>] ? perf_install_in_context+0x83/0xf0
[ 2972.347014] [<ffffffff8115d171>] ? SYSC_perf_event_open+0xb81/0xd40
[ 2972.354436] [<ffffffff8115d7a9>] ? SyS_perf_event_open+0x9/0x10
[ 2972.361469] [<ffffffff816a8df2>] ? entry_SYSCALL_64_fastpath+0x16/0x7a
[ 2973.026502] ------------[ cut here ]------------
[ 2973.031822] WARNING: CPU: 3 PID: 9409 at kernel/watchdog.c:311 watchdog_overflow_callback+0x84/0xa0()
[ 2973.042038] Watchdog detected hard LOCKUP on cpu 3
[ 2973.124350] CPU: 3 PID: 9409 Comm: perf_fuzzer Tainted: G W 4.2.0-rc1+ #166
[ 2973.133447] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[ 2973.141844] ffffffff81a28ae2 ffff88011eac5af0 ffffffff816a10a3 0000000000000000
[ 2973.150325] ffff88011eac5b40 ffff88011eac5b30 ffffffff8106ec8a ffff88011eac5c40
[ 2973.158787] ffff880119133800 0000000000000001 ffff88011eac5c40 ffff88011eac5ef8
[ 2973.167248] Call Trace:
[ 2973.170317] <NMI> [<ffffffff816a10a3>] dump_stack+0x45/0x57
[ 2973.176960] [<ffffffff8106ec8a>] warn_slowpath_common+0x8a/0xc0
[ 2973.183818] [<ffffffff8106ed06>] warn_slowpath_fmt+0x46/0x50
[ 2973.190416] [<ffffffff8102f676>] ? intel_pmu_drain_pebs_nhm+0x176/0x2e0
[ 2973.198013] [<ffffffff8111b784>] watchdog_overflow_callback+0x84/0xa0
[ 2973.205422] [<ffffffff8115af7c>] __perf_event_overflow+0x8c/0x1c0
[ 2973.212460] [<ffffffff8115bae4>] perf_event_overflow+0x14/0x20
[ 2973.219226] [<ffffffff810321b4>] intel_pmu_handle_irq+0x1d4/0x440
[ 2973.226340] [<ffffffff81028e76>] perf_event_nmi_handler+0x26/0x40
[ 2973.233400] [<ffffffff810181ad>] nmi_handle+0x9d/0x140
[ 2973.239427] [<ffffffff81018115>] ? nmi_handle+0x5/0x140
[ 2973.245540] [<ffffffff810184b9>] default_do_nmi+0xc9/0x120
[ 2973.251932] [<ffffffff8101859d>] do_nmi+0x8d/0xc0
[ 2973.257507] [<ffffffff816ab01f>] end_repeat_nmi+0x1e/0x2e
[ 2973.263846] [<ffffffff81035d76>] ? intel_bts_enable_local+0x26/0x40
[ 2973.271087] [<ffffffff81035d76>] ? intel_bts_enable_local+0x26/0x40
[ 2973.278330] [<ffffffff81035d76>] ? intel_bts_enable_local+0x26/0x40
[ 2973.285556] <<EOE>> [<ffffffff810309ed>] ? __intel_pmu_enable_all+0x8d/0xc0
[ 2973.293646] [<ffffffff81030a30>] intel_pmu_enable_all+0x10/0x20
[ 2973.300538] [<ffffffff8102a95c>] x86_pmu_enable+0x25c/0x2e0
[ 2973.307048] [<ffffffff81156202>] perf_pmu_enable+0x22/0x30
[ 2973.313470] [<ffffffff81157da1>] __perf_install_in_context+0x131/0x1d0
[ 2973.320973] [<ffffffff811533f2>] remote_function+0x42/0x50
[ 2973.327406] [<ffffffff810f1046>] generic_exec_single+0xb6/0x120
[ 2973.334300] [<ffffffff8115d13a>] ? SYSC_perf_event_open+0xb4a/0xd40
[ 2973.341513] [<ffffffff811533b0>] ? cpu_clock_event_start+0x40/0x40
[ 2973.348676] [<ffffffff810f1160>] smp_call_function_single+0xb0/0x110
[ 2973.355976] [<ffffffff81152484>] task_function_call+0x44/0x50
[ 2973.362705] [<ffffffff81157c70>] ? perf_mux_hrtimer_handler+0x1f0/0x1f0
[ 2973.370291] [<ffffffff81152843>] perf_install_in_context+0x83/0xf0
[ 2973.377478] [<ffffffff8115d171>] SYSC_perf_event_open+0xb81/0xd40
[ 2973.384552] [<ffffffff8115d7a9>] SyS_perf_event_open+0x9/0x10
[ 2973.391242] [<ffffffff816a8df2>] entry_SYSCALL_64_fastpath+0x16/0x7a
[ 2973.398575] ---[ end trace a75b257dea18211c ]---

2015-07-15 06:42:53

by Stephane Eranian

[permalink] [raw]
Subject: Re: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()

On Fri, Jul 3, 2015 at 9:49 PM, Vince Weaver <[email protected]> wrote:
> On Fri, 3 Jul 2015, Peter Zijlstra wrote:
>
>> That said, its far too warm and I might just not be making sense.
>
> you need to come visit Maine! Although I am not sure the cooler weather
> necessarily improves my kernel debugging skills.
>
> I managed to lock the machine (again this is with the patch applied).
>
I can reproduce the problem on my HSW running the fuzzer.

I can see why this could be happening if you are mixing PEBS and non PEBS events
in the bottom 4 counters. I suspect:
for (bit = 0; bit < x86_pmu.max_pebs_events; bit++) {
if ((counts[bit] == 0) && (error[bit] == 0))
continue;

This test is not correct when you have non-PEBS events mixed with PEBS
events and
they overflow at the same time. They will have counts[i] != 0 but
error[i] == 0, and thus
you fall thru the loop and hit the assert. Or it is something along those lines.


> [ 299.366027] ------------[ cut here ]------------
> [ 299.370985] WARNING: CPU: 2 PID: 8241 at arch/x86/kernel/cpu/perf_event_intel_ds.c:1198 intel_pmu_drain_pebs_nhm+0x283/0x2e0()
> [ 299.456929] CPU: 2 PID: 8241 Comm: perf_fuzzer Tainted: G W 4.1.0+ #164
> [ 299.465750] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
> [ 299.474274] ffffffff81a105a0 ffff88011ea85b10 ffffffff8169f823 0000000000000000
> [ 299.482864] 0000000000000000 ffff88011ea85b50 ffffffff8106ec8a ffff88011ea85ba0
> [ 299.491488] 0000000000000000 0000000000000001 ffff88011ea8bd80 ffff8801190400c0
> [ 299.500029] Call Trace:
> [ 299.503190] <NMI> [<ffffffff8169f823>] dump_stack+0x45/0x57
> [ 299.509936] [<ffffffff8106ec8a>] warn_slowpath_common+0x8a/0xc0
> [ 299.516901] [<ffffffff8106ed7a>] warn_slowpath_null+0x1a/0x20
> [ 299.523715] [<ffffffff8102f783>] intel_pmu_drain_pebs_nhm+0x283/0x2e0
> [ 299.531268] [<ffffffff81032235>] intel_pmu_handle_irq+0x255/0x440
> [ 299.538487] [<ffffffff81028e76>] perf_event_nmi_handler+0x26/0x40
> [ 299.545638] [<ffffffff810181ad>] nmi_handle+0x9d/0x140
> [ 299.551772] [<ffffffff81018115>] ? nmi_handle+0x5/0x140
> [ 299.558013] [<ffffffff8101843a>] default_do_nmi+0x4a/0x120
> [ 299.564527] [<ffffffff8101859d>] do_nmi+0x8d/0xc0
> [ 299.570185] [<ffffffff816a979f>] end_repeat_nmi+0x1e/0x2e
> [ 299.576580] [<ffffffff811bc9d2>] ? check_poison_obj+0x92/0x230
> [ 299.583390] [<ffffffff811bc9d2>] ? check_poison_obj+0x92/0x230
> [ 299.590163] [<ffffffff811bc9d2>] ? check_poison_obj+0x92/0x230
> [ 299.596922] <<EOE>> [<ffffffff8115bea8>] ? perf_event_alloc+0x58/0x680
> [ 299.604594] [<ffffffff811bcf7d>] cache_alloc_debugcheck_after.isra.51+0x1cd/0x250
> [ 299.613140] [<ffffffff811c08b6>] kmem_cache_alloc_trace+0xa6/0x510
> [ 299.620330] [<ffffffff8115bea8>] ? perf_event_alloc+0x58/0x680
> [ 299.627088] [<ffffffff8106ee48>] ? get_online_cpus+0x58/0x70
> [ 299.633688] [<ffffffff8115bea8>] perf_event_alloc+0x58/0x680
> [ 299.640319] [<ffffffff8115c897>] SYSC_perf_event_open+0x3c7/0xd40
> [ 299.647353] [<ffffffff8105f86b>] ? __do_page_fault+0x1ab/0x3f0
> [ 299.654172] [<ffffffff8115d689>] SyS_perf_event_open+0x9/0x10
> [ 299.660871] [<ffffffff816a7572>] entry_SYSCALL_64_fastpath+0x16/0x7a
> [ 299.668236] ---[ end trace 3356c74581c13f1d ]---
> [ 299.673648] Uhhuh. NMI received for unknown reason 31 on CPU 2.
> [ 299.680427] Do you have a strange power saving mode enabled?
> [ 299.686963] Dazed and confused, but trying to continue
> [ 299.692904] Uhhuh. NMI received for unknown reason 31 on CPU 2.
> [ 299.699748] Do you have a strange power saving mode enabled?
> [ 299.706227] Dazed and confused, but trying to continue
> [ 299.712172] Uhhuh. NMI received for unknown reason 31 on CPU 2.
> [ 299.718946] Do you have a strange power saving mode enabled?
> [ 299.725446] Dazed and confused, but trying to continue
> [ 299.731419] Uhhuh. NMI received for unknown reason 31 on CPU 2.
> [ 299.738235] Do you have a strange power saving mode enabled?
> [ 299.744740] Dazed and confused, but trying to continue
> [ 299.750660] Uhhuh. NMI received for unknown reason 21 on CPU 2.
> [ 299.757398] Do you have a strange power saving mode enabled?
> [ 299.763862] Dazed and confused, but trying to continue
>
> (machine eventually locks up after lots of these messages)
>

2015-07-15 12:35:55

by Peter Zijlstra

[permalink] [raw]
Subject: Re: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()

On Wed, Jul 15, 2015 at 08:42:50AM +0200, Stephane Eranian wrote:
> On Fri, Jul 3, 2015 at 9:49 PM, Vince Weaver <[email protected]> wrote:
> > On Fri, 3 Jul 2015, Peter Zijlstra wrote:
> >
> >> That said, its far too warm and I might just not be making sense.
> >
> > you need to come visit Maine! Although I am not sure the cooler weather
> > necessarily improves my kernel debugging skills.
> >
> > I managed to lock the machine (again this is with the patch applied).
> >
> I can reproduce the problem on my HSW running the fuzzer.
>
> I can see why this could be happening if you are mixing PEBS and non PEBS events
> in the bottom 4 counters. I suspect:
> for (bit = 0; bit < x86_pmu.max_pebs_events; bit++) {
> if ((counts[bit] == 0) && (error[bit] == 0))
> continue;
>
> This test is not correct when you have non-PEBS events mixed with PEBS
> events and
> they overflow at the same time. They will have counts[i] != 0 but
> error[i] == 0, and thus
> you fall thru the loop and hit the assert. Or it is something along those lines.
>

The only way I can make this work is if ->status only has !PEBS events
set, because if it has both set we'll take that slow path which masks
out the !PEBS bits.

After masking there are 3 options:

- there is one bit set, and its @bit, we increment counts[bit].
- there are multiple bits set, we increment error[] for each set bit,
we do not increment counts[].
- there are no bits set, we do nothing.

The intent was to never increment counts[] for !PEBS events.

Now if we start out with only a single !PEBS event set, we'll pass the
test and increment counts[] for a !PEBS and hit the warn.

The below patch modifies the code such that it can deal with that
particular issue. Can you try?

---
arch/x86/kernel/cpu/perf_event_intel_ds.c | 29 +++++++++++++----------------
1 file changed, 13 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 71fc40238843..68d0ced1d229 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -1142,6 +1142,7 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)

for (at = base; at < top; at += x86_pmu.pebs_record_size) {
struct pebs_record_nhm *p = at;
+ u64 pebs_status;

/* PEBS v3 has accurate status bits */
if (x86_pmu.intel_cap.pebs_format >= 3) {
@@ -1152,12 +1153,14 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
continue;
}

- bit = find_first_bit((unsigned long *)&p->status,
+ pebs_status = p->status & cpuc->pebs_enabled;
+ pebs_status &= (1ULL << x86_pmu.max_pebs_events) - 1;
+
+ bit = find_first_bit((unsigned long *)&pebs_status,
x86_pmu.max_pebs_events);
if (bit >= x86_pmu.max_pebs_events)
continue;
- if (!test_bit(bit, cpuc->active_mask))
- continue;
+
/*
* The PEBS hardware does not deal well with the situation
* when events happen near to each other and multiple bits
@@ -1172,27 +1175,21 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
* one, and it's not possible to reconstruct all events
* that caused the PEBS record. It's called collision.
* If collision happened, the record will be dropped.
- *
*/
- if (p->status != (1 << bit)) {
- u64 pebs_status;
-
- /* slow path */
- pebs_status = p->status & cpuc->pebs_enabled;
- pebs_status &= (1ULL << MAX_PEBS_EVENTS) - 1;
- if (pebs_status != (1 << bit)) {
- for_each_set_bit(i, (unsigned long *)&pebs_status,
- MAX_PEBS_EVENTS)
- error[i]++;
- continue;
- }
+ if (p->status != (1ULL << bit)) {
+ for_each_set_bit(i, (unsigned long *)&pebs_status,
+ x86_pmu.max_pebs_events)
+ error[i]++;
+ continue;
}
+
counts[bit]++;
}

for (bit = 0; bit < x86_pmu.max_pebs_events; bit++) {
if ((counts[bit] == 0) && (error[bit] == 0))
continue;
+
event = cpuc->events[bit];
WARN_ON_ONCE(!event);
WARN_ON_ONCE(!event->attr.precise_ip);

2015-07-16 06:02:20

by Stephane Eranian

[permalink] [raw]
Subject: Re: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()

On Wed, Jul 15, 2015 at 2:35 PM, Peter Zijlstra <[email protected]> wrote:
> On Wed, Jul 15, 2015 at 08:42:50AM +0200, Stephane Eranian wrote:
>> On Fri, Jul 3, 2015 at 9:49 PM, Vince Weaver <[email protected]> wrote:
>> > On Fri, 3 Jul 2015, Peter Zijlstra wrote:
>> >
>> >> That said, its far too warm and I might just not be making sense.
>> >
>> > you need to come visit Maine! Although I am not sure the cooler weather
>> > necessarily improves my kernel debugging skills.
>> >
>> > I managed to lock the machine (again this is with the patch applied).
>> >
>> I can reproduce the problem on my HSW running the fuzzer.
>>
>> I can see why this could be happening if you are mixing PEBS and non PEBS events
>> in the bottom 4 counters. I suspect:
>> for (bit = 0; bit < x86_pmu.max_pebs_events; bit++) {
>> if ((counts[bit] == 0) && (error[bit] == 0))
>> continue;
>>
>> This test is not correct when you have non-PEBS events mixed with PEBS
>> events and
>> they overflow at the same time. They will have counts[i] != 0 but
>> error[i] == 0, and thus
>> you fall thru the loop and hit the assert. Or it is something along those lines.
>>
>
> The only way I can make this work is if ->status only has !PEBS events
> set, because if it has both set we'll take that slow path which masks
> out the !PEBS bits.
>
> After masking there are 3 options:
>
> - there is one bit set, and its @bit, we increment counts[bit].
> - there are multiple bits set, we increment error[] for each set bit,
> we do not increment counts[].
> - there are no bits set, we do nothing.
>
> The intent was to never increment counts[] for !PEBS events.
>
> Now if we start out with only a single !PEBS event set, we'll pass the
> test and increment counts[] for a !PEBS and hit the warn.
>
> The below patch modifies the code such that it can deal with that
> particular issue. Can you try?
>
Been running it for a couple of hours, so far so good. I will let it
run all night.
Testing it on HSW and NHM, my SNB is broken at the moment.

Don't know if the fuzzer has already reproduced the test case.
Thanks.

> ---
> arch/x86/kernel/cpu/perf_event_intel_ds.c | 29 +++++++++++++----------------
> 1 file changed, 13 insertions(+), 16 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> index 71fc40238843..68d0ced1d229 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> @@ -1142,6 +1142,7 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
>
> for (at = base; at < top; at += x86_pmu.pebs_record_size) {
> struct pebs_record_nhm *p = at;
> + u64 pebs_status;
>
> /* PEBS v3 has accurate status bits */
> if (x86_pmu.intel_cap.pebs_format >= 3) {
> @@ -1152,12 +1153,14 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
> continue;
> }
>
> - bit = find_first_bit((unsigned long *)&p->status,
> + pebs_status = p->status & cpuc->pebs_enabled;
> + pebs_status &= (1ULL << x86_pmu.max_pebs_events) - 1;
> +
> + bit = find_first_bit((unsigned long *)&pebs_status,
> x86_pmu.max_pebs_events);
> if (bit >= x86_pmu.max_pebs_events)
> continue;
> - if (!test_bit(bit, cpuc->active_mask))
> - continue;
> +
> /*
> * The PEBS hardware does not deal well with the situation
> * when events happen near to each other and multiple bits
> @@ -1172,27 +1175,21 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
> * one, and it's not possible to reconstruct all events
> * that caused the PEBS record. It's called collision.
> * If collision happened, the record will be dropped.
> - *
> */
> - if (p->status != (1 << bit)) {
> - u64 pebs_status;
> -
> - /* slow path */
> - pebs_status = p->status & cpuc->pebs_enabled;
> - pebs_status &= (1ULL << MAX_PEBS_EVENTS) - 1;
> - if (pebs_status != (1 << bit)) {
> - for_each_set_bit(i, (unsigned long *)&pebs_status,
> - MAX_PEBS_EVENTS)
> - error[i]++;
> - continue;
> - }
> + if (p->status != (1ULL << bit)) {
> + for_each_set_bit(i, (unsigned long *)&pebs_status,
> + x86_pmu.max_pebs_events)
> + error[i]++;
> + continue;
> }
> +
> counts[bit]++;
> }
>
> for (bit = 0; bit < x86_pmu.max_pebs_events; bit++) {
> if ((counts[bit] == 0) && (error[bit] == 0))
> continue;
> +
> event = cpuc->events[bit];
> WARN_ON_ONCE(!event);
> WARN_ON_ONCE(!event->attr.precise_ip);

2015-07-16 07:15:47

by Peter Zijlstra

[permalink] [raw]
Subject: Re: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()

On Thu, Jul 16, 2015 at 08:02:03AM +0200, Stephane Eranian wrote:
> Been running it for a couple of hours, so far so good. I will let it
> run all night.

Thanks!

> > ---
> > arch/x86/kernel/cpu/perf_event_intel_ds.c | 29 +++++++++++++----------------
> > 1 file changed, 13 insertions(+), 16 deletions(-)
> >
> > diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> > index 71fc40238843..68d0ced1d229 100644
> > --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
> > +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> > @@ -1142,6 +1142,7 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
> >
> > for (at = base; at < top; at += x86_pmu.pebs_record_size) {
> > struct pebs_record_nhm *p = at;
> > + u64 pebs_status;
> >
> > /* PEBS v3 has accurate status bits */
> > if (x86_pmu.intel_cap.pebs_format >= 3) {
> > @@ -1152,12 +1153,14 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
> > continue;
> > }
> >
> > - bit = find_first_bit((unsigned long *)&p->status,
> > + pebs_status = p->status & cpuc->pebs_enabled;
> > + pebs_status &= (1ULL << x86_pmu.max_pebs_events) - 1;
> > +
> > + bit = find_first_bit((unsigned long *)&pebs_status,
> > x86_pmu.max_pebs_events);
> > if (bit >= x86_pmu.max_pebs_events)
> > continue;

Maybe we should WARN in this case? A PEBS entry without any PEBS bits
set in the status field would be 'weird', right?

Maybe something like:

if (WARN(bit >= x86_pmu.max_pebs_events,
"PEBS record without PEBS event! status=%Lx pebs_enabled=%Lx active_mask=%Lx",
p->status, cpuc->pebs_enabled, cpuc->active_mask))
continue;

If that triggers we at least get more info.

2015-07-16 07:30:21

by Stephane Eranian

[permalink] [raw]
Subject: Re: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()

On Thu, Jul 16, 2015 at 12:15 AM, Peter Zijlstra <[email protected]> wrote:
> On Thu, Jul 16, 2015 at 08:02:03AM +0200, Stephane Eranian wrote:
>> Been running it for a couple of hours, so far so good. I will let it
>> run all night.
>
> Thanks!
>
Well, it died on NHM in the same function despite your patch. Need to
look at the exact warning.\
So more work is needed. But then I also saw the irq loop stuck message
before that.


>> > ---
>> > arch/x86/kernel/cpu/perf_event_intel_ds.c | 29 +++++++++++++----------------
>> > 1 file changed, 13 insertions(+), 16 deletions(-)
>> >
>> > diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
>> > index 71fc40238843..68d0ced1d229 100644
>> > --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
>> > +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
>> > @@ -1142,6 +1142,7 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
>> >
>> > for (at = base; at < top; at += x86_pmu.pebs_record_size) {
>> > struct pebs_record_nhm *p = at;
>> > + u64 pebs_status;
>> >
>> > /* PEBS v3 has accurate status bits */
>> > if (x86_pmu.intel_cap.pebs_format >= 3) {
>> > @@ -1152,12 +1153,14 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
>> > continue;
>> > }
>> >
>> > - bit = find_first_bit((unsigned long *)&p->status,
>> > + pebs_status = p->status & cpuc->pebs_enabled;
>> > + pebs_status &= (1ULL << x86_pmu.max_pebs_events) - 1;
>> > +
>> > + bit = find_first_bit((unsigned long *)&pebs_status,
>> > x86_pmu.max_pebs_events);
>> > if (bit >= x86_pmu.max_pebs_events)
>> > continue;
>
> Maybe we should WARN in this case? A PEBS entry without any PEBS bits
> set in the status field would be 'weird', right?
>
> Maybe something like:
>
> if (WARN(bit >= x86_pmu.max_pebs_events,
> "PEBS record without PEBS event! status=%Lx pebs_enabled=%Lx active_mask=%Lx",
> p->status, cpuc->pebs_enabled, cpuc->active_mask))
> continue;
>
> If that triggers we at least get more info.

2015-07-16 21:12:13

by Stephane Eranian

[permalink] [raw]
Subject: Re: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()

hi,


The assertion I am seeing on HSW now is:
[ 114.652263] WARNING: CPU: 6 PID: 3538 at
arch/x86/kernel/cpu/perf_event.c:1209 x86_pmu_start+0xaa/0x100()
[ 114.652264] Modules linked in: snd_hda_codec_hdmi i915 bnep rfcomm
bluetooth drm_kms_helper snd_hda_codec_realtek snd_hda_codec_generic
drm snd_hda_intel snd_hda_codec snd_hda_core intel_rapl iosf_mbi
snd_hwdep x86_pkg_temp_thermal snd_pcm intel_powerclamp snd_seq_midi
snd_seq_midi_event coretemp snd_rawmidi snd_seq kvm_intel kvm
snd_seq_device snd_timer snd crct10dif_pclmul crc32_pclmul soundcore
mei_me mei ghash_clmulni_intel aesni_intel mxm_wmi aes_x86_64 lrw
i2c_algo_bit gf128mul lpc_ich glue_helper shpchp serio_raw ablk_helper
cryptd tpm_infineon soc_button_array wmi video mac_hid
intel_smartconnect nls_iso8859_1 parport_pc ppdev lp parport uas
usb_storage psmouse r8169 ahci libahci mii
[ 114.652287] CPU: 6 PID: 3538 Comm: perf_fuzzer Not tainted 4.2.0-rc2+ #3
[ 114.652287] Hardware name: MSI MS-7816/Z87-G43 (MS-7816), BIOS V1.0
04/02/2013
[ 114.652288] ffffffff81a8e078 ffff880232e33da8 ffffffff8178881a
0000000000000007
[ 114.652289] 0000000000000000 ffff880232e33de8 ffffffff81073a4a
ffff88021f844900
[ 114.652291] ffff88023f38bbc0 ffff8800a85d8800 0000000000000000
ffff88023f31a8f8
[ 114.652292] Call Trace:
[ 114.652295] [<ffffffff8178881a>] dump_stack+0x45/0x57
[ 114.652299] [<ffffffff81073a4a>] warn_slowpath_common+0x8a/0xc0
[ 114.652300] [<ffffffff81073b3a>] warn_slowpath_null+0x1a/0x20
[ 114.652302] [<ffffffff8102aeea>] x86_pmu_start+0xaa/0x100
[ 114.652304] [<ffffffff81162e09>] perf_ioctl+0x3b9/0x400
[ 114.652306] [<ffffffff811f851a>] do_vfs_ioctl+0x2ba/0x490
[ 114.652307] [<ffffffff811f700f>] ? f_modown+0x4f/0xa0
[ 114.652308] [<ffffffff811f70e5>] ? f_setown+0x45/0x50
[ 114.652309] [<ffffffff811f8769>] SyS_ioctl+0x79/0x90
[ 114.652310] [<ffffffff8178f52e>] tracesys_phase2+0x88/0x8d

Which corresponds to:
static void x86_pmu_start(struct perf_event *event, int flags)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
int idx = event->hw.idx;

if (WARN_ON_ONCE(!(event->hw.state & PERF_HES_STOPPED)))
return;
^^^^^ this one

The kernel is trying to start an event which was not stopped.

As for NHM, the irqstuck loop is there. After more instrumentation, it
seems to hit
only when sampling on fixed counter 3.


On Thu, Jul 16, 2015 at 9:30 AM, Stephane Eranian
<[email protected]> wrote:
> On Thu, Jul 16, 2015 at 12:15 AM, Peter Zijlstra <[email protected]> wrote:
>> On Thu, Jul 16, 2015 at 08:02:03AM +0200, Stephane Eranian wrote:
>>> Been running it for a couple of hours, so far so good. I will let it
>>> run all night.
>>
>> Thanks!
>>
> Well, it died on NHM in the same function despite your patch. Need to
> look at the exact warning.\
> So more work is needed. But then I also saw the irq loop stuck message
> before that.
>
>
>>> > ---
>>> > arch/x86/kernel/cpu/perf_event_intel_ds.c | 29 +++++++++++++----------------
>>> > 1 file changed, 13 insertions(+), 16 deletions(-)
>>> >
>>> > diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
>>> > index 71fc40238843..68d0ced1d229 100644
>>> > --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
>>> > +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
>>> > @@ -1142,6 +1142,7 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
>>> >
>>> > for (at = base; at < top; at += x86_pmu.pebs_record_size) {
>>> > struct pebs_record_nhm *p = at;
>>> > + u64 pebs_status;
>>> >
>>> > /* PEBS v3 has accurate status bits */
>>> > if (x86_pmu.intel_cap.pebs_format >= 3) {
>>> > @@ -1152,12 +1153,14 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
>>> > continue;
>>> > }
>>> >
>>> > - bit = find_first_bit((unsigned long *)&p->status,
>>> > + pebs_status = p->status & cpuc->pebs_enabled;
>>> > + pebs_status &= (1ULL << x86_pmu.max_pebs_events) - 1;
>>> > +
>>> > + bit = find_first_bit((unsigned long *)&pebs_status,
>>> > x86_pmu.max_pebs_events);
>>> > if (bit >= x86_pmu.max_pebs_events)
>>> > continue;
>>
>> Maybe we should WARN in this case? A PEBS entry without any PEBS bits
>> set in the status field would be 'weird', right?
>>
>> Maybe something like:
>>
>> if (WARN(bit >= x86_pmu.max_pebs_events,
>> "PEBS record without PEBS event! status=%Lx pebs_enabled=%Lx active_mask=%Lx",
>> p->status, cpuc->pebs_enabled, cpuc->active_mask))
>> continue;
>>
>> If that triggers we at least get more info.

Subject: [tip:perf/core] perf/x86/intel/pebs: Fix event disable PEBS buffer drain

Commit-ID: 2a853e1123cbbb43ff74cc47b4ec582ce34df262
Gitweb: http://git.kernel.org/tip/2a853e1123cbbb43ff74cc47b4ec582ce34df262
Author: Liang, Kan <[email protected]>
AuthorDate: Fri, 3 Jul 2015 20:08:27 +0000
Committer: Ingo Molnar <[email protected]>
CommitDate: Tue, 4 Aug 2015 10:17:00 +0200

perf/x86/intel/pebs: Fix event disable PEBS buffer drain

When disabling a PEBS event, we need to drain the buffer. Doing so
requires a correct cpuc->pebs_active mask.

The current code clears the pebs_active bit before draining the
buffer. Fix that.

Signed-off-by: "Liang, Kan" <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vince Weaver<[email protected]>
Link: http://lkml.kernel.org/r/37D7C6CF3E00A74B8858931C1DB2F07701885A65@SHSMSX103.ccr.corp.intel.com
[ Fixed the SOB. ]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel_ds.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 2f7ee05..8e564f8 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -789,6 +789,11 @@ void intel_pmu_pebs_disable(struct perf_event *event)
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
struct hw_perf_event *hwc = &event->hw;
struct debug_store *ds = cpuc->ds;
+ bool large_pebs = ds->pebs_interrupt_threshold >
+ ds->pebs_buffer_base + x86_pmu.pebs_record_size;
+
+ if (large_pebs)
+ intel_pmu_drain_pebs_buffer();

cpuc->pebs_enabled &= ~(1ULL << hwc->idx);

@@ -797,12 +802,8 @@ void intel_pmu_pebs_disable(struct perf_event *event)
else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
cpuc->pebs_enabled &= ~(1ULL << 63);

- if (ds->pebs_interrupt_threshold >
- ds->pebs_buffer_base + x86_pmu.pebs_record_size) {
- intel_pmu_drain_pebs_buffer();
- if (!pebs_is_enabled(cpuc))
- perf_sched_cb_dec(event->ctx->pmu);
- }
+ if (large_pebs && !pebs_is_enabled(cpuc))
+ perf_sched_cb_dec(event->ctx->pmu);

if (cpuc->enabled)
wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);