From: "Liang, Kan" <kan.liang@intel.com>
To: Vince Weaver <vincent.weaver@maine.edu>
CC: Peter Zijlstra <peterz@infradead.org>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        Ingo Molnar <mingo@redhat.com>,
        "Arnaldo Carvalho de Melo" <acme@kernel.org>,
        Stephane Eranian <eranian@gmail.com>
Subject: RE: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()
Thread-Topic: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()
Thread-Index: AQHQtZIrWo8WhOngikicIZVMqW8Mb53KKJlQgAOao4CAAKBqAP//uvqAgACFYRA=
Date: Mon, 6 Jul 2015 16:23:56 +0000
Message-ID: <37D7C6CF3E00A74B8858931C1DB2F0770188655F@SHSMSX103.ccr.corp.intel.com>
References: <alpine.DEB.2.20.1507021111380.14637@vincent-weaver-1.umelst.maine.edu>
 <20150703131336.GI19282@twins.programming.kicks-ass.net>
 <37D7C6CF3E00A74B8858931C1DB2F07701885A65@SHSMSX103.ccr.corp.intel.com>
 <20150706105517.GZ3644@twins.programming.kicks-ass.net>
 <37D7C6CF3E00A74B8858931C1DB2F07701886203@SHSMSX103.ccr.corp.intel.com>
 <alpine.DEB.2.20.1507061220110.19467@vincent-weaver-1.umelst.maine.edu>
In-Reply-To: <alpine.DEB.2.20.1507061220110.19467@vincent-weaver-1.umelst.maine.edu>
Accept-Language: zh-CN, en-US
Content-Language: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 8BIT
MIME-Version: 1.0
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 5031
Lines: 109


> On Mon, 6 Jul 2015, Liang, Kan wrote:
> 
> >
> > > On Fri, Jul 03, 2015 at 08:08:27PM +0000, Liang, Kan wrote:
> > > > If we cleared the last bit, we not only drain the buffer but also
> > > > decrease the event->ctx->pmu, which is used to flush the PEBS
> > > > buffer during context switches.
> > > > We need to disable cpuc->pebs_enabled before changing
> > > > event->ctx->pmu as below.
> > > >
> > >
> > > Indeed, mind sending a proper patch so I can press 'A' on it?
> >
> > Sure, I will do that.
> > But I didn't verify the patch, since I cannot reproduce the issue.
> >
> > Vince, would you mind testing the patch?
> > If the issue is gone, I will send a proper patch then.
> 
> I've got too many patches floating around so I forget what this one is trying
> to fix.  The pebs related lockup?  Or the warning?
>

It's trying to fix the warning issue as below. 
For the test result, it looks the patch doesn't help, does it?

> > 	WARN_ON_ONCE(!event->attr.precise_ip);
> >
> > [  584.352324] WARNING: CPU: 2 PID: 18924 at
> > arch/x86/kernel/cpu/perf_event_intel_ds.c:1198
> > intel_pmu_drain_pebs_nhm+0x283/0x2e0()

 
> Was this patch meant to be in addition to PeterZ's, or standalone?
>

Standalone.
 
> Also please send proper patches in the future, this one was whitespace
> damaged and a pain to get applied.
> 
> With just this patch applied (without PeterZ's) I still managed to trigger the
> following warning.

Thanks for the test.

> 
> [ 1328.103920] ------------[ cut here ]------------ [ 1328.109367] WARNING:
> CPU: 0 PID: 0 at arch/x86/kernel/cpu/perf_event_intel_ds.c:1199
> intel_pmu_drain_pebs_nhm+0x283/0x2e0()
> [ 1328.199193] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       4.2.0-
> rc1+ #166
> [ 1328.207955] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS
> FBKT72AUS 01/26/2014 [ 1328.216461]  ffffffff81a10f88 ffff88011ea05b10
> ffffffff816a10a3 0000000000000000 [ 1328.225043]  0000000000000000
> ffff88011ea05b50 ffffffff8106ec8a ffff88011ea05ba0 [ 1328.233654]
> 0000000000000000 0000000000000001 ffff88011ea0bd80 ffff8801190000c0
> [ 1328.242260] Call Trace:
> [ 1328.245452]  <NMI>  [<ffffffff816a10a3>] dump_stack+0x45/0x57
> [ 1328.252243]  [<ffffffff8106ec8a>] warn_slowpath_common+0x8a/0xc0
> [ 1328.259212]  [<ffffffff8106ed7a>] warn_slowpath_null+0x1a/0x20
> [ 1328.266000]  [<ffffffff8102f783>]
> intel_pmu_drain_pebs_nhm+0x283/0x2e0
> [ 1328.273551]  [<ffffffff81032235>] intel_pmu_handle_irq+0x255/0x440
> [ 1328.280742]  [<ffffffff811574bf>] ? __perf_event_enable+0x1f/0x2a0
> [ 1328.287887]  [<ffffffff8114b19d>] ? irq_work_queue+0x5d/0x80
> [ 1328.294500]  [<ffffffff81028e76>] perf_event_nmi_handler+0x26/0x40
> [ 1328.301659]  [<ffffffff810181ad>] nmi_handle+0x9d/0x140 [ 1328.307783]
> [<ffffffff81018115>] ? nmi_handle+0x5/0x140 [ 1328.314010]
> [<ffffffff8101843a>] default_do_nmi+0x4a/0x120 [ 1328.320495]
> [<ffffffff8101859d>] do_nmi+0x8d/0xc0 [ 1328.326053]  [<ffffffff816ab01f>]
> end_repeat_nmi+0x1e/0x2e [ 1328.332405]  [<ffffffff810309ba>] ?
> __intel_pmu_enable_all+0x5a/0xc0 [ 1328.339627]  [<ffffffff810309ba>] ?
> __intel_pmu_enable_all+0x5a/0xc0 [ 1328.346894]  [<ffffffff810309ba>] ?
> __intel_pmu_enable_all+0x5a/0xc0 [ 1328.354130]  <<EOE>>  <IRQ>
> [<ffffffff81030a30>] intel_pmu_enable_all+0x10/0x20 [ 1328.362556]
> [<ffffffff8102a95c>] x86_pmu_enable+0x25c/0x2e0 [ 1328.369079]
> [<ffffffff81156202>] perf_pmu_enable+0x22/0x30 [ 1328.375447]
> [<ffffffff81157da1>] __perf_install_in_context+0x131/0x1d0
> [ 1328.382936]  [<ffffffff811533b0>] ? cpu_clock_event_start+0x40/0x40
> [ 1328.390106]  [<ffffffff811533f2>] remote_function+0x42/0x50
> [ 1328.396525]  [<ffffffff810f0e9b>]
> flush_smp_call_function_queue+0x7b/0x170
> [ 1328.404360]  [<ffffffff810f1883>]
> generic_smp_call_function_single_interrupt+0x13/0x60
> [ 1328.413345]  [<ffffffff81049217>]
> smp_call_function_single_interrupt+0x27/0x40
> [ 1328.421544]  [<ffffffff816aa1ab>]
> call_function_single_interrupt+0x6b/0x70
> [ 1328.429358]  <EOI>  [<ffffffff81545454>] ?
> cpuidle_enter_state+0xf4/0x220 [ 1328.437120]  [<ffffffff81545430>] ?
> cpuidle_enter_state+0xd0/0x220 [ 1328.444210]  [<ffffffff815455b7>]
> cpuidle_enter+0x17/0x20 [ 1328.450491]  [<ffffffff810b06eb>]
> call_cpuidle+0x3b/0x70 [ 1328.456680]  [<ffffffff81545593>] ?
> cpuidle_select+0x13/0x20 [ 1328.463196]  [<ffffffff810b0965>]
> cpu_startup_entry+0x245/0x310 [ 1328.469992]  [<ffffffff81695b3b>]
> rest_init+0xbb/0xd0 [ 1328.475860]  [<ffffffff81d4af8b>]
> start_kernel+0x460/0x46d [ 1328.482154]  [<ffffffff81d4a120>] ?
> early_idt_handler_array+0x120/0x120
> [ 1328.489661]  [<ffffffff81d4a4d7>] x86_64_start_reservations+0x2a/0x2c
> [ 1328.496974]  [<ffffffff81d4a614>] x86_64_start_kernel+0x13b/0x14a
> [ 1328.503905] ---[ end trace a75b257dea18211b ]---

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/