Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754938AbbGFQQJ (ORCPT ); Mon, 6 Jul 2015 12:16:09 -0400 Received: from mail-ig0-f173.google.com ([209.85.213.173]:33715 "EHLO mail-ig0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751439AbbGFQQG (ORCPT ); Mon, 6 Jul 2015 12:16:06 -0400 From: Vince Weaver X-Google-Original-From: Vince Weaver Date: Mon, 6 Jul 2015 12:22:23 -0400 (EDT) To: "Liang, Kan" cc: Peter Zijlstra , "Vince Weaver (vincent.weaver@maine.edu)" , "linux-kernel@vger.kernel.org" , Ingo Molnar , Arnaldo Carvalho de Melo , Stephane Eranian Subject: RE: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm() In-Reply-To: <37D7C6CF3E00A74B8858931C1DB2F07701886203@SHSMSX103.ccr.corp.intel.com> Message-ID: References: <20150703131336.GI19282@twins.programming.kicks-ass.net> <37D7C6CF3E00A74B8858931C1DB2F07701885A65@SHSMSX103.ccr.corp.intel.com> <20150706105517.GZ3644@twins.programming.kicks-ass.net> <37D7C6CF3E00A74B8858931C1DB2F07701886203@SHSMSX103.ccr.corp.intel.com> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4531 Lines: 82 On Mon, 6 Jul 2015, Liang, Kan wrote: > > > On Fri, Jul 03, 2015 at 08:08:27PM +0000, Liang, Kan wrote: > > > If we cleared the last bit, we not only drain the buffer but also > > > decrease the event->ctx->pmu, which is used to flush the PEBS buffer > > > during context switches. > > > We need to disable cpuc->pebs_enabled before changing > > > event->ctx->pmu as below. > > > > > > > Indeed, mind sending a proper patch so I can press 'A' on it? > > Sure, I will do that. > But I didn't verify the patch, since I cannot reproduce the issue. > > Vince, would you mind testing the patch? > If the issue is gone, I will send a proper patch then. I've got too many patches floating around so I forget what this one is trying to fix. The pebs related lockup? Or the warning? Was this patch meant to be in addition to PeterZ's, or standalone? Also please send proper patches in the future, this one was whitespace damaged and a pain to get applied. With just this patch applied (without PeterZ's) I still managed to trigger the following warning. [ 1328.103920] ------------[ cut here ]------------ [ 1328.109367] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/perf_event_intel_ds.c:1199 intel_pmu_drain_pebs_nhm+0x283/0x2e0() [ 1328.199193] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.2.0-rc1+ #166 [ 1328.207955] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014 [ 1328.216461] ffffffff81a10f88 ffff88011ea05b10 ffffffff816a10a3 0000000000000000 [ 1328.225043] 0000000000000000 ffff88011ea05b50 ffffffff8106ec8a ffff88011ea05ba0 [ 1328.233654] 0000000000000000 0000000000000001 ffff88011ea0bd80 ffff8801190000c0 [ 1328.242260] Call Trace: [ 1328.245452] [] dump_stack+0x45/0x57 [ 1328.252243] [] warn_slowpath_common+0x8a/0xc0 [ 1328.259212] [] warn_slowpath_null+0x1a/0x20 [ 1328.266000] [] intel_pmu_drain_pebs_nhm+0x283/0x2e0 [ 1328.273551] [] intel_pmu_handle_irq+0x255/0x440 [ 1328.280742] [] ? __perf_event_enable+0x1f/0x2a0 [ 1328.287887] [] ? irq_work_queue+0x5d/0x80 [ 1328.294500] [] perf_event_nmi_handler+0x26/0x40 [ 1328.301659] [] nmi_handle+0x9d/0x140 [ 1328.307783] [] ? nmi_handle+0x5/0x140 [ 1328.314010] [] default_do_nmi+0x4a/0x120 [ 1328.320495] [] do_nmi+0x8d/0xc0 [ 1328.326053] [] end_repeat_nmi+0x1e/0x2e [ 1328.332405] [] ? __intel_pmu_enable_all+0x5a/0xc0 [ 1328.339627] [] ? __intel_pmu_enable_all+0x5a/0xc0 [ 1328.346894] [] ? __intel_pmu_enable_all+0x5a/0xc0 [ 1328.354130] <> [] intel_pmu_enable_all+0x10/0x20 [ 1328.362556] [] x86_pmu_enable+0x25c/0x2e0 [ 1328.369079] [] perf_pmu_enable+0x22/0x30 [ 1328.375447] [] __perf_install_in_context+0x131/0x1d0 [ 1328.382936] [] ? cpu_clock_event_start+0x40/0x40 [ 1328.390106] [] remote_function+0x42/0x50 [ 1328.396525] [] flush_smp_call_function_queue+0x7b/0x170 [ 1328.404360] [] generic_smp_call_function_single_interrupt+0x13/0x60 [ 1328.413345] [] smp_call_function_single_interrupt+0x27/0x40 [ 1328.421544] [] call_function_single_interrupt+0x6b/0x70 [ 1328.429358] [] ? cpuidle_enter_state+0xf4/0x220 [ 1328.437120] [] ? cpuidle_enter_state+0xd0/0x220 [ 1328.444210] [] cpuidle_enter+0x17/0x20 [ 1328.450491] [] call_cpuidle+0x3b/0x70 [ 1328.456680] [] ? cpuidle_select+0x13/0x20 [ 1328.463196] [] cpu_startup_entry+0x245/0x310 [ 1328.469992] [] rest_init+0xbb/0xd0 [ 1328.475860] [] start_kernel+0x460/0x46d [ 1328.482154] [] ? early_idt_handler_array+0x120/0x120 [ 1328.489661] [] x86_64_start_reservations+0x2a/0x2c [ 1328.496974] [] x86_64_start_kernel+0x13b/0x14a [ 1328.503905] ---[ end trace a75b257dea18211b ]--- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/