Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754549AbbGFQp1 (ORCPT ); Mon, 6 Jul 2015 12:45:27 -0400 Received: from mail-ig0-f178.google.com ([209.85.213.178]:38470 "EHLO mail-ig0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751045AbbGFQpW (ORCPT ); Mon, 6 Jul 2015 12:45:22 -0400 From: Vince Weaver X-Google-Original-From: Vince Weaver Date: Mon, 6 Jul 2015 12:51:37 -0400 (EDT) To: "Liang, Kan" cc: Vince Weaver , Peter Zijlstra , "linux-kernel@vger.kernel.org" , Ingo Molnar , Arnaldo Carvalho de Melo , Stephane Eranian Subject: RE: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm() In-Reply-To: <37D7C6CF3E00A74B8858931C1DB2F0770188655F@SHSMSX103.ccr.corp.intel.com> Message-ID: References: <20150703131336.GI19282@twins.programming.kicks-ass.net> <37D7C6CF3E00A74B8858931C1DB2F07701885A65@SHSMSX103.ccr.corp.intel.com> <20150706105517.GZ3644@twins.programming.kicks-ass.net> <37D7C6CF3E00A74B8858931C1DB2F07701886203@SHSMSX103.ccr.corp.intel.com> <37D7C6CF3E00A74B8858931C1DB2F0770188655F@SHSMSX103.ccr.corp.intel.com> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7387 Lines: 135 On Mon, 6 Jul 2015, Liang, Kan wrote: > > > On Mon, 6 Jul 2015, Liang, Kan wrote: > > > > > > > > > On Fri, Jul 03, 2015 at 08:08:27PM +0000, Liang, Kan wrote: > > > > > If we cleared the last bit, we not only drain the buffer but also > > > > > decrease the event->ctx->pmu, which is used to flush the PEBS > > > > > buffer during context switches. > > > > > We need to disable cpuc->pebs_enabled before changing > > > > > event->ctx->pmu as below. > > > > > > > > > > > > > Indeed, mind sending a proper patch so I can press 'A' on it? > > > > > > Sure, I will do that. > > > But I didn't verify the patch, since I cannot reproduce the issue. > > > > > > Vince, would you mind testing the patch? > > > If the issue is gone, I will send a proper patch then. > > > > I've got too many patches floating around so I forget what this one is trying > > to fix. The pebs related lockup? Or the warning? > > > > It's trying to fix the warning issue as below. > For the test result, it looks the patch doesn't help, does it? > > > > WARN_ON_ONCE(!event->attr.precise_ip); > > > > > > [ 584.352324] WARNING: CPU: 2 PID: 18924 at > > > arch/x86/kernel/cpu/perf_event_intel_ds.c:1198 > > > intel_pmu_drain_pebs_nhm+0x283/0x2e0() > > > > Was this patch meant to be in addition to PeterZ's, or standalone? > > > > Standalone. > > > Also please send proper patches in the future, this one was whitespace > > damaged and a pain to get applied. > > > > With just this patch applied (without PeterZ's) I still managed to trigger the > > following warning. > > Thanks for the test. The machine also crashed a few minutes later. [ 2972.105858] INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 7, t=5482 jiffies, g=338012, c=338011, q=205) [ 2972.118544] Task dump for CPU 3: [ 2972.122706] perf_fuzzer R running task 0 9409 2404 0x0000000c [ 2972.131021] 0000000000000092 ffffffff81030dfb 0000000300000005 0000000400000004 [ 2972.139762] ffff88011eacc4a0 0000000000000005 0000000000000092 0000000500000002 [ 2972.148507] ffffffff81030e6b ffff8801197b2800 ffff8801197b2800 ffff88011eacbd80 [ 2972.157225] Call Trace: [ 2972.160547] [] ? intel_start_scheduling+0x4b/0x70 [ 2972.168098] [] ? intel_stop_scheduling+0x4b/0x70 [ 2972.175513] [] ? _raw_spin_unlock+0x2b/0x40 [ 2972.182502] [] ? intel_stop_scheduling+0x4b/0x70 [ 2972.189920] [] ? x86_schedule_events+0x1e2/0x260 [ 2972.197340] [] ? __lock_acquire.isra.31+0x3a6/0xf90 [ 2972.204998] [] ? __lock_acquire.isra.31+0x3a6/0xf90 [ 2972.212670] [] ? perf_event_update_userpage+0x102/0x170 [ 2972.220680] [] ? perf_event_update_userpage+0x11a/0x170 [ 2972.228699] [] ? perf_event_task_disable+0xd0/0xd0 [ 2972.236281] [] ? intel_pmu_enable_event+0xfb/0x210 [ 2972.243882] [] ? intel_pmu_pebs_enable_all+0x34/0x40 [ 2972.251652] [] ? __intel_pmu_enable_all+0x8d/0xc0 [ 2972.259115] [] ? intel_pmu_enable_all+0x10/0x20 [ 2972.266402] [] ? x86_pmu_enable+0x25c/0x2e0 [ 2972.273316] [] ? perf_pmu_enable+0x22/0x30 [ 2972.280152] [] ? __perf_install_in_context+0x131/0x1d0 [ 2972.288087] [] ? remote_function+0x42/0x50 [ 2972.294856] [] ? generic_exec_single+0xb6/0x120 [ 2972.302156] [] ? SYSC_perf_event_open+0xb4a/0xd40 [ 2972.309584] [] ? cpu_clock_event_start+0x40/0x40 [ 2972.316895] [] ? smp_call_function_single+0xb0/0x110 [ 2972.324617] [] ? task_function_call+0x44/0x50 [ 2972.331698] [] ? perf_mux_hrtimer_handler+0x1f0/0x1f0 [ 2972.339490] [] ? perf_install_in_context+0x83/0xf0 [ 2972.347014] [] ? SYSC_perf_event_open+0xb81/0xd40 [ 2972.354436] [] ? SyS_perf_event_open+0x9/0x10 [ 2972.361469] [] ? entry_SYSCALL_64_fastpath+0x16/0x7a [ 2973.026502] ------------[ cut here ]------------ [ 2973.031822] WARNING: CPU: 3 PID: 9409 at kernel/watchdog.c:311 watchdog_overflow_callback+0x84/0xa0() [ 2973.042038] Watchdog detected hard LOCKUP on cpu 3 [ 2973.124350] CPU: 3 PID: 9409 Comm: perf_fuzzer Tainted: G W 4.2.0-rc1+ #166 [ 2973.133447] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014 [ 2973.141844] ffffffff81a28ae2 ffff88011eac5af0 ffffffff816a10a3 0000000000000000 [ 2973.150325] ffff88011eac5b40 ffff88011eac5b30 ffffffff8106ec8a ffff88011eac5c40 [ 2973.158787] ffff880119133800 0000000000000001 ffff88011eac5c40 ffff88011eac5ef8 [ 2973.167248] Call Trace: [ 2973.170317] [] dump_stack+0x45/0x57 [ 2973.176960] [] warn_slowpath_common+0x8a/0xc0 [ 2973.183818] [] warn_slowpath_fmt+0x46/0x50 [ 2973.190416] [] ? intel_pmu_drain_pebs_nhm+0x176/0x2e0 [ 2973.198013] [] watchdog_overflow_callback+0x84/0xa0 [ 2973.205422] [] __perf_event_overflow+0x8c/0x1c0 [ 2973.212460] [] perf_event_overflow+0x14/0x20 [ 2973.219226] [] intel_pmu_handle_irq+0x1d4/0x440 [ 2973.226340] [] perf_event_nmi_handler+0x26/0x40 [ 2973.233400] [] nmi_handle+0x9d/0x140 [ 2973.239427] [] ? nmi_handle+0x5/0x140 [ 2973.245540] [] default_do_nmi+0xc9/0x120 [ 2973.251932] [] do_nmi+0x8d/0xc0 [ 2973.257507] [] end_repeat_nmi+0x1e/0x2e [ 2973.263846] [] ? intel_bts_enable_local+0x26/0x40 [ 2973.271087] [] ? intel_bts_enable_local+0x26/0x40 [ 2973.278330] [] ? intel_bts_enable_local+0x26/0x40 [ 2973.285556] <> [] ? __intel_pmu_enable_all+0x8d/0xc0 [ 2973.293646] [] intel_pmu_enable_all+0x10/0x20 [ 2973.300538] [] x86_pmu_enable+0x25c/0x2e0 [ 2973.307048] [] perf_pmu_enable+0x22/0x30 [ 2973.313470] [] __perf_install_in_context+0x131/0x1d0 [ 2973.320973] [] remote_function+0x42/0x50 [ 2973.327406] [] generic_exec_single+0xb6/0x120 [ 2973.334300] [] ? SYSC_perf_event_open+0xb4a/0xd40 [ 2973.341513] [] ? cpu_clock_event_start+0x40/0x40 [ 2973.348676] [] smp_call_function_single+0xb0/0x110 [ 2973.355976] [] task_function_call+0x44/0x50 [ 2973.362705] [] ? perf_mux_hrtimer_handler+0x1f0/0x1f0 [ 2973.370291] [] perf_install_in_context+0x83/0xf0 [ 2973.377478] [] SYSC_perf_event_open+0xb81/0xd40 [ 2973.384552] [] SyS_perf_event_open+0x9/0x10 [ 2973.391242] [] entry_SYSCALL_64_fastpath+0x16/0x7a [ 2973.398575] ---[ end trace a75b257dea18211c ]--- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/