Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754113AbbGPHaV (ORCPT ); Thu, 16 Jul 2015 03:30:21 -0400 Received: from mail-ob0-f178.google.com ([209.85.214.178]:33766 "EHLO mail-ob0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752068AbbGPHaU (ORCPT ); Thu, 16 Jul 2015 03:30:20 -0400 MIME-Version: 1.0 Reply-To: eranian@gmail.com In-Reply-To: <20150716071536.GY19282@twins.programming.kicks-ass.net> References: <20150703131336.GI19282@twins.programming.kicks-ass.net> <20150703190420.GS3644@twins.programming.kicks-ass.net> <20150715123546.GK2859@worktop.programming.kicks-ass.net> <20150716071536.GY19282@twins.programming.kicks-ass.net> Date: Thu, 16 Jul 2015 00:30:18 -0700 Message-ID: Subject: Re: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm() From: Stephane Eranian To: Peter Zijlstra Cc: Vince Weaver , LKML , Ingo Molnar , Arnaldo Carvalho de Melo , kan.liang@intel.com Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2568 Lines: 58 On Thu, Jul 16, 2015 at 12:15 AM, Peter Zijlstra wrote: > On Thu, Jul 16, 2015 at 08:02:03AM +0200, Stephane Eranian wrote: >> Been running it for a couple of hours, so far so good. I will let it >> run all night. > > Thanks! > Well, it died on NHM in the same function despite your patch. Need to look at the exact warning.\ So more work is needed. But then I also saw the irq loop stuck message before that. >> > --- >> > arch/x86/kernel/cpu/perf_event_intel_ds.c | 29 +++++++++++++---------------- >> > 1 file changed, 13 insertions(+), 16 deletions(-) >> > >> > diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c >> > index 71fc40238843..68d0ced1d229 100644 >> > --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c >> > +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c >> > @@ -1142,6 +1142,7 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs) >> > >> > for (at = base; at < top; at += x86_pmu.pebs_record_size) { >> > struct pebs_record_nhm *p = at; >> > + u64 pebs_status; >> > >> > /* PEBS v3 has accurate status bits */ >> > if (x86_pmu.intel_cap.pebs_format >= 3) { >> > @@ -1152,12 +1153,14 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs) >> > continue; >> > } >> > >> > - bit = find_first_bit((unsigned long *)&p->status, >> > + pebs_status = p->status & cpuc->pebs_enabled; >> > + pebs_status &= (1ULL << x86_pmu.max_pebs_events) - 1; >> > + >> > + bit = find_first_bit((unsigned long *)&pebs_status, >> > x86_pmu.max_pebs_events); >> > if (bit >= x86_pmu.max_pebs_events) >> > continue; > > Maybe we should WARN in this case? A PEBS entry without any PEBS bits > set in the status field would be 'weird', right? > > Maybe something like: > > if (WARN(bit >= x86_pmu.max_pebs_events, > "PEBS record without PEBS event! status=%Lx pebs_enabled=%Lx active_mask=%Lx", > p->status, cpuc->pebs_enabled, cpuc->active_mask)) > continue; > > If that triggers we at least get more info. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/