Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758583AbcLORAA (ORCPT ); Thu, 15 Dec 2016 12:00:00 -0500 Received: from mail-yw0-f176.google.com ([209.85.161.176]:33675 "EHLO mail-yw0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758347AbcLOQ76 (ORCPT ); Thu, 15 Dec 2016 11:59:58 -0500 MIME-Version: 1.0 In-Reply-To: <20161215084213.GV3124@twins.programming.kicks-ass.net> References: <20160307202556.GQ6344@twins.programming.kicks-ass.net> <20160308210707.GG6344@twins.programming.kicks-ass.net> <20160310104236.GV6344@twins.programming.kicks-ass.net> <20161214175552.GW3207@twins.programming.kicks-ass.net> <20161215084213.GV3124@twins.programming.kicks-ass.net> From: Stephane Eranian Date: Thu, 15 Dec 2016 08:59:56 -0800 Message-ID: Subject: Re: [PATCH 2/3] perf/x86/pebs: add workaround for broken OVFL status on HSW To: Peter Zijlstra Cc: Jiri Olsa , Andi Kleen , LKML , Arnaldo Carvalho de Melo , "mingo@elte.hu" , "Liang, Kan" , Namhyung Kim , Adrian Hunter Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3889 Lines: 87 On Thu, Dec 15, 2016 at 12:42 AM, Peter Zijlstra wrote: > On Wed, Dec 14, 2016 at 11:26:49PM -0800, Stephane Eranian wrote: >> On Wed, Dec 14, 2016 at 9:55 AM, Peter Zijlstra wrote: >> > >> > Just spotted this again, ping? >> > >> Ok, on what processor running what command, so I can try and reproduce? > > For me its more of a correctness issue, i've not actually spotted a > problem as such. > > But every time I read this code it makes me wonder. > > Supposing that the hardware sets the CTRL overflow flags but hasn't > generated the PEBS record yet (or not enough records to reach the PEBS > buffer threshold) we still don't want to process these events as if they > were !PEBS. > I am suspicious about the case where you have multiple PEBS events and they do not quite fire at the same time but close enough that you may have PEBS in-flight by the time you enter handle_irq. Last night I ran a simple test on SKL using tip.git: $ perf record --e cpu/event=0xd0,umask=0x81/upp,cpu/event=0xc0,umask=1/upp,cpu/event=0xd0,umask=0x81/upp multichase; perf report -D | fgrep SAMPLE | grep -v 'IP, 0x4' | grep -v events Basically, looking for samples missing the EXACT tag, i.e., samples processed a regular event when I only have PEBS events. Over 8h, I got about 3 or 4 such samples. So there is still a condition where we see the overflow as regular and not PEBS. So we need to examine that code again looking for possible race with PEBS in flight and not having the PEBS overflow bits yet. > That is, we _never_ want to look at pebs_enabled, hence unconditional > masking of those bits. > > Hardware _should_ not set them in the first place, but clearly it does > sometimes. > >> >> How about we make the clear of pebs_enabled unconditional? >> >> >> >> --- >> >> arch/x86/events/intel/core.c | 20 ++++++++++---------- >> >> 1 file changed, 10 insertions(+), 10 deletions(-) >> >> >> >> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c >> >> index 68fa55b4d42e..dc9579665425 100644 >> >> --- a/arch/x86/events/intel/core.c >> >> +++ b/arch/x86/events/intel/core.c >> >> @@ -1883,6 +1883,16 @@ static int intel_pmu_handle_irq(struct pt_regs *regs) >> >> status &= ~(GLOBAL_STATUS_COND_CHG | >> >> GLOBAL_STATUS_ASIF | >> >> GLOBAL_STATUS_LBRS_FROZEN); >> >> + /* >> >> + * There are cases where, even though, the PEBS ovfl bit is set >> >> + * in GLOBAL_OVF_STATUS, the PEBS events may also have their >> >> + * overflow bits set for their counters. We must clear them >> >> + * here because they have been processed as exact samples in >> >> + * the drain_pebs() routine. They must not be processed again >> >> + * in the for_each_bit_set() loop for regular samples below. >> >> + */ >> >> + status &= ~cpuc->pebs_enabled; >> >> + >> >> if (!status) >> >> goto done; >> >> >> >> @@ -1892,16 +1902,6 @@ static int intel_pmu_handle_irq(struct pt_regs *regs) >> >> if (__test_and_clear_bit(62, (unsigned long *)&status)) { >> >> handled++; >> >> x86_pmu.drain_pebs(regs); >> >> - /* >> >> - * There are cases where, even though, the PEBS ovfl bit is set >> >> - * in GLOBAL_OVF_STATUS, the PEBS events may also have their >> >> - * overflow bits set for their counters. We must clear them >> >> - * here because they have been processed as exact samples in >> >> - * the drain_pebs() routine. They must not be processed again >> >> - * in the for_each_bit_set() loop for regular samples below. >> >> - */ >> >> - status &= ~cpuc->pebs_enabled; >> >> - status &= x86_pmu.intel_ctrl | GLOBAL_STATUS_TRACE_TOPAPMI; >> >> } >> >> >> >> /*