Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753773AbcCHNR2 (ORCPT ); Tue, 8 Mar 2016 08:17:28 -0500 Received: from torg.zytor.com ([198.137.202.12]:51952 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751295AbcCHNRS (ORCPT ); Tue, 8 Mar 2016 08:17:18 -0500 Date: Tue, 8 Mar 2016 05:16:31 -0800 From: tip-bot for Stephane Eranian Message-ID: Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, eranian@google.com, mingo@kernel.org, alexander.shishkin@linux.intel.com, stable@vger.kernel.org, tglx@linutronix.de, torvalds@linux-foundation.org, hpa@zytor.com, acme@redhat.com, jolsa@redhat.com, vincent.weaver@maine.edu Reply-To: peterz@infradead.org, eranian@google.com, linux-kernel@vger.kernel.org, mingo@kernel.org, alexander.shishkin@linux.intel.com, stable@vger.kernel.org, torvalds@linux-foundation.org, tglx@linutronix.de, hpa@zytor.com, acme@redhat.com, vincent.weaver@maine.edu, jolsa@redhat.com In-Reply-To: <1457034642-21837-3-git-send-email-eranian@google.com> References: <1457034642-21837-3-git-send-email-eranian@google.com> To: linux-tip-commits@vger.kernel.org Subject: [tip:perf/core] perf/x86/pebs: Add workaround for broken OVFL status on HSW+ Git-Commit-ID: 8077eca079a212f26419c57226f28696b7100683 X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3962 Lines: 92 Commit-ID: 8077eca079a212f26419c57226f28696b7100683 Gitweb: http://git.kernel.org/tip/8077eca079a212f26419c57226f28696b7100683 Author: Stephane Eranian AuthorDate: Thu, 3 Mar 2016 20:50:41 +0100 Committer: Ingo Molnar CommitDate: Tue, 8 Mar 2016 12:18:35 +0100 perf/x86/pebs: Add workaround for broken OVFL status on HSW+ This patch fixes an issue with the GLOBAL_OVERFLOW_STATUS bits on Haswell, Broadwell and Skylake processors when using PEBS. The SDM stipulates that when the PEBS iterrupt threshold is crossed, an interrupt is posted and the kernel is interrupted. The kernel will find GLOBAL_OVF_SATUS bit 62 set indicating there are PEBS records to drain. But the bits corresponding to the actual counters should NOT be set. The kernel follows the SDM and assumes that all PEBS events are processed in the drain_pebs() callback. The kernel then checks for remaining overflows on any other (non-PEBS) events and processes these in the for_each_bit_set(&status) loop. As it turns out, under certain conditions on HSW and later processors, on PEBS buffer interrupt, bit 62 is set but the counter bits may be set as well. In that case, the kernel drains PEBS and generates SAMPLES with the EXACT tag, then it processes the counter bits, and generates normal (non-EXACT) SAMPLES. I ran into this problem by trying to understand why on HSW sampling on a PEBS event was sometimes returning SAMPLES without the EXACT tag. This should not happen on user level code because HSW has the eventing_ip which always point to the instruction that caused the event. The workaround in this patch simply ensures that the bits for the counters used for PEBS events are cleared after the PEBS buffer has been drained. With this fix 100% of the PEBS samples on my user code report the EXACT tag. Before: $ perf record -e cpu/event=0xd0,umask=0x81/upp ./multichase $ perf report -D | fgrep SAMPLES PERF_RECORD_SAMPLE(IP, 0x2): 11775/11775: 0x406de5 period: 73469 addr: 0 exact=Y \--- EXACT tag is missing After: $ perf record -e cpu/event=0xd0,umask=0x81/upp ./multichase $ perf report -D | fgrep SAMPLES PERF_RECORD_SAMPLE(IP, 0x4002): 11775/11775: 0x406de5 period: 73469 addr: 0 exact=Y \--- EXACT tag is set The problem tends to appear more often when multiple PEBS events are used. Signed-off-by: Stephane Eranian Signed-off-by: Peter Zijlstra (Intel) Cc: Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Vince Weaver Cc: adrian.hunter@intel.com Cc: kan.liang@intel.com Cc: namhyung@kernel.org Link: http://lkml.kernel.org/r/1457034642-21837-3-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar --- arch/x86/events/intel/core.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index b3f6349..6567c62 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -1892,6 +1892,16 @@ again: if (__test_and_clear_bit(62, (unsigned long *)&status)) { handled++; x86_pmu.drain_pebs(regs); + /* + * There are cases where, even though, the PEBS ovfl bit is set + * in GLOBAL_OVF_STATUS, the PEBS events may also have their + * overflow bits set for their counters. We must clear them + * here because they have been processed as exact samples in + * the drain_pebs() routine. They must not be processed again + * in the for_each_bit_set() loop for regular samples below. + */ + status &= ~cpuc->pebs_enabled; + status &= x86_pmu.intel_ctrl | GLOBAL_STATUS_TRACE_TOPAPMI; } /*