MIME-Version: 1.0
In-Reply-To: <20161214175552.GW3207@twins.programming.kicks-ass.net>
References: <20160307121840.GF6375@twins.programming.kicks-ass.net>
 <20160307182731.GA12153@krava.redhat.com> <20160307202556.GQ6344@twins.programming.kicks-ass.net>
 <CABPqkBT7ogN2Og20WtpRhKo29Twf0Lt_eZpPFie=dzPps-HWTQ@mail.gmail.com>
 <20160308210707.GG6344@twins.programming.kicks-ass.net> <CABPqkBQ3T1hx6Ncmhf_GN3ivw7JyOguXbbrCBpGkN7=qsXbf=Q@mail.gmail.com>
 <CABPqkBTW2b_ozpumgF2A7dDETCw9TxnPfXnjGZuN6DQS0wd29g@mail.gmail.com>
 <CABPqkBRAEi+79VkB7w5WtG+UwvnLm4sWqm6mqJ+mL-J7CK2Fyg@mail.gmail.com>
 <CABPqkBRhoae0F1eUbeBzk_bDPm2KmvGHBGSp7tPZFfhpACOZeg@mail.gmail.com>
 <20160310104236.GV6344@twins.programming.kicks-ass.net> <20161214175552.GW3207@twins.programming.kicks-ass.net>
From: Stephane Eranian <eranian@google.com>
Date: Wed, 14 Dec 2016 23:26:49 -0800
Message-ID: <CABPqkBSQzw=jX+7QwN296ozOVoCneXe+3xZJH142taM=nO1TvA@mail.gmail.com>
Subject: Re: [PATCH 2/3] perf/x86/pebs: add workaround for broken OVFL status
 on HSW
To: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Olsa <jolsa@redhat.com>, Andi Kleen <ak@linux.intel.com>,
        LKML <linux-kernel@vger.kernel.org>,
        Arnaldo Carvalho de Melo <acme@redhat.com>,
        "mingo@elte.hu" <mingo@elte.hu>, "Liang, Kan" <kan.liang@intel.com>,
        Namhyung Kim <namhyung@kernel.org>,
        Adrian Hunter <adrian.hunter@intel.com>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3551
Lines: 76

On Wed, Dec 14, 2016 at 9:55 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>
> Just spotted this again, ping?
>
Ok, on what processor running what command, so I can try and reproduce?

> On Thu, Mar 10, 2016 at 11:42:36AM +0100, Peter Zijlstra wrote:
>> On Wed, Mar 09, 2016 at 09:40:07AM -0800, Stephane Eranian wrote:
>> > With your queue.tip perf/core branch, I run into another problem.
>> > I am monitoring with 2 PEBS events and I have the NMI watchdog enabled.
>> >
>> > I see non-EXACT PEBS records again, despite my change (which is in).
>> > I tracked it down to the following issue after the testing of bit 62:
>> >
>> > [31137.273061] CPU71 status=0x200000001 orig_status=0x200000001 bit62=0
>> >
>> > The IRQ handler is called because the fixed counter for the NMI has overflowed
>> > and it sees this in bit 33, but it also sees that one of the PEBS
>> > events has also
>> > overflowed, yet bit 62 is not set. Therefore both overflows are
>> > treated as regular
>> > and the drain_pebs() is not called generating a non-EXACT record for the PEBS
>> > counter (counter 0). So something is wrong still and this is on Broadwell.
>> >
>> > First, I don't understand why the OVF bit for counter 0 is set. It
>> > should not according
>> > to specs because the counter is in PEBS mode. There must be a race there. So we
>> > have to handle it by relying on cpuc->pebs_enabled. I will try that.
>> > We likely also
>> > need to force OVF bit 62 to 1 so we can ack it in the end (and in case
>> > it gets set).
>>
>> How about we make the clear of pebs_enabled unconditional?
>>
>> ---
>>  arch/x86/events/intel/core.c | 20 ++++++++++----------
>>  1 file changed, 10 insertions(+), 10 deletions(-)
>>
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index 68fa55b4d42e..dc9579665425 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
>> @@ -1883,6 +1883,16 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
>>       status &= ~(GLOBAL_STATUS_COND_CHG |
>>                   GLOBAL_STATUS_ASIF |
>>                   GLOBAL_STATUS_LBRS_FROZEN);
>> +     /*
>> +      * There are cases where, even though, the PEBS ovfl bit is set
>> +      * in GLOBAL_OVF_STATUS, the PEBS events may also have their
>> +      * overflow bits set for their counters. We must clear them
>> +      * here because they have been processed as exact samples in
>> +      * the drain_pebs() routine. They must not be processed again
>> +      * in the for_each_bit_set() loop for regular samples below.
>> +      */
>> +     status &= ~cpuc->pebs_enabled;
>> +
>>       if (!status)
>>               goto done;
>>
>> @@ -1892,16 +1902,6 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
>>       if (__test_and_clear_bit(62, (unsigned long *)&status)) {
>>               handled++;
>>               x86_pmu.drain_pebs(regs);
>> -             /*
>> -              * There are cases where, even though, the PEBS ovfl bit is set
>> -              * in GLOBAL_OVF_STATUS, the PEBS events may also have their
>> -              * overflow bits set for their counters. We must clear them
>> -              * here because they have been processed as exact samples in
>> -              * the drain_pebs() routine. They must not be processed again
>> -              * in the for_each_bit_set() loop for regular samples below.
>> -              */
>> -             status &= ~cpuc->pebs_enabled;
>> -             status &= x86_pmu.intel_ctrl | GLOBAL_STATUS_TRACE_TOPAPMI;
>>       }
>>
>>       /*