Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757841AbcK2Oqc convert rfc822-to-8bit (ORCPT ); Tue, 29 Nov 2016 09:46:32 -0500 Received: from mga06.intel.com ([134.134.136.31]:10264 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757782AbcK2OqT (ORCPT ); Tue, 29 Nov 2016 09:46:19 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.31,717,1473145200"; d="scan'208";a="11057976" From: "Liang, Kan" To: Peter Zijlstra CC: "mingo@redhat.com" , "linux-kernel@vger.kernel.org" , "eranian@google.com" , "alexander.shishkin@linux.intel.com" , "ak@linux.intel.com" , "Odzioba, Lukasz" Subject: RE: [PATCH] perf/x86: fix event counter update issue Thread-Topic: [PATCH] perf/x86: fix event counter update issue Thread-Index: AQHSSa1zo+RK785pc06zvVqph8o/bKDvK4oAgADaq3A= Date: Tue, 29 Nov 2016 14:46:14 +0000 Message-ID: <37D7C6CF3E00A74B8858931C1DB2F07750CA4064@SHSMSX103.ccr.corp.intel.com> References: <1480361206-1702-1-git-send-email-kan.liang@intel.com> <20161129092520.GB3092@twins.programming.kicks-ass.net> In-Reply-To: <20161129092520.GB3092@twins.programming.kicks-ass.net> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiZGQwMTIxZDgtNDY3ZC00NTBmLTlkODMtOGEyMmU0OWRlN2MxIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE1LjkuNi42IiwiVHJ1c3RlZExhYmVsSGFzaCI6InY1eDVcL3NLYXJaMHRQdjY4UzFLaW9ybjVuZ1wvR1Bqc2VuM1RkeTMwaEtDdz0ifQ== x-ctpclassification: CTP_IC x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2560 Lines: 69 > So caveat that I'm ill and cannot think much.. > > On Mon, Nov 28, 2016 at 11:26:46AM -0800, kan.liang@intel.com wrote: > > > Here, all the possible failure cases are listed. > > Terms: > > - new: current PMU counter value which read from rdpmcl. > > - prev: previous counter value which is stored in &hwc->prev_count. > > - in PMI/not in PMI: the event update happens in PMI handler or not. > > > Current code to calculate delta: > > delta = (new << shift) - (prev << shift); > > delta >>= shift; > > > > Case A: Not in PMI. new > prev. But delta is negative. > > That's the failure case of Test 2. > > delta is s64 type. new and prev are u64 type. If the new is big > > enough, after doing left shift and sub, the bit 64 of the result may > > still be 1. > > After converting to s64, the sign flag will be set. Since delta is > > s64 type, arithmetic right shift is applied, which copy the sign flag > > into empty bit positions on the upper end of delta. > > It can be fixed by adding the max count value. > > > > Here is the real data for test2 on KNL. > > new = aea96e1927 > > prev = ffffff0000000001 > > delta = aea96e1927000000 - 1000000 = aea96e1926000000 > > aea96e1926000000 >> 24 = ffffffaea96e1926 << negative delta > > How can this happen? IIRC the thing increments, we program a negative > value, and when it passes 0 we generate a PMI. > > And note that we _ALWAYS_ set the IN bits, even for !sampling events. > Also note we set max_period to (1<<31) - 1, so we should never exceed 31 > bits. > The max_period is 0xfffffffff. The limit is breaked by this patch. 069e0c3c4058 ("perf/x86/intel: Support full width counting") https://patchwork.kernel.org/patch/2784191/ /* Support full width counters using alternative MSR range */ if (x86_pmu.intel_cap.full_width_write) { x86_pmu.max_period = x86_pmu.cntval_mask; x86_pmu.perfctr = MSR_IA32_PMC0; pr_cont("full-width counters, "); } > > > Case B: In PMI. new > prev. delta is positive. > > That's the failure case of Test 3. > > The PMI is triggered by overflow. But there is latency between > > overflow and PMI handler. So new has small amount. > > Current calculation lose the max count value. > > That doesn't make sense, per the 31bit limit. > > > > Case C: In PMI. new < prev. delta is negative. > > The PMU counter may be start from a big value. E.g. the fixed period > > is small. > > It can be fixed by adding the max count value. > > Doesn't make sense, how can this happen?