Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755409AbcK2TIw (ORCPT ); Tue, 29 Nov 2016 14:08:52 -0500 Received: from mga05.intel.com ([192.55.52.43]:41071 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753542AbcK2TIn (ORCPT ); Tue, 29 Nov 2016 14:08:43 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.31,570,1473145200"; d="scan'208";a="36899031" From: "Odzioba, Lukasz" To: Stephane Eranian , Peter Zijlstra CC: "Liang, Kan" , "mingo@redhat.com" , LKML , Alexander Shishkin , "ak@linux.intel.com" Subject: RE: [PATCH] perf/x86: fix event counter update issue Thread-Topic: [PATCH] perf/x86: fix event counter update issue Thread-Index: AQHSSa1z5JhtztyVjkewN4oL0nBfbaDvsacAgACEqgCAAAlsMA== Date: Tue, 29 Nov 2016 19:08:39 +0000 Message-ID: References: <1480361206-1702-1-git-send-email-kan.liang@intel.com> <20161129092520.GB3092@twins.programming.kicks-ass.net> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiMWY5MWEwNjMtNDE5OS00MjdmLWI3YWQtNjE4NDNkNThlNDRkIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX1BVQkxJQyJ9XX1dfSwiU3ViamVjdExhYmVscyI6W10sIlRNQ1ZlcnNpb24iOiIxNS45LjYuNiIsIlRydXN0ZWRMYWJlbEhhc2giOiJ5UnVTTm9uN2dGb1p6am5IR0NiRjk1ajlDcUNLRm5qZEVXUFduMlRzWExnPSJ9 x-ctpclassification: CTP_PUBLIC x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id uATJ8t3r006522 Content-Length: 3242 Lines: 47 On Tuesday, November 29, 2016 6:20 PM Stephane Eranian wrote: >> On Tue, Nov 29, 2016 at 1:25 AM, Peter Zijlstra wrote: >> How can this happen? IIRC the thing increments, we program a negative >> value, and when it passes 0 we generate a PMI. >> > Yeah, that's the part I don't quite understand thus my initial > question. What you describe > is what is supposed to happen regardless of the counter width. I don't see any interrupts flying around: # cat /proc/interrupts | grep PMI PMI: 10 14 12 18 Performance monitoring interrupts (test case here) # cat /proc/interrupts | grep PMI PMI: 10 14 12 18 Performance monitoring interrupts What I also don't understand is why this is only seen on per cpu, or system wide mode. PMIs seems to be programmed because if we wait long enough it will eventually overflow, what we saw in one of tests cases from commit msg. In the failing case (system wide) x86_perf_event_update is called only once at the end of workload: 259.154530: x86_perf_event_set_period: x86_perf_event_set_period left = ffffffffff, ret = 0 550.079623: x86_pmu_stop: x86_pmu_stop -> need update 550.079625: x86_perf_event_update: x86_perf_event_update config = 11003c, period = ffff880276212000 550.079627: x86_perf_event_update: x86_perf_event_update sample_period = ffffffffff, last_period = ffffffffff, period_left = ffffffffff 550.079629: x86_perf_event_update: x86_perf_event_update(1) prev = ffffff0000000001, new = 8bba934b9e, delta = ffffff8bba934b9d, shift=24 In the good case -per pid case, x86_perf_event_update is called every ~3 seconds. 568.448083: x86_perf_event_set_period: x86_perf_event_set_period left = ffffffffff, ret = 0 568.949105: x86_pmu_stop: x86_pmu_stop -> need update 568.949106: x86_perf_event_update: x86_perf_event_update config = 11003c, period = ffff880276212000 568.949108: x86_perf_event_update: x86_perf_event_update sample_period = ffffffffff, last_period = ffffffffff, period_left = ffffffffff 568.949110: x86_perf_event_update: x86_perf_event_update(1) prev = ffffff0000000001, new = 3d9525cc, delta = 3d9525cb, shift=24 568.949135: x86_perf_event_set_period: x86_perf_event_set_period left = ffc26ada34, ret = 0 570.679697: x86_pmu_stop: x86_pmu_stop -> need update 570.679699: x86_perf_event_update: x86_perf_event_update config = 11003c, period = ffff880276212000 570.679700: x86_perf_event_update: x86_perf_event_update sample_period = ffffffffff, last_period = ffffffffff, period_left = ffc26ada34 570.679701: x86_perf_event_update: x86_perf_event_update(1) prev = ffffff003d9525cc, new = 112601ddf, delta = d4caf813, shift=24 570.679723: x86_perf_event_set_period: x86_perf_event_set_period left = feed9fe221, ret = 0 (... ~115 similar calls here) 859.431686: x86_pmu_stop: x86_pmu_stop -> need update 859.431687: x86_perf_event_update: x86_perf_event_update config = 11003c, period = ffff880276209000 859.431688: x86_perf_event_update: x86_perf_event_update sample_period = ffffffffff, last_period = ffffffffff, period_left = 7453fc7d0d 859.431689: x86_perf_event_update: x86_perf_event_update(1) prev = ffffff8bac0382f3, new = 8bbaa4a147, delta = ea11e54, shift=24 Thanks, Lukas