Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752916AbdGFFID (ORCPT ); Thu, 6 Jul 2017 01:08:03 -0400 Received: from mail-qk0-f194.google.com ([209.85.220.194]:35465 "EHLO mail-qk0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751011AbdGFFIB (ORCPT ); Thu, 6 Jul 2017 01:08:01 -0400 MIME-Version: 1.0 Reply-To: robert@ocallahan.org In-Reply-To: <20170704102159.GB20062@leverpostej> References: <2256f9b5-1277-c4b1-1472-61a10cd1db9a@linux.intel.com> <20170628101248.GB5981@leverpostej> <20170628105600.GC5981@leverpostej> <20170628174900.GG8252@leverpostej> <20170704090313.xyb5lntyy55ga7dm@hirez.programming.kicks-ass.net> <20170704093345.GB19649@leverpostej> <20170704102159.GB20062@leverpostej> From: "Robert O'Callahan" Date: Wed, 5 Jul 2017 22:07:59 -0700 X-Google-Sender-Auth: lrnlXK0MI8Qz03XotObkp0O0pmk Message-ID: Subject: Re: [PATCH] perf/core: generate overflow signal when samples are dropped (WAS: Re: [REGRESSION] perf/core: PMU interrupts dropped if we entered the kernel in the "skid" region) To: Mark Rutland Cc: Peter Zijlstra , Kyle Huey , Vince Weaver , "Jin, Yao" , Ingo Molnar , stable@vger.kernel.org, Alexander Shishkin , Arnaldo Carvalho de Melo , Jiri Olsa , Linus Torvalds , Namhyung Kim , Stephane Eranian , Thomas Gleixner , acme@kernel.org, jolsa@kernel.org, kan.liang@intel.com, Will Deacon , gregkh@linuxfoundation.org, open list Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1096 Lines: 23 On Tue, Jul 4, 2017 at 3:21 AM, Mark Rutland wrote: > Should any of those be moved into the "should be dropped" pile? Why not be conservative and clear every sample you're not sure about? We'd appreciate a fix sooner rather than later here, since rr is currently broken on every stable Linux kernel and our attempts to implement a workaround have failed. (We have separate "interrupt" and "measure" counters, and I thought we might work around this regression by programming the "interrupt" counter to count kernel events as well as user events (interrupting early is OK), but that caused our (completely separate) "measure" counter to report off-by-one results (!), which seems to be a different bug present on a range of older kernels.) Thanks, Rob -- lbir ye,ea yer.tnietoehr rdn rdsme,anea lurpr edna e hnysnenh hhe uresyf toD selthor stor edna siewaoeodm or v sstvr esBa kbvted,t rdsme,aoreseoouoto o l euetiuruewFa kbn e hnystoivateweh uresyf tulsa rehr rdm or rnea lurpr .a war hsrer holsa rodvted,t nenh hneireseoouot.tniesiewaoeivatewt sstvr esn