Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751643AbdF1NIu (ORCPT ); Wed, 28 Jun 2017 09:08:50 -0400 Received: from foss.arm.com ([217.140.101.70]:41412 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751312AbdF1NIn (ORCPT ); Wed, 28 Jun 2017 09:08:43 -0400 Date: Wed, 28 Jun 2017 14:07:49 +0100 From: Mark Rutland To: Vince Weaver Cc: Kyle Huey , "Jin, Yao" , Ingo Molnar , "Peter Zijlstra (Intel)" , stable@vger.kernel.org, Alexander Shishkin , Arnaldo Carvalho de Melo , Jiri Olsa , Linus Torvalds , Namhyung Kim , Stephane Eranian , Thomas Gleixner , acme@kernel.org, jolsa@kernel.org, kan.liang@intel.com, Will Deacon , gregkh@linuxfoundation.org, "Robert O'Callahan" , open list Subject: Re: [PATCH] perf/core: generate overflow signal when samples are dropped (WAS: Re: [REGRESSION] perf/core: PMU interrupts dropped if we entered the kernel in the "skid" region) Message-ID: <20170628130748.GI5981@leverpostej> References: <2256f9b5-1277-c4b1-1472-61a10cd1db9a@linux.intel.com> <20170628101248.GB5981@leverpostej> <20170628105600.GC5981@leverpostej> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1970 Lines: 48 On Wed, Jun 28, 2017 at 08:40:30AM -0400, Vince Weaver wrote: > On Wed, 28 Jun 2017, Mark Rutland wrote: > > > On Wed, Jun 28, 2017 at 11:12:48AM +0100, Mark Rutland wrote: > > > Instead of bailing out early in perf_event_overflow, we can bail prior > > to performing the actual sampling in __perf_event_output(). This avoids > > the information leak, but preserves the generation of the signal. > > > > Since we don't place any sample data into the ring buffer, the signal is > > arguably spurious. However, a userspace ringbuffer consumer can already > > consume data prior to taking the associated signals, and therefore must > > handle spurious signals to operate correctly. Thus, this signal > > shouldn't be harmful. > > this could still break some of my perf_event validation tests. > > Ones that set up a sampling event for every 1M instructions, run for 100M > instructions, and expect there to be 100 samples received. Is that test reliable today? I'd expect that at least on ARM it's not, given that events can be counted imprecisely, and mode filters can be applied imprecisely. So you might get fewer (or more) samples. I'd imagine similar is true on other archtiectures. If sampling took long enough, the existing ratelimiting could come into effect, too. Surely that already has some error margin? > If we're so worried about info leakage, can't we just zero-out the problem > address (or randomize the kernel address) rather than just pretending the > interrupt didn't happen? Making up zeroed or randomized data is going to confuse users. I can't imagine that real users are going to want bogus samples that they have to identify (somehow) in order to skip when processing the data. I can see merit in signalling "lost" samples to userspace, so long as they're easily distinguished from real samples. One option is to fake up a sample using the user regs regardless, but that's both fragile and surprising in other cases. Thanks, Mark.