Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751657AbdF1K5A (ORCPT ); Wed, 28 Jun 2017 06:57:00 -0400 Received: from foss.arm.com ([217.140.101.70]:40300 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751584AbdF1K4z (ORCPT ); Wed, 28 Jun 2017 06:56:55 -0400 Date: Wed, 28 Jun 2017 11:56:01 +0100 From: Mark Rutland To: Kyle Huey Cc: "Jin, Yao" , Ingo Molnar , "Peter Zijlstra (Intel)" , stable@vger.kernel.org, Alexander Shishkin , Arnaldo Carvalho de Melo , Jiri Olsa , Linus Torvalds , Namhyung Kim , Stephane Eranian , Thomas Gleixner , Vince Weaver , acme@kernel.org, jolsa@kernel.org, kan.liang@intel.com, Will Deacon , gregkh@linuxfoundation.org, "Robert O'Callahan" , open list Subject: [PATCH] perf/core: generate overflow signal when samples are dropped (WAS: Re: [REGRESSION] perf/core: PMU interrupts dropped if we entered the kernel in the "skid" region) Message-ID: <20170628105600.GC5981@leverpostej> References: <2256f9b5-1277-c4b1-1472-61a10cd1db9a@linux.intel.com> <20170628101248.GB5981@leverpostej> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170628101248.GB5981@leverpostej> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5481 Lines: 154 On Wed, Jun 28, 2017 at 11:12:48AM +0100, Mark Rutland wrote: > On Tue, Jun 27, 2017 at 09:51:00PM -0700, Kyle Huey wrote: > As we're trying to avoid smapling state, I think we can move the check > into perf_prepare_sample() or __perf_event_output(), where that state is > actually sampled. I'll take a look at that momentarily. > > Just to clarify, you don't care about the sample state at all? i.e. you > don't need the user program counter? > > Is that signal delivered to the tracee, or to a different process that > traces it? If the latter, what ensures that the task is stopped > sufficiently quickly? > > > It seems to me that it might be reasonable to ignore the interrupt if > > the purpose of the interrupt is to trigger sampling of the CPUs > > register state. But if the interrupt will trigger some other > > operation, such as a signal on an fd, then there's no reason to drop > > it. > > Agreed. I'll try to have a patch for this soon. > > I just need to figure out exactly where that overflow signal is > generated by the perf core. I've figured that out now. That's handled by perf_pending_event(), whcih is the irq_work we schedule in __perf_event_overflow(). Does the below patch work for you? ---->8---- >From bb1f99dace508ce34ab0818f91d59e73450fa142 Mon Sep 17 00:00:00 2001 From: Mark Rutland Date: Wed, 28 Jun 2017 11:39:25 +0100 Subject: [PATCH] perf/core: generate overflow signal when samples are dropped We recently tried to kill an information leak where users could receive kernel samples due to skid on the PMU interrupt. To block this, we bailed out early in perf_event_overflow(), as we do for non-sampling events. This broke rr, which uses sampling events to receive a signal on overflow (but does not care about the contents of the sample). These signals are critical to the correct operation of rr. Instead of bailing out early in perf_event_overflow, we can bail prior to performing the actual sampling in __perf_event_output(). This avoids the information leak, but preserves the generation of the signal. Since we don't place any sample data into the ring buffer, the signal is arguably spurious. However, a userspace ringbuffer consumer can already consume data prior to taking the associated signals, and therefore must handle spurious signals to operate correctly. Thus, this signal shouldn't be harmful. Fixes: cc1582c231ea041f ("perf/core: Drop kernel samples even though :u is specified") Reported-by: Kyle Huey Signed-off-by: Mark Rutland Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Ingo Molnar Cc: Jin Yao Cc: Peter Zijlstra (Intel) --- kernel/events/core.c | 46 +++++++++++++++++++++++++--------------------- 1 file changed, 25 insertions(+), 21 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 6c4e523..6b263f3 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -6090,6 +6090,21 @@ void perf_prepare_sample(struct perf_event_header *header, } } +static bool sample_is_allowed(struct perf_event *event, struct pt_regs *regs) +{ + /* + * Due to interrupt latency (AKA "skid"), we may enter the + * kernel before taking an overflow, even if the PMU is only + * counting user events. + * To avoid leaking information to userspace, we must always + * reject kernel samples when exclude_kernel is set. + */ + if (event->attr.exclude_kernel && !user_mode(regs)) + return false; + + return true; +} + static void __always_inline __perf_event_output(struct perf_event *event, struct perf_sample_data *data, @@ -6101,6 +6116,12 @@ void perf_prepare_sample(struct perf_event_header *header, struct perf_output_handle handle; struct perf_event_header header; + /* + * For security, drop the skid kernel samples if necessary. + */ + if (!sample_is_allowed(event, regs)) + return ret; + /* protect the callchain buffers */ rcu_read_lock(); @@ -7316,21 +7337,6 @@ int perf_event_account_interrupt(struct perf_event *event) return __perf_event_account_interrupt(event, 1); } -static bool sample_is_allowed(struct perf_event *event, struct pt_regs *regs) -{ - /* - * Due to interrupt latency (AKA "skid"), we may enter the - * kernel before taking an overflow, even if the PMU is only - * counting user events. - * To avoid leaking information to userspace, we must always - * reject kernel samples when exclude_kernel is set. - */ - if (event->attr.exclude_kernel && !user_mode(regs)) - return false; - - return true; -} - /* * Generic event overflow handling, sampling. */ @@ -7352,12 +7358,6 @@ static int __perf_event_overflow(struct perf_event *event, ret = __perf_event_account_interrupt(event, throttle); /* - * For security, drop the skid kernel samples if necessary. - */ - if (!sample_is_allowed(event, regs)) - return ret; - - /* * XXX event_limit might not quite work as expected on inherited * events */ @@ -7372,6 +7372,10 @@ static int __perf_event_overflow(struct perf_event *event, READ_ONCE(event->overflow_handler)(event, data, regs); + /* + * We must generate a wakeup regardless of whether we actually + * generated a sample. This is relied upon by rr. + */ if (*perf_event_fasync(event) && event->pending_kill) { event->pending_wakeup = 1; irq_work_queue(&event->pending); -- 1.9.1