Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755012AbbFKOP7 (ORCPT ); Thu, 11 Jun 2015 10:15:59 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:51850 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754736AbbFKOP4 (ORCPT ); Thu, 11 Jun 2015 10:15:56 -0400 Date: Thu, 11 Jun 2015 16:15:48 +0200 From: Peter Zijlstra To: Adrian Hunter Cc: Andi Kleen , Arnaldo Carvalho de Melo , Ingo Molnar , linux-kernel@vger.kernel.org, Jiri Olsa , Stephane Eranian , mathieu.poirier@linaro.org, Pawel Moll Subject: Re: [RFC PATCH] perf: Add PERF_RECORD_SWITCH to indicate context switches Message-ID: <20150611141548.GW19282@twins.programming.kicks-ass.net> References: <1433859670-10806-1-git-send-email-adrian.hunter@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1433859670-10806-1-git-send-email-adrian.hunter@intel.com> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1788 Lines: 49 On Tue, Jun 09, 2015 at 05:21:10PM +0300, Adrian Hunter wrote: > Tracepoints are no good at all for non-privileged users > because they need either CAP_SYS_ADMIN or > /proc/sys/kernel/perf_event_paranoid <= -1. > > On the other hand, kernel software events need either > CAP_SYS_ADMIN or /proc/sys/kernel/perf_event_paranoid <= 1. So while I think it makes sense to allow some tracepoint outside of that priv level, IOW have a per tracepoint priv level filter thingy, I don't think sched_switch() is one of those because it explicitly exposes timing information on other tasks. > This new PERF_RECORD_SWITCH event does not have those problems > and it also has a couple of other small advantages. It is > easier to use because it is an auxiliary event (like mmap, > comm and task events) which can be enabled by setting a single > bit. It is smaller than sched:sched_switch and easier to parse. Right, so the one wee problem I have is that this only provides sched_in data, I imagine people might be interested in sched_out as well. Typically the switch even provides prev and next and thereby is complete, but since we're limiting it to the one specific task, we'll not have the sched_out data. > @@ -812,6 +813,18 @@ enum perf_event_type { > */ > PERF_RECORD_ITRACE_START = 12, > > + /* > + * > + * > + * struct { > + * struct perf_event_header header; > + * u32 pid, tid; > + * u64 time; all 3 are already part of sample_id. > + * struct sample_id sample_id; > + * }; > + */ > + PERF_RECORD_SWITCH = 13, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/