Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932758AbcKJMNF (ORCPT ); Thu, 10 Nov 2016 07:13:05 -0500 Received: from bombadil.infradead.org ([198.137.202.9]:39574 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932197AbcKJMND (ORCPT ); Thu, 10 Nov 2016 07:13:03 -0500 Date: Thu, 10 Nov 2016 13:12:53 +0100 From: Peter Zijlstra To: Mark Rutland Cc: kan.liang@intel.com, mingo@redhat.com, linux-kernel@vger.kernel.org, acme@kernel.org, tglx@linutronix.de, alexander.shishkin@linux.intel.com, vince@deater.net, eranian@google.com, andi@firstfloor.org Subject: Re: [PATCH] perf/core: introduce context per CPU event list Message-ID: <20161110121253.GX3142@twins.programming.kicks-ass.net> References: <1478718286-12824-1-git-send-email-kan.liang@intel.com> <20161110083355.GO3568@worktop.programming.kicks-ass.net> <20161110110516.GA4418@leverpostej> <20161110113704.GU3142@twins.programming.kicks-ass.net> <20161110120423.GC4418@leverpostej> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161110120423.GC4418@leverpostej> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1697 Lines: 41 On Thu, Nov 10, 2016 at 12:04:23PM +0000, Mark Rutland wrote: > On Thu, Nov 10, 2016 at 12:37:05PM +0100, Peter Zijlstra wrote: > > So the problem is finding which events are active when. > > Sure. > > If we only care about PERF_EVENT_STATE_ACTIVE, then I think we can > fairly easily maintain a perf_event_context::active_event_list at > event_sched_{in,out}() time (or somewhere close to that). > > If we need PERF_EVENT_STATE_INACTIVE events, then that doesn't work, > since we can give up early and not schedule some eligible events. > > > If we stick all events in an RB-tree sorted on: {pmu,cpu,runtime} we > > can, fairly easily, find the relevant subtree and limit the iteration. > > Esp. if we use a threaded tree. > > That would cater for big.LITTLE, certainly, but I'm not sure I follow > how that helps to find active events -- you'll still have to iterate > through the whole PMU subtree to find which are active, no? Ah, so the tree would in fact only contain 'INACTIVE' events :-) That is, when no events are on the hardware, all events (if there are any) are INACTIVE. Then on sched-in, we find the relevant subtree, and linearly try and program all events from that subtree onto the PMU. Once adding an event fails programming, we stop (like we do now). These programmed events transition from INACTIVE to ACTIVE, and we take them out of the tree. Then on sched-out, we remove all events from the hardware, increase the events their runtime value by however long they were ACTIVE, flip them to INACTIVE and stuff them back in the tree. (I'm can't quite recall if we can easily find ACTIVE events from a PMU, but if not, we can easily track those on a separate list).