Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S937869AbZAPWO6 (ORCPT ); Fri, 16 Jan 2009 17:14:58 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1764363AbZAPWOr (ORCPT ); Fri, 16 Jan 2009 17:14:47 -0500 Received: from e31.co.us.ibm.com ([32.97.110.149]:39535 "EHLO e31.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1764354AbZAPWOr (ORCPT ); Fri, 16 Jan 2009 17:14:47 -0500 Message-ID: <497106C5.60703@us.ibm.com> Date: Fri, 16 Jan 2009 16:14:29 -0600 From: Maynard Johnson User-Agent: Thunderbird 2.0.0.19 (Windows/20081209) MIME-Version: 1.0 To: Ingo Molnar CC: Corey Ashford , Andi Kleen , Paul Mackerras , Peter Zijlstra , Ingo Molnar , linux-kernel@vger.kernel.org, Thomas Gleixner , Andrew Morton , Stephane Eranian , Eric Dumazet , Robert Richter , Arjan van de Ven , Peter Anvin , "David S. Miller" , perfctr-devel@lists.sourceforge.net Subject: Re: [patch] Performance Counters for Linux, v4 References: <20081214212829.GA9435@elte.hu> <18758.18810.350923.806445@cargo.ozlabs.ibm.com> <1229437341.7025.11.camel@twins> <18760.13407.568536.198724@cargo.ozlabs.ibm.com> <87ljuf1s75.fsf@basil.nowhere.org> <4970CB6F.9000301@linux.vnet.ibm.com> In-Reply-To: <4970CB6F.9000301@linux.vnet.ibm.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2460 Lines: 50 Corey Ashford wrote: > Andi Kleen wrote: >> Paul Mackerras writes: >>> The perf counter subsystem will, in Ingo's design, naturally try to >>> schedule as many counters and groups on as it can. Given a list of >>> counters/groups, it could start with the first and keep on trying to >>> add counters or groups while it can, essentially trying all possible >>> combinations until it either fills up all the hardware counters or >>> exhausts the possible combinations. If it moves all the >>> counters/groups that do fit on up to the head of the list, and then >>> rotates them to the back of the list when the timeslice expires, that >>> would probably be OK. In fact the computation about what set of >>> counters/groups to put on should be done when adding/removing a >>> counter/group and when the timeslice expires, rather than at context >>> switch time. (I'm talking about the list of part-time counters/groups >>> here, of course.) >> One issue is that PMU counts can cover more than one CPU. One example >> for this are the Uncore events on Nehalem (which cover a whole socket) >> or when you are in AnyThreads monitoring mode (then you get events >> from both SMT siblings in a core) >> >> With that you would need to examine other CPU's state at context switch >> time. Probably not a good idea for scalability. >> >> -Andi >> > > Over time, it seems clear that we will see multi-core processor designs > with increasingly large uncore/nest facilities, so this could become > more and more of an issue. Ingo, I'll add my voice to the chorus here. To reiterate the point, some PMUs count events that are external to the processor cores, and these events cannot be attributed to any one particular CPU -- and certainly not to a particular pid. The current interface has a restriction that the user cannot pass -1 for both pid and cpu. But it seems to me that's exactly what would be needed for such off-core events. Can this feature fit in with the current interface or is some sort of extension needed? Thanks. -Maynard > > - Corey > > Corey Ashford > Software Engineer > IBM Linux Technology Center, Linux Toolchain > Beaverton, OR > 503-578-3507 > cjashfor@us.ibm.com > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/