Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758172AbZA2UBc (ORCPT ); Thu, 29 Jan 2009 15:01:32 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751754AbZA2UBW (ORCPT ); Thu, 29 Jan 2009 15:01:22 -0500 Received: from e1.ny.us.ibm.com ([32.97.182.141]:37768 "EHLO e1.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751558AbZA2UBV (ORCPT ); Thu, 29 Jan 2009 15:01:21 -0500 Message-ID: <49820B03.6010807@linux.vnet.ibm.com> Date: Thu, 29 Jan 2009 12:01:07 -0800 From: Corey Ashford User-Agent: Thunderbird 2.0.0.19 (Windows/20081209) MIME-Version: 1.0 To: eranian@gmail.com CC: Ingo Molnar , linux-kernel@vger.kernel.org, Thomas Gleixner , Andrew Morton , Eric Dumazet , Robert Richter , Arjan van de Ven , Peter Anvin , Peter Zijlstra , Paul Mackerras , "David S. Miller" , Mike Galbraith , "perfmon2-devel@lists.sourceforge.net" , Papi Subject: Re: [announce] Performance Counters for Linux, v6 References: <20090121185021.GA8852@elte.hu> <4981102C.2000400@linux.vnet.ibm.com> <7c86c4470901290432h77d479c9k160a7b793b2812bb@mail.gmail.com> In-Reply-To: <7c86c4470901290432h77d479c9k160a7b793b2812bb@mail.gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7014 Lines: 176 stephane eranian wrote: > Corey, > > On Thu, Jan 29, 2009 at 3:10 AM, Corey Ashford > wrote: >> Ingo Molnar wrote: >>> We are pleased to announce version 6 of our performance counters subsystem >>> implementation. The shortlog, diffstat and the combo patch can be found >>> below. The combo patch against latest -git (2.6.29-rc2) can be also found >>> at: >>> >>> >>> http://people.redhat.com/mingo/perfcounters/perfcounters-v6-v2.6.29-rc2.patch >>> >>> It's also available in tip/master at: >>> >>> http://people.redhat.com/mingo/tip.git/README >>> >>> There are many changes in the v6 release: >>> >>> - PowerPC performance counters support from Paul Mackerras, for POWER6 >>> and for the PPC970 family. >>> >>> - ioctl API to disable/enable individual counters and groups without >>> closing their fd. This can be useful for libraries, ad-hoc >>> instrumentation and PAPI support. >>> >>> - 'pinned' and 'exclusive' counter attributes - for those >>> applications that want to influence counter scheduling explicitly. >>> >>> - The 'perfstat' utility (ex 'timec') has been updated: >>> >>> http://people.redhat.com/mingo/perfcounters/perfstat.c >>> >>> - 'kerneltop' (easy-to-use text mode NMI profiler) has been updated: >>> http://people.redhat.com/mingo/perfcounters/kerneltop.c >>> >>> - Merged to latest mainline >>> >>> - Various fixes and other updates >>> >>> Ingo >> I'm not sure if this is the right place to propose such a thing, but I think >> it would be very valuable to have a standardized user-side library to >> accompany this addition to the kernel. >> >> In particular, as a starting place for the discussion, I'd like to see >> functions in it that are very similar to a subset of what is currently in >> libpfm. Specifically, I'd like to see the following functions (with the >> names changed to pcl_* perhaps): >> >> extern pfm_err_t pfm_find_event(const char *str, unsigned int *idx); >> extern pfm_err_t pfm_find_event_bycode(int code, unsigned int *idx); >> extern pfm_err_t pfm_find_event_bycode_next(int code, unsigned int start, >> unsigned int *next); >> extern pfm_err_t pfm_find_event_mask(unsigned int event_idx, const char >> *str, >> unsigned int *mask_idx); >> extern pfm_err_t pfm_find_full_event(const char *str, pfmlib_event_t *e); >> >> extern pfm_err_t pfm_get_max_event_name_len(size_t *len); >> >> extern pfm_err_t pfm_get_num_events(unsigned int *count); >> extern pfm_err_t pfm_get_num_event_masks(unsigned int event_idx, >> unsigned int *count); >> extern pfm_err_t pfm_get_event_name(unsigned int idx, char *name, >> size_t maxlen); >> extern pfm_err_t pfm_get_full_event_name(pfmlib_event_t *e, char *name, >> size_t maxlen); >> extern pfm_err_t pfm_get_event_code(unsigned int idx, int *code); >> extern pfm_err_t pfm_get_event_mask_code(unsigned int idx, >> unsigned int mask_idx, >> unsigned int *code); >> extern pfm_err_t pfm_get_event_description(unsigned int idx, char **str); >> extern pfm_err_t pfm_get_event_code_counter(unsigned int idx, unsigned int >> cnt, >> int *code); >> extern pfm_err_t pfm_get_event_mask_name(unsigned int event_idx, >> unsigned int mask_idx, >> char *name, size_t maxlen); >> extern pfm_err_t pfm_get_event_mask_description(unsigned int event_idx, >> unsigned int mask_idx, >> char **desc); >> >> >> Now, since it's not clear right now how unit masks are going to be handled >> in your proposal, I'm not sure the that *_event_mask_* functions are >> applicable, but I think something that fills that function will be needed. >> >> Architectures that have need for additional functionality should be free to >> add arch-specific functions. >> >> Full descriptions of these functions can be found in the man pages of the >> libpfm documentation. >> >> Any thoughts on this? Do you already have a us > er library structure in mind? > Yes, I did give some thoughts to all of this. In fact, I have been > playing a bit with > libpfm and the LPC proposal. > > I think, given that LPC is dealing with event -> counter assignment in > the kernel, libpfm > does not have to do it. All it needs to do is event:attributes -> > value, and that value is > then passed to the kernel in raw mode. > > Event attributes includes on x86, for instance, the edge, invert, > counter-mask, plm, field. > I think we could do something more generic than what is currently > there. That would not > require PMU specific data structures for attributes. Just pass > everything into a string. > > To that extent, I have been experimenting with something along those lines: > > int pfm_get_event_encoding(char *event_str, uint64_t **values, int *count); > > events are encoded as follows: > > event_name:[unit_mask1:unit_mask2:...:unit_maskn][::A1=V1:A2=V2:..:An=Vn] > > Attribute names and values depend on each PMU model. Attributes names > are strings. > Values can have any type. > > For X86, most attributes would be identical, same thing on Itanium > because they are > architected. > > Some PMU models may need more than one 64-bit value to configure one > event, That is > is why there is vector and a count. Libpfm should not be concerned > with how those values > are encoded and passed to the kernel. It should be concerned with the > event -> value > as described in the PMU documentation. > > Given that LPC manages events independently of each other, libpfm does > not reallly need > to process multiple events at a time to get a global view of what is > being measured. > > Here is an example: > > $ self inst_retired:any_p::i=1:c=1:u=1:k=1 > [0x1d300c0 event_sel=0xc0 umask=0x0 os=1 usr=1 en=1 int=1 inv=1 edge=0 > cnt_mask=1] INST_RETIRED This looks encouraging! I assume the library would still retain the functions that allow us to iterate through the available events, and obtain text description of events. Would it make sense to have similar functions to obtain the available unit masks and attributes for a particular event? For debugging purposes at least, it might make sense to have a function that does the inverse of pfm_get_event_encoding as well. -- Regards, - Corey Corey Ashford Software Engineer IBM Linux Technology Center, Linux Toolchain Beaverton, OR 503-578-3507 cjashfor@us.ibm.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/