2012-05-25 22:36:49

by Andi Kleen

[permalink] [raw]
Subject: Re: [RFCv2 0/8] perf tool: Add new event group management

On Wed, Apr 04, 2012 at 11:16:08PM +0200, Jiri Olsa wrote:
> hi,
> adding support for creating event groups based on the way they
> are specified on the command line.

Any updates on this patchkit? We have a situation here where better
group management is needed to avoid problems with inconsistent
counters that depend on each others.

I don't see it in tip.

-Andi


2012-05-26 12:39:33

by Jiri Olsa

[permalink] [raw]
Subject: Re: [RFCv2 0/8] perf tool: Add new event group management

On Sat, May 26, 2012 at 12:36:46AM +0200, Andi Kleen wrote:
> On Wed, Apr 04, 2012 at 11:16:08PM +0200, Jiri Olsa wrote:
> > hi,
> > adding support for creating event groups based on the way they
> > are specified on the command line.
>
> Any updates on this patchkit? We have a situation here where better
> group management is needed to avoid problems with inconsistent
> counters that depend on each others.

The startup patches just got in recently
http://marc.info/?l=linux-kernel&m=133758460912306&w=2

so I'll continue on this shortly..

If you have some ideas on this or real world examples,
that would really help.. so far, here's the latest discussion:
http://marc.info/?t=133357436900005&r=1&w=2

and here's some thought on hows the display's going to look like:
http://marc.info/?l=linux-kernel&m=133458636513108&w=2


jirka

2012-05-26 19:23:19

by Andi Kleen

[permalink] [raw]
Subject: Re: [RFCv2 0/8] perf tool: Add new event group management

On Sat, May 26, 2012 at 02:38:58PM +0200, Jiri Olsa wrote:
> The startup patches just got in recently
> http://marc.info/?l=linux-kernel&m=133758460912306&w=2
>
> so I'll continue on this shortly..

Great.

> If you have some ideas on this or real world examples,

Any of the proposed syntaxes looked fine for me. The important
part is that it works in some form.

> that would really help.. so far, here's the latest discussion:
> http://marc.info/?t=133357436900005&r=1&w=2

For example you want to measure sandy bridge frontend contention in a
more useful way than the dubious event in standard perf.

The formula for this is

N = 4*CPU_CLK_UNHALTED.THREAD (4 execution slots)
Percent_FE_bound = 100*(IDQ_UOPS_NOT_DELIVERED.CORE / N)

Translated into perf this is

-e r53003c -e r53019c

and some glue to compute the formula:

#!/usr/bin/python
import sys

cyc, e1 = sys.stdin.readline().split(",")
uops, e2 = sys.stdin.readline().split(",")

N = 4 * float(cyc)
P_FE = 100.0 * (float(uops) / N)
print "percent frontend bound: %.2f" % (P_FE)


perf stat -x, -e r53003c -e r53019c /bin/ls 2>log
./frontend.py < log
percent frontend bound: 41.53

My /bin/ls is 42% frontend bound.

Now you see we always have to measure the CPU_CLK_UNHALTED and
IDQ_UOPS_NOT_DELIVERED.CORE together. Otherwise there is no useful output
from the formula.

The problem happens when we want to measure other things too. You tend
to quickly run out of 4 counters per CPU thread, so have to multiplex.
And that is where the groups are needed. Without the groups we have
to do multiple runs, instead of one that measures this all time sliced.

This is pretty common with all kinds of measurements.

-Andi

--
[email protected] -- Speaking for myself only.

2012-05-27 07:56:45

by Ulrich Drepper

[permalink] [raw]
Subject: Re: [RFCv2 0/8] perf tool: Add new event group management

On Sat, May 26, 2012 at 8:38 AM, Jiri Olsa <[email protected]> wrote:
> If you have some ideas on this or real world examples,
> that would really help.. so far, here's the latest discussion:
> http://marc.info/?t=133357436900005&r=1&w=2

If you're looking for a definitive source, just point to the Intel
optimization manual. Absolute values of counters are not really
useful and so they are defining many (50+) ratios which people should
investigate. These ratios are only really accurate if the counters
are swapped in and out at the same time.

The reminds me of a detail I looked at when starting an an
implementation for this (glad you got more time to devote to it). The
problem with ratios are that there are so many. So efficient
scheduling is going to be important. Many ratios use as a base the
same counters over and over again (e.g., cycle count, instruction
count, etc). Therefore it is important to recognize when two groups
can be scheduled concurrently even if the total number of counters
needed would be high but due to intersections it is possible.

One last comment, not critical. From a parsing point of view the
colon in the proposed syntax

name : { counter1, counter2 }

is unnecessary. Just one more thing people can get wrong. How about
leaving it out? An open curly brace to indicate a group should be
sufficient.

2012-05-27 15:08:49

by Andi Kleen

[permalink] [raw]
Subject: Re: [RFCv2 0/8] perf tool: Add new event group management

> same counters over and over again (e.g., cycle count, instruction
> count, etc). Therefore it is important to recognize when two groups

These two are luckily fixed counters, so always available
(at least as long as you disable the nmi watchdog)

-Andi

--
[email protected] -- Speaking for myself only.

2012-05-28 19:22:22

by Jiri Olsa

[permalink] [raw]
Subject: Re: [RFCv2 0/8] perf tool: Add new event group management

On Sun, May 27, 2012 at 03:56:22AM -0400, Ulrich Drepper wrote:
> On Sat, May 26, 2012 at 8:38 AM, Jiri Olsa <[email protected]> wrote:
> > If you have some ideas on this or real world examples,
> > that would really help.. so far, here's the latest discussion:
> > http://marc.info/?t=133357436900005&r=1&w=2
>
> If you're looking for a definitive source, just point to the Intel
> optimization manual. Absolute values of counters are not really
> useful and so they are defining many (50+) ratios which people should
> investigate. These ratios are only really accurate if the counters
> are swapped in and out at the same time.

thanks a lot for the pointer, very useful

>
> The reminds me of a detail I looked at when starting an an
> implementation for this (glad you got more time to devote to it). The
> problem with ratios are that there are so many. So efficient
> scheduling is going to be important. Many ratios use as a base the
> same counters over and over again (e.g., cycle count, instruction
> count, etc). Therefore it is important to recognize when two groups
> can be scheduled concurrently even if the total number of counters
> needed would be high but due to intersections it is possible.
>
> One last comment, not critical. From a parsing point of view the
> colon in the proposed syntax
>
> name : { counter1, counter2 }
>
> is unnecessary. Just one more thing people can get wrong. How about
> leaving it out? An open curly brace to indicate a group should be
> sufficient.

yep, we'll omit the first colon

I'll CC you guys on next patchset

thanks,
jirka

2012-05-29 08:39:31

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFCv2 0/8] perf tool: Add new event group management

On Sun, 2012-05-27 at 03:56 -0400, Ulrich Drepper wrote:
> Therefore it is important to recognize when two groups
> can be scheduled concurrently even if the total number of counters
> needed would be high but due to intersections it is possible.

We currently don't do counter multiplexing like that. I'd love to see a
readable and efficient implementation of that though, preferably in the
core and not the arch code.