2011-05-26 22:10:25

by David Ahern

[permalink] [raw]
Subject: perf: h/w counters not counted and no error/info reported to user


By default (tip-perf-core) perf-stat enables a number of H/W counters:

perf stat -- sleep 1

Performance counter stats for 'sleep 1':

1.346889 task-clock
1 context-switches
0 CPU-migrations
157 page-faults
<not counted> cycles
<not counted> stalled-cycles-frontend
<not counted> stalled-cycles-backend
493,740 instructions
137,062 branches
6,773 branch-misses

In this case 3 of them are not counted and perf does not tell me why.
I'm guessing that my processor (Core2 Duo T9550) does not support the
stalled cycles counters. Should perf not emit a message telling me that?

However, cycles is supported by the processor:

perf stat -e cycles -- sleep 1

Performance counter stats for 'sleep 1':

1,091,102 cycles

Based on that I am guessing that only the last N requested events are
counted where N is processor specific. In my case N is apparently 3 as
this works:

perf stat -e cycles -e instructions -e branches -- sleep 1

Performance counter stats for 'sleep 1':

1,028,331 cycles
634,101 instructions
130,253 branches

but adding a fourth causes the first one to roll off -- again without
any message from perf:

perf stat -e cycles -e instructions -e branches -e branch-misses -v --
sleep 1
Performance counter stats for 'sleep 1':

<not counted> cycles
219,727 instructions
128,071 branches
6,625 branch-misses

Which is confusing -- cycles are not counted, yet it should be given
that it is supported.

David


2011-05-27 06:31:49

by Corey Ashford

[permalink] [raw]
Subject: Re: perf: h/w counters not counted and no error/info reported to user

On 05/26/2011 03:10 PM, David Ahern wrote:
>
> By default (tip-perf-core) perf-stat enables a number of H/W counters:
>
> perf stat -- sleep 1
>
> Performance counter stats for 'sleep 1':
>
> 1.346889 task-clock
> 1 context-switches
> 0 CPU-migrations
> 157 page-faults
> <not counted> cycles
> <not counted> stalled-cycles-frontend
> <not counted> stalled-cycles-backend
> 493,740 instructions
> 137,062 branches
> 6,773 branch-misses
>
> In this case 3 of them are not counted and perf does not tell me why.
> I'm guessing that my processor (Core2 Duo T9550) does not support the
> stalled cycles counters. Should perf not emit a message telling me that?
>
> However, cycles is supported by the processor:
>
> perf stat -e cycles -- sleep 1
>
> Performance counter stats for 'sleep 1':
>
> 1,091,102 cycles

I suspect that what's happening is that "sleep 1" is not a very good
test case, because it's sleeping most of the time. The result is that
some of the events don't have time enough to get counted because not
enough system ticks occur while the "sleep 1" is running to get all of
the events to be scheduled onto the PMU.

Try a simple problem which spins, or otherwise chews up cycles, for a
second or so instead of sleeps. I think you will get more events counted.

- Corey

2011-05-27 09:50:42

by Ingo Molnar

[permalink] [raw]
Subject: Re: perf: h/w counters not counted and no error/info reported to user


* David Ahern <[email protected]> wrote:

> By default (tip-perf-core) perf-stat enables a number of H/W counters:
>
> perf stat -- sleep 1

did you mean to do 'perf stat -a sleep 1' ?

'perf stat sleep 1' in itself will only measure that very simple
command - which does not do much.

Thanks,

Ingo

2011-05-27 15:04:45

by David Ahern

[permalink] [raw]
Subject: Re: perf: h/w counters not counted and no error/info reported to user



On 05/27/2011 03:50 AM, Ingo Molnar wrote:
>
> * David Ahern <[email protected]> wrote:
>
>> By default (tip-perf-core) perf-stat enables a number of H/W counters:
>>
>> perf stat -- sleep 1
>
> did you mean to do 'perf stat -a sleep 1' ?

Adding the -a has its own problems. It blocks reading from fd=0 (stdin);
I have to type Ctrl-D to see the results. The FD array gets initialized
to 0 and some case is not erroring out or setting the fd.

perf stat -a -- sleep 1

Performance counter stats for 'sleep 1':

2008.600377 task-clock
545 context-switches
31 CPU-migrations
198 page-faults
101,938,898 cycles
140,734,186,560,480 stalled-cycles-frontend
0 stalled-cycles-backend
115,406,129 instructions

26,361,321 branches
1,068,804 branch-misses

1.003842524 seconds time elapsed

The stalled-cycles-frontend is off the chats, but shows a value.
Similarly, backed shows 0 versus <not counted>.


>
> 'perf stat sleep 1' in itself will only measure that very simple
> command - which does not do much.

Simple program for a simple test. I just wanted to know which counters
are available for a given processor.

Per your comment and Corey's I tried a program that does some work:

perf stat -- openssl speed aes

Performance counter stats for 'openssl speed aes':

44964.148608 task-clock
4,593 context-switches
48 CPU-migrations
487 page-faults
124,134,470,217 cycles
<not counted> stalled-cycles-frontend
<not counted> stalled-cycles-backend
319,440,401,206 instructions
6,710,107,937 branches
1,638,412 branch-misses

45.012092601 seconds time elapsed
This time I get data for 4 events -- which I did not get with sleep --
but I get <not counted> for the stalled cycles.

How is the <not counted> supposed to be interpreted?

Also, back to the 'perf stat -- sleep 1' example: If I run it with
individual -e <event> args I get values -- up to 3. Sure sleep is a
simple program, but some counter value comes out. What's magical about 3?

David


>
> Thanks,
>
> Ingo

2011-05-27 16:36:13

by David Ahern

[permalink] [raw]
Subject: Re: perf: h/w counters not counted and no error/info reported to user



On 05/27/2011 09:04 AM, David Ahern wrote:
> How is the <not counted> supposed to be interpreted?
>
> Also, back to the 'perf stat -- sleep 1' example: If I run it with
> individual -e <event> args I get values -- up to 3. Sure sleep is a
> simple program, but some counter value comes out. What's magical about 3?

Ok, I got it now. On some occasions for simple programs -- like sleep 1
-- the counter reads show:

cycles: 0 1066485 0

While the counter value is non-0, the time enabled and running values
are 0 and those latter two are used by stat to say "not-counted".

Unsupported events have 0's across the board. I'm working on a patch to
change <not counted> to <not supported> for events marked as unsupported.

David

2011-05-30 08:22:18

by Ingo Molnar

[permalink] [raw]
Subject: Re: perf: h/w counters not counted and no error/info reported to user


* David Ahern <[email protected]> wrote:

>
>
> On 05/27/2011 03:50 AM, Ingo Molnar wrote:
> >
> > * David Ahern <[email protected]> wrote:
> >
> >> By default (tip-perf-core) perf-stat enables a number of H/W counters:
> >>
> >> perf stat -- sleep 1
> >
> > did you mean to do 'perf stat -a sleep 1' ?
>
> Adding the -a has its own problems. It blocks reading from fd=0
> (stdin); I have to type Ctrl-D to see the results. The FD array
> gets initialized to 0 and some case is not erroring out or setting
> the fd.

Ok, that's a bug (and you sent the fix for it already) - weird that i
havent seen this before, i use perf stat -a sleep 1 type of analysis
all the time.

Thanks,

Ingo

2011-05-30 21:04:36

by Ashwin Chaugule

[permalink] [raw]
Subject: Re: perf: h/w counters not counted and no error/info reported to user

Hi David,

On Fri, May 27, 2011 at 12:36 PM, David Ahern <[email protected]> wrote:
>
> Ok, I got it now. On some occasions for simple programs -- like sleep 1
> -- the counter reads show:
>
> cycles: 0 1066485 0
>
> While the counter value is non-0, the time enabled and running values
> are 0 and those latter two are used by stat to say "not-counted".

I've seen time-enabled = 0 too, but I suspected that's because on my
kernel the sched_clock implementation was missing. (defaults to a
jiffies based timer)
I saw your patch for initializing the fd's to -1's, but I'm unable to
see how that makes the value of time-enabled non zero ?

Cheers,
Ashwin

2011-05-31 00:57:25

by David Ahern

[permalink] [raw]
Subject: Re: perf: h/w counters not counted and no error/info reported to user



On 05/30/2011 03:04 PM, Ashwin Chaugule wrote:
> Hi David,
>
> On Fri, May 27, 2011 at 12:36 PM, David Ahern <[email protected]> wrote:
>>
>> Ok, I got it now. On some occasions for simple programs -- like sleep 1
>> -- the counter reads show:
>>
>> cycles: 0 1066485 0
>>
>> While the counter value is non-0, the time enabled and running values
>> are 0 and those latter two are used by stat to say "not-counted".
>
> I've seen time-enabled = 0 too, but I suspected that's because on my
> kernel the sched_clock implementation was missing. (defaults to a
> jiffies based timer)

I think in the case of 'perf stat -- sleep 1' it runs so quickly that
sometimes it does not cross timer ticks and the time_enabled comes out 0.

> I saw your patch for initializing the fd's to -1's, but I'm unable to
> see how that makes the value of time-enabled non zero ?

It doesn't. That patch addresses unsupported counters in the default
list (e.g., -ddd on a T9550 Core2Duo). The first open fails (cpu 0,
thread 0) and the entire FD array is not reset to -1 (from its
initialization to 0).

David

>
> Cheers,
> Ashwin