Hi folks,
First let me state that I'm working with a Linux kernel which is a couple of
weeks old and came from Linus' post-2.6.31 release tree. It does contain the
name change from perf_counter to perf_event, for example.
I'm working on some optimizations of PAPI's use of perf_events. In particular I
want to speed up the reading of events. To do this, I am doing two things:
1) Not disabling and enabling the counters around reading the counters. In
other words, read them on the fly.
2) Use the PERF_FORMAT_GROUP option so that I can read up an entire group of
counters in one syscall.
This seems to be working fairly well, except it seems that sometimes the "time
running" and "time enabled" values that I read up are zero, even though the
count is non-zero. The case I'm looking at has three groups, and it's the third
group, containing a single event, which is getting the zero enabled/running values.
One experiment that I just now tried was to put the code back in that
disables/re-enables the counters around the code that reads the counters. This
seems to have fixed the problem, but causes a performance degradation, of course.
Is this expected behavior? Should I be able to read the counts, and times on
the fly like I want to be able to do?
Thanks for your consideration of this issue.
- Corey
Corey Ashford
Software Engineer
IBM Linux Technology Center, Linux Toolchain
Beaverton, OR
[email protected]
* Corey Ashford <[email protected]> wrote:
>> Btw., you might want to consider putting such testcases into a new
>> 'perf test' kind of utility (under tools/perf/builtin-test.c) that
>> does various tests such as reading a count and validating the
>> time-running/time-enabled values - so that we can see any bugs
>> directly.
>
> I'll see if I can put the test case in some sort of acceptable format
> like that. Maybe we can use this one test case as the starting point
> for builtin-test.c.
Yeah. Feel free to add just something minimal that solves your problem.
Others will add their testcases too i'm sure, once the basic command is
there.
Ingo
On 10/23/2009 01:57 PM, Ingo Molnar wrote:
>
> * Corey Ashford<[email protected]> wrote:
>
>>> Btw., you might want to consider putting such testcases into a new
>>> 'perf test' kind of utility (under tools/perf/builtin-test.c) that
>>> does various tests such as reading a count and validating the
>>> time-running/time-enabled values - so that we can see any bugs
>>> directly.
>>
>> I'll see if I can put the test case in some sort of acceptable format
>> like that. Maybe we can use this one test case as the starting point
>> for builtin-test.c.
>
> Yeah. Feel free to add just something minimal that solves your problem.
> Others will add their testcases too i'm sure, once the basic command is
> there.
>
> Ingo
Hi Ingo and Paul,
I followed the directions Ingo provided for getting the -tip tree and
then attempted to build the perf tool that's in it and it doesn't build.
I'm getting this error on a Power5 box:
../../lib/find_next_bit.c: In function ?ext2_swabp?:
../../lib/find_next_bit.c:168: warning: passing argument 1 of
?__swab64p? from incompatible pointer type
make: *** [util/find_next_bit.o] Error 1
I looked at the code and it looks ok to me. There's a cast from an
"unsigned long *" to a "unsigned long long *", but those types should be
of the same size (in -m64 mode), so I don't know why this is failing.
Some good news is that I have created a simple test case which
demonstrates the bug, but I haven't formulated it in the builtin-test.c
format yet, since I have been able to build perf.
If you toss the attached file into the perf source directory and compile
it as follows:
gcc -m64 -o petb perf_events_time_bug.c -lrt
it will show the error:
% ./petb
[0] = 1
[1] = 0 <- time enabled
[2] = 0 <- time running
[3] = 4559d1f8
[4] = 1
test failed!
perf_events_time_bug.c -lrt
If you compile it with:
gcc -DDO_DISABLE -m66 -o petb perf_events_time_bug.c -lrt
This will disable the event before reading it and then you will get the
correct times.
% ./petb
[0] = 1
[1] = 2a50c1c8 <- time enabled
[2] = 2a50c1c8 <- time running
[3] = 460dea97
[4] = 2
test passed!
--
Regards,
- Corey
Corey Ashford
Software Engineer
IBM Linux Technology Center, Linux Toolchain
Beaverton, OR
[email protected]
Anyone had a chance to look at this test case yet?
Regards,
- Corey
Corey Ashford wrote:
> On 10/23/2009 01:57 PM, Ingo Molnar wrote:
>>
>> * Corey Ashford<[email protected]> wrote:
>>
>>>> Btw., you might want to consider putting such testcases into a new
>>>> 'perf test' kind of utility (under tools/perf/builtin-test.c) that
>>>> does various tests such as reading a count and validating the
>>>> time-running/time-enabled values - so that we can see any bugs
>>>> directly.
>>>
>>> I'll see if I can put the test case in some sort of acceptable format
>>> like that. Maybe we can use this one test case as the starting point
>>> for builtin-test.c.
>>
>> Yeah. Feel free to add just something minimal that solves your problem.
>> Others will add their testcases too i'm sure, once the basic command is
>> there.
>>
>> Ingo
>
> Hi Ingo and Paul,
>
> I followed the directions Ingo provided for getting the -tip tree and
> then attempted to build the perf tool that's in it and it doesn't build.
> I'm getting this error on a Power5 box:
>
> ../../lib/find_next_bit.c: In function ?ext2_swabp?:
> ../../lib/find_next_bit.c:168: warning: passing argument 1 of
> ?__swab64p? from incompatible pointer type
> make: *** [util/find_next_bit.o] Error 1
>
> I looked at the code and it looks ok to me. There's a cast from an
> "unsigned long *" to a "unsigned long long *", but those types should be
> of the same size (in -m64 mode), so I don't know why this is failing.
>
> Some good news is that I have created a simple test case which
> demonstrates the bug, but I haven't formulated it in the builtin-test.c
> format yet, since I have been able to build perf.
>
> If you toss the attached file into the perf source directory and compile
> it as follows:
>
> gcc -m64 -o petb perf_events_time_bug.c -lrt
>
> it will show the error:
> % ./petb
> [0] = 1
> [1] = 0 <- time enabled
> [2] = 0 <- time running
> [3] = 4559d1f8
> [4] = 1
> test failed!
>
> perf_events_time_bug.c -lrt
>
> If you compile it with:
>
> gcc -DDO_DISABLE -m66 -o petb perf_events_time_bug.c -lrt
>
> This will disable the event before reading it and then you will get the
> correct times.
>
> % ./petb
> [0] = 1
> [1] = 2a50c1c8 <- time enabled
> [2] = 2a50c1c8 <- time running
> [3] = 460dea97
> [4] = 2
> test passed!
>
--
Regards,
- Corey
Corey Ashford
Software Engineer
IBM Linux Technology Center, Linux Toolchain
Beaverton, OR
503-578-3507
[email protected]
On Sun, Oct 25, 2009 at 07:37:09PM -0700, Corey Ashford wrote:
> I'm getting this error on a Power5 box:
>
> ../../lib/find_next_bit.c: In function ‘ext2_swabp’:
> ../../lib/find_next_bit.c:168: warning: passing argument 1 of
> ‘__swab64p’ from incompatible pointer type
> make: *** [util/find_next_bit.o] Error 1
>
> I looked at the code and it looks ok to me. There's a cast from an
> "unsigned long *" to a "unsigned long long *", but those types should be
> of the same size (in -m64 mode), so I don't know why this is failing.
Yeah, it's ok because we have the BITS_PER_LONG check. But gcc is not
aware of that and just consider the cast as breaking the strict aliasing
rules.
Can you tell me if the following patch works for you? (only for
this build error).
Thanks.
diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index 800783d..167bc04 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -846,8 +846,9 @@ util/bitmap.o: ../../lib/bitmap.c PERF-CFLAGS
util/hweight.o: ../../lib/hweight.c PERF-CFLAGS
$(QUIET_CC)$(CC) -o util/hweight.o -c $(ALL_CFLAGS) -DETC_PERFCONFIG='"$(ETC_PERFCONFIG_SQ)"' $<
+KFNB_FLAGS = `echo $(ALL_CFLAGS) | sed s/-Wstrict-aliasing=3//`
util/find_next_bit.o: ../../lib/find_next_bit.c PERF-CFLAGS
- $(QUIET_CC)$(CC) -o util/find_next_bit.o -c $(ALL_CFLAGS) -DETC_PERFCONFIG='"$(ETC_PERFCONFIG_SQ)"' $<
+ $(QUIET_CC)$(CC) -o util/find_next_bit.o -c $(KFNB_FLAGS) -DETC_PERFCONFIG='"$(ETC_PERFCONFIG_SQ)"' $<
perf-%$X: %.o $(PERFLIBS)
$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(LIBS)
Frederic Weisbecker wrote:
> On Sun, Oct 25, 2009 at 07:37:09PM -0700, Corey Ashford wrote:
>> I'm getting this error on a Power5 box:
>>
>> ../../lib/find_next_bit.c: In function ‘ext2_swabp’:
>> ../../lib/find_next_bit.c:168: warning: passing argument 1 of
>> ‘__swab64p’ from incompatible pointer type
>> make: *** [util/find_next_bit.o] Error 1
>>
>> I looked at the code and it looks ok to me. There's a cast from an
>> "unsigned long *" to a "unsigned long long *", but those types should be
>> of the same size (in -m64 mode), so I don't know why this is failing.
>
>
> Yeah, it's ok because we have the BITS_PER_LONG check. But gcc is not
> aware of that and just consider the cast as breaking the strict aliasing
> rules.
>
> Can you tell me if the following patch works for you? (only for
> this build error).
Thanks for looking into the build problem, Frederic. With your patch, I'm still
seeing the error:
% patch -p1 < ~/patch
patching file tools/perf/Makefile
Hunk #1 succeeded at 845 (offset -1 lines).
% cd tools/perf
~/linux.trees.git/tools/perf
% make
Makefile:472: No libdwarf.h found or old libdwarf.h found, disables dwarf
support. Please install libdwarf-dev/libdwarf-devel >= 20081231
CC util/find_next_bit.o
cc1: warnings being treated as errors
../../lib/find_next_bit.c: In function 'ext2_swabp':
../../lib/find_next_bit.c:168: warning: passing argument 1 of '__swab64p' from
incompatible pointer type
make: *** [util/find_next_bit.o] Error 1
elm3c4.beaverton.ibm.com:corey-13% cat ~/patch
diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index 800783d..167bc04 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -846,8 +846,9 @@ util/bitmap.o: ../../lib/bitmap.c PERF-CFLAGS
util/hweight.o: ../../lib/hweight.c PERF-CFLAGS
$(QUIET_CC)$(CC) -o util/hweight.o -c $(ALL_CFLAGS)
-DETC_PERFCONFIG='"$(ETC_PERFCONFIG_SQ)"' $<
+KFNB_FLAGS = `echo $(ALL_CFLAGS) | sed s/-Wstrict-aliasing=3//`
util/find_next_bit.o: ../../lib/find_next_bit.c PERF-CFLAGS
- $(QUIET_CC)$(CC) -o util/find_next_bit.o -c $(ALL_CFLAGS)
-DETC_PERFCONFIG='"$(ETC_PERFCONFIG_SQ)"' $<
+ $(QUIET_CC)$(CC) -o util/find_next_bit.o -c $(KFNB_FLAGS)
-DETC_PERFCONFIG='"$(ETC_PERFCONFIG_SQ)"' $<
perf-%$X: %.o $(PERFLIBS)
$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^)
$(LIBS)
>
>
> diff --git a/tools/perf/Makefile b/tools/perf/Makefile
> index 800783d..167bc04 100644
> --- a/tools/perf/Makefile
> +++ b/tools/perf/Makefile
> @@ -846,8 +846,9 @@ util/bitmap.o: ../../lib/bitmap.c PERF-CFLAGS
> util/hweight.o: ../../lib/hweight.c PERF-CFLAGS
> $(QUIET_CC)$(CC) -o util/hweight.o -c $(ALL_CFLAGS) -DETC_PERFCONFIG='"$(ETC_PERFCONFIG_SQ)"' $<
>
> +KFNB_FLAGS = `echo $(ALL_CFLAGS) | sed s/-Wstrict-aliasing=3//`
> util/find_next_bit.o: ../../lib/find_next_bit.c PERF-CFLAGS
> - $(QUIET_CC)$(CC) -o util/find_next_bit.o -c $(ALL_CFLAGS) -DETC_PERFCONFIG='"$(ETC_PERFCONFIG_SQ)"' $<
> + $(QUIET_CC)$(CC) -o util/find_next_bit.o -c $(KFNB_FLAGS) -DETC_PERFCONFIG='"$(ETC_PERFCONFIG_SQ)"' $<
>
> perf-%$X: %.o $(PERFLIBS)
> $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(LIBS)
>
>
--
Regards,
- Corey
Corey Ashford
Software Engineer
IBM Linux Technology Center, Linux Toolchain
Beaverton, OR
503-578-3507
[email protected]