LinuxLists.cc - [PATCH 0/9] Performance counters for POWER

2009-01-09 10:41:42

Subject: [PATCH 0/9] Performance counters for POWER

The following series of patches extends Ingo and Thomas's performance
counter framework to add support for 64-bit POWER processors.
Currently I have the PPC970 family and POWER6 done.

The approach I have taken is to do the constraint checking and the
search through the space of alternative event codes as each group of
counters is added at the time a task is scheduled in. That means we
are potentially doing the search several times in a row, with
interrupts disabled. I think it will be OK since there are only a few
events that have alternatives (and not many of them), and the
constraint checking is fast since it is just simple integer
operations. However, one of the things I plan to do is to instrument
that code to find out how long it takes in the worst case. (If it
takes too long then I will need some major changes to the generic
code.)

This series is also available via git at:

git://git.kernel.org/pub/scm/linux/kernel/git/paulus/perfcounters.git

in the master branch.

Paul.

2009-01-09 13:28:50

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [PATCH 0/9] Performance counters for POWER

On Fri, 2009-01-09 at 21:40 +1100, Paul Mackerras wrote:
> The following series of patches extends Ingo and Thomas's performance
> counter framework to add support for 64-bit POWER processors.
> Currently I have the PPC970 family and POWER6 done.
>
> The approach I have taken is to do the constraint checking and the
> search through the space of alternative event codes as each group of
> counters is added at the time a task is scheduled in.

Hmm, the model I thought would make most sense for power and other
machines with such heavy constraints was that you'd compose a register
set when you create groups, and then when you RR the groups, you just
program the pre-computed sets.

The create code already has hooks to validate constraints -- so that you
cannot create a group that would never fit on the machine. If you use
this to generate and store a register set, you'd only need to program
them in the counter scheduler.

My current understanding of the counter scheduler is that it RR groups,
each schedule event throws out the last state, queuing whatever groups
it had at the tail, and then adds as many possible new groups from the
head of the list. Which by the above constraint checking is guaranteed
to be at least 1.

With pre-computed regs sets it would be hard to go beyond the 1 group at
a time, therefore I imagine your current approach is more flexible, but
I worry about the cost.

> That means we
> are potentially doing the search several times in a row, with
> interrupts disabled. I think it will be OK since there are only a few
> events that have alternatives (and not many of them), and the
> constraint checking is fast since it is just simple integer
> operations.

> However, one of the things I plan to do is to instrument
> that code to find out how long it takes in the worst case. (If it
> takes too long then I will need some major changes to the generic
> code.)

Right, esp on high context switch rates it might dominate the machine.

2009-01-09 13:47:48

by Ingo Molnar

[permalink] [raw]

Subject: Re: [PATCH 0/9] Performance counters for POWER

* Paul Mackerras <[email protected]> wrote:

> The following series of patches extends Ingo and Thomas's performance
> counter framework to add support for 64-bit POWER processors. Currently
> I have the PPC970 family and POWER6 done.

Cool stuff!

> The approach I have taken is to do the constraint checking and the
> search through the space of alternative event codes as each group of
> counters is added at the time a task is scheduled in. That means we are
> potentially doing the search several times in a row, with interrupts
> disabled. I think it will be OK since there are only a few events that
> have alternatives (and not many of them), and the constraint checking is
> fast since it is just simple integer operations. However, one of the
> things I plan to do is to instrument that code to find out how long it
> takes in the worst case. (If it takes too long then I will need some
> major changes to the generic code.)

Sounds like a very good approach to me. I think the core code wants to be
optimistic towards the non-presence of scheduling constraints. So as
hardware improves and evolves [which we all hope it does], so will
hopefully the constraint related overhead become smaller.

> This series is also available via git at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/paulus/perfcounters.git
>
> in the master branch.

Great work Paul!

Do timec.c and kerneltop.c work fine for you by any chance? If yes, could
you send us some sample output that you get with them on your power
testbox(es)?

Also, would this be the right moment for me to pull from you?

Your modifications to kernel/perf_counter.c are all fixes and sensible
extensions, and i expect the x86 side should continue to work just fine,
so i'd like to pull this ASAP :-)

Ingo

2009-01-09 23:45:46

by Paul Mackerras

[permalink] [raw]

Subject: Re: [PATCH 0/9] Performance counters for POWER

Ingo Molnar writes:

> Do timec.c and kerneltop.c work fine for you by any chance? If yes, could
> you send us some sample output that you get with them on your power
> testbox(es)?

timec.c works with some minor modifications. I'll post my version in
a separate message. The main difference is that syscall() on powerpc
returns -1 on error and sets errno, rather than returning a negative
error number. I also added a way to specify raw events.

I haven't looked in detail at kerneltop.c yet. I have done a small
test program that exercises interrupt counters though.

> Also, would this be the right moment for me to pull from you?

Actually, I have discovered some bugs, so I'll need to do new versions
of 7/9 and 8/9. I'll let you know when I have redone the git tree.

Paul.

2009-01-10 02:00:59

by Paul Mackerras

[permalink] [raw]

Subject: Re: [PATCH 0/9] Performance counters for POWER

Peter Zijlstra writes:

> Hmm, the model I thought would make most sense for power and other
> machines with such heavy constraints was that you'd compose a register
> set when you create groups, and then when you RR the groups, you just
> program the pre-computed sets.

Groups are user-created entities, and in practice (at least so far)
most groups consist of just a single counter. I don't want to build
in a limitation that you can only count one thing at a time unless you
construct a multi-counter group. (There is no easy way to combine the
register settings for two groups, short of working out the register
settings for the combined set of events from scratch.)

It would be possible to have an entity that describes a set of groups
(i.e. the result of a scheduling decision) and precompute register
settings for that. That's the kind of thing I was alluding to when I
talked about major changes to the generic code. But that entity is a
distinct thing from the current notion of a "group".

> Right, esp on high context switch rates it might dominate the machine.

Currently the constraint check/alternative search seems to take a
fraction of a microsecond, so I'm hopeful it'll be OK. We'll see.

Paul.

2009-01-10 05:51:29

by Paul Mackerras

[permalink] [raw]

Subject: Re: [PATCH 0/9] Performance counters for POWER

Ingo Molnar writes:

> Also, would this be the right moment for me to pull from you?

I fixed the bugs and pushed out the updated commits, so yes please do
the pull. Note that I did a forced update, so fetch again if you
fetched previously.

Paul.

2009-01-11 01:44:47

by Ingo Molnar

[permalink] [raw]

Subject: Re: [PATCH 0/9] Performance counters for POWER

2009-01-11 01:48:44

by Ingo Molnar

[permalink] [raw]

Subject: Re: [PATCH 0/9] Performance counters for POWER

* Paul Mackerras <[email protected]> wrote:

> Ingo Molnar writes:
>
> > Do timec.c and kerneltop.c work fine for you by any chance? If yes, could
> > you send us some sample output that you get with them on your power
> > testbox(es)?
>
> timec.c works with some minor modifications. I'll post my version in a
> separate message. The main difference is that syscall() on powerpc
> returns -1 on error and sets errno, rather than returning a negative
> error number. I also added a way to specify raw events.

thanks - i've picked up your version and uploaded it to:

http://redhat.com/~mingo/perfcounters/timec.c

Ingo