2009-12-01 12:03:24

by Jamie Iles

[permalink] [raw]
Subject: Perf events/ARM

Hi,

I'm looking at adding support for the hardware performance counters in ARMv6
using the new perf events framework. I have a simple setup that uses the
counters on their own, but wrt the perf events framework:

- what are the requirements of set_perf_event_pending() and
perf_event_do_pending()? As far as I can tell from sparc/x86/powerpc,
set_perf_event_pending() triggers an interrupt that then calls
perf_event_do_pending(). Does perf_event_do_pending need to run in
interrupt context or could I use a soft IRQ if platforms don't have a
spare IRQ?

- ARM does not have proper support for atomic64's. Other than
performance, would there be any known problems with using the generic
spinlocked atomic64's?

Thanks,

Jamie


2009-12-01 14:31:19

by Ingo Molnar

[permalink] [raw]
Subject: Re: Perf events/ARM


* Jamie Iles <[email protected]> wrote:

> Hi,
>
> I'm looking at adding support for the hardware performance counters in ARMv6
> using the new perf events framework. I have a simple setup that uses the
> counters on their own, but wrt the perf events framework:
>
> - what are the requirements of set_perf_event_pending() and
> perf_event_do_pending()? As far as I can tell from sparc/x86/powerpc,
> set_perf_event_pending() triggers an interrupt that then calls
> perf_event_do_pending(). Does perf_event_do_pending need to run in
> interrupt context or could I use a soft IRQ if platforms don't have a
> spare IRQ?

softirq would be fine too i suspect - but then you need to increase the
buffering of perf_pending_head, as multiple hardirqs could hit before
the softirq processing has finished.

As that gets complex quick, an acceptable first-order approach would be
to just ignore those lost events and run it from a softirq - i _think_
everything should be OK.

> - ARM does not have proper support for atomic64's. Other than
> performance, would there be any known problems with using the generic
> spinlocked atomic64's?

Not a problem at all. Even performance-wise they are pretty nice - Paul
has done a nice job hashing it along 16 spinlocks - so for small SMP
systems there should be no global cacheline bounce.

Ingo

2009-12-01 14:40:43

by Peter Zijlstra

[permalink] [raw]
Subject: Re: Perf events/ARM

On Tue, 2009-12-01 at 15:31 +0100, Ingo Molnar wrote:
> * Jamie Iles <[email protected]> wrote:
>
> > Hi,
> >
> > I'm looking at adding support for the hardware performance counters in ARMv6
> > using the new perf events framework. I have a simple setup that uses the
> > counters on their own, but wrt the perf events framework:
> >
> > - what are the requirements of set_perf_event_pending() and
> > perf_event_do_pending()? As far as I can tell from sparc/x86/powerpc,
> > set_perf_event_pending() triggers an interrupt that then calls
> > perf_event_do_pending(). Does perf_event_do_pending need to run in
> > interrupt context or could I use a soft IRQ if platforms don't have a
> > spare IRQ?
>
> softirq would be fine too i suspect - but then you need to increase the
> buffering of perf_pending_head, as multiple hardirqs could hit before
> the softirq processing has finished.
>
> As that gets complex quick, an acceptable first-order approach would be
> to just ignore those lost events and run it from a softirq - i _think_
> everything should be OK.

Things like wakeups and ->event_limit might get delayed.

Delayed wakeups can be mitigated by larger buffers, delayed disable on
->event_limit is not something you can fix up.

Does your PMU generate regular interrupts or actual NMIs? If its normal
interrupts you can simply call perf_event_do_pending() at the
pmu-interrupt tail.

x86 does a self-ipi to get from NMI context into IRQ context as fast as
possible, simply because you cannot do very much from NMI context.

> > - ARM does not have proper support for atomic64's. Other than
> > performance, would there be any known problems with using the generic
> > spinlocked atomic64's?
>
> Not a problem at all. Even performance-wise they are pretty nice - Paul
> has done a nice job hashing it along 16 spinlocks - so for small SMP
> systems there should be no global cacheline bounce.

Depends, again if your PMU generates NMIs a spinlock'ed version won't
work.

2009-12-01 14:51:50

by Jamie Iles

[permalink] [raw]
Subject: Re: Perf events/ARM

Pete, Ingo,

On Tue, Dec 01, 2009 at 03:40:44PM +0100, Peter Zijlstra wrote:
> Things like wakeups and ->event_limit might get delayed.
>
> Delayed wakeups can be mitigated by larger buffers, delayed disable on
> ->event_limit is not something you can fix up.
>
> Does your PMU generate regular interrupts or actual NMIs? If its normal
> interrupts you can simply call perf_event_do_pending() at the
> pmu-interrupt tail.
The PMU generates regular interrupts and in the case of the platform I'm
working on these go to a regular vectored interrupt controller all sharing the
same vector. So I'll add a call to perf_event_do_pending() at the tail of the
PMU IRQ and leave set_perf_event_pending() as a nop.

Thanks,

Jamie