2007-02-13 18:50:37

by Stephane Eranian

[permalink] [raw]
Subject: 2.6.20 new perfmon code base + libpfm + pfmon

Hello,

I have released another version of the perfmon new code base package.
This version of the kernel patch is relative to 2.6.20.

This new kernel patch includes the following new features and
bug fixes:
- first cut at supporting Oprofile on i386 and x86-64 architectures
- several internal interfaces simplfications
- various MIPS updates (Phil Mucci/Manoj Ekbote)
- varous PPC32 updates (Phil Mucci)
- fix bug in set switching with a single set
- fix bug in pfm_restart() for per-thread mode with blocking notification

Unfortunately, this release does not build for PowerPC due to a problem with the
TIF_* flags. Perfmon adds 2 new flags which make the TIF now use more than 16 bits which
causes problem with some assembly instructions in entry_64.S. Hopefully,
this will be fixed in the next release.

To make Oprofile work, you need a modified user level Oprofile package. I have made
a first pass at modifying 0.9.2 to work on Perfmon 2.3 (and v2.0 for IA-64) for
the following processors: AMD Opteron, P6, Core Duo, Core 2 Duo, P4. The modified
package is available as Alpha at:

ftp://ftp.hpl.hp.com/pub/linux-ia64/oprof-perfmon2-070122.diff

I have also released a new libpfm, libpfm-3.2-070206, with lots of
changes. Here are some of the most important ones:
- Full Intel Core 2 Duo event table (Thanks to Dan Terpstra for his help)
- Full event table for P6 and Pentium M (Dan Terpstra)
- various MIPS updates (Phil Mucci)
- improved Montecito event->counter assignment routine
- rewritten P6 event->counter assignment routine
- extended detect_unavail_pmcs() code to handle pmds
- extended whichpmu.c example to show implement vs. available registers
- possiblity to force a libpfm PMU different from host PMU
- several bug fixes to library and examples

Also a new version of pfmon, pfmon-3.2-070206, with a few changes:
- rewritten the automatic sampling buffer sizing based on
the resource limits (rlimits) and perfmon global settings
- Intel Core 2 Duo event listing now shows if event supports PEBS
- fixed important memory leak that showed up when monitoring across fork/pthread

You can get the packages and more detailed changelogs our the web site:

http://perfmon2.sf.net (go to Project Files)

Enjoy,

PS: home page not yet updated due to a problem connecting via ssh to SF.net today.

--

-Stephane


2007-02-13 21:48:57

by William Cohen

[permalink] [raw]
Subject: Re: [perfmon] 2.6.20 new perfmon code base + libpfm + pfmon

Stephane Eranian wrote:
> Hello,
>
> I have released another version of the perfmon new code base package.
> This version of the kernel patch is relative to 2.6.20.
>
> This new kernel patch includes the following new features and
> bug fixes:
> - first cut at supporting Oprofile on i386 and x86-64 architectures
> - several internal interfaces simplfications
> - various MIPS updates (Phil Mucci/Manoj Ekbote)
> - varous PPC32 updates (Phil Mucci)
> - fix bug in set switching with a single set
> - fix bug in pfm_restart() for per-thread mode with blocking notification
>
> Unfortunately, this release does not build for PowerPC due to a problem with the
> TIF_* flags. Perfmon adds 2 new flags which make the TIF now use more than 16 bits which
> causes problem with some assembly instructions in entry_64.S. Hopefully,
> this will be fixed in the next release.
>
> To make Oprofile work, you need a modified user level Oprofile package. I have made
> a first pass at modifying 0.9.2 to work on Perfmon 2.3 (and v2.0 for IA-64) for
> the following processors: AMD Opteron, P6, Core Duo, Core 2 Duo, P4. The modified
> package is available as Alpha at:
>
> ftp://ftp.hpl.hp.com/pub/linux-ia64/oprof-perfmon2-070122.diff

Hello Stephane,

The oprofile patch should be made against the oprofile cvs rather than the 0.9.2
tarball. There are some files that the patch touches that are created by the
autogen.sh.

The oprofile patch doesn't build if things are configured without the
"--enable-perfmon2".

gcc -W -Wall -fno-common -Wdeclaration-after-statement -fno-omit-frame-pointer
-g -O2 -o oprofiled init.o oprofiled.o opd_stats.o opd_sfile.o opd_kernel.o
opd_trans.o opd_cookie.o opd_events.o opd_mangling.o opd_perfmon.o
opd_perfmon_22.o opd_perfmon_compat.o opd_anon.o liblegacy/liblegacy.a
../libabi/libabi.a ../libdb/libodb.a ../libop/libop.a ../libutil/libutil.a
-lpopt -liberty -ldl
opd_perfmon.o: In function `perfmon_init':
/home/wcohen/research/profiling/oprofile/oprofile-0.9.2-perfmon2/daemon/opd_perfmon.c:384:
undefined reference to `do_perfmon_init'
collect2: ld returned 1 exit status

-Will

2007-02-13 22:15:16

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.20 new perfmon code base + libpfm + pfmon

> On Tue, 13 Feb 2007 10:48:39 -0800 Stephane Eranian <[email protected]> wrote:
> I have released another version of the perfmon new code base package.

Can we have a bug push to get this merged up please?

2007-02-13 22:20:47

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.20 new perfmon code base + libpfm + pfmon

Andrew Morton <[email protected]> writes:

> > On Tue, 13 Feb 2007 10:48:39 -0800 Stephane Eranian <[email protected]> wrote:
> > I have released another version of the perfmon new code base package.
>
> Can we have a bug push to get this merged up please?

Yes, there certainly seems to be user interest in this.

I've been merging the x86 specific infrastructure Stephane sent.
So hopefully the basics are there already.

The big open question was still the review of the syscall interface.
Probably needs some determined reviewers.

I did a review of some of the basic low level code some time ago;
there were some issues but I believe they are probably all resolved
by now (but I haven't verified that recently)

-Andi

2007-02-14 01:31:05

by Chuck Ebbert

[permalink] [raw]
Subject: Re: 2.6.20 new perfmon code base + libpfm + pfmon

Andrew Morton wrote:
>> On Tue, 13 Feb 2007 10:48:39 -0800 Stephane Eranian <[email protected]> wrote:
>> I have released another version of the perfmon new code base package.
>
> Can we have a bug push to get this merged up please?

You mean "big" push? :)

FWIW I ran 2.6.17.8 kernels for weeks with the perfmon
patches applied and had no problems, so it does seem
reasonably stable to me.

2007-02-14 18:31:26

by Stephane Eranian

[permalink] [raw]
Subject: Re: 2.6.20 new perfmon code base + libpfm + pfmon

Andrew,

On Tue, Feb 13, 2007 at 02:05:33PM -0800, Andrew Morton wrote:
> > On Tue, 13 Feb 2007 10:48:39 -0800 Stephane Eranian <[email protected]> wrote:
> > I have released another version of the perfmon new code base package.
>
> Can we have a bug push to get this merged up please?

Could you please indicate the procedure to do this?

Do you "just" need a patch against your latest -mm tree?

Thanks.

--
-Stephane

2007-02-14 18:47:24

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.20 new perfmon code base + libpfm + pfmon

On Wed, 14 Feb 2007 10:29:32 -0800
Stephane Eranian <[email protected]> wrote:

> On Tue, Feb 13, 2007 at 02:05:33PM -0800, Andrew Morton wrote:
> > > On Tue, 13 Feb 2007 10:48:39 -0800 Stephane Eranian <[email protected]> wrote:
> > > I have released another version of the perfmon new code base package.
> >
> > Can we have a bug push to get this merged up please?
>
> Could you please indicate the procedure to do this?

re-post the patches for review-and-possible-integration, as you did last
time, I guess.

> Do you "just" need a patch against your latest -mm tree?

You should only raise patches against -mm is there's a special need to do
that (usually: they're against something which is only in -mm). Patches
against 2.6.21-rc1 should suit.

2007-02-14 19:33:45

by William Cohen

[permalink] [raw]
Subject: Re: [perfmon] 2.6.20 new perfmon code base + libpfm + pfmon

William Cohen wrote:
> Hello Stephane,
>
> The oprofile patch should be made against the oprofile cvs rather than
> the 0.9.2 tarball. There are some files that the patch touches that are
> created by the autogen.sh.
>
> The oprofile patch doesn't build if things are configured without the
> "--enable-perfmon2".
>
> gcc -W -Wall -fno-common -Wdeclaration-after-statement
> -fno-omit-frame-pointer -g -O2 -o oprofiled init.o oprofiled.o
> opd_stats.o opd_sfile.o opd_kernel.o opd_trans.o opd_cookie.o
> opd_events.o opd_mangling.o opd_perfmon.o opd_perfmon_22.o
> opd_perfmon_compat.o opd_anon.o liblegacy/liblegacy.a ../libabi/libabi.a
> ../libdb/libodb.a ../libop/libop.a ../libutil/libutil.a -lpopt -liberty
> -ldl
> opd_perfmon.o: In function `perfmon_init':
> /home/wcohen/research/profiling/oprofile/oprofile-0.9.2-perfmon2/daemon/opd_perfmon.c:384:
> undefined reference to `do_perfmon_init'
> collect2: ld returned 1 exit status
>
> -Will

Hi Stephane,

I tweaked the oprofile patch a bit so that it applies to the oprofile cvs
repository and builds with and without being configured with --enable-perfmon2.
The patch is attached.

Things that will need to be done to the patch:

-handle case where perfmon pmd/pmc registers are unavailable
Is the method being used going to work for systemwide perfmon?
create a thread local context then collect unavail regs.
What happens if another thread using a register that is marked
available in the current thread?
-handle naming differences between oprofile events and perfmon2
-general cleanup

-Will


Attachments:
oprof-perfmon2-cvs4.diff (33.37 kB)

2007-02-14 23:05:40

by Stephane Eranian

[permalink] [raw]
Subject: Re: [perfmon] 2.6.20 new perfmon code base + libpfm + pfmon

Will,

On Wed, Feb 14, 2007 at 12:05:31PM -0500, William Cohen wrote:

> >The oprofile patch should be made against the oprofile cvs rather than
> >the 0.9.2 tarball. There are some files that the patch touches that are
> >created by the autogen.sh.
> >
> >The oprofile patch doesn't build if things are configured without the
> >"--enable-perfmon2".
> >
> >gcc -W -Wall -fno-common -Wdeclaration-after-statement
> >-fno-omit-frame-pointer -g -O2 -o oprofiled init.o oprofiled.o
> >opd_stats.o opd_sfile.o opd_kernel.o opd_trans.o opd_cookie.o
> >opd_events.o opd_mangling.o opd_perfmon.o opd_perfmon_22.o
> >opd_perfmon_compat.o opd_anon.o liblegacy/liblegacy.a ../libabi/libabi.a
> >../libdb/libodb.a ../libop/libop.a ../libutil/libutil.a -lpopt -liberty
> >-ldl
> >opd_perfmon.o: In function `perfmon_init':
> >/home/wcohen/research/profiling/oprofile/oprofile-0.9.2-perfmon2/daemon/opd_perfmon.c:384:
> >undefined reference to `do_perfmon_init'
> >collect2: ld returned 1 exit status
> >
> >-Will
>
> Hi Stephane,
>
> I tweaked the oprofile patch a bit so that it applies to the oprofile cvs
> repository and builds with and without being configured with
> --enable-perfmon2. The patch is attached.
>
> Things that will need to be done to the patch:
>
> -handle case where perfmon pmd/pmc registers are unavailable
> Is the method being used going to work for systemwide perfmon?
> create a thread local context then collect unavail regs.

As of now, there is enforced mutual exclusions between per-thread and system-wide.
The unavailable support is used to make the application aware that the whole PMU may
not be available, for whatever reason. For instance, today this is used when the NMI
watchdog is active. The unavailable mask will report that one counter is not available.

This will be generalized once we add a finer grain PMU register allocator underneath
perfmon (to be used by perfmon and NMI for instance). The way I envision this is that
you create the context, query what is available, assign event -> counters, load
context onto thread or CPU. That call could fail if some of the registers used became
unavailable in which case, you need to go through the procedure again.

> What happens if another thread using a register that is marked
> available in the current thread?
> -handle naming differences between oprofile events and perfmon2

This is handled in the code already today. Because I did not want to change
the event description table nor the higher level event -> counter logic of OProfile,
the mapping from Oprofile counter -> Perfmon PMC/PMD is done AFTER. With such setup,
it is hard to deal with unavailable registers detection once in perfmon code. I think
querying the unavailable register needs to happen BEFORE event-> counter assignment.
But that means that we need to provide a converter from Perfmon PMD -> OProfile
counters which is something i have not yet done.

Thanks for the patch.

--
-Stephane

2007-02-15 09:01:07

by Stephane Eranian

[permalink] [raw]
Subject: Re: 2.6.20 new perfmon code base + libpfm + pfmon

Andi,

On Wed, Feb 14, 2007 at 12:20:56AM +0100, Andi Kleen wrote:
> Andrew Morton <[email protected]> writes:
>
> > > On Tue, 13 Feb 2007 10:48:39 -0800 Stephane Eranian <[email protected]> wrote:
> > > I have released another version of the perfmon new code base package.
> >
> > Can we have a bug push to get this merged up please?
>
> Yes, there certainly seems to be user interest in this.
>
> I've been merging the x86 specific infrastructure Stephane sent.
> So hopefully the basics are there already.
>
Yes, almost everything is in there now. Tony Luck told me he has integrated
the idle notifier for IA-64. I saw that the i386 version of the notifier
was recently integrated as well. So I think that for 2.6.21 we'll have
everything we need for i386, x86-64 and ia64. On MIPS and PowerPC,
a few things are still missing but they should be fixed soon.

On x86-64 and i386, the one last thing I would need that you do not already
have is in the NMI handler for the architectural perfmon to switch PERFCTR0
to PERFCTR1. This would allow certain events to be measured while the NMI
watchdog is active. This is needed on Intel Core-based processors where
certain events can ONLY be measured by PERFCTR0. The CPU_CLK_UNHALTED event
used by the watchdog can be measured by any counter.

I have attached the x86-64 patch for this. I can submit the i386 version
as well.

> The big open question was still the review of the syscall interface.
> Probably needs some determined reviewers.
>
Not a problem.

> I did a review of some of the basic low level code some time ago;
> there were some issues but I believe they are probably all resolved
> by now (but I haven't verified that recently)
>
Yes, all the changes and fixes you and Andrew had requested have been made.


changelog:
- for architectural perfmon support, switch from PERFCTR0 to PERFCTR1.
this does free PERFCTR0 which is the only counter supported for certain
events on Intel Core-based processors.

signed-off-by: stephane eranian <[email protected]>

diff --exclude=.git -urp linux-2.6.20.base/arch/x86_64/kernel/nmi.c linux-2.6.20/arch/x86_64/kernel/nmi.c
--- linux-2.6.20.base/arch/x86_64/kernel/nmi.c 2007-02-05 00:31:52.000000000 -0800
+++ linux-2.6.20/arch/x86_64/kernel/nmi.c 2007-02-09 09:44:29.000000000 -0800
@@ -275,7 +275,7 @@ int __init check_nmi_watchdog (void)
* 32nd bit should be 1, for 33.. to be 1.
* Find the appropriate nmi_hz
*/
- if (wd->perfctr_msr == MSR_ARCH_PERFMON_PERFCTR0 &&
+ if (wd->perfctr_msr == MSR_ARCH_PERFMON_PERFCTR1 &&
((u64)cpu_khz * 1000) > 0x7fffffffULL) {
nmi_hz = ((u64)cpu_khz * 1000) / 0x7fffffffUL + 1;
}
@@ -615,8 +615,8 @@ static int setup_intel_arch_watchdog(voi
(ebx & ARCH_PERFMON_UNHALTED_CORE_CYCLES_PRESENT))
goto fail;

- perfctr_msr = MSR_ARCH_PERFMON_PERFCTR0;
- evntsel_msr = MSR_ARCH_PERFMON_EVENTSEL0;
+ perfctr_msr = MSR_ARCH_PERFMON_PERFCTR1;
+ evntsel_msr = MSR_ARCH_PERFMON_EVENTSEL1;

if (!reserve_perfctr_nmi(perfctr_msr))
goto fail;
@@ -855,7 +855,7 @@ int __kprobes nmi_watchdog_tick(struct p
dummy &= ~P4_CCCR_OVF;
wrmsrl(wd->cccr_msr, dummy);
apic_write(APIC_LVTPC, APIC_DM_NMI);
- } else if (wd->perfctr_msr == MSR_ARCH_PERFMON_PERFCTR0) {
+ } else if (wd->perfctr_msr == MSR_ARCH_PERFMON_PERFCTR1) {
/*
* ArchPerfom/Core Duo needs to re-unmask
* the apic vector