2008-11-26 08:42:14

by Stephane Eranian

[permalink] [raw]
Subject: [patch 00/24] perfmon: introduction

[Repost with the correct cc: list]

Hello,

This series of patches implements the perfmon interface which provides access
to the hardware performance counters of modern processors. In particular,
this version supports per-thread counting for all AMD64 processors, and
recent Intel processors with architectural PMU (Core Duo, Core 2, Atom,
and Core i7).

It does not supersede Oprofile in this first implementation. Oprofile
and perfmon can be compiled in the same kernel, but only one can have
an active session at a time.

This implementation takes into account the various comments received from
previous reviews on LKML. This is a much simplified version compared to
the fully featured version maintained as a GIT tree on kernel.org.

This new version, named perfmon3, uses only 5 system calls (instead of 12).
Each call was carefully designed to allow for future extensions.

Full documentation is available in Documentation/perfmon.txt and is provided
by one of the patches.

Once this basic set of perfmon functionalities is upstream, we will build
on it and add other features such as support for sampling, event set
multiplexing and other processor architectures.

Updated versions of libpfm and pfmon able to support both perfmon2 and
perfmon3 are available in the CVS repository on the perfmon web site, check
out: http://perfmon2.sf.net.

The patch series is against 2.6.28-rc6.

Please consider adding this patch series for 2.6.29 and send me any
comments you may have on the code.

Thanks to all the people who have contributed to this effort.

S.Eranian
--


2008-11-26 10:44:54

by Paul Mackerras

[permalink] [raw]
Subject: Re: [patch 00/24] perfmon: introduction

[email protected] writes:

> This new version, named perfmon3, uses only 5 system calls (instead of 12).
> Each call was carefully designed to allow for future extensions.

I notice that the userspace stuff in CVS assumes perfmon3 has 7 system
calls, the 5 added here plus pfm_create_sets and pfm_getinfo_sets.
Are you planning to add those two later (in which case we should
reserve the numbers now), or are you going to implement the
functionality of those two in pfm_write and pfm_read?

Regards,
Paul.

2008-11-26 13:38:49

by Stephane Eranian

[permalink] [raw]
Subject: Re: [patch 00/24] perfmon: introduction

Hello Paul,

On Wed, Nov 26, 2008 at 11:05 AM, Paul Mackerras <[email protected]> wrote:
> [email protected] writes:
>
>> This new version, named perfmon3, uses only 5 system calls (instead of 12).
>> Each call was carefully designed to allow for future extensions.
>
> I notice that the userspace stuff in CVS assumes perfmon3 has 7 system
> calls, the 5 added here plus pfm_create_sets and pfm_getinfo_sets.
> Are you planning to add those two later (in which case we should
> reserve the numbers now), or are you going to implement the
> functionality of those two in pfm_write and pfm_read?
>

Yes, the LKML patchset represents the very minimal functionality of perfmon3.
Libpfm/pfmon are coded to support the fully-featured version to help
with testing
and evaluations.

The 2 systems calls you are referring to go with event set and multiplexing, an
advanced feature which I intend to submit to LKML later. That's why I
did not add
those two calls immediately. People would have argued that the calls are
not used.

Furthermore, there is still some discussions as to how best to add the
set functionality
to the syscall API. We need to be able to:
- create new sets
- retrieve runtime execution about sets, e.g., number of
activations, timeout left
- and possibility delete sets

I don't think we can reserve syscall numbers in advance. OTOH, the
perfmon3 syscalls
don't necessarily need to be contiguous, unless people tell me this is
a rule used by
the kernel developers.

2008-11-26 14:16:20

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch 00/24] perfmon: introduction


please remove all this debug crap from the patches:

+ PFM_DBG_ovfl("pfm_arch_write_pmc(0x%lx, 0x%Lx)",
+ PFM_DBG_ovfl("pfm_arch_write_pmd(0x%lx, 0x%Lx)",
+ PFM_DBG_ovfl("pfm_arch_read_pmd(0x%lx) = 0x%Lx",
+ PFM_DBG_ovfl("pfm_arch_read_pmc(0x%lx) = 0x%016Lx",
+ PFM_DBG("clear cr4.pce");
+ PFM_DBG("set cr4.pce");
+ PFM_DBG_ovfl("state=%d", ctx->state);
+ PFM_DBG_ovfl("no ctx");
+ PFM_DBG_ovfl("no ovfl");
+ PFM_DBG("LTVPC=0x%lx using_nmi=%d",
+ PFM_DBG("pmc%d(%s) already used", i, d->desc);
+ PFM_DBG("nlost=%d info_flags=0x%x\n", nlost, pmu_info->flags);
+ PFM_DBG("pmd%d(%s) already used", i, d->desc);
+ PFM_DBG("pmc%u released", i);
+ PFM_DBG("pmd%u released", i);
+ PFM_DBG("acquired Northbridge event access globally");
+ PFM_DBG("global NorthBridge event conflict");
+ PFM_DBG("released NorthBridge events globally");
+ PFM_DBG("global=0x%llx set to 0x%llx",
+ PFM_DBG("global_ctrl restored to 0x%llx\n",
+#define _PFM_DBG(lm, f, x...) \
+#define PFM_DBG(f, x...) _PFM_DBG(0x1, f, ##x)
+#define PFM_DBG_ovfl(f, x...) _PFM_DBG(0x2, f, ##x)
+#define PFM_DBG(f, x...) do {} while (0)
+#define PFM_DBG_ovfl(f, x...) do {} while (0)
+ PFM_DBG("ctx_task=[%d] ctx_state=%d is_system=%d",
+ PFM_DBG("pid=%d", task->pid);
+ PFM_DBG("load_pid=%d has a context "
+ PFM_DBG("novfls=%u", num_ovfls);
+ PFM_DBG("pmd%u val=0x%llx",
+ PFM_DBG("ctx_state=%d task [%d]",
+ PFM_DBG("released ownership");
+ PFM_DBG("state=%d is_self=%d", ctx->state, ctx->flags.is_self);
+ PFM_DBG("[%d] has no ctx", current->pid);
+ PFM_DBG("work_type=%d", type);
+ PFM_DBG("unkown type=%d", type);
+ PFM_DBG("context is zombie, bailing out");
+ PFM_DBG("free ctx @0x%p", ctx);
+ PFM_DBG("user group not allowed to create a task context");
+ PFM_DBG("pmc%u=0x%llx",
+ PFM_DBG("alloc ctx @0x%p", ctx);
+ PFM_DBG("no usable PMU registers");
+ PFM_DBG("flags=0x%x fd=%d", ctx_flags, fd);
+ PFM_DBG("state=%d", state);
+ PFM_DBG("zombie ctx for [%d]", ctx->task->pid);
+ PFM_DBG("called filp=%p", filp);
+ PFM_DBG("pfm_file_ops");
+ PFM_DBG("pfm_read called");
+ PFM_DBG("pfm_write called");
+ PFM_DBG("pfm_ioctl called");
+ PFM_DBG("new inode ino=%ld @%p", inode->i_ino, inode);
+ PFM_DBG_ovfl("pmd%u ovfl=%s new=0x%llx old=0x%llx "
+ PFM_DBG_ovfl("intr_pmds=0x%llx npend=%u ip=%p u_pmds=0x%llx",
+ PFM_DBG_ovfl("ctx is zombie, converted to spurious");
+ PFM_DBG_ovfl("no ctx");
+ PFM_DBG_ovfl("spurious: not owned by current task");
+ PFM_DBG_ovfl("spurious: monitoring non active");
+ PFM_DBG_ovfl("no npend_ovfls");
+ PFM_DBG("pmu_acquired=%d", pfm_pmu_acquired);
+ PFM_DBG("regs_all.pmcs=0x%llx",
+ PFM_DBG("PMU acquired: %u PMCs, %u PMDs, %u counters",
+ PFM_DBG("pmu_acquired=%d", pfm_pmu_acquired);
+ PFM_DBG("PMU released");
+ PFM_DBG("in thread=%u",
+ PFM_DBG("out thread=%u ret=%d",
+ PFM_DBG("in thread=%u",
+ PFM_DBG("out thread=%u",
+ PFM_DBG("in sys=%u task=%u",
+ PFM_DBG("already some system-wide sessions");
+ PFM_DBG("%u conflicting thread_sessions",
+ PFM_DBG("out sys=%u task=%u",
+ PFM_DBG("in sys=%u task=%u",
+ PFM_DBG("out sys=%u task=%u",
+ PFM_DBG("u_pmds=0x%llx nu_pmds=%u u_pmcs=0x%llx nu_pmcs=%u",
+ PFM_DBG("pmd%u is not available", cnum);
+ PFM_DBG("pmd%u=0x%llx a_pmu=%d "
+ PFM_DBG("pmc%u is not available", cnum);
+ PFM_DBG("pmc%u=0x%llx a_pmu=%d "
+ PFM_DBG("pmd%u is not implemented/unaccessible", cnum);
+ PFM_DBG("pmd%u cannot read, because not used", cnum);
+ PFM_DBG("pmd%u=0x%llx ",
+ PFM_DBG("cannot attach to kernel thread [%d]", task->pid);
+ PFM_DBG("cannot attach to zombie/dead task [%d]", task->pid);
+ PFM_DBG("task not found %d", pid);
+ PFM_DBG("may_attach=%d check_attach=%d", ret1, ret);
+ PFM_DBG("state=%d check_mask=0x%x task=[%d]",
+ PFM_DBG("state=%d, cmd needs context unloaded", state);
+ PFM_DBG("old_state=%d new_state=%d",
+ PFM_DBG("ret=%d",ret);
+ PFM_DBG("argument too big %zu max=%zu",
+ PFM_DBG("invalid fd %d", fd);
+ PFM_DBG("fd %d not related to perfmon", fd);
+ PFM_DBG("invalid type=%d", type);
+ PFM_DBG("invalid size=%zu for type=%d", sz, type);
+ PFM_DBG("sz=%zu sz_type=%zu count=%zu",
+ PFM_DBG("flags=0x%x sif=%p", flags, ureq);
+ PFM_DBG("no flags accepted yet");
+ PFM_DBG("fd=%d flags=0x%x type=%d req=%p sz=%zu",
+ PFM_DBG("no flags defined");
+ PFM_DBG("invalid type=%d", type);
+ PFM_DBG("fd=%d flags=0x%x type=%d req=%p sz=%zu",
+ PFM_DBG("no flags defined");
+ PFM_DBG("invalid type=%d", type);
+ PFM_DBG("fd=%d uflags=0x%x state=0x%x", fd, uflags, state);
+ PFM_DBG("no flags defined");
+ PFM_DBG("invalid state=0x%x", state);
+ PFM_DBG("fd=%d uflags=0x%x target=%d", fd, uflags, target);
+ PFM_DBG("invalid flags");

they are used way too frequently and obscures the structure and
purpose of the code. We wouldnt mind in a driver but this is core
kernel code.

Ingo

2008-11-26 14:20:22

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch 00/24] perfmon: introduction


A generic request (i'll continue to post my specific remarks to the
subject patches):

before submitting patches to lkml, please run them through
scripts/checkpatch.pl. Your current lot introduces an excessive number
of 69 (!) new errors+warnings:

total: 45 errors, 24 warnings, 7394 lines checked

Thanks,

Ingo

-------------------->
ERROR: need space after that ',' (ctx:VxV)
#403: FILE: arch/x86/include/asm/mach-default/entry_arch.h:37:
+BUILD_INTERRUPT(pmu_interrupt,LOCAL_PERFMON_VECTOR)
^

WARNING: line over 80 characters
#770: FILE: arch/x86/include/asm/perfmon_kern.h:320:
+static inline void pfm_arch_ovfl_reset_pmd(struct pfm_context *ctx, unsigned int cnum)

WARNING: line over 80 characters
#784: FILE: arch/x86/include/asm/perfmon_kern.h:334:
+static inline int pfm_arch_context_create(struct pfm_context *ctx, u32 ctx_flags)

WARNING: line over 80 characters
#802: FILE: arch/x86/include/asm/perfmon_kern.h:352:
+int pfm_arch_ctxswout_thread(struct task_struct *task, struct pfm_context *ctx);

ERROR: use tabs not spaces
#1059: FILE: arch/x86/kernel/process_32.c:465:
+ ^Iif (test_tsk_thread_flag(prev_p, TIF_PERFMON_CTXSW))$

ERROR: use tabs not spaces
#1060: FILE: arch/x86/kernel/process_32.c:466:
+ ^I^Ipfm_ctxsw_out(prev_p, next_p);$

ERROR: use tabs not spaces
#1138: FILE: arch/x86/kernel/signal_32.c:689:
+ ^I/* process perfmon asynchronous work (e.g. block thread or reset) */$

ERROR: use tabs not spaces
#1139: FILE: arch/x86/kernel/signal_32.c:690:
+ ^Iif (thread_info_flags & _TIF_PERFMON_WORK)$

ERROR: use tabs not spaces
#1140: FILE: arch/x86/kernel/signal_32.c:691:
+ ^I^Ipfm_handle_work(regs);$

ERROR: use tabs not spaces
#1161: FILE: arch/x86/kernel/signal_64.c:489:
+ ^I/* process perfmon asynchronous work (e.g. block thread or reset) */$

ERROR: use tabs not spaces
#1162: FILE: arch/x86/kernel/signal_64.c:490:
+ ^Iif (thread_info_flags & _TIF_PERFMON_WORK)$

ERROR: use tabs not spaces
#1163: FILE: arch/x86/kernel/signal_64.c:491:
+ ^I^Ipfm_handle_work(regs);$

WARNING: line over 80 characters
#1969: FILE: arch/x86/perfmon/perfmon_amd64.c:70:
+/* pmc0 */ PMC_D(PFM_REG_I64, "PERFSEL0", PFM_K8_VAL, PFM_K8_RSVD, PFM_K8_NO64, MSR_K7_EVNTSEL0),

WARNING: line over 80 characters
#1970: FILE: arch/x86/perfmon/perfmon_amd64.c:71:
+/* pmc1 */ PMC_D(PFM_REG_I64, "PERFSEL1", PFM_K8_VAL, PFM_K8_RSVD, PFM_K8_NO64, MSR_K7_EVNTSEL1),

WARNING: line over 80 characters
#1971: FILE: arch/x86/perfmon/perfmon_amd64.c:72:
+/* pmc2 */ PMC_D(PFM_REG_I64, "PERFSEL2", PFM_K8_VAL, PFM_K8_RSVD, PFM_K8_NO64, MSR_K7_EVNTSEL2),

WARNING: line over 80 characters
#1972: FILE: arch/x86/perfmon/perfmon_amd64.c:73:
+/* pmc3 */ PMC_D(PFM_REG_I64, "PERFSEL3", PFM_K8_VAL, PFM_K8_RSVD, PFM_K8_NO64, MSR_K7_EVNTSEL3),

ERROR: use tabs not spaces
#2289: FILE: arch/x86/perfmon/perfmon_amd64.c:390:
+^I^I set->used_pmcs,$

ERROR: use tabs not spaces
#2290: FILE: arch/x86/perfmon/perfmon_amd64.c:391:
+^I^I enable_mask,$

ERROR: use tabs not spaces
#2291: FILE: arch/x86/perfmon/perfmon_amd64.c:392:
+^I^I max_enable);$

ERROR: need space after that ',' (ctx:VxV)
#2328: FILE: arch/x86/perfmon/perfmon_amd64.c:429:
+ pfm_arch_bv_set_bit(i,set->povfl_pmds);
^

ERROR: need consistent spacing around '+' (ctx:VxW)
#2563: FILE: arch/x86/perfmon/perfmon_intel_arch.c:176:
+ max_enable = i+ 1;
^

ERROR: use tabs not spaces
#2748: FILE: arch/x86/perfmon/perfmon_intel_arch.c:361:
+ ^I/*$

ERROR: need spaces around that '=' (ctx:VxV)
#2936: FILE: arch/x86/perfmon/perfmon_intel_arch.c:549:
+ for (i=0; i < 16; i++) {
^

ERROR: need spaces around that '=' (ctx:VxV)
#2941: FILE: arch/x86/perfmon/perfmon_intel_arch.c:554:
+ for (i=16; i < PFM_IA_MAX_PMDS; i++) {
^

WARNING: line over 80 characters
#3260: FILE: include/linux/perfmon_kern.h:133:
+ struct pfm_regdesc regs; /* registers available to context */

WARNING: line over 80 characters
#3286: FILE: include/linux/perfmon_kern.h:159:
+ if (unlikely((pfm_controls.debug & lm) && printk_ratelimit())) { \

WARNING: printk() should include KERN_ facility level
#3287: FILE: include/linux/perfmon_kern.h:160:
+ printk("perfmon: %s.%d: CPU%d [%d]: " f "\n", \

ERROR: use tabs not spaces
#3565: FILE: include/linux/sched.h:1356:
+ ^Istruct pfm_context *pfm_context;$

WARNING: line over 80 characters
#3581: FILE: include/linux/syscalls.h:630:
+ char __user *f, void __user *uarg, size_t uarg_size);

WARNING: line over 80 characters
#3583: FILE: include/linux/syscalls.h:632:
+asmlinkage long sys_pfm_write(int fd, int flags, int type, void __user *arg, size_t s);

WARNING: line over 80 characters
#3584: FILE: include/linux/syscalls.h:633:
+asmlinkage long sys_pfm_read(int fd, int flags, int type, void __user *arg, size_t s);

ERROR: use tabs not spaces
#3865: FILE: perfmon/perfmon_attach.c:100:
+ ^I * link context to task$

ERROR: use tabs not spaces
#3866: FILE: perfmon/perfmon_attach.c:101:
+ ^I */$

ERROR: use tabs not spaces
#3897: FILE: perfmon/perfmon_attach.c:132:
+ ^I^I * on UP, we may have to push out the PMU$

ERROR: use tabs not spaces
#3898: FILE: perfmon/perfmon_attach.c:133:
+ ^I^I * state of the last monitored thread$

ERROR: use tabs not spaces
#3899: FILE: perfmon/perfmon_attach.c:134:
+ ^I^I */$

ERROR: use tabs not spaces
#3922: FILE: perfmon/perfmon_attach.c:157:
+ ^I * will cause switch_to() to invoke PMU$

ERROR: use tabs not spaces
#3923: FILE: perfmon/perfmon_attach.c:158:
+ ^I * context switch code$

ERROR: use tabs not spaces
#3924: FILE: perfmon/perfmon_attach.c:159:
+ ^I */$

ERROR: "foo * bar" should be "foo *bar"
#5014: FILE: perfmon/perfmon_file.c:245:
+ struct inode * inode;

ERROR: use tabs not spaces
#5145: FILE: perfmon/perfmon_init.c:65:
+ ^Iif (pfm_init_sysfs())$

ERROR: use tabs not spaces
#5146: FILE: perfmon/perfmon_init.c:66:
+ ^I^Igoto error_disable;$

WARNING: line over 80 characters
#5687: FILE: perfmon/perfmon_pmu.c:215:
+ memset(&pfm_pmu_conf->regs_all, 0, sizeof(struct pfm_regdesc));

ERROR: use tabs not spaces
#5690: FILE: perfmon/perfmon_pmu.c:218:
+^I^I^I^I ^I unavail_pmcs,$

WARNING: line over 80 characters
#5694: FILE: perfmon/perfmon_pmu.c:222:
+ (unsigned long long)pfm_pmu_conf->regs_all.pmcs[0]);

WARNING: line over 80 characters
#6194: FILE: perfmon/perfmon_rw.c:84:
+static inline int update_changes(struct pfm_context *ctx, struct pfm_event_set *set,

ERROR: need a space before the open parenthesis '('
#6219: FILE: perfmon/perfmon_rw.c:109:
+ for(p = 0; n; n--, p = q+1) {

ERROR: need space before that '-' (ctx:VxV)
#6768: FILE: perfmon/perfmon_syscalls.c:204:
+ state, check_mask, task ? task->pid:-1);
^

ERROR: need space after that ',' (ctx:VxV)
#6859: FILE: perfmon/perfmon_syscalls.c:295:
+ PFM_DBG("ret=%d",ret);
^

ERROR: need a space before the open parenthesis '('
#6987: FILE: perfmon/perfmon_syscalls.c:423:
+ switch(type) {

ERROR: need a space before the open parenthesis '('
#7081: FILE: perfmon/perfmon_syscalls.c:517:
+ switch(type) {

WARNING: kfree(NULL) is safe this check is probabally not required
#7100: FILE: perfmon/perfmon_syscalls.c:536:
+ if (fptr)
+ kfree(fptr);

ERROR: need a space before the open parenthesis '('
#7145: FILE: perfmon/perfmon_syscalls.c:581:
+ switch(type) {

WARNING: kfree(NULL) is safe this check is probabally not required
#7160: FILE: perfmon/perfmon_syscalls.c:596:
+ if (fptr)
+ kfree(fptr);

ERROR: need a space before the open parenthesis '('
#7180: FILE: perfmon/perfmon_syscalls.c:616:
+ switch(state) {

ERROR: use tabs not spaces
#7258: FILE: perfmon/perfmon_syscalls.c:694:
+ ^I * handle detach in a separate function$

ERROR: use tabs not spaces
#7259: FILE: perfmon/perfmon_syscalls.c:695:
+ ^I */$

ERROR: use tabs not spaces
#7276: FILE: perfmon/perfmon_syscalls.c:712:
+ ^Iif (target != current->pid) {$

WARNING: line over 80 characters
#7394: FILE: perfmon/perfmon_sysfs.c:84:
+static ssize_t pfm_controls_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)

WARNING: line over 80 characters
#7398: FILE: perfmon/perfmon_sysfs.c:88:
+ return snprintf(buf, PAGE_SIZE, "%u.%u\n", PFM_VERSION_MAJ, PFM_VERSION_MIN);

WARNING: line over 80 characters
#7407: FILE: perfmon/perfmon_sysfs.c:97:
+ return snprintf(buf, PAGE_SIZE, "%d\n", pfm_controls.task_group);

WARNING: line over 80 characters
#7410: FILE: perfmon/perfmon_sysfs.c:100:
+ return snprintf(buf, PAGE_SIZE, "%zu\n", pfm_controls.arg_mem_max);

WARNING: line over 80 characters
#7415: FILE: perfmon/perfmon_sysfs.c:105:
+static ssize_t pfm_controls_store(struct kobject *kobj, struct kobj_attribute *attr,

ERROR: use tabs not spaces
#7416: FILE: perfmon/perfmon_sysfs.c:106:
+^I^I^I ^I const char *buf, size_t count)$

WARNING: line over 80 characters
#7518: FILE: perfmon/perfmon_sysfs.c:208:
+static ssize_t pfm_pmu_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)

ERROR: use tabs not spaces
#7635: FILE: perfmon/perfmon_sysfs.c:325:
+ ^I * dynamic allocation happens on pfm_kernel_kobj,$

ERROR: use tabs not spaces
#7636: FILE: perfmon/perfmon_sysfs.c:326:
+ ^I * but a release callback is attached$

ERROR: use tabs not spaces
#7637: FILE: perfmon/perfmon_sysfs.c:327:
+ ^I */$

ERROR: Missing Signed-off-by: line(s)

total: 45 errors, 24 warnings, 7394 lines checked

Your patch has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.