2013-08-14 12:42:48

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V11 00/15] perf tools: some fixes and tweaks

Hi

Here are some fixes and tweaks to perf tools (version 11).

Changes in V11:
perf tools: re-implement debug print function for linking python/perf.so
New patch split from "perf tools: add debug prints"
perf tools: add debug prints
Changed to use graph_dotted_line
Moved python/link changes to separate patch
Added Namhyung's Ack
perf tools: change machine__findnew_thread() to set thread pid
Added explanation of main thread lookup to commit message
perf tools: tidy up sample parsing overflow checking
Added Jiri's Ack
perf tools: remove unnecessary callchain validation
Added Namhyung's Ack
perf tools: remove references to struct ip_event
Added Namhyung's Ack
perf tools: move perf_evlist__config() to a new source file
Added Namhyung's Ack
perf tools: add missing 'abi' member to 'struct regs_dump'
New patch
perf tools: expand perf_event__synthesize_sample()
Adjusted for new 'abi' member of 'struct regs_dump'
perf tools: add a function to calculate sample event size
Adjusted for new 'abi' member of 'struct regs_dump'
perf tools: add a sample parsing test
Adjusted for new 'abi' member of 'struct regs_dump'

Changes in V10:
Re-based on:
perf/core branch of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git
commit 207353be1574027c151609a31167ac5919a056c8: perf tools: Remove filter parameter of thread__find_addr_map()

perf tools: add debug prints
Changed to use macros
perf tools: add pid to struct thread
Added David's Ack
perf tools: change machine__findnew_thread() to set thread pid
Added David's Ack
perf tools: remove references to struct ip_event
Also remove from hists_link.c
perf tools: move struct ip_event
Dropped since it is covered by "perf tools: remove references to struct ip_event"
perf tools: add a sample parsing test
Fix missing sample comparison for PERF_SAMPLE_READ

Changes in V9:
perf: Update perf_event_type documentation
Dropped since its been applied
perf tools: tidy up sample parsing overflow checking
Changed to use a single overflow function
Updated for PERF_SAMPLE_READ
perf: make events stream always parsable
Added more about sample parsing to the commit message
perf tools: add support for PERF_SAMPLE_IDENTFIER
When selecting PERF_SAMPLE_IDENTFIER, ensure PERF_SAMPLE_ID
is deselected
perf tools: expand perf_event__synthesize_sample()
Updated for PERF_SAMPLE_READ
perf tools: add a function to calculate sample event size
Updated for PERF_SAMPLE_READ
perf tools: add a sample parsing test
Updated for PERF_SAMPLE_READ

Changes in V8:
perf tools: add debug prints
Fixed Python link errors
perf tools: move perf_evlist__config() to a new source file
New Patch to avoid Python link errors
perf tools: add support for PERF_SAMPLE_IDENTFIER
Adjustments due to patch above

Changes in V7:
perf: Update perf_event_type documentation
Proposed new patch from Peter Zijlstra
perf: make events stream always parsable
Adjustments due to patch above
perf tools: tidy up sample parsing overflow checking
Change to a single overflow function
Amend comment
perf tools: add a function to calculate sample event size
New patch
perf tools: add a sample parsing test
Amended to use sample event size calculation

Changes in V6:
Some checkpatch fixes

perf: make events stream always parsable
Add sample format comments

Changes in V5:
Re-based to Arnaldo's tree and dropped already applied patches:
perf tools: remove unused parameter
perf tools: fix missing tool parameter
perf tools: fix missing 'finished_round'
perf tools: fix parse_events_terms() segfault on error path
perf tools: fix new_term() missing free on error path
perf tools: add const specifier to perf_pmu__find name parameter
perf tools: tidy duplicated munmap code
perf tools: validate perf event header size

perf tools: add debug prints
Changed to perf_event_attr__fprintf()
perf tools: add pid to struct thread
Always set the pid, even if a pid is already set
perf tools: change machine__findnew_thread() to set thread pid
Replaces: perf tools: change "machine" functions to set thread pid
perf tools: add support for PERF_SAMPLE_IDENTFIER
Only use PERF_SAMPLE_IDENTFIER if sample types are different
perf tools: expand perf_event__synthesize_sample()
New patch in preparation of a sample parsing test
perf tools: add a sample parsing test
New patch

Changes in V4:
I added kernel support for matching sample types via
PERF_SAMPLE_IDENTIFIER. perf tools support for that required
first fixing some other things.

perf tools: fix parse_events_terms() freeing local variable on error path
Dropped - covered by David Ahern
perf tools: struct thread has a tid not a pid
Added ack by David Ahern
perf tools: add pid to struct thread
Remove unused function
perf tools: fix missing increment in sample parsing
New patch
perf tools: tidy up sample parsing overflow checking
New patch
perf tools: remove unnecessary callchain validation
New patch
perf tools: remove references to struct ip_event
New patch
perf tools: move struct ip_event
New patch
perf: make events stream always parsable
New patch
perf tools: add support for PERF_SAMPLE_IDENTFIER
New patch

Changes in V3:
perf tools: add pid to struct thread
Split into 2 patches
perf tools: fix ppid in thread__fork()
Dropped for now

Changes in V2:
perf tools: fix missing tool parameter
Fixed one extra occurrence
perf tools: fix parse_events_terms() freeing local variable on error path
Made "freeing" code into a new function
perf tools: validate perf event header size
Corrected byte-swapping
perf tools: allow non-matching sample types
Added comments
Fixed id_pos calculation
id_pos/is_pos updated whenever sample_type changes
Removed perf_evlist__sample_type()
Added __perf_evlist__combined_sample_type()
Added perf_evlist__combined_sample_type()
Added perf_evlist__make_sample_types_compatible()
Added ack's to patches acked by Jiri Olsa


Adrian Hunter (15):
perf tools: re-implement debug print function for linking python/perf.so
perf tools: add debug prints
perf tools: allow non-matching sample types
perf tools: add pid to struct thread
perf tools: change machine__findnew_thread() to set thread pid
perf tools: tidy up sample parsing overflow checking
perf tools: remove unnecessary callchain validation
perf tools: remove references to struct ip_event
perf: make events stream always parsable
perf tools: move perf_evlist__config() to a new source file
perf tools: add support for PERF_SAMPLE_IDENTFIER
perf tools: add missing 'abi' member to 'struct regs_dump'
perf tools: expand perf_event__synthesize_sample()
perf tools: add a function to calculate sample event size
perf tools: add a sample parsing test

include/uapi/linux/perf_event.h | 27 ++-
kernel/events/core.c | 11 +-
tools/perf/Makefile | 2 +
tools/perf/builtin-inject.c | 8 +-
tools/perf/builtin-kmem.c | 3 +-
tools/perf/builtin-kvm.c | 2 +-
tools/perf/builtin-lock.c | 3 +-
tools/perf/builtin-mem.c | 2 +-
tools/perf/builtin-report.c | 2 +-
tools/perf/builtin-sched.c | 20 +-
tools/perf/builtin-script.c | 3 +-
tools/perf/builtin-top.c | 11 +-
tools/perf/builtin-trace.c | 12 +-
tools/perf/tests/builtin-test.c | 4 +
tools/perf/tests/code-reading.c | 4 +-
tools/perf/tests/hists_link.c | 23 +-
tools/perf/tests/mmap-basic.c | 2 +-
tools/perf/tests/sample-parsing.c | 316 +++++++++++++++++++++++++
tools/perf/tests/tests.h | 1 +
tools/perf/util/build-id.c | 11 +-
tools/perf/util/callchain.c | 8 -
tools/perf/util/callchain.h | 5 -
tools/perf/util/event.c | 5 +-
tools/perf/util/event.h | 34 ++-
tools/perf/util/evlist.c | 146 ++++++++++--
tools/perf/util/evlist.h | 10 +-
tools/perf/util/evsel.c | 471 +++++++++++++++++++++++++++++++++-----
tools/perf/util/evsel.h | 13 +-
tools/perf/util/machine.c | 37 ++-
tools/perf/util/machine.h | 3 +-
tools/perf/util/python.c | 20 ++
tools/perf/util/record.c | 111 +++++++++
tools/perf/util/session.c | 32 +--
tools/perf/util/thread.c | 3 +-
tools/perf/util/thread.h | 3 +-
35 files changed, 1176 insertions(+), 192 deletions(-)
create mode 100644 tools/perf/tests/sample-parsing.c
create mode 100644 tools/perf/util/record.c


Regards
Adrian


2013-08-14 12:42:51

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V11 01/15] perf tools: re-implement debug print function for linking python/perf.so

python/perf.so links a subset of objects. Re-implement 'verbose'
and 'eprintf' so they (and consequently 'pr_debug') can be used
in objects linked into pythin/perf.so. Note 'eprintf' must be
re-implemented because the full version links the browser ui.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/python.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
index 925e0c3..381f4fd 100644
--- a/tools/perf/util/python.c
+++ b/tools/perf/util/python.c
@@ -8,6 +8,26 @@
#include "cpumap.h"
#include "thread_map.h"

+/*
+ * Support debug printing even though util/debug.c is not linked. That means
+ * implementing 'verbose' and 'eprintf'.
+ */
+int verbose;
+
+int eprintf(int level, const char *fmt, ...)
+{
+ va_list args;
+ int ret = 0;
+
+ if (verbose >= level) {
+ va_start(args, fmt);
+ ret = vfprintf(stderr, fmt, args);
+ va_end(args);
+ }
+
+ return ret;
+}
+
/* Define PyVarObject_HEAD_INIT for python 2.5 */
#ifndef PyVarObject_HEAD_INIT
# define PyVarObject_HEAD_INIT(type, size) PyObject_HEAD_INIT(type) size,
--
1.7.11.7

2013-08-14 12:42:57

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V11 02/15] perf tools: add debug prints

It is useful to see the arguments to perf_event_open
and whether the perf events ring buffer was mmapped
per-cpu or per-thread. That information will now be
displayed when verbose is 2 i.e option -vv

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Namhyung Kim <[email protected]>

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/evlist.c | 3 +++
tools/perf/util/evsel.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 69 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index c7d111f..1f5105a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -14,6 +14,7 @@
#include "target.h"
#include "evlist.h"
#include "evsel.h"
+#include "debug.h"
#include <unistd.h>

#include "parse-events.h"
@@ -486,6 +487,7 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist, int prot, int m
int nr_cpus = cpu_map__nr(evlist->cpus);
int nr_threads = thread_map__nr(evlist->threads);

+ pr_debug2("perf event ring buffer mmapped per cpu\n");
for (cpu = 0; cpu < nr_cpus; cpu++) {
int output = -1;

@@ -524,6 +526,7 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist, int prot, in
int thread;
int nr_threads = thread_map__nr(evlist->threads);

+ pr_debug2("perf event ring buffer mmapped per thread\n");
for (thread = 0; thread < nr_threads; thread++) {
int output = -1;

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 960394e..ad5f701 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -21,6 +21,7 @@
#include "thread_map.h"
#include "target.h"
#include "perf_regs.h"
+#include "debug.h"

static struct {
bool sample_id_all;
@@ -861,6 +862,65 @@ static int get_group_fd(struct perf_evsel *evsel, int cpu, int thread)
return fd;
}

+#define __PRINT_ATTR(fmt, cast, field) \
+ fprintf(fp, " %-19s "fmt"\n", #field, cast attr->field)
+
+#define PRINT_ATTR_U32(field) __PRINT_ATTR("%u" , , field)
+#define PRINT_ATTR_X32(field) __PRINT_ATTR("%#x", , field)
+#define PRINT_ATTR_U64(field) __PRINT_ATTR("%" PRIu64, (uint64_t), field)
+#define PRINT_ATTR_X64(field) __PRINT_ATTR("%#"PRIx64, (uint64_t), field)
+
+#define PRINT_ATTR2N(name1, field1, name2, field2) \
+ fprintf(fp, " %-19s %u %-19s %u\n", \
+ name1, attr->field1, name2, attr->field2)
+
+#define PRINT_ATTR2(field1, field2) \
+ PRINT_ATTR2N(#field1, field1, #field2, field2)
+
+static size_t perf_event_attr__fprintf(struct perf_event_attr *attr, FILE *fp)
+{
+ size_t ret = 0;
+
+ ret += fprintf(fp, "%.60s\n", graph_dotted_line);
+ ret += fprintf(fp, "perf_event_attr:\n");
+
+ ret += PRINT_ATTR_U32(type);
+ ret += PRINT_ATTR_U32(size);
+ ret += PRINT_ATTR_X64(config);
+ ret += PRINT_ATTR_U64(sample_period);
+ ret += PRINT_ATTR_U64(sample_freq);
+ ret += PRINT_ATTR_X64(sample_type);
+ ret += PRINT_ATTR_X64(read_format);
+
+ ret += PRINT_ATTR2(disabled, inherit);
+ ret += PRINT_ATTR2(pinned, exclusive);
+ ret += PRINT_ATTR2(exclude_user, exclude_kernel);
+ ret += PRINT_ATTR2(exclude_hv, exclude_idle);
+ ret += PRINT_ATTR2(mmap, comm);
+ ret += PRINT_ATTR2(freq, inherit_stat);
+ ret += PRINT_ATTR2(enable_on_exec, task);
+ ret += PRINT_ATTR2(watermark, precise_ip);
+ ret += PRINT_ATTR2(mmap_data, sample_id_all);
+ ret += PRINT_ATTR2(exclude_host, exclude_guest);
+ ret += PRINT_ATTR2N("excl.callchain_kern", exclude_callchain_kernel,
+ "excl.callchain_user", exclude_callchain_user);
+
+ ret += PRINT_ATTR_U32(wakeup_events);
+ ret += PRINT_ATTR_U32(wakeup_watermark);
+ ret += PRINT_ATTR_X32(bp_type);
+ ret += PRINT_ATTR_X64(bp_addr);
+ ret += PRINT_ATTR_X64(config1);
+ ret += PRINT_ATTR_U64(bp_len);
+ ret += PRINT_ATTR_X64(config2);
+ ret += PRINT_ATTR_X64(branch_sample_type);
+ ret += PRINT_ATTR_X64(sample_regs_user);
+ ret += PRINT_ATTR_U32(sample_stack_user);
+
+ ret += fprintf(fp, "%.60s\n", graph_dotted_line);
+
+ return ret;
+}
+
static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
struct thread_map *threads)
{
@@ -884,6 +944,9 @@ retry_sample_id:
if (perf_missing_features.sample_id_all)
evsel->attr.sample_id_all = 0;

+ if (verbose >= 2)
+ perf_event_attr__fprintf(&evsel->attr, stderr);
+
for (cpu = 0; cpu < cpus->nr; cpu++) {

for (thread = 0; thread < threads->nr; thread++) {
@@ -894,6 +957,9 @@ retry_sample_id:

group_fd = get_group_fd(evsel, cpu, thread);

+ pr_debug2("perf_event_open: ");
+ pr_debug2("pid %d cpu %d group_fd %d flags %#lx\n",
+ pid, cpus->map[cpu], group_fd, flags);
FD(evsel, cpu, thread) = sys_perf_event_open(&evsel->attr,
pid,
cpus->map[cpu],
--
1.7.11.7

2013-08-14 12:43:05

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V11 04/15] perf tools: add pid to struct thread

Record pid on struct thread. The member is named 'pid_'
to avoid confusion with the 'tid' member which was previously
named 'pid'.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: David Ahern <[email protected]>
---
tools/perf/util/machine.c | 16 +++++++++++-----
tools/perf/util/thread.c | 3 ++-
tools/perf/util/thread.h | 3 ++-
3 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 4514e7e..40d92f4 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -253,7 +253,8 @@ void machines__set_id_hdr_size(struct machines *machines, u16 id_hdr_size)
return;
}

-static struct thread *__machine__findnew_thread(struct machine *machine, pid_t tid,
+static struct thread *__machine__findnew_thread(struct machine *machine,
+ pid_t pid, pid_t tid,
bool create)
{
struct rb_node **p = &machine->threads.rb_node;
@@ -265,8 +266,11 @@ static struct thread *__machine__findnew_thread(struct machine *machine, pid_t t
* so most of the time we dont have to look up
* the full rbtree:
*/
- if (machine->last_match && machine->last_match->tid == tid)
+ if (machine->last_match && machine->last_match->tid == tid) {
+ if (pid && pid != machine->last_match->pid_)
+ machine->last_match->pid_ = pid;
return machine->last_match;
+ }

while (*p != NULL) {
parent = *p;
@@ -274,6 +278,8 @@ static struct thread *__machine__findnew_thread(struct machine *machine, pid_t t

if (th->tid == tid) {
machine->last_match = th;
+ if (pid && pid != th->pid_)
+ th->pid_ = pid;
return th;
}

@@ -286,7 +292,7 @@ static struct thread *__machine__findnew_thread(struct machine *machine, pid_t t
if (!create)
return NULL;

- th = thread__new(tid);
+ th = thread__new(pid, tid);
if (th != NULL) {
rb_link_node(&th->rb_node, parent, p);
rb_insert_color(&th->rb_node, &machine->threads);
@@ -298,12 +304,12 @@ static struct thread *__machine__findnew_thread(struct machine *machine, pid_t t

struct thread *machine__findnew_thread(struct machine *machine, pid_t tid)
{
- return __machine__findnew_thread(machine, tid, true);
+ return __machine__findnew_thread(machine, 0, tid, true);
}

struct thread *machine__find_thread(struct machine *machine, pid_t tid)
{
- return __machine__findnew_thread(machine, tid, false);
+ return __machine__findnew_thread(machine, 0, tid, false);
}

int machine__process_comm_event(struct machine *machine, union perf_event *event)
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 6feeb88..e3d4a55 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -7,12 +7,13 @@
#include "util.h"
#include "debug.h"

-struct thread *thread__new(pid_t tid)
+struct thread *thread__new(pid_t pid, pid_t tid)
{
struct thread *self = zalloc(sizeof(*self));

if (self != NULL) {
map_groups__init(&self->mg);
+ self->pid_ = pid;
self->tid = tid;
self->ppid = -1;
self->comm = malloc(32);
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 13c62c9..927b37f 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -12,6 +12,7 @@ struct thread {
struct list_head node;
};
struct map_groups mg;
+ pid_t pid_; /* Not all tools update this */
pid_t tid;
pid_t ppid;
char shortname[3];
@@ -24,7 +25,7 @@ struct thread {

struct machine;

-struct thread *thread__new(pid_t tid);
+struct thread *thread__new(pid_t pid, pid_t tid);
void thread__delete(struct thread *self);

int thread__set_comm(struct thread *self, const char *comm);
--
1.7.11.7

2013-08-14 12:43:09

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V11 05/15] perf tools: change machine__findnew_thread() to set thread pid

Add a new parameter for 'pid' to machine__findnew_thread().
Change callers to pass 'pid' when it is known.

Note that callers sometimes want to find the main thread
which has the memory maps. The main thread has tid == pid
so the usage in that case is:

machine__findnew_thread(machine, pid, pid)

whereas the usage to find the specific thread is:

machine__findnew_thread(machine, pid, tid)

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: David Ahern <[email protected]>
---
tools/perf/builtin-inject.c | 2 +-
tools/perf/builtin-kmem.c | 3 ++-
tools/perf/builtin-kvm.c | 2 +-
tools/perf/builtin-lock.c | 3 ++-
tools/perf/builtin-sched.c | 20 +++++++++++---------
tools/perf/builtin-script.c | 3 ++-
tools/perf/builtin-trace.c | 12 +++++++++---
tools/perf/tests/code-reading.c | 4 ++--
tools/perf/tests/hists_link.c | 3 ++-
tools/perf/util/build-id.c | 7 +++++--
tools/perf/util/event.c | 3 ++-
tools/perf/util/machine.c | 23 ++++++++++++++++-------
tools/perf/util/machine.h | 3 ++-
tools/perf/util/session.c | 2 +-
14 files changed, 58 insertions(+), 32 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 1d8de2e..0d4ae1d 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -198,7 +198,7 @@ static int perf_event__inject_buildid(struct perf_tool *tool,

cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;

- thread = machine__findnew_thread(machine, event->ip.pid);
+ thread = machine__findnew_thread(machine, event->ip.pid, event->ip.pid);
if (thread == NULL) {
pr_err("problem processing %d event, skipping it.\n",
event->header.type);
diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index b49f5c5..c324778 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -305,7 +305,8 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
struct perf_evsel *evsel,
struct machine *machine)
{
- struct thread *thread = machine__findnew_thread(machine, event->ip.pid);
+ struct thread *thread = machine__findnew_thread(machine, event->ip.pid,
+ event->ip.pid);

if (thread == NULL) {
pr_debug("problem processing %d event, skipping it.\n",
diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index fa2f3d7..373094d 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -815,7 +815,7 @@ static int process_sample_event(struct perf_tool *tool,
if (skip_sample(kvm, sample))
return 0;

- thread = machine__findnew_thread(machine, sample->tid);
+ thread = machine__findnew_thread(machine, sample->pid, sample->tid);
if (thread == NULL) {
pr_debug("problem processing %d event, skipping it.\n",
event->header.type);
diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index 76543a4..ee33ba2 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -805,7 +805,8 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
struct perf_evsel *evsel,
struct machine *machine)
{
- struct thread *thread = machine__findnew_thread(machine, sample->tid);
+ struct thread *thread = machine__findnew_thread(machine, sample->pid,
+ sample->tid);

if (thread == NULL) {
pr_debug("problem processing %d event, skipping it.\n",
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index f809cc7..d8c51b2 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -724,8 +724,10 @@ static int replay_fork_event(struct perf_sched *sched,
{
struct thread *child, *parent;

- child = machine__findnew_thread(machine, event->fork.tid);
- parent = machine__findnew_thread(machine, event->fork.ptid);
+ child = machine__findnew_thread(machine, event->fork.pid,
+ event->fork.tid);
+ parent = machine__findnew_thread(machine, event->fork.ppid,
+ event->fork.ptid);

if (child == NULL || parent == NULL) {
pr_debug("thread does not exist on fork event: child %p, parent %p\n",
@@ -934,8 +936,8 @@ static int latency_switch_event(struct perf_sched *sched,
return -1;
}

- sched_out = machine__findnew_thread(machine, prev_pid);
- sched_in = machine__findnew_thread(machine, next_pid);
+ sched_out = machine__findnew_thread(machine, 0, prev_pid);
+ sched_in = machine__findnew_thread(machine, 0, next_pid);

out_events = thread_atoms_search(&sched->atom_root, sched_out, &sched->cmp_pid);
if (!out_events) {
@@ -978,7 +980,7 @@ static int latency_runtime_event(struct perf_sched *sched,
{
const u32 pid = perf_evsel__intval(evsel, sample, "pid");
const u64 runtime = perf_evsel__intval(evsel, sample, "runtime");
- struct thread *thread = machine__findnew_thread(machine, pid);
+ struct thread *thread = machine__findnew_thread(machine, 0, pid);
struct work_atoms *atoms = thread_atoms_search(&sched->atom_root, thread, &sched->cmp_pid);
u64 timestamp = sample->time;
int cpu = sample->cpu;
@@ -1016,7 +1018,7 @@ static int latency_wakeup_event(struct perf_sched *sched,
if (!success)
return 0;

- wakee = machine__findnew_thread(machine, pid);
+ wakee = machine__findnew_thread(machine, 0, pid);
atoms = thread_atoms_search(&sched->atom_root, wakee, &sched->cmp_pid);
if (!atoms) {
if (thread_atoms_insert(sched, wakee))
@@ -1070,7 +1072,7 @@ static int latency_migrate_task_event(struct perf_sched *sched,
if (sched->profile_cpu == -1)
return 0;

- migrant = machine__findnew_thread(machine, pid);
+ migrant = machine__findnew_thread(machine, 0, pid);
atoms = thread_atoms_search(&sched->atom_root, migrant, &sched->cmp_pid);
if (!atoms) {
if (thread_atoms_insert(sched, migrant))
@@ -1289,8 +1291,8 @@ static int map_switch_event(struct perf_sched *sched, struct perf_evsel *evsel,
return -1;
}

- sched_out = machine__findnew_thread(machine, prev_pid);
- sched_in = machine__findnew_thread(machine, next_pid);
+ sched_out = machine__findnew_thread(machine, 0, prev_pid);
+ sched_in = machine__findnew_thread(machine, 0, next_pid);

sched->curr_thread[this_cpu] = sched_in;

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 2ad9d5b..d82712f 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -501,7 +501,8 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
struct machine *machine)
{
struct addr_location al;
- struct thread *thread = machine__findnew_thread(machine, event->ip.tid);
+ struct thread *thread = machine__findnew_thread(machine, event->ip.pid,
+ event->ip.tid);

if (thread == NULL) {
pr_debug("problem processing %d event, skipping it.\n",
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index da7ae01..2b6cd59 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -302,7 +302,9 @@ static int trace__sys_enter(struct trace *trace, struct perf_evsel *evsel,
char *msg;
void *args;
size_t printed = 0;
- struct thread *thread = machine__findnew_thread(&trace->host, sample->tid);
+ struct thread *thread = machine__findnew_thread(&trace->host,
+ sample->pid,
+ sample->tid);
struct syscall *sc = trace__syscall_info(trace, evsel, sample);
struct thread_trace *ttrace = thread__trace(thread);

@@ -345,7 +347,9 @@ static int trace__sys_exit(struct trace *trace, struct perf_evsel *evsel,
{
int ret;
u64 duration = 0;
- struct thread *thread = machine__findnew_thread(&trace->host, sample->tid);
+ struct thread *thread = machine__findnew_thread(&trace->host,
+ sample->pid,
+ sample->tid);
struct thread_trace *ttrace = thread__trace(thread);
struct syscall *sc = trace__syscall_info(trace, evsel, sample);

@@ -398,7 +402,9 @@ static int trace__sched_stat_runtime(struct trace *trace, struct perf_evsel *evs
{
u64 runtime = perf_evsel__intval(evsel, sample, "runtime");
double runtime_ms = (double)runtime / NSEC_PER_MSEC;
- struct thread *thread = machine__findnew_thread(&trace->host, sample->tid);
+ struct thread *thread = machine__findnew_thread(&trace->host,
+ sample->pid,
+ sample->tid);
struct thread_trace *ttrace = thread__trace(thread);

if (ttrace == NULL)
diff --git a/tools/perf/tests/code-reading.c b/tools/perf/tests/code-reading.c
index eec1421..34696ca 100644
--- a/tools/perf/tests/code-reading.c
+++ b/tools/perf/tests/code-reading.c
@@ -257,7 +257,7 @@ static int process_sample_event(struct machine *machine,
return -1;
}

- thread = machine__findnew_thread(machine, sample.pid);
+ thread = machine__findnew_thread(machine, sample.pid, sample.pid);
if (!thread) {
pr_debug("machine__findnew_thread failed\n");
return -1;
@@ -447,7 +447,7 @@ static int do_test_code_reading(bool try_kcore)
goto out_err;
}

- thread = machine__findnew_thread(machine, pid);
+ thread = machine__findnew_thread(machine, pid, pid);
if (!thread) {
pr_debug("machine__findnew_thread failed\n");
goto out_err;
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 50bfb01..87f9f72 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -88,7 +88,8 @@ static struct machine *setup_fake_machine(struct machines *machines)
for (i = 0; i < ARRAY_SIZE(fake_threads); i++) {
struct thread *thread;

- thread = machine__findnew_thread(machine, fake_threads[i].pid);
+ thread = machine__findnew_thread(machine, fake_threads[i].pid,
+ fake_threads[i].pid);
if (thread == NULL)
goto out;

diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index 5295625..0f9d27a 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -24,7 +24,8 @@ int build_id__mark_dso_hit(struct perf_tool *tool __maybe_unused,
{
struct addr_location al;
u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
- struct thread *thread = machine__findnew_thread(machine, event->ip.pid);
+ struct thread *thread = machine__findnew_thread(machine, event->ip.pid,
+ event->ip.pid);

if (thread == NULL) {
pr_err("problem processing %d event, skipping it.\n",
@@ -47,7 +48,9 @@ static int perf_event__exit_del_thread(struct perf_tool *tool __maybe_unused,
__maybe_unused,
struct machine *machine)
{
- struct thread *thread = machine__findnew_thread(machine, event->fork.tid);
+ struct thread *thread = machine__findnew_thread(machine,
+ event->fork.pid,
+ event->fork.tid);

dump_printf("(%d:%d):(%d:%d)\n", event->fork.pid, event->fork.tid,
event->fork.ppid, event->fork.ptid);
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 49713ae..61cecf9 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -686,7 +686,8 @@ int perf_event__preprocess_sample(const union perf_event *event,
struct perf_sample *sample)
{
u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
- struct thread *thread = machine__findnew_thread(machine, event->ip.pid);
+ struct thread *thread = machine__findnew_thread(machine, event->ip.pid,
+ event->ip.pid);

if (thread == NULL)
return -1;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 40d92f4..7a934ea 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -32,7 +32,8 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
return -ENOMEM;

if (pid != HOST_KERNEL_ID) {
- struct thread *thread = machine__findnew_thread(machine, pid);
+ struct thread *thread = machine__findnew_thread(machine, 0,
+ pid);
char comm[64];

if (thread == NULL)
@@ -302,9 +303,10 @@ static struct thread *__machine__findnew_thread(struct machine *machine,
return th;
}

-struct thread *machine__findnew_thread(struct machine *machine, pid_t tid)
+struct thread *machine__findnew_thread(struct machine *machine, pid_t pid,
+ pid_t tid)
{
- return __machine__findnew_thread(machine, 0, tid, true);
+ return __machine__findnew_thread(machine, pid, tid, true);
}

struct thread *machine__find_thread(struct machine *machine, pid_t tid)
@@ -314,7 +316,9 @@ struct thread *machine__find_thread(struct machine *machine, pid_t tid)

int machine__process_comm_event(struct machine *machine, union perf_event *event)
{
- struct thread *thread = machine__findnew_thread(machine, event->comm.tid);
+ struct thread *thread = machine__findnew_thread(machine,
+ event->comm.pid,
+ event->comm.tid);

if (dump_trace)
perf_event__fprintf_comm(event, stdout);
@@ -1012,7 +1016,8 @@ int machine__process_mmap_event(struct machine *machine, union perf_event *event
return 0;
}

- thread = machine__findnew_thread(machine, event->mmap.pid);
+ thread = machine__findnew_thread(machine, event->mmap.pid,
+ event->mmap.pid);
if (thread == NULL)
goto out_problem;

@@ -1039,8 +1044,12 @@ out_problem:

int machine__process_fork_event(struct machine *machine, union perf_event *event)
{
- struct thread *thread = machine__findnew_thread(machine, event->fork.tid);
- struct thread *parent = machine__findnew_thread(machine, event->fork.ptid);
+ struct thread *thread = machine__findnew_thread(machine,
+ event->fork.pid,
+ event->fork.tid);
+ struct thread *parent = machine__findnew_thread(machine,
+ event->fork.ppid,
+ event->fork.ptid);

if (dump_trace)
perf_event__fprintf_task(event, stdout);
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 603ffba..0df925b 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -106,7 +106,8 @@ static inline bool machine__is_host(struct machine *machine)
return machine ? machine->pid == HOST_KERNEL_ID : false;
}

-struct thread *machine__findnew_thread(struct machine *machine, pid_t tid);
+struct thread *machine__findnew_thread(struct machine *machine, pid_t pid,
+ pid_t tid);

size_t machine__fprintf(struct machine *machine, FILE *fp);

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 0e7ae17..da0a216 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1099,7 +1099,7 @@ void perf_event_header__bswap(struct perf_event_header *self)

struct thread *perf_session__findnew(struct perf_session *session, pid_t pid)
{
- return machine__findnew_thread(&session->machines.host, pid);
+ return machine__findnew_thread(&session->machines.host, 0, pid);
}

static struct thread *perf_session__register_idle_thread(struct perf_session *self)
--
1.7.11.7

2013-08-14 12:43:18

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V11 07/15] perf tools: remove unnecessary callchain validation

Now that the sample parsing correctly checks data sizes
there is no reason for it to be done again for callchains.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
---
tools/perf/util/callchain.c | 8 --------
tools/perf/util/callchain.h | 5 -----
tools/perf/util/session.c | 20 --------------------
3 files changed, 33 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 4fee33b..482f680 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -21,14 +21,6 @@

__thread struct callchain_cursor callchain_cursor;

-bool ip_callchain__valid(struct ip_callchain *chain,
- const union perf_event *event)
-{
- unsigned int chain_size = event->header.size;
- chain_size -= (unsigned long)&event->ip.__more_data - (unsigned long)event;
- return chain->nr * sizeof(u64) <= chain_size;
-}
-
#define chain_for_each_child(child, parent) \
list_for_each_entry(child, &parent->children, siblings)

diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 812d5a0..2b585bc 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -109,11 +109,6 @@ int callchain_append(struct callchain_root *root,
int callchain_merge(struct callchain_cursor *cursor,
struct callchain_root *dst, struct callchain_root *src);

-struct ip_callchain;
-union perf_event;
-
-bool ip_callchain__valid(struct ip_callchain *chain,
- const union perf_event *event);
/*
* Initialize a cursor before adding entries inside, but keep
* the previously allocated entries as a cache.
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index da0a216..1d9ae82 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -997,22 +997,6 @@ static int perf_session_deliver_event(struct perf_session *session,
}
}

-static int perf_session__preprocess_sample(struct perf_session *session,
- union perf_event *event, struct perf_sample *sample)
-{
- if (event->header.type != PERF_RECORD_SAMPLE ||
- !sample->callchain)
- return 0;
-
- if (!ip_callchain__valid(sample->callchain, event)) {
- pr_debug("call-chain problem with event, skipping it.\n");
- ++session->stats.nr_invalid_chains;
- session->stats.total_invalid_chains += sample->period;
- return -EINVAL;
- }
- return 0;
-}
-
static int perf_session__process_user_event(struct perf_session *session, union perf_event *event,
struct perf_tool *tool, u64 file_offset)
{
@@ -1075,10 +1059,6 @@ static int perf_session__process_event(struct perf_session *session,
if (ret)
return ret;

- /* Preprocess sample records - precheck callchains */
- if (perf_session__preprocess_sample(session, event, &sample))
- return 0;
-
if (tool->ordered_samples) {
ret = perf_session_queue_event(session, event, &sample,
file_offset);
--
1.7.11.7

2013-08-14 12:43:24

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V11 09/15] perf: make events stream always parsable

The event stream is not always parsable because the format of a sample
is dependent on the sample_type of the selected event. When there
is more than one selected event and the sample_types are not the
same then parsing becomes problematic. A sample can be matched to its
selected event using the ID that is allocated when the event is opened.
Unfortunately, to get the ID from the sample means first parsing it.

This patch adds a new sample format bit PERF_SAMPLE_IDENTIFER that puts
the ID at a fixed position so that the ID can be retrieved without
parsing the sample. For sample events, that is the first position
immediately after the header. For non-sample events, that is the last
position.

In this respect parsing samples requires that the sample_type and ID
values are recorded. For example, perf tools records struct perf_event_attr
and the IDs within the perf.data file. Those must be read first
before it is possible to parse samples found later in the perf.data file.

Signed-off-by: Adrian Hunter <[email protected]>
---
include/uapi/linux/perf_event.h | 27 ++++++++++++++++++++-------
kernel/events/core.c | 11 ++++++++++-
2 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 62c25a2..42cb7b6 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -134,8 +134,9 @@ enum perf_event_sample_format {
PERF_SAMPLE_STACK_USER = 1U << 13,
PERF_SAMPLE_WEIGHT = 1U << 14,
PERF_SAMPLE_DATA_SRC = 1U << 15,
+ PERF_SAMPLE_IDENTIFIER = 1U << 16,

- PERF_SAMPLE_MAX = 1U << 16, /* non-ABI */
+ PERF_SAMPLE_MAX = 1U << 17, /* non-ABI */
};

/*
@@ -492,12 +493,12 @@ enum perf_event_type {
/*
* If perf_event_attr.sample_id_all is set then all event types will
* have the sample_type selected fields related to where/when
- * (identity) an event took place (TID, TIME, ID, CPU, STREAM_ID)
- * described in PERF_RECORD_SAMPLE below, it will be stashed just after
- * the perf_event_header and the fields already present for the existing
- * fields, i.e. at the end of the payload. That way a newer perf.data
- * file will be supported by older perf tools, with these new optional
- * fields being ignored.
+ * (identity) an event took place (TID, TIME, ID, STREAM_ID, CPU,
+ * IDENTIFIER) described in PERF_RECORD_SAMPLE below, it will be stashed
+ * just after the perf_event_header and the fields already present for
+ * the existing fields, i.e. at the end of the payload. That way a newer
+ * perf.data file will be supported by older perf tools, with these new
+ * optional fields being ignored.
*
* struct sample_id {
* { u32 pid, tid; } && PERF_SAMPLE_TID
@@ -505,7 +506,12 @@ enum perf_event_type {
* { u64 id; } && PERF_SAMPLE_ID
* { u64 stream_id;} && PERF_SAMPLE_STREAM_ID
* { u32 cpu, res; } && PERF_SAMPLE_CPU
+ * { u64 id; } && PERF_SAMPLE_IDENTIFIER
* } && perf_event_attr::sample_id_all
+ *
+ * Note that PERF_SAMPLE_IDENTIFIER duplicates PERF_SAMPLE_ID. The
+ * advantage of PERF_SAMPLE_IDENTIFIER is that its position is fixed
+ * relative to header.size.
*/

/*
@@ -594,6 +600,13 @@ enum perf_event_type {
* struct {
* struct perf_event_header header;
*
+ * #
+ * # Note that PERF_SAMPLE_IDENTIFIER duplicates PERF_SAMPLE_ID.
+ * # The advantage of PERF_SAMPLE_IDENTIFIER is that its position
+ * # is fixed relative to header.
+ * #
+ *
+ * { u64 id; } && PERF_SAMPLE_IDENTIFIER
* { u64 ip; } && PERF_SAMPLE_IP
* { u32 pid, tid; } && PERF_SAMPLE_TID
* { u64 time; } && PERF_SAMPLE_TIME
diff --git a/kernel/events/core.c b/kernel/events/core.c
index e82e700..776e17d 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1213,6 +1213,9 @@ static void perf_event__id_header_size(struct perf_event *event)
if (sample_type & PERF_SAMPLE_TIME)
size += sizeof(data->time);

+ if (sample_type & PERF_SAMPLE_IDENTIFIER)
+ size += sizeof(data->id);
+
if (sample_type & PERF_SAMPLE_ID)
size += sizeof(data->id);

@@ -4284,7 +4287,7 @@ static void __perf_event_header__init_id(struct perf_event_header *header,
if (sample_type & PERF_SAMPLE_TIME)
data->time = perf_clock();

- if (sample_type & PERF_SAMPLE_ID)
+ if (sample_type & (PERF_SAMPLE_ID | PERF_SAMPLE_IDENTIFIER))
data->id = primary_event_id(event);

if (sample_type & PERF_SAMPLE_STREAM_ID)
@@ -4323,6 +4326,9 @@ static void __perf_event__output_id_sample(struct perf_output_handle *handle,

if (sample_type & PERF_SAMPLE_CPU)
perf_output_put(handle, data->cpu_entry);
+
+ if (sample_type & PERF_SAMPLE_IDENTIFIER)
+ perf_output_put(handle, data->id);
}

void perf_event__output_id_sample(struct perf_event *event,
@@ -4436,6 +4442,9 @@ void perf_output_sample(struct perf_output_handle *handle,

perf_output_put(handle, *header);

+ if (sample_type & PERF_SAMPLE_IDENTIFIER)
+ perf_output_put(handle, data->id);
+
if (sample_type & PERF_SAMPLE_IP)
perf_output_put(handle, data->ip);

--
1.7.11.7

2013-08-14 12:43:38

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V11 13/15] perf tools: expand perf_event__synthesize_sample()

Expand perf_event__synthesize_sample() to handle all
sample format bits.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/builtin-inject.c | 4 +-
tools/perf/util/event.h | 1 +
tools/perf/util/evsel.c | 95 ++++++++++++++++++++++++++++++++++++++++++++-
3 files changed, 97 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index ffacd46..9b336fd 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -301,7 +301,9 @@ found:
sample_sw.period = sample->period;
sample_sw.time = sample->time;
perf_event__synthesize_sample(event_sw, evsel->attr.sample_type,
- &sample_sw, false);
+ evsel->attr.sample_regs_user,
+ evsel->attr.read_format, &sample_sw,
+ false);
build_id__mark_dso_hit(tool, event_sw, &sample_sw, evsel, machine);
return perf_event__repipe(tool, event_sw, &sample_sw, machine);
}
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 6bb5d7c..84f704f 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -246,6 +246,7 @@ int perf_event__preprocess_sample(const union perf_event *self,
const char *perf_event__name(unsigned int id);

int perf_event__synthesize_sample(union perf_event *event, u64 type,
+ u64 sample_regs_user, u64 read_format,
const struct perf_sample *sample,
bool swapped);

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 81c170f..4c97f36 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1389,7 +1389,6 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
}

if (type & PERF_SAMPLE_REGS_USER) {
- /* First u64 tells us if we have any regs in sample. */
OVERFLOW_CHECK_u64(array);
data->user_regs.abi = *array;
array++;
@@ -1440,11 +1439,12 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
}

int perf_event__synthesize_sample(union perf_event *event, u64 type,
+ u64 sample_regs_user, u64 read_format,
const struct perf_sample *sample,
bool swapped)
{
u64 *array;
-
+ size_t sz;
/*
* used for cross-endian analysis. See git commit 65014ab3
* for why this goofiness is needed.
@@ -1517,6 +1517,97 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type,
array++;
}

+ if (type & PERF_SAMPLE_READ) {
+ if (read_format & PERF_FORMAT_GROUP)
+ *array = sample->read.group.nr;
+ else
+ *array = sample->read.one.value;
+ array++;
+
+ if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED) {
+ *array = sample->read.time_enabled;
+ array++;
+ }
+
+ if (read_format & PERF_FORMAT_TOTAL_TIME_RUNNING) {
+ *array = sample->read.time_running;
+ array++;
+ }
+
+ /* PERF_FORMAT_ID is forced for PERF_SAMPLE_READ */
+ if (read_format & PERF_FORMAT_GROUP) {
+ sz = sample->read.group.nr *
+ sizeof(struct sample_read_value);
+ memcpy(array, sample->read.group.values, sz);
+ array = (void *)array + sz;
+ } else {
+ *array = sample->read.one.id;
+ array++;
+ }
+ }
+
+ if (type & PERF_SAMPLE_CALLCHAIN) {
+ sz = (sample->callchain->nr + 1) * sizeof(u64);
+ memcpy(array, sample->callchain, sz);
+ array = (void *)array + sz;
+ }
+
+ if (type & PERF_SAMPLE_RAW) {
+ u.val32[0] = sample->raw_size;
+ if (WARN_ONCE(swapped,
+ "Endianness of raw data not corrected!\n")) {
+ /*
+ * Inverse of what is done in perf_evsel__parse_sample
+ */
+ u.val32[0] = bswap_32(u.val32[0]);
+ u.val32[1] = bswap_32(u.val32[1]);
+ u.val64 = bswap_64(u.val64);
+ }
+ *array = u.val64;
+ array = (void *)array + sizeof(u32);
+
+ memcpy(array, sample->raw_data, sample->raw_size);
+ array = (void *)array + sample->raw_size;
+ }
+
+ if (type & PERF_SAMPLE_BRANCH_STACK) {
+ sz = sample->branch_stack->nr * sizeof(struct branch_entry);
+ sz += sizeof(u64);
+ memcpy(array, sample->branch_stack, sz);
+ array = (void *)array + sz;
+ }
+
+ if (type & PERF_SAMPLE_REGS_USER) {
+ if (sample->user_regs.abi) {
+ *array++ = sample->user_regs.abi;
+ sz = hweight_long(sample_regs_user) * sizeof(u64);
+ memcpy(array, sample->user_regs.regs, sz);
+ array = (void *)array + sz;
+ } else {
+ *array++ = 0;
+ }
+ }
+
+ if (type & PERF_SAMPLE_STACK_USER) {
+ sz = sample->user_stack.size;
+ *array++ = sz;
+ if (sz) {
+ memcpy(array, sample->user_stack.data, sz);
+ array = (void *)array + sz;
+ *array++ = sz;
+ }
+ }
+
+ if (type & PERF_SAMPLE_WEIGHT) {
+ *array = sample->weight;
+ array++;
+ }
+
+ if (type & PERF_SAMPLE_DATA_SRC) {
+ *array = sample->data_src;
+ array++;
+ }
+
return 0;
}

--
1.7.11.7

2013-08-14 12:43:32

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V11 11/15] perf tools: add support for PERF_SAMPLE_IDENTFIER

Enable parsing of samples with sample format bit
PERF_SAMPLE_IDENTFIER. In addition, if the kernel supports
it, prefer it to selecting PERF_SAMPLE_ID thereby avoiding
the need to force compatible sample types.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/tests/mmap-basic.c | 2 +-
tools/perf/util/event.h | 9 +++--
tools/perf/util/evlist.c | 5 ++-
tools/perf/util/evlist.h | 2 +
tools/perf/util/evsel.c | 45 +++++++++++++++++----
tools/perf/util/evsel.h | 3 +-
tools/perf/util/record.c | 94 ++++++++++++++++++++++++++++++++++++++++---
7 files changed, 141 insertions(+), 19 deletions(-)

diff --git a/tools/perf/tests/mmap-basic.c b/tools/perf/tests/mmap-basic.c
index 5b1b5ab..c4185b9 100644
--- a/tools/perf/tests/mmap-basic.c
+++ b/tools/perf/tests/mmap-basic.c
@@ -72,7 +72,7 @@ int test__basic_mmap(void)
}

evsels[i]->attr.wakeup_events = 1;
- perf_evsel__set_sample_id(evsels[i]);
+ perf_evsel__set_sample_id(evsels[i], false);

perf_evlist__add(evlist, evsels[i]);

diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index de4bfe0..b0921fe 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -53,7 +53,8 @@ struct read_event {
(PERF_SAMPLE_IP | PERF_SAMPLE_TID | \
PERF_SAMPLE_TIME | PERF_SAMPLE_ADDR | \
PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
- PERF_SAMPLE_CPU | PERF_SAMPLE_PERIOD)
+ PERF_SAMPLE_CPU | PERF_SAMPLE_PERIOD | \
+ PERF_SAMPLE_IDENTIFIER)

/*
* Events have compatible sample types if the following bits all have the same
@@ -61,13 +62,15 @@ struct read_event {
* events the order is: PERF_SAMPLE_IP, PERF_SAMPLE_TID, PERF_SAMPLE_TIME,
* PERF_SAMPLE_ADDR, PERF_SAMPLE_ID. For non-sample events the sample members
* are accessed in reverse order. The order is: PERF_SAMPLE_ID,
- * PERF_SAMPLE_STREAM_ID, PERF_SAMPLE_CPU.
+ * PERF_SAMPLE_STREAM_ID, PERF_SAMPLE_CPU. PERF_SAMPLE_IDENTIFIER is added for
+ * completeness but it should not be used with PERF_SAMPLE_ID. Sample types
+ * that include PERF_SAMPLE_IDENTIFIER are always compatible.
*/
#define PERF_COMPAT_MASK \
(PERF_SAMPLE_IP | PERF_SAMPLE_TID | \
PERF_SAMPLE_TIME | PERF_SAMPLE_ADDR | \
PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
- PERF_SAMPLE_CPU)
+ PERF_SAMPLE_CPU | PERF_SAMPLE_IDENTIFIER)

struct sample_event {
struct perf_event_header header;
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index e734926..bf8155c 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -56,7 +56,7 @@ struct perf_evlist *perf_evlist__new(void)
* Events with compatible sample types all have the same id_pos
* and is_pos. For convenience, put a copy on evlist.
*/
-static void perf_evlist__set_id_pos(struct perf_evlist *evlist)
+void perf_evlist__set_id_pos(struct perf_evlist *evlist)
{
struct perf_evsel *first = perf_evlist__first(evlist);

@@ -848,6 +848,9 @@ u16 perf_evlist__id_hdr_size(struct perf_evlist *evlist)

if (sample_type & PERF_SAMPLE_CPU)
size += sizeof(data->cpu) * 2;
+
+ if (sample_type & PERF_SAMPLE_IDENTIFIER)
+ size += sizeof(data->id);
out:
return size;
}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 1019758..3eb2c13 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -89,7 +89,9 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *self, int idx);
int perf_evlist__open(struct perf_evlist *evlist);
void perf_evlist__close(struct perf_evlist *evlist);

+void perf_evlist__set_id_pos(struct perf_evlist *evlist);
void perf_evlist__make_sample_types_compatible(struct perf_evlist *evlist);
+bool perf_can_sample_identifier(void);
void perf_evlist__config(struct perf_evlist *evlist,
struct perf_record_opts *opts);

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index e63b82e..4b2819b 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -50,13 +50,17 @@ int __perf_evsel__sample_size(u64 sample_type)
* __perf_evsel__calc_id_pos - calculate id_pos.
* @sample_type: sample type
*
- * This function returns the position of the event id (PERF_SAMPLE_ID) in a
- * sample event i.e. in the array of struct sample_event.
+ * This function returns the position of the event id (PERF_SAMPLE_ID or
+ * PERF_SAMPLE_IDENTIFIER) in a sample event i.e. in the array of struct
+ * sample_event.
*/
static int __perf_evsel__calc_id_pos(u64 sample_type)
{
int idx = 0;

+ if (sample_type & PERF_SAMPLE_IDENTIFIER)
+ return 0;
+
if (!(sample_type & PERF_SAMPLE_ID))
return -1;

@@ -80,13 +84,16 @@ static int __perf_evsel__calc_id_pos(u64 sample_type)
* @sample_type: sample type
*
* This function returns the position (counting backwards) of the event id
- * (PERF_SAMPLE_ID) in a non-sample event i.e. if sample_id_all is used there is
- * an id sample appended to non-sample events.
+ * (PERF_SAMPLE_ID or PERF_SAMPLE_IDENTIFIER) in a non-sample event i.e. if
+ * sample_id_all is used there is an id sample appended to non-sample events.
*/
static int __perf_evsel__calc_is_pos(u64 sample_type)
{
int idx = 1;

+ if (sample_type & PERF_SAMPLE_IDENTIFIER)
+ return 1;
+
if (!(sample_type & PERF_SAMPLE_ID))
return -1;

@@ -135,9 +142,15 @@ void __perf_evsel__reset_sample_bit(struct perf_evsel *evsel,
}
}

-void perf_evsel__set_sample_id(struct perf_evsel *evsel)
+void perf_evsel__set_sample_id(struct perf_evsel *evsel,
+ bool can_sample_identifier)
{
- perf_evsel__set_sample_bit(evsel, ID);
+ if (can_sample_identifier) {
+ perf_evsel__reset_sample_bit(evsel, ID);
+ perf_evsel__set_sample_bit(evsel, IDENTIFIER);
+ } else {
+ perf_evsel__set_sample_bit(evsel, ID);
+ }
evsel->attr.read_format |= PERF_FORMAT_ID;
}

@@ -570,7 +583,7 @@ void perf_evsel__config(struct perf_evsel *evsel,
* We need ID even in case of single event, because
* PERF_SAMPLE_READ process ID specific data.
*/
- perf_evsel__set_sample_id(evsel);
+ perf_evsel__set_sample_id(evsel, false);

/*
* Apply group format only if we belong to group
@@ -1123,6 +1136,11 @@ static int perf_evsel__parse_id_sample(const struct perf_evsel *evsel,
array += ((event->header.size -
sizeof(event->header)) / sizeof(u64)) - 1;

+ if (type & PERF_SAMPLE_IDENTIFIER) {
+ sample->id = *array;
+ array--;
+ }
+
if (type & PERF_SAMPLE_CPU) {
u.val64 = *array;
if (swapped) {
@@ -1219,6 +1237,12 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
if (evsel->sample_size + sizeof(event->header) > event->header.size)
return -EFAULT;

+ data->id = -1ULL;
+ if (type & PERF_SAMPLE_IDENTIFIER) {
+ data->id = *array;
+ array++;
+ }
+
if (type & PERF_SAMPLE_IP) {
data->ip = *array;
array++;
@@ -1249,7 +1273,6 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
array++;
}

- data->id = -1ULL;
if (type & PERF_SAMPLE_ID) {
data->id = *array;
array++;
@@ -1431,6 +1454,11 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type,

array = event->sample.array;

+ if (type & PERF_SAMPLE_IDENTIFIER) {
+ *array = sample->id;
+ array++;
+ }
+
if (type & PERF_SAMPLE_IP) {
*array = sample->ip;
array++;
@@ -1619,6 +1647,7 @@ static int sample_type__fprintf(FILE *fp, bool *first, u64 value)
bit_name(READ), bit_name(CALLCHAIN), bit_name(ID), bit_name(CPU),
bit_name(PERIOD), bit_name(STREAM_ID), bit_name(RAW),
bit_name(BRANCH_STACK), bit_name(REGS_USER), bit_name(STACK_USER),
+ bit_name(IDENTIFIER),
{ .name = NULL, }
};
#undef bit_name
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 3056f4f..97e7a35 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -152,7 +152,8 @@ void __perf_evsel__reset_sample_bit(struct perf_evsel *evsel,
#define perf_evsel__reset_sample_bit(evsel, bit) \
__perf_evsel__reset_sample_bit(evsel, PERF_SAMPLE_##bit)

-void perf_evsel__set_sample_id(struct perf_evsel *evsel);
+void perf_evsel__set_sample_id(struct perf_evsel *evsel,
+ bool use_sample_identifier);

int perf_evsel__set_filter(struct perf_evsel *evsel, int ncpus, int nthreads,
const char *filter);
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 67291e4..6e01bf2 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -1,11 +1,83 @@
#include "evlist.h"
#include "evsel.h"
#include "cpumap.h"
+#include "parse-events.h"
+
+typedef void (*setup_probe_fn_t)(struct perf_evsel *evsel);
+
+static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
+{
+ struct perf_evlist *evlist;
+ struct perf_evsel *evsel;
+ int err = -EAGAIN, fd;
+
+ evlist = perf_evlist__new();
+ if (!evlist)
+ return -ENOMEM;
+
+ if (parse_events(evlist, str))
+ goto out_delete;
+
+ evsel = perf_evlist__first(evlist);
+
+ fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+ if (fd < 0)
+ goto out_delete;
+ close(fd);
+
+ fn(evsel);
+
+ fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+ if (fd < 0) {
+ if (errno == EINVAL)
+ err = -EINVAL;
+ goto out_delete;
+ }
+ close(fd);
+ err = 0;
+
+out_delete:
+ perf_evlist__delete(evlist);
+ return err;
+}
+
+static bool perf_probe_api(setup_probe_fn_t fn)
+{
+ const char *try[] = {"cycles:u", "instructions:u", "cpu-clock", NULL};
+ struct cpu_map *cpus;
+ int cpu, ret, i = 0;
+
+ cpus = cpu_map__new(NULL);
+ if (!cpus)
+ return false;
+ cpu = cpus->map[0];
+ cpu_map__delete(cpus);
+
+ do {
+ ret = perf_do_probe_api(fn, cpu, try[i++]);
+ if (!ret)
+ return true;
+ } while (ret == -EAGAIN && try[i]);
+
+ return false;
+}
+
+static void perf_probe_sample_identifier(struct perf_evsel *evsel)
+{
+ evsel->attr.sample_type |= PERF_SAMPLE_IDENTIFIER;
+}
+
+bool perf_can_sample_identifier(void)
+{
+ return perf_probe_api(perf_probe_sample_identifier);
+}

void perf_evlist__config(struct perf_evlist *evlist,
- struct perf_record_opts *opts)
+ struct perf_record_opts *opts)
{
struct perf_evsel *evsel;
+ bool use_sample_identifier = false;
+
/*
* Set the evsel leader links before we configure attributes,
* since some might depend on this info.
@@ -16,12 +88,24 @@ void perf_evlist__config(struct perf_evlist *evlist,
if (evlist->cpus->map[0] < 0)
opts->no_inherit = true;

- list_for_each_entry(evsel, &evlist->entries, node) {
+ list_for_each_entry(evsel, &evlist->entries, node)
perf_evsel__config(evsel, opts);

- if (evlist->nr_entries > 1)
- perf_evsel__set_sample_id(evsel);
+ if (evlist->nr_entries > 1) {
+ struct perf_evsel *first = perf_evlist__first(evlist);
+
+ list_for_each_entry(evsel, &evlist->entries, node) {
+ if (evsel->attr.sample_type == first->attr.sample_type)
+ continue;
+ use_sample_identifier = perf_can_sample_identifier();
+ break;
+ }
+ list_for_each_entry(evsel, &evlist->entries, node)
+ perf_evsel__set_sample_id(evsel, use_sample_identifier);
}

- perf_evlist__make_sample_types_compatible(evlist);
+ if (use_sample_identifier)
+ perf_evlist__set_id_pos(evlist);
+ else
+ perf_evlist__make_sample_types_compatible(evlist);
}
--
1.7.11.7

2013-08-14 12:43:44

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V11 14/15] perf tools: add a function to calculate sample event size

Add perf_event__sample_event_size() which can be used when
synthesizing sample events to determine how big the resulting
event will be, and therefore how much memory to allocate.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/event.h | 2 ++
tools/perf/util/evsel.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 94 insertions(+)

diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 84f704f..a452504 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -245,6 +245,8 @@ int perf_event__preprocess_sample(const union perf_event *self,

const char *perf_event__name(unsigned int id);

+size_t perf_event__sample_event_size(const struct perf_sample *sample, u64 type,
+ u64 sample_regs_user, u64 read_format);
int perf_event__synthesize_sample(union perf_event *event, u64 type,
u64 sample_regs_user, u64 read_format,
const struct perf_sample *sample,
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 4c97f36..0785c3b 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1438,6 +1438,98 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
return 0;
}

+size_t perf_event__sample_event_size(const struct perf_sample *sample, u64 type,
+ u64 sample_regs_user, u64 read_format)
+{
+ size_t sz, result = sizeof(struct sample_event);
+
+ if (type & PERF_SAMPLE_IDENTIFIER)
+ result += sizeof(u64);
+
+ if (type & PERF_SAMPLE_IP)
+ result += sizeof(u64);
+
+ if (type & PERF_SAMPLE_TID)
+ result += sizeof(u64);
+
+ if (type & PERF_SAMPLE_TIME)
+ result += sizeof(u64);
+
+ if (type & PERF_SAMPLE_ADDR)
+ result += sizeof(u64);
+
+ if (type & PERF_SAMPLE_ID)
+ result += sizeof(u64);
+
+ if (type & PERF_SAMPLE_STREAM_ID)
+ result += sizeof(u64);
+
+ if (type & PERF_SAMPLE_CPU)
+ result += sizeof(u64);
+
+ if (type & PERF_SAMPLE_PERIOD)
+ result += sizeof(u64);
+
+ if (type & PERF_SAMPLE_READ) {
+ result += sizeof(u64);
+ if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
+ result += sizeof(u64);
+ if (read_format & PERF_FORMAT_TOTAL_TIME_RUNNING)
+ result += sizeof(u64);
+ /* PERF_FORMAT_ID is forced for PERF_SAMPLE_READ */
+ if (read_format & PERF_FORMAT_GROUP) {
+ sz = sample->read.group.nr *
+ sizeof(struct sample_read_value);
+ result += sz;
+ } else {
+ result += sizeof(u64);
+ }
+ }
+
+ if (type & PERF_SAMPLE_CALLCHAIN) {
+ sz = (sample->callchain->nr + 1) * sizeof(u64);
+ result += sz;
+ }
+
+ if (type & PERF_SAMPLE_RAW) {
+ result += sizeof(u32);
+ result += sample->raw_size;
+ }
+
+ if (type & PERF_SAMPLE_BRANCH_STACK) {
+ sz = sample->branch_stack->nr * sizeof(struct branch_entry);
+ sz += sizeof(u64);
+ result += sz;
+ }
+
+ if (type & PERF_SAMPLE_REGS_USER) {
+ if (sample->user_regs.abi) {
+ result += sizeof(u64);
+ sz = hweight_long(sample_regs_user) * sizeof(u64);
+ result += sz;
+ } else {
+ result += sizeof(u64);
+ }
+ }
+
+ if (type & PERF_SAMPLE_STACK_USER) {
+ sz = sample->user_stack.size;
+ result += sizeof(u64);
+ if (sz) {
+ result += sz;
+ result += sizeof(u64);
+ }
+ }
+
+ if (type & PERF_SAMPLE_WEIGHT)
+ result += sizeof(u64);
+
+ if (type & PERF_SAMPLE_DATA_SRC)
+ result += sizeof(u64);
+
+ return result;
+}
+
int perf_event__synthesize_sample(union perf_event *event, u64 type,
u64 sample_regs_user, u64 read_format,
const struct perf_sample *sample,
--
1.7.11.7

2013-08-14 12:43:49

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V11 15/15] perf tools: add a sample parsing test

Add a test that checks that sample parsing is correctly
implemented.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/Makefile | 1 +
tools/perf/tests/builtin-test.c | 4 +
tools/perf/tests/sample-parsing.c | 316 ++++++++++++++++++++++++++++++++++++++
tools/perf/tests/tests.h | 1 +
4 files changed, 322 insertions(+)
create mode 100644 tools/perf/tests/sample-parsing.c

diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index 868358c..a069578 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -439,6 +439,7 @@ PERFLIBS = $(LIB_FILE) $(LIBLK) $(LIBTRACEEVENT)
ifneq ($(OUTPUT),)
CFLAGS += -I$(OUTPUT)
endif
+LIB_OBJS += $(OUTPUT)tests/sample-parsing.o

ifdef NO_LIBELF
EXTLIBS := $(filter-out -lelf,$(EXTLIBS))
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index f5af192..8ad9415 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -104,6 +104,10 @@ static struct test {
.func = test__code_reading,
},
{
+ .desc = "Test sample parsing",
+ .func = test__sample_parsing,
+ },
+ {
.func = NULL,
},
};
diff --git a/tools/perf/tests/sample-parsing.c b/tools/perf/tests/sample-parsing.c
new file mode 100644
index 0000000..77f598d
--- /dev/null
+++ b/tools/perf/tests/sample-parsing.c
@@ -0,0 +1,316 @@
+#include <stdbool.h>
+#include <inttypes.h>
+
+#include "util.h"
+#include "event.h"
+#include "evsel.h"
+
+#include "tests.h"
+
+#define COMP(m) do { \
+ if (s1->m != s2->m) { \
+ pr_debug("Samples differ at '"#m"'\n"); \
+ return false; \
+ } \
+} while (0)
+
+#define MCOMP(m) do { \
+ if (memcmp(&s1->m, &s2->m, sizeof(s1->m))) { \
+ pr_debug("Samples differ at '"#m"'\n"); \
+ return false; \
+ } \
+} while (0)
+
+static bool samples_same(const struct perf_sample *s1,
+ const struct perf_sample *s2, u64 type, u64 regs_user,
+ u64 read_format)
+{
+ size_t i;
+
+ if (type & PERF_SAMPLE_IDENTIFIER)
+ COMP(id);
+
+ if (type & PERF_SAMPLE_IP)
+ COMP(ip);
+
+ if (type & PERF_SAMPLE_TID) {
+ COMP(pid);
+ COMP(tid);
+ }
+
+ if (type & PERF_SAMPLE_TIME)
+ COMP(time);
+
+ if (type & PERF_SAMPLE_ADDR)
+ COMP(addr);
+
+ if (type & PERF_SAMPLE_ID)
+ COMP(id);
+
+ if (type & PERF_SAMPLE_STREAM_ID)
+ COMP(stream_id);
+
+ if (type & PERF_SAMPLE_CPU)
+ COMP(cpu);
+
+ if (type & PERF_SAMPLE_PERIOD)
+ COMP(period);
+
+ if (type & PERF_SAMPLE_READ) {
+ if (read_format & PERF_FORMAT_GROUP)
+ COMP(read.group.nr);
+ else
+ COMP(read.one.value);
+ if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
+ COMP(read.time_enabled);
+ if (read_format & PERF_FORMAT_TOTAL_TIME_RUNNING)
+ COMP(read.time_running);
+ /* PERF_FORMAT_ID is forced for PERF_SAMPLE_READ */
+ if (read_format & PERF_FORMAT_GROUP) {
+ for (i = 0; i < s1->read.group.nr; i++)
+ MCOMP(read.group.values[i]);
+ } else {
+ COMP(read.one.id);
+ }
+ }
+
+ if (type & PERF_SAMPLE_CALLCHAIN) {
+ COMP(callchain->nr);
+ for (i = 0; i < s1->callchain->nr; i++)
+ COMP(callchain->ips[i]);
+ }
+
+ if (type & PERF_SAMPLE_RAW) {
+ COMP(raw_size);
+ if (memcmp(s1->raw_data, s2->raw_data, s1->raw_size)) {
+ pr_debug("Samples differ at 'raw_data'\n");
+ return false;
+ }
+ }
+
+ if (type & PERF_SAMPLE_BRANCH_STACK) {
+ COMP(branch_stack->nr);
+ for (i = 0; i < s1->branch_stack->nr; i++)
+ MCOMP(branch_stack->entries[i]);
+ }
+
+ if (type & PERF_SAMPLE_REGS_USER) {
+ size_t sz = hweight_long(regs_user) * sizeof(u64);
+
+ COMP(user_regs.abi);
+ if (s1->user_regs.abi &&
+ (!s1->user_regs.regs || !s2->user_regs.regs ||
+ memcmp(s1->user_regs.regs, s2->user_regs.regs, sz))) {
+ pr_debug("Samples differ at 'user_regs'\n");
+ return false;
+ }
+ }
+
+ if (type & PERF_SAMPLE_STACK_USER) {
+ COMP(user_stack.size);
+ if (memcmp(s1->user_stack.data, s1->user_stack.data,
+ s1->user_stack.size)) {
+ pr_debug("Samples differ at 'user_stack'\n");
+ return false;
+ }
+ }
+
+ if (type & PERF_SAMPLE_WEIGHT)
+ COMP(weight);
+
+ if (type & PERF_SAMPLE_DATA_SRC)
+ COMP(data_src);
+
+ return true;
+}
+
+static int do_test(u64 sample_type, u64 sample_regs_user, u64 read_format)
+{
+ struct perf_evsel evsel = {
+ .needs_swap = false,
+ .attr = {
+ .sample_type = sample_type,
+ .sample_regs_user = sample_regs_user,
+ .read_format = read_format,
+ },
+ };
+ union perf_event *event;
+ union {
+ struct ip_callchain callchain;
+ u64 data[64];
+ } callchain = {
+ /* 3 ips */
+ .data = {3, 201, 202, 203},
+ };
+ union {
+ struct branch_stack branch_stack;
+ u64 data[64];
+ } branch_stack = {
+ /* 1 branch_entry */
+ .data = {1, 211, 212, 213},
+ };
+ u64 user_regs[64];
+ const u64 raw_data[] = {0x123456780a0b0c0dULL, 0x1102030405060708ULL};
+ const u64 data[] = {0x2211443366558877ULL, 0, 0xaabbccddeeff4321ULL};
+ struct perf_sample sample = {
+ .ip = 101,
+ .pid = 102,
+ .tid = 103,
+ .time = 104,
+ .addr = 105,
+ .id = 106,
+ .stream_id = 107,
+ .period = 108,
+ .weight = 109,
+ .cpu = 110,
+ .raw_size = sizeof(raw_data),
+ .data_src = 111,
+ .raw_data = (void *)raw_data,
+ .callchain = &callchain.callchain,
+ .branch_stack = &branch_stack.branch_stack,
+ .user_regs = {
+ .abi = PERF_SAMPLE_REGS_ABI_64,
+ .regs = user_regs,
+ },
+ .user_stack = {
+ .size = sizeof(data),
+ .data = (void *)data,
+ },
+ .read = {
+ .time_enabled = 0x030a59d664fca7deULL,
+ .time_running = 0x011b6ae553eb98edULL,
+ },
+ };
+ struct sample_read_value values[] = {{1, 5}, {9, 3}, {2, 7}, {6, 4},};
+ struct perf_sample sample_out;
+ size_t i, sz, bufsz;
+ int err, ret = -1;
+
+ for (i = 0; i < sizeof(user_regs); i++)
+ *(i + (u8 *)user_regs) = i & 0xfe;
+
+ if (read_format & PERF_FORMAT_GROUP) {
+ sample.read.group.nr = 4;
+ sample.read.group.values = values;
+ } else {
+ sample.read.one.value = 0x08789faeb786aa87ULL;
+ sample.read.one.id = 99;
+ }
+
+ sz = perf_event__sample_event_size(&sample, sample_type,
+ sample_regs_user, read_format);
+ bufsz = sz + 4096; /* Add a bit for overrun checking */
+ event = malloc(bufsz);
+ if (!event) {
+ pr_debug("malloc failed\n");
+ return -1;
+ }
+
+ memset(event, 0xff, bufsz);
+ event->header.type = PERF_RECORD_SAMPLE;
+ event->header.misc = 0;
+ event->header.size = sz;
+
+ err = perf_event__synthesize_sample(event, sample_type,
+ sample_regs_user, read_format,
+ &sample, false);
+ if (err) {
+ pr_debug("%s failed for sample_type %#"PRIx64", error %d\n",
+ "perf_event__synthesize_sample", sample_type, err);
+ goto out_free;
+ }
+
+ /* The data does not contain 0xff so we use that to check the size */
+ for (i = bufsz; i > 0; i--) {
+ if (*(i - 1 + (u8 *)event) != 0xff)
+ break;
+ }
+ if (i != sz) {
+ pr_debug("Event size mismatch: actual %zu vs expected %zu\n",
+ i, sz);
+ goto out_free;
+ }
+
+ evsel.sample_size = __perf_evsel__sample_size(sample_type);
+
+ err = perf_evsel__parse_sample(&evsel, event, &sample_out);
+ if (err) {
+ pr_debug("%s failed for sample_type %#"PRIx64", error %d\n",
+ "perf_evsel__parse_sample", sample_type, err);
+ goto out_free;
+ }
+
+ if (!samples_same(&sample, &sample_out, sample_type,
+ sample_regs_user, read_format)) {
+ pr_debug("parsing failed for sample_type %#"PRIx64"\n",
+ sample_type);
+ goto out_free;
+ }
+
+ ret = 0;
+out_free:
+ free(event);
+ if (ret && read_format)
+ pr_debug("read_format %#"PRIx64"\n", read_format);
+ return ret;
+}
+
+/**
+ * test__sample_parsing - test sample parsing.
+ *
+ * This function implements a test that synthesizes a sample event, parses it
+ * and then checks that the parsed sample matches the original sample. The test
+ * checks sample format bits separately and together. If the test passes %0 is
+ * returned, otherwise %-1 is returned.
+ */
+int test__sample_parsing(void)
+{
+ const u64 rf[] = {4, 5, 6, 7, 12, 13, 14, 15};
+ u64 sample_type;
+ u64 sample_regs_user;
+ size_t i;
+ int err;
+
+ /*
+ * Fail the test if it has not been updated when new sample format bits
+ * were added.
+ */
+ if (PERF_SAMPLE_MAX > PERF_SAMPLE_IDENTIFIER << 1) {
+ pr_debug("sample format has changed - test needs updating\n");
+ return -1;
+ }
+
+ /* Test each sample format bit separately */
+ for (sample_type = 1; sample_type != PERF_SAMPLE_MAX;
+ sample_type <<= 1) {
+ /* Test read_format variations */
+ if (sample_type == PERF_SAMPLE_READ) {
+ for (i = 0; i < ARRAY_SIZE(rf); i++) {
+ err = do_test(sample_type, 0, rf[i]);
+ if (err)
+ return err;
+ }
+ continue;
+ }
+
+ if (sample_type == PERF_SAMPLE_REGS_USER)
+ sample_regs_user = 0x3fff;
+ else
+ sample_regs_user = 0;
+
+ err = do_test(sample_type, sample_regs_user, 0);
+ if (err)
+ return err;
+ }
+
+ /* Test all sample format bits together */
+ sample_type = PERF_SAMPLE_MAX - 1;
+ sample_regs_user = 0x3fff;
+ for (i = 0; i < ARRAY_SIZE(rf); i++) {
+ err = do_test(sample_type, sample_regs_user, rf[i]);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index c748f53..83d5b71 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -37,5 +37,6 @@ int test__task_exit(void);
int test__sw_clock_freq(void);
int test__perf_time_to_tsc(void);
int test__code_reading(void);
+int test__sample_parsing(void);

#endif /* TESTS_H */
--
1.7.11.7

2013-08-14 12:44:30

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V11 12/15] perf tools: add missing 'abi' member to 'struct regs_dump'

And store the parsed value there. Note that the 'abi' is
0 (no registers), 1 (32-bit registers) or 2 (64-bit registers),
but the registers are anyway copied one-by-one as 64-bit
values onto the event i.e. see 'perf_output_sample_regs()'

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/event.h | 1 +
tools/perf/util/evsel.c | 7 +++----
2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index b0921fe..6bb5d7c 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -78,6 +78,7 @@ struct sample_event {
};

struct regs_dump {
+ u64 abi;
u64 *regs;
};

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 4b2819b..81c170f 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1389,13 +1389,12 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
}

if (type & PERF_SAMPLE_REGS_USER) {
- u64 avail;
-
/* First u64 tells us if we have any regs in sample. */
OVERFLOW_CHECK_u64(array);
- avail = *array++;
+ data->user_regs.abi = *array;
+ array++;

- if (avail) {
+ if (data->user_regs.abi) {
u64 regs_user = evsel->attr.sample_regs_user;

sz = hweight_long(regs_user) * sizeof(u64);
--
1.7.11.7

2013-08-14 12:44:47

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V11 10/15] perf tools: move perf_evlist__config() to a new source file

perf_evlist__config() must be moved to a separate source
file to avoid Python link errors when adding support for
PERF_SAMPLE_IDENTIFIER.

It is appropriate to do this because perf_evlist__config()
is a helper function for event recording. It is used by
tools to apply recording options to perf_evlist. It is
not used by the Python API.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
---
tools/perf/Makefile | 1 +
tools/perf/util/evlist.c | 24 ------------------------
tools/perf/util/record.c | 27 +++++++++++++++++++++++++++
3 files changed, 28 insertions(+), 24 deletions(-)
create mode 100644 tools/perf/util/record.c

diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index e0d3d9f..868358c 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -360,6 +360,7 @@ LIB_OBJS += $(OUTPUT)util/rblist.o
LIB_OBJS += $(OUTPUT)util/intlist.o
LIB_OBJS += $(OUTPUT)util/vdso.o
LIB_OBJS += $(OUTPUT)util/stat.o
+LIB_OBJS += $(OUTPUT)util/record.o

LIB_OBJS += $(OUTPUT)ui/setup.o
LIB_OBJS += $(OUTPUT)ui/helpline.o
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 2e5c0b7..e734926 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -89,30 +89,6 @@ void perf_evlist__make_sample_types_compatible(struct perf_evlist *evlist)
perf_evlist__set_id_pos(evlist);
}

-void perf_evlist__config(struct perf_evlist *evlist,
- struct perf_record_opts *opts)
-{
- struct perf_evsel *evsel;
- /*
- * Set the evsel leader links before we configure attributes,
- * since some might depend on this info.
- */
- if (opts->group)
- perf_evlist__set_leader(evlist);
-
- if (evlist->cpus->map[0] < 0)
- opts->no_inherit = true;
-
- list_for_each_entry(evsel, &evlist->entries, node) {
- perf_evsel__config(evsel, opts);
-
- if (evlist->nr_entries > 1)
- perf_evsel__set_sample_id(evsel);
- }
-
- perf_evlist__make_sample_types_compatible(evlist);
-}
-
static void perf_evlist__purge(struct perf_evlist *evlist)
{
struct perf_evsel *pos, *n;
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
new file mode 100644
index 0000000..67291e4
--- /dev/null
+++ b/tools/perf/util/record.c
@@ -0,0 +1,27 @@
+#include "evlist.h"
+#include "evsel.h"
+#include "cpumap.h"
+
+void perf_evlist__config(struct perf_evlist *evlist,
+ struct perf_record_opts *opts)
+{
+ struct perf_evsel *evsel;
+ /*
+ * Set the evsel leader links before we configure attributes,
+ * since some might depend on this info.
+ */
+ if (opts->group)
+ perf_evlist__set_leader(evlist);
+
+ if (evlist->cpus->map[0] < 0)
+ opts->no_inherit = true;
+
+ list_for_each_entry(evsel, &evlist->entries, node) {
+ perf_evsel__config(evsel, opts);
+
+ if (evlist->nr_entries > 1)
+ perf_evsel__set_sample_id(evsel);
+ }
+
+ perf_evlist__make_sample_types_compatible(evlist);
+}
--
1.7.11.7

2013-08-14 12:43:15

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V11 06/15] perf tools: tidy up sample parsing overflow checking

The size of data retrieved from a sample event must be
validated to ensure it does not go past the end of the
event. That was being done sporadically and without
considering integer overflows.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
---
tools/perf/util/evsel.c | 112 ++++++++++++++++++++++++++++++------------------
1 file changed, 71 insertions(+), 41 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 565ab55..12812f7 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1166,24 +1166,30 @@ static int perf_evsel__parse_id_sample(const struct perf_evsel *evsel,
return 0;
}

-static bool sample_overlap(const union perf_event *event,
- const void *offset, u64 size)
+static inline bool overflow(const void *endp, u16 max_size, const void *offset,
+ u64 size)
{
- const void *base = event;
+ return size > max_size || offset + size > endp;
+}

- if (offset + size > base + event->header.size)
- return true;
+#define OVERFLOW_CHECK(offset, size, max_size) \
+ do { \
+ if (overflow(endp, (max_size), (offset), (size))) \
+ return -EFAULT; \
+ } while (0)

- return false;
-}
+#define OVERFLOW_CHECK_u64(offset) \
+ OVERFLOW_CHECK(offset, sizeof(u64), sizeof(u64))

int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
struct perf_sample *data)
{
u64 type = evsel->attr.sample_type;
- u64 regs_user = evsel->attr.sample_regs_user;
bool swapped = evsel->needs_swap;
const u64 *array;
+ u16 max_size = event->header.size;
+ const void *endp = (void *)event + max_size;
+ u64 sz;

/*
* used for cross-endian analysis. See git commit 65014ab3
@@ -1205,6 +1211,11 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,

array = event->sample.array;

+ /*
+ * The evsel's sample_size is based on PERF_SAMPLE_MASK which includes
+ * up to PERF_SAMPLE_PERIOD. After that overflow() must be used to
+ * check the format does not go past the end of the event.
+ */
if (evsel->sample_size + sizeof(event->header) > event->header.size)
return -EFAULT;

@@ -1270,6 +1281,7 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
if (type & PERF_SAMPLE_READ) {
u64 read_format = evsel->attr.read_format;

+ OVERFLOW_CHECK_u64(array);
if (read_format & PERF_FORMAT_GROUP)
data->read.group.nr = *array;
else
@@ -1278,41 +1290,51 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
array++;

if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED) {
+ OVERFLOW_CHECK_u64(array);
data->read.time_enabled = *array;
array++;
}

if (read_format & PERF_FORMAT_TOTAL_TIME_RUNNING) {
+ OVERFLOW_CHECK_u64(array);
data->read.time_running = *array;
array++;
}

/* PERF_FORMAT_ID is forced for PERF_SAMPLE_READ */
if (read_format & PERF_FORMAT_GROUP) {
- data->read.group.values = (struct sample_read_value *) array;
- array = (void *) array + data->read.group.nr *
- sizeof(struct sample_read_value);
+ const u64 max_group_nr = UINT64_MAX /
+ sizeof(struct sample_read_value);
+
+ if (data->read.group.nr > max_group_nr)
+ return -EFAULT;
+ sz = data->read.group.nr *
+ sizeof(struct sample_read_value);
+ OVERFLOW_CHECK(array, sz, max_size);
+ data->read.group.values =
+ (struct sample_read_value *)array;
+ array = (void *)array + sz;
} else {
+ OVERFLOW_CHECK_u64(array);
data->read.one.id = *array;
array++;
}
}

if (type & PERF_SAMPLE_CALLCHAIN) {
- if (sample_overlap(event, array, sizeof(data->callchain->nr)))
- return -EFAULT;
-
- data->callchain = (struct ip_callchain *)array;
+ const u64 max_callchain_nr = UINT64_MAX / sizeof(u64);

- if (sample_overlap(event, array, data->callchain->nr))
+ OVERFLOW_CHECK_u64(array);
+ data->callchain = (struct ip_callchain *)array++;
+ if (data->callchain->nr > max_callchain_nr)
return -EFAULT;
-
- array += 1 + data->callchain->nr;
+ sz = data->callchain->nr * sizeof(u64);
+ OVERFLOW_CHECK(array, sz, max_size);
+ array = (void *)array + sz;
}

if (type & PERF_SAMPLE_RAW) {
- const u64 *pdata;
-
+ OVERFLOW_CHECK_u64(array);
u.val64 = *array;
if (WARN_ONCE(swapped,
"Endianness of raw data not corrected!\n")) {
@@ -1321,65 +1343,73 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
u.val32[0] = bswap_32(u.val32[0]);
u.val32[1] = bswap_32(u.val32[1]);
}
-
- if (sample_overlap(event, array, sizeof(u32)))
- return -EFAULT;
-
data->raw_size = u.val32[0];
- pdata = (void *) array + sizeof(u32);
-
- if (sample_overlap(event, pdata, data->raw_size))
- return -EFAULT;
+ array = (void *)array + sizeof(u32);

- data->raw_data = (void *) pdata;
-
- array = (void *)array + data->raw_size + sizeof(u32);
+ OVERFLOW_CHECK(array, data->raw_size, max_size);
+ data->raw_data = (void *)array;
+ array = (void *)array + data->raw_size;
}

if (type & PERF_SAMPLE_BRANCH_STACK) {
- u64 sz;
+ const u64 max_branch_nr = UINT64_MAX /
+ sizeof(struct branch_entry);

- data->branch_stack = (struct branch_stack *)array;
- array++; /* nr */
+ OVERFLOW_CHECK_u64(array);
+ data->branch_stack = (struct branch_stack *)array++;

+ if (data->branch_stack->nr > max_branch_nr)
+ return -EFAULT;
sz = data->branch_stack->nr * sizeof(struct branch_entry);
- sz /= sizeof(u64);
- array += sz;
+ OVERFLOW_CHECK(array, sz, max_size);
+ array = (void *)array + sz;
}

if (type & PERF_SAMPLE_REGS_USER) {
+ u64 avail;
+
/* First u64 tells us if we have any regs in sample. */
- u64 avail = *array++;
+ OVERFLOW_CHECK_u64(array);
+ avail = *array++;

if (avail) {
+ u64 regs_user = evsel->attr.sample_regs_user;
+
+ sz = hweight_long(regs_user) * sizeof(u64);
+ OVERFLOW_CHECK(array, sz, max_size);
data->user_regs.regs = (u64 *)array;
- array += hweight_long(regs_user);
+ array = (void *)array + sz;
}
}

if (type & PERF_SAMPLE_STACK_USER) {
- u64 size = *array++;
+ OVERFLOW_CHECK_u64(array);
+ sz = *array++;

data->user_stack.offset = ((char *)(array - 1)
- (char *) event);

- if (!size) {
+ if (!sz) {
data->user_stack.size = 0;
} else {
+ OVERFLOW_CHECK(array, sz, max_size);
data->user_stack.data = (char *)array;
- array += size / sizeof(*array);
+ array = (void *)array + sz;
+ OVERFLOW_CHECK_u64(array);
data->user_stack.size = *array++;
}
}

data->weight = 0;
if (type & PERF_SAMPLE_WEIGHT) {
+ OVERFLOW_CHECK_u64(array);
data->weight = *array;
array++;
}

data->data_src = PERF_MEM_DATA_SRC_NONE;
if (type & PERF_SAMPLE_DATA_SRC) {
+ OVERFLOW_CHECK_u64(array);
data->data_src = *array;
array++;
}
--
1.7.11.7

2013-08-14 12:45:41

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V11 08/15] perf tools: remove references to struct ip_event

struct ip_event assumes fixed positions for ip, pid
and tid. That is no longer true with the addition of
PERF_SAMPLE_IDENTIFIER. The information is anyway in
struct sample, so use that instead.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
---
tools/perf/builtin-inject.c | 4 ++--
tools/perf/builtin-kmem.c | 4 ++--
tools/perf/builtin-mem.c | 2 +-
tools/perf/builtin-script.c | 4 ++--
tools/perf/builtin-top.c | 11 ++++++-----
tools/perf/tests/hists_link.c | 20 ++++++++------------
tools/perf/util/build-id.c | 8 ++++----
tools/perf/util/event.c | 6 +++---
tools/perf/util/event.h | 11 -----------
tools/perf/util/evsel.c | 4 ++--
tools/perf/util/session.c | 8 +++++---
11 files changed, 35 insertions(+), 47 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 0d4ae1d..ffacd46 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -198,7 +198,7 @@ static int perf_event__inject_buildid(struct perf_tool *tool,

cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;

- thread = machine__findnew_thread(machine, event->ip.pid, event->ip.pid);
+ thread = machine__findnew_thread(machine, sample->pid, sample->pid);
if (thread == NULL) {
pr_err("problem processing %d event, skipping it.\n",
event->header.type);
@@ -206,7 +206,7 @@ static int perf_event__inject_buildid(struct perf_tool *tool,
}

thread__find_addr_map(thread, machine, cpumode, MAP__FUNCTION,
- event->ip.ip, &al);
+ sample->ip, &al);

if (al.map != NULL) {
if (!al.map->dso->hit) {
diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index c324778..c2dff9c 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -305,8 +305,8 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
struct perf_evsel *evsel,
struct machine *machine)
{
- struct thread *thread = machine__findnew_thread(machine, event->ip.pid,
- event->ip.pid);
+ struct thread *thread = machine__findnew_thread(machine, sample->pid,
+ sample->pid);

if (thread == NULL) {
pr_debug("problem processing %d event, skipping it.\n",
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 706a1fa..791b432 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -94,7 +94,7 @@ dump_raw_samples(struct perf_tool *tool,
symbol_conf.field_sep,
sample->tid,
symbol_conf.field_sep,
- event->ip.ip,
+ sample->ip,
symbol_conf.field_sep,
sample->addr,
symbol_conf.field_sep,
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index d82712f..93a34ce 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -501,8 +501,8 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
struct machine *machine)
{
struct addr_location al;
- struct thread *thread = machine__findnew_thread(machine, event->ip.pid,
- event->ip.tid);
+ struct thread *thread = machine__findnew_thread(machine, sample->pid,
+ sample->tid);

if (thread == NULL) {
pr_debug("problem processing %d event, skipping it.\n",
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index e37521f..2122141 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -689,7 +689,7 @@ static void perf_event__process_sample(struct perf_tool *tool,
{
struct perf_top *top = container_of(tool, struct perf_top, tool);
struct symbol *parent = NULL;
- u64 ip = event->ip.ip;
+ u64 ip = sample->ip;
struct addr_location al;
int err;

@@ -699,10 +699,10 @@ static void perf_event__process_sample(struct perf_tool *tool,
if (!seen)
seen = intlist__new(NULL);

- if (!intlist__has_entry(seen, event->ip.pid)) {
+ if (!intlist__has_entry(seen, sample->pid)) {
pr_err("Can't find guest [%d]'s kernel information\n",
- event->ip.pid);
- intlist__add(seen, event->ip.pid);
+ sample->pid);
+ intlist__add(seen, sample->pid);
}
return;
}
@@ -836,7 +836,8 @@ static void perf_top__mmap_read_idx(struct perf_top *top, int idx)
break;
case PERF_RECORD_MISC_GUEST_KERNEL:
++top->guest_kernel_samples;
- machine = perf_session__find_machine(session, event->ip.pid);
+ machine = perf_session__find_machine(session,
+ sample.pid);
break;
case PERF_RECORD_MISC_GUEST_USER:
++top->guest_us_samples;
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 87f9f72..4228ffc 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -211,15 +211,13 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
list_for_each_entry(evsel, &evlist->entries, node) {
for (k = 0; k < ARRAY_SIZE(fake_common_samples); k++) {
const union perf_event event = {
- .ip = {
- .header = {
- .misc = PERF_RECORD_MISC_USER,
- },
- .pid = fake_common_samples[k].pid,
- .ip = fake_common_samples[k].ip,
+ .header = {
+ .misc = PERF_RECORD_MISC_USER,
},
};

+ sample.pid = fake_common_samples[k].pid;
+ sample.ip = fake_common_samples[k].ip;
if (perf_event__preprocess_sample(&event, machine, &al,
&sample) < 0)
goto out;
@@ -235,15 +233,13 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)

for (k = 0; k < ARRAY_SIZE(fake_samples[i]); k++) {
const union perf_event event = {
- .ip = {
- .header = {
- .misc = PERF_RECORD_MISC_USER,
- },
- .pid = fake_samples[i][k].pid,
- .ip = fake_samples[i][k].ip,
+ .header = {
+ .misc = PERF_RECORD_MISC_USER,
},
};

+ sample.pid = fake_samples[i][k].pid;
+ sample.ip = fake_samples[i][k].ip;
if (perf_event__preprocess_sample(&event, machine, &al,
&sample) < 0)
goto out;
diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index 0f9d27a..fb58409 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -18,14 +18,14 @@

int build_id__mark_dso_hit(struct perf_tool *tool __maybe_unused,
union perf_event *event,
- struct perf_sample *sample __maybe_unused,
+ struct perf_sample *sample,
struct perf_evsel *evsel __maybe_unused,
struct machine *machine)
{
struct addr_location al;
u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
- struct thread *thread = machine__findnew_thread(machine, event->ip.pid,
- event->ip.pid);
+ struct thread *thread = machine__findnew_thread(machine, sample->pid,
+ sample->pid);

if (thread == NULL) {
pr_err("problem processing %d event, skipping it.\n",
@@ -34,7 +34,7 @@ int build_id__mark_dso_hit(struct perf_tool *tool __maybe_unused,
}

thread__find_addr_map(thread, machine, cpumode, MAP__FUNCTION,
- event->ip.ip, &al);
+ sample->ip, &al);

if (al.map != NULL)
al.map->dso->hit = 1;
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 61cecf9..8d51f21 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -686,8 +686,8 @@ int perf_event__preprocess_sample(const union perf_event *event,
struct perf_sample *sample)
{
u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
- struct thread *thread = machine__findnew_thread(machine, event->ip.pid,
- event->ip.pid);
+ struct thread *thread = machine__findnew_thread(machine, sample->pid,
+ sample->pid);

if (thread == NULL)
return -1;
@@ -709,7 +709,7 @@ int perf_event__preprocess_sample(const union perf_event *event,
machine__create_kernel_maps(machine);

thread__find_addr_map(thread, machine, cpumode, MAP__FUNCTION,
- event->ip.ip, al);
+ sample->ip, al);
dump_printf(" ...... dso: %s\n",
al->map ? al->map->dso->long_name :
al->level == 'H' ? "[hypervisor]" : "<not found>");
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index f6c45fd..de4bfe0 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -8,16 +8,6 @@
#include "map.h"
#include "build-id.h"

-/*
- * PERF_SAMPLE_IP | PERF_SAMPLE_TID | *
- */
-struct ip_event {
- struct perf_event_header header;
- u64 ip;
- u32 pid, tid;
- unsigned char __more_data[];
-};
-
struct mmap_event {
struct perf_event_header header;
u32 pid, tid;
@@ -180,7 +170,6 @@ struct tracing_data_event {

union perf_event {
struct perf_event_header header;
- struct ip_event ip;
struct mmap_event mmap;
struct comm_event comm;
struct fork_event fork;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 12812f7..e63b82e 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1220,7 +1220,7 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
return -EFAULT;

if (type & PERF_SAMPLE_IP) {
- data->ip = event->ip.ip;
+ data->ip = *array;
array++;
}

@@ -1432,7 +1432,7 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type,
array = event->sample.array;

if (type & PERF_SAMPLE_IP) {
- event->ip.ip = sample->ip;
+ *array = sample->ip;
array++;
}

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 1d9ae82..07642a7 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -840,7 +840,8 @@ static void dump_sample(struct perf_evsel *evsel, union perf_event *event,

static struct machine *
perf_session__find_machine_for_cpumode(struct perf_session *session,
- union perf_event *event)
+ union perf_event *event,
+ struct perf_sample *sample)
{
const u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;

@@ -852,7 +853,7 @@ static struct machine *
if (event->header.type == PERF_RECORD_MMAP)
pid = event->mmap.pid;
else
- pid = event->ip.pid;
+ pid = sample->pid;

return perf_session__findnew_machine(session, pid);
}
@@ -958,7 +959,8 @@ static int perf_session_deliver_event(struct perf_session *session,
hists__inc_nr_events(&evsel->hists, event->header.type);
}

- machine = perf_session__find_machine_for_cpumode(session, event);
+ machine = perf_session__find_machine_for_cpumode(session, event,
+ sample);

switch (event->header.type) {
case PERF_RECORD_SAMPLE:
--
1.7.11.7

2013-08-14 12:43:01

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V11 03/15] perf tools: allow non-matching sample types

Sample types need not be identical to determine
the sample id from the event. Only the position
of the sample id needs to be the same.

Compatible sample types are ones in which the bits
defined by PERF_COMPAT_MASK are the same.
'perf_evlist__config()' forces sample types to be
compatible on that basis.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/builtin-report.c | 2 +-
tools/perf/util/event.h | 14 +++++
tools/perf/util/evlist.c | 136 ++++++++++++++++++++++++++++++++++++++++++--
tools/perf/util/evlist.h | 8 ++-
tools/perf/util/evsel.c | 64 ++++++++++++++++++++-
tools/perf/util/evsel.h | 10 ++++
tools/perf/util/session.c | 4 +-
7 files changed, 228 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 958a56a..9725aa3 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -365,7 +365,7 @@ static int process_read_event(struct perf_tool *tool,
static int perf_report__setup_sample_type(struct perf_report *rep)
{
struct perf_session *self = rep->session;
- u64 sample_type = perf_evlist__sample_type(self->evlist);
+ u64 sample_type = perf_evlist__combined_sample_type(self->evlist);

if (!self->fd_pipe && !(sample_type & PERF_SAMPLE_CALLCHAIN)) {
if (sort__has_parent) {
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 15db071..f6c45fd 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -65,6 +65,20 @@ struct read_event {
PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
PERF_SAMPLE_CPU | PERF_SAMPLE_PERIOD)

+/*
+ * Events have compatible sample types if the following bits all have the same
+ * value. This is because the order of sample members is fixed. For sample
+ * events the order is: PERF_SAMPLE_IP, PERF_SAMPLE_TID, PERF_SAMPLE_TIME,
+ * PERF_SAMPLE_ADDR, PERF_SAMPLE_ID. For non-sample events the sample members
+ * are accessed in reverse order. The order is: PERF_SAMPLE_ID,
+ * PERF_SAMPLE_STREAM_ID, PERF_SAMPLE_CPU.
+ */
+#define PERF_COMPAT_MASK \
+ (PERF_SAMPLE_IP | PERF_SAMPLE_TID | \
+ PERF_SAMPLE_TIME | PERF_SAMPLE_ADDR | \
+ PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
+ PERF_SAMPLE_CPU)
+
struct sample_event {
struct perf_event_header header;
u64 array[];
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 1f5105a..2e5c0b7 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -49,6 +49,46 @@ struct perf_evlist *perf_evlist__new(void)
return evlist;
}

+/**
+ * perf_evlist__set_id_pos - set the positions of event ids.
+ * @evlist: selected event list
+ *
+ * Events with compatible sample types all have the same id_pos
+ * and is_pos. For convenience, put a copy on evlist.
+ */
+static void perf_evlist__set_id_pos(struct perf_evlist *evlist)
+{
+ struct perf_evsel *first = perf_evlist__first(evlist);
+
+ evlist->id_pos = first->id_pos;
+ evlist->is_pos = first->is_pos;
+}
+
+/**
+ * perf_evlist__make_sample_types_compatible - make sample types compatible.
+ * @evlist: selected event list
+ *
+ * Events with compatible sample types all have the same id_pos and is_pos.
+ * This can be achieved by matching the bits of PERF_COMPAT_MASK.
+ */
+void perf_evlist__make_sample_types_compatible(struct perf_evlist *evlist)
+{
+ struct perf_evsel *evsel;
+ u64 compat = 0;
+
+ list_for_each_entry(evsel, &evlist->entries, node)
+ compat |= evsel->attr.sample_type & PERF_COMPAT_MASK;
+
+ list_for_each_entry(evsel, &evlist->entries, node) {
+ evsel->attr.sample_type |= compat;
+ evsel->sample_size =
+ __perf_evsel__sample_size(evsel->attr.sample_type);
+ perf_evsel__calc_id_pos(evsel);
+ }
+
+ perf_evlist__set_id_pos(evlist);
+}
+
void perf_evlist__config(struct perf_evlist *evlist,
struct perf_record_opts *opts)
{
@@ -69,6 +109,8 @@ void perf_evlist__config(struct perf_evlist *evlist,
if (evlist->nr_entries > 1)
perf_evsel__set_sample_id(evsel);
}
+
+ perf_evlist__make_sample_types_compatible(evlist);
}

static void perf_evlist__purge(struct perf_evlist *evlist)
@@ -102,6 +144,7 @@ void perf_evlist__add(struct perf_evlist *evlist, struct perf_evsel *entry)
{
list_add_tail(&entry->node, &evlist->entries);
++evlist->nr_entries;
+ perf_evlist__set_id_pos(evlist);
}

void perf_evlist__splice_list_tail(struct perf_evlist *evlist,
@@ -110,6 +153,7 @@ void perf_evlist__splice_list_tail(struct perf_evlist *evlist,
{
list_splice_tail(list, &evlist->entries);
evlist->nr_entries += nr_entries;
+ perf_evlist__set_id_pos(evlist);
}

void __perf_evlist__set_leader(struct list_head *list)
@@ -371,6 +415,55 @@ struct perf_evsel *perf_evlist__id2evsel(struct perf_evlist *evlist, u64 id)
return NULL;
}

+static int perf_evlist__event2id(struct perf_evlist *evlist,
+ union perf_event *event, u64 *id)
+{
+ const u64 *array = event->sample.array;
+ ssize_t n;
+
+ n = (event->header.size - sizeof(event->header)) >> 3;
+
+ if (event->header.type == PERF_RECORD_SAMPLE) {
+ if (evlist->id_pos >= n)
+ return -1;
+ *id = array[evlist->id_pos];
+ } else {
+ if (evlist->is_pos >= n)
+ return -1;
+ n -= evlist->is_pos;
+ *id = array[n];
+ }
+ return 0;
+}
+
+static struct perf_evsel *perf_evlist__event2evsel(struct perf_evlist *evlist,
+ union perf_event *event)
+{
+ struct hlist_head *head;
+ struct perf_sample_id *sid;
+ int hash;
+ u64 id;
+
+ if (evlist->nr_entries == 1 || evlist->matching_sample_types)
+ return perf_evlist__first(evlist);
+
+ if (perf_evlist__event2id(evlist, event, &id))
+ return NULL;
+
+ /* Synthesized events have an id of zero */
+ if (!id)
+ return perf_evlist__first(evlist);
+
+ hash = hash_64(id, PERF_EVLIST__HLIST_BITS);
+ head = &evlist->heads[hash];
+
+ hlist_for_each_entry(sid, head, node) {
+ if (sid->id == id)
+ return sid->evsel;
+ }
+ return NULL;
+}
+
union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
{
struct perf_mmap *md = &evlist->mmap[idx];
@@ -682,19 +775,49 @@ int perf_evlist__set_filter(struct perf_evlist *evlist, const char *filter)
bool perf_evlist__valid_sample_type(struct perf_evlist *evlist)
{
struct perf_evsel *first = perf_evlist__first(evlist), *pos = first;
+ bool ok = true;

list_for_each_entry_continue(pos, &evlist->entries, node) {
- if (first->attr.sample_type != pos->attr.sample_type)
+ if (first->attr.sample_type != pos->attr.sample_type) {
+ ok = false;
+ break;
+ }
+ }
+
+ if (ok) {
+ evlist->matching_sample_types = true;
+ return true;
+ }
+
+ if (evlist->id_pos < 0 || evlist->is_pos < 0)
+ return false;
+
+ list_for_each_entry(pos, &evlist->entries, node) {
+ if (pos->id_pos != evlist->id_pos ||
+ pos->is_pos != evlist->is_pos)
return false;
}

return true;
}

-u64 perf_evlist__sample_type(struct perf_evlist *evlist)
+u64 __perf_evlist__combined_sample_type(struct perf_evlist *evlist)
{
- struct perf_evsel *first = perf_evlist__first(evlist);
- return first->attr.sample_type;
+ struct perf_evsel *evsel;
+
+ if (evlist->combined_sample_type)
+ return evlist->combined_sample_type;
+
+ list_for_each_entry(evsel, &evlist->entries, node)
+ evlist->combined_sample_type |= evsel->attr.sample_type;
+
+ return evlist->combined_sample_type;
+}
+
+u64 perf_evlist__combined_sample_type(struct perf_evlist *evlist)
+{
+ evlist->combined_sample_type = 0;
+ return __perf_evlist__combined_sample_type(evlist);
}

bool perf_evlist__valid_read_format(struct perf_evlist *evlist)
@@ -907,7 +1030,10 @@ int perf_evlist__start_workload(struct perf_evlist *evlist)
int perf_evlist__parse_sample(struct perf_evlist *evlist, union perf_event *event,
struct perf_sample *sample)
{
- struct perf_evsel *evsel = perf_evlist__first(evlist);
+ struct perf_evsel *evsel = perf_evlist__event2evsel(evlist, event);
+
+ if (!evsel)
+ return -EFAULT;
return perf_evsel__parse_sample(evsel, event, sample);
}

diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 327abab..1019758 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -32,11 +32,15 @@ struct perf_evlist {
int nr_fds;
int nr_mmaps;
int mmap_len;
+ int id_pos;
+ int is_pos;
+ u64 combined_sample_type;
struct {
int cork_fd;
pid_t pid;
} workload;
bool overwrite;
+ bool matching_sample_types;
struct perf_mmap *mmap;
struct pollfd *pollfd;
struct thread_map *threads;
@@ -85,6 +89,7 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *self, int idx);
int perf_evlist__open(struct perf_evlist *evlist);
void perf_evlist__close(struct perf_evlist *evlist);

+void perf_evlist__make_sample_types_compatible(struct perf_evlist *evlist);
void perf_evlist__config(struct perf_evlist *evlist,
struct perf_record_opts *opts);

@@ -121,7 +126,8 @@ void __perf_evlist__set_leader(struct list_head *list);
void perf_evlist__set_leader(struct perf_evlist *evlist);

u64 perf_evlist__read_format(struct perf_evlist *evlist);
-u64 perf_evlist__sample_type(struct perf_evlist *evlist);
+u64 __perf_evlist__combined_sample_type(struct perf_evlist *evlist);
+u64 perf_evlist__combined_sample_type(struct perf_evlist *evlist);
bool perf_evlist__sample_id_all(struct perf_evlist *evlist);
u16 perf_evlist__id_hdr_size(struct perf_evlist *evlist);

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index ad5f701..565ab55 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -30,7 +30,7 @@ static struct {

#define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))

-static int __perf_evsel__sample_size(u64 sample_type)
+int __perf_evsel__sample_size(u64 sample_type)
{
u64 mask = sample_type & PERF_SAMPLE_MASK;
int size = 0;
@@ -46,6 +46,65 @@ static int __perf_evsel__sample_size(u64 sample_type)
return size;
}

+/**
+ * __perf_evsel__calc_id_pos - calculate id_pos.
+ * @sample_type: sample type
+ *
+ * This function returns the position of the event id (PERF_SAMPLE_ID) in a
+ * sample event i.e. in the array of struct sample_event.
+ */
+static int __perf_evsel__calc_id_pos(u64 sample_type)
+{
+ int idx = 0;
+
+ if (!(sample_type & PERF_SAMPLE_ID))
+ return -1;
+
+ if (sample_type & PERF_SAMPLE_IP)
+ idx += 1;
+
+ if (sample_type & PERF_SAMPLE_TID)
+ idx += 1;
+
+ if (sample_type & PERF_SAMPLE_TIME)
+ idx += 1;
+
+ if (sample_type & PERF_SAMPLE_ADDR)
+ idx += 1;
+
+ return idx;
+}
+
+/**
+ * __perf_evsel__calc_is_pos - calculate is_pos.
+ * @sample_type: sample type
+ *
+ * This function returns the position (counting backwards) of the event id
+ * (PERF_SAMPLE_ID) in a non-sample event i.e. if sample_id_all is used there is
+ * an id sample appended to non-sample events.
+ */
+static int __perf_evsel__calc_is_pos(u64 sample_type)
+{
+ int idx = 1;
+
+ if (!(sample_type & PERF_SAMPLE_ID))
+ return -1;
+
+ if (sample_type & PERF_SAMPLE_CPU)
+ idx += 1;
+
+ if (sample_type & PERF_SAMPLE_STREAM_ID)
+ idx += 1;
+
+ return idx;
+}
+
+void perf_evsel__calc_id_pos(struct perf_evsel *evsel)
+{
+ evsel->id_pos = __perf_evsel__calc_id_pos(evsel->attr.sample_type);
+ evsel->is_pos = __perf_evsel__calc_is_pos(evsel->attr.sample_type);
+}
+
void hists__init(struct hists *hists)
{
memset(hists, 0, sizeof(*hists));
@@ -62,6 +121,7 @@ void __perf_evsel__set_sample_bit(struct perf_evsel *evsel,
if (!(evsel->attr.sample_type & bit)) {
evsel->attr.sample_type |= bit;
evsel->sample_size += sizeof(u64);
+ perf_evsel__calc_id_pos(evsel);
}
}

@@ -71,6 +131,7 @@ void __perf_evsel__reset_sample_bit(struct perf_evsel *evsel,
if (evsel->attr.sample_type & bit) {
evsel->attr.sample_type &= ~bit;
evsel->sample_size -= sizeof(u64);
+ perf_evsel__calc_id_pos(evsel);
}
}

@@ -89,6 +150,7 @@ void perf_evsel__init(struct perf_evsel *evsel,
INIT_LIST_HEAD(&evsel->node);
hists__init(&evsel->hists);
evsel->sample_size = __perf_evsel__sample_size(attr->sample_type);
+ perf_evsel__calc_id_pos(evsel);
}

struct perf_evsel *perf_evsel__new(struct perf_event_attr *attr, int idx)
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 532a5f9..3056f4f 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -48,6 +48,11 @@ struct perf_sample_id {
* @name - Can be set to retain the original event name passed by the user,
* so that when showing results in tools such as 'perf stat', we
* show the name used, not some alias.
+ * @id_pos: the position of the event id (PERF_SAMPLE_ID) in a sample event
+ * i.e. in the array of struct sample_event
+ * @is_pos: the position (counting backwards) of the event id (PERF_SAMPLE_ID)
+ * in a non-sample event i.e. if sample_id_all is used there is an id
+ * sample appended to non-sample events
*/
struct perf_evsel {
struct list_head node;
@@ -74,6 +79,8 @@ struct perf_evsel {
} handler;
struct cpu_map *cpus;
unsigned int sample_size;
+ int id_pos;
+ int is_pos;
bool supported;
bool needs_swap;
/* parse modifier helper */
@@ -104,6 +111,9 @@ void perf_evsel__delete(struct perf_evsel *evsel);
void perf_evsel__config(struct perf_evsel *evsel,
struct perf_record_opts *opts);

+int __perf_evsel__sample_size(u64 sample_type);
+void perf_evsel__calc_id_pos(struct perf_evsel *evsel);
+
bool perf_evsel__is_cache_op_valid(u8 type, u8 op);

#define PERF_EVSEL__MAX_ALIASES 8
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index de16a77..0e7ae17 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -739,7 +739,7 @@ static void perf_session__print_tstamp(struct perf_session *session,
union perf_event *event,
struct perf_sample *sample)
{
- u64 sample_type = perf_evlist__sample_type(session->evlist);
+ u64 sample_type = __perf_evlist__combined_sample_type(session->evlist);

if (event->header.type != PERF_RECORD_SAMPLE &&
!perf_evlist__sample_id_all(session->evlist)) {
@@ -1001,7 +1001,7 @@ static int perf_session__preprocess_sample(struct perf_session *session,
union perf_event *event, struct perf_sample *sample)
{
if (event->header.type != PERF_RECORD_SAMPLE ||
- !(perf_evlist__sample_type(session->evlist) & PERF_SAMPLE_CALLCHAIN))
+ !sample->callchain)
return 0;

if (!ip_callchain__valid(sample->callchain, event)) {
--
1.7.11.7

2013-08-14 12:54:38

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH V11 09/15] perf: make events stream always parsable

On 14/08/13 15:48, Adrian Hunter wrote:
> The event stream is not always parsable because the format of a sample
> is dependent on the sample_type of the selected event. When there
> is more than one selected event and the sample_types are not the
> same then parsing becomes problematic. A sample can be matched to its
> selected event using the ID that is allocated when the event is opened.
> Unfortunately, to get the ID from the sample means first parsing it.
>
> This patch adds a new sample format bit PERF_SAMPLE_IDENTIFER that puts
> the ID at a fixed position so that the ID can be retrieved without
> parsing the sample. For sample events, that is the first position
> immediately after the header. For non-sample events, that is the last
> position.
>
> In this respect parsing samples requires that the sample_type and ID
> values are recorded. For example, perf tools records struct perf_event_attr
> and the IDs within the perf.data file. Those must be read first
> before it is possible to parse samples found later in the perf.data file.
>
> Signed-off-by: Adrian Hunter <[email protected]>

Forgot to add Peter's ack which is here:

http://marc.info/?l=linux-kernel&m=137629757013526&w=2

2013-08-14 16:37:54

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH V11 12/15] perf tools: add missing 'abi' member to 'struct regs_dump'

On Wed, Aug 14, 2013 at 03:48:34PM +0300, Adrian Hunter wrote:
> And store the parsed value there. Note that the 'abi' is
> 0 (no registers), 1 (32-bit registers) or 2 (64-bit registers),
> but the registers are anyway copied one-by-one as 64-bit
> values onto the event i.e. see 'perf_output_sample_regs()'
>
> Signed-off-by: Adrian Hunter <[email protected]>

Acked-by: Jiri Olsa <[email protected]>

2013-08-14 16:40:17

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH V11 13/15] perf tools: expand perf_event__synthesize_sample()

On Wed, Aug 14, 2013 at 03:48:35PM +0300, Adrian Hunter wrote:

SNIP

> index 81c170f..4c97f36 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1389,7 +1389,6 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
> }
>
> if (type & PERF_SAMPLE_REGS_USER) {
> - /* First u64 tells us if we have any regs in sample. */

;-)

> OVERFLOW_CHECK_u64(array);
> data->user_regs.abi = *array;

SNIP

> +
> + if (type & PERF_SAMPLE_REGS_USER) {
> + if (sample->user_regs.abi) {
> + *array++ = sample->user_regs.abi;
> + sz = hweight_long(sample_regs_user) * sizeof(u64);
> + memcpy(array, sample->user_regs.regs, sz);
> + array = (void *)array + sz;
> + } else {
> + *array++ = 0;
> + }
> + }

Acked-by: Jiri Olsa <[email protected]>

2013-08-16 18:41:12

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH V11 03/15] perf tools: allow non-matching sample types

Em Wed, Aug 14, 2013 at 03:48:25PM +0300, Adrian Hunter escreveu:
> Sample types need not be identical to determine
> the sample id from the event. Only the position
> of the sample id needs to be the same.
>
> Compatible sample types are ones in which the bits
> defined by PERF_COMPAT_MASK are the same.
> 'perf_evlist__config()' forces sample types to be
> compatible on that basis.
>
> Signed-off-by: Adrian Hunter <[email protected]>
> ---
> tools/perf/builtin-report.c | 2 +-
> tools/perf/util/event.h | 14 +++++
> tools/perf/util/evlist.c | 136 ++++++++++++++++++++++++++++++++++++++++++--
> tools/perf/util/evlist.h | 8 ++-
> tools/perf/util/evsel.c | 64 ++++++++++++++++++++-
> tools/perf/util/evsel.h | 10 ++++
> tools/perf/util/session.c | 4 +-
> 7 files changed, 228 insertions(+), 10 deletions(-)
>
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index 958a56a..9725aa3 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -365,7 +365,7 @@ static int process_read_event(struct perf_tool *tool,
> static int perf_report__setup_sample_type(struct perf_report *rep)
> {
> struct perf_session *self = rep->session;
> - u64 sample_type = perf_evlist__sample_type(self->evlist);
> + u64 sample_type = perf_evlist__combined_sample_type(self->evlist);
>
> if (!self->fd_pipe && !(sample_type & PERF_SAMPLE_CALLCHAIN)) {
> if (sort__has_parent) {
> diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
> index 15db071..f6c45fd 100644
> --- a/tools/perf/util/event.h
> +++ b/tools/perf/util/event.h
> @@ -65,6 +65,20 @@ struct read_event {
> PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
> PERF_SAMPLE_CPU | PERF_SAMPLE_PERIOD)
>
> +/*
> + * Events have compatible sample types if the following bits all have the same
> + * value. This is because the order of sample members is fixed. For sample
> + * events the order is: PERF_SAMPLE_IP, PERF_SAMPLE_TID, PERF_SAMPLE_TIME,
> + * PERF_SAMPLE_ADDR, PERF_SAMPLE_ID. For non-sample events the sample members
> + * are accessed in reverse order. The order is: PERF_SAMPLE_ID,
> + * PERF_SAMPLE_STREAM_ID, PERF_SAMPLE_CPU.
> + */
> +#define PERF_COMPAT_MASK \
> + (PERF_SAMPLE_IP | PERF_SAMPLE_TID | \
> + PERF_SAMPLE_TIME | PERF_SAMPLE_ADDR | \
> + PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
> + PERF_SAMPLE_CPU)
> +
> struct sample_event {
> struct perf_event_header header;
> u64 array[];
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 1f5105a..2e5c0b7 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -49,6 +49,46 @@ struct perf_evlist *perf_evlist__new(void)
> return evlist;
> }
>
> +/**
> + * perf_evlist__set_id_pos - set the positions of event ids.
> + * @evlist: selected event list
> + *
> + * Events with compatible sample types all have the same id_pos
> + * and is_pos. For convenience, put a copy on evlist.
> + */
> +static void perf_evlist__set_id_pos(struct perf_evlist *evlist)
> +{
> + struct perf_evsel *first = perf_evlist__first(evlist);
> +
> + evlist->id_pos = first->id_pos;
> + evlist->is_pos = first->is_pos;
> +}
> +
> +/**
> + * perf_evlist__make_sample_types_compatible - make sample types compatible.
> + * @evlist: selected event list
> + *
> + * Events with compatible sample types all have the same id_pos and is_pos.
> + * This can be achieved by matching the bits of PERF_COMPAT_MASK.
> + */
> +void perf_evlist__make_sample_types_compatible(struct perf_evlist *evlist)
> +{
> + struct perf_evsel *evsel;
> + u64 compat = 0;
> +
> + list_for_each_entry(evsel, &evlist->entries, node)
> + compat |= evsel->attr.sample_type & PERF_COMPAT_MASK;
> +
> + list_for_each_entry(evsel, &evlist->entries, node) {
> + evsel->attr.sample_type |= compat;
> + evsel->sample_size =
> + __perf_evsel__sample_size(evsel->attr.sample_type);
> + perf_evsel__calc_id_pos(evsel);
> + }
> +
> + perf_evlist__set_id_pos(evlist);
> +}
> +
> void perf_evlist__config(struct perf_evlist *evlist,
> struct perf_record_opts *opts)
> {
> @@ -69,6 +109,8 @@ void perf_evlist__config(struct perf_evlist *evlist,
> if (evlist->nr_entries > 1)
> perf_evsel__set_sample_id(evsel);
> }
> +
> + perf_evlist__make_sample_types_compatible(evlist);
> }
>
> static void perf_evlist__purge(struct perf_evlist *evlist)
> @@ -102,6 +144,7 @@ void perf_evlist__add(struct perf_evlist *evlist, struct perf_evsel *entry)
> {
> list_add_tail(&entry->node, &evlist->entries);
> ++evlist->nr_entries;
> + perf_evlist__set_id_pos(evlist);

So we repeatedly call this, that will set it to the same element (we add
to the tail)), its not a problem, but wouldn't it be clearer as:

if (!evlist->nr_entries++)
perf_evlist__set_id_pos(evlist);

> }
>
> void perf_evlist__splice_list_tail(struct perf_evlist *evlist,
> @@ -110,6 +153,7 @@ void perf_evlist__splice_list_tail(struct perf_evlist *evlist,
> {
> list_splice_tail(list, &evlist->entries);
> evlist->nr_entries += nr_entries;
> + perf_evlist__set_id_pos(evlist);

Ditto.

> }
>
> void __perf_evlist__set_leader(struct list_head *list)
> @@ -371,6 +415,55 @@ struct perf_evsel *perf_evlist__id2evsel(struct perf_evlist *evlist, u64 id)
> return NULL;
> }
>
> +static int perf_evlist__event2id(struct perf_evlist *evlist,
> + union perf_event *event, u64 *id)
> +{
> + const u64 *array = event->sample.array;
> + ssize_t n;
> +
> + n = (event->header.size - sizeof(event->header)) >> 3;
> +
> + if (event->header.type == PERF_RECORD_SAMPLE) {
> + if (evlist->id_pos >= n)
> + return -1;
> + *id = array[evlist->id_pos];
> + } else {
> + if (evlist->is_pos >= n)
> + return -1;
> + n -= evlist->is_pos;
> + *id = array[n];
> + }
> + return 0;
> +}
> +
> +static struct perf_evsel *perf_evlist__event2evsel(struct perf_evlist *evlist,
> + union perf_event *event)
> +{
> + struct hlist_head *head;
> + struct perf_sample_id *sid;
> + int hash;
> + u64 id;
> +
> + if (evlist->nr_entries == 1 || evlist->matching_sample_types)
> + return perf_evlist__first(evlist);

So this doesn't really maps to the evsel with PERF_SAMPLE_ID, but to a
evsel that has a sample_type that is the same as whatever evsel maps to
what is in PERF_SAMPLE_ID, right?

I think we should use a better name for this function, lets see its
usage...

> + if (perf_evlist__event2id(evlist, event, &id))
> + return NULL;
> +
> + /* Synthesized events have an id of zero */
> + if (!id)
> + return perf_evlist__first(evlist);
> +
> + hash = hash_64(id, PERF_EVLIST__HLIST_BITS);
> + head = &evlist->heads[hash];
> +
> + hlist_for_each_entry(sid, head, node) {
> + if (sid->id == id)
> + return sid->evsel;
> + }
> + return NULL;
> +}
> +
> union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
> {
> struct perf_mmap *md = &evlist->mmap[idx];
> @@ -682,19 +775,49 @@ int perf_evlist__set_filter(struct perf_evlist *evlist, const char *filter)
> bool perf_evlist__valid_sample_type(struct perf_evlist *evlist)
> {
> struct perf_evsel *first = perf_evlist__first(evlist), *pos = first;
> + bool ok = true;
>
> list_for_each_entry_continue(pos, &evlist->entries, node) {
> - if (first->attr.sample_type != pos->attr.sample_type)
> + if (first->attr.sample_type != pos->attr.sample_type) {
> + ok = false;
> + break;
> + }
> + }
> +
> + if (ok) {
> + evlist->matching_sample_types = true;
> + return true;
> + }
> +

What about:

evlist->matching_sample_types = true;

list_for_each_entry_continue(pos, &evlist->entries, node) {
if (first->attr.sample_type != pos->attr.sample_type) {
evlist->matching_sample_types = false;
break;
}
}

if (evlist->matching_sample_types)
return true;

> + if (evlist->id_pos < 0 || evlist->is_pos < 0)
> + return false;

Where do we set this?

If this is the case why don't we test it as the first step in this
perf_evlist__valid_sample_type() function?

Humm, probably if matching_sample_types is true the values of these
variables are not used at all?

> + list_for_each_entry(pos, &evlist->entries, node) {
> + if (pos->id_pos != evlist->id_pos ||
> + pos->is_pos != evlist->is_pos)
> return false;
> }
>
> return true;
> }
>
> -u64 perf_evlist__sample_type(struct perf_evlist *evlist)
> +u64 __perf_evlist__combined_sample_type(struct perf_evlist *evlist)
> {
> - struct perf_evsel *first = perf_evlist__first(evlist);
> - return first->attr.sample_type;
> + struct perf_evsel *evsel;
> +
> + if (evlist->combined_sample_type)
> + return evlist->combined_sample_type;
> +
> + list_for_each_entry(evsel, &evlist->entries, node)
> + evlist->combined_sample_type |= evsel->attr.sample_type;
> +
> + return evlist->combined_sample_type;
> +}
> +
> +u64 perf_evlist__combined_sample_type(struct perf_evlist *evlist)
> +{
> + evlist->combined_sample_type = 0;
> + return __perf_evlist__combined_sample_type(evlist);
> }
>
> bool perf_evlist__valid_read_format(struct perf_evlist *evlist)
> @@ -907,7 +1030,10 @@ int perf_evlist__start_workload(struct perf_evlist *evlist)
> int perf_evlist__parse_sample(struct perf_evlist *evlist, union perf_event *event,
> struct perf_sample *sample)
> {
> - struct perf_evsel *evsel = perf_evlist__first(evlist);
> + struct perf_evsel *evsel = perf_evlist__event2evsel(evlist, event);
> +
> + if (!evsel)
> + return -EFAULT;

Ok, here, for an evlist with three events with the same sample_type
we'll always get the first event, so:

Can't we have the same sample_type but different sample_regs_user, thus
different sample_size and then this:

if (type & PERF_SAMPLE_REGS_USER) {
/* First u64 tells us if we have any regs in sample. */
u64 avail = *array++;

if (avail) {
data->user_regs.regs = (u64 *)array;
array += hweight_long(regs_user);
}
}

could break if the first event asked for more registers to be dumped per
sample?

I.e. that optimization to return the first entry needs to look at all
the evsels sample_regs_user?

> return perf_evsel__parse_sample(evsel, event, sample);
> }
>
> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
> index 327abab..1019758 100644
> --- a/tools/perf/util/evlist.h
> +++ b/tools/perf/util/evlist.h
> @@ -32,11 +32,15 @@ struct perf_evlist {
> int nr_fds;
> int nr_mmaps;
> int mmap_len;
> + int id_pos;
> + int is_pos;
> + u64 combined_sample_type;
> struct {
> int cork_fd;
> pid_t pid;
> } workload;
> bool overwrite;
> + bool matching_sample_types;
> struct perf_mmap *mmap;
> struct pollfd *pollfd;
> struct thread_map *threads;
> @@ -85,6 +89,7 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *self, int idx);
> int perf_evlist__open(struct perf_evlist *evlist);
> void perf_evlist__close(struct perf_evlist *evlist);
>
> +void perf_evlist__make_sample_types_compatible(struct perf_evlist *evlist);
> void perf_evlist__config(struct perf_evlist *evlist,
> struct perf_record_opts *opts);
>
> @@ -121,7 +126,8 @@ void __perf_evlist__set_leader(struct list_head *list);
> void perf_evlist__set_leader(struct perf_evlist *evlist);
>
> u64 perf_evlist__read_format(struct perf_evlist *evlist);
> -u64 perf_evlist__sample_type(struct perf_evlist *evlist);
> +u64 __perf_evlist__combined_sample_type(struct perf_evlist *evlist);
> +u64 perf_evlist__combined_sample_type(struct perf_evlist *evlist);
> bool perf_evlist__sample_id_all(struct perf_evlist *evlist);
> u16 perf_evlist__id_hdr_size(struct perf_evlist *evlist);
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index ad5f701..565ab55 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -30,7 +30,7 @@ static struct {
>
> #define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
>
> -static int __perf_evsel__sample_size(u64 sample_type)
> +int __perf_evsel__sample_size(u64 sample_type)
> {
> u64 mask = sample_type & PERF_SAMPLE_MASK;
> int size = 0;
> @@ -46,6 +46,65 @@ static int __perf_evsel__sample_size(u64 sample_type)
> return size;
> }
>
> +/**
> + * __perf_evsel__calc_id_pos - calculate id_pos.
> + * @sample_type: sample type
> + *
> + * This function returns the position of the event id (PERF_SAMPLE_ID) in a
> + * sample event i.e. in the array of struct sample_event.
> + */
> +static int __perf_evsel__calc_id_pos(u64 sample_type)
> +{
> + int idx = 0;
> +
> + if (!(sample_type & PERF_SAMPLE_ID))
> + return -1;
> +
> + if (sample_type & PERF_SAMPLE_IP)
> + idx += 1;
> +
> + if (sample_type & PERF_SAMPLE_TID)
> + idx += 1;
> +
> + if (sample_type & PERF_SAMPLE_TIME)
> + idx += 1;
> +
> + if (sample_type & PERF_SAMPLE_ADDR)
> + idx += 1;
> +
> + return idx;
> +}
> +
> +/**
> + * __perf_evsel__calc_is_pos - calculate is_pos.
> + * @sample_type: sample type
> + *
> + * This function returns the position (counting backwards) of the event id
> + * (PERF_SAMPLE_ID) in a non-sample event i.e. if sample_id_all is used there is
> + * an id sample appended to non-sample events.
> + */
> +static int __perf_evsel__calc_is_pos(u64 sample_type)
> +{
> + int idx = 1;
> +
> + if (!(sample_type & PERF_SAMPLE_ID))
> + return -1;
> +
> + if (sample_type & PERF_SAMPLE_CPU)
> + idx += 1;
> +
> + if (sample_type & PERF_SAMPLE_STREAM_ID)
> + idx += 1;
> +
> + return idx;
> +}
> +
> +void perf_evsel__calc_id_pos(struct perf_evsel *evsel)
> +{
> + evsel->id_pos = __perf_evsel__calc_id_pos(evsel->attr.sample_type);
> + evsel->is_pos = __perf_evsel__calc_is_pos(evsel->attr.sample_type);
> +}
> +
> void hists__init(struct hists *hists)
> {
> memset(hists, 0, sizeof(*hists));
> @@ -62,6 +121,7 @@ void __perf_evsel__set_sample_bit(struct perf_evsel *evsel,
> if (!(evsel->attr.sample_type & bit)) {
> evsel->attr.sample_type |= bit;
> evsel->sample_size += sizeof(u64);
> + perf_evsel__calc_id_pos(evsel);
> }
> }
>
> @@ -71,6 +131,7 @@ void __perf_evsel__reset_sample_bit(struct perf_evsel *evsel,
> if (evsel->attr.sample_type & bit) {
> evsel->attr.sample_type &= ~bit;
> evsel->sample_size -= sizeof(u64);
> + perf_evsel__calc_id_pos(evsel);
> }
> }
>
> @@ -89,6 +150,7 @@ void perf_evsel__init(struct perf_evsel *evsel,
> INIT_LIST_HEAD(&evsel->node);
> hists__init(&evsel->hists);
> evsel->sample_size = __perf_evsel__sample_size(attr->sample_type);
> + perf_evsel__calc_id_pos(evsel);
> }
>
> struct perf_evsel *perf_evsel__new(struct perf_event_attr *attr, int idx)
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index 532a5f9..3056f4f 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -48,6 +48,11 @@ struct perf_sample_id {
> * @name - Can be set to retain the original event name passed by the user,
> * so that when showing results in tools such as 'perf stat', we
> * show the name used, not some alias.
> + * @id_pos: the position of the event id (PERF_SAMPLE_ID) in a sample event
> + * i.e. in the array of struct sample_event
> + * @is_pos: the position (counting backwards) of the event id (PERF_SAMPLE_ID)
> + * in a non-sample event i.e. if sample_id_all is used there is an id
> + * sample appended to non-sample events
> */
> struct perf_evsel {
> struct list_head node;
> @@ -74,6 +79,8 @@ struct perf_evsel {
> } handler;
> struct cpu_map *cpus;
> unsigned int sample_size;
> + int id_pos;
> + int is_pos;
> bool supported;
> bool needs_swap;
> /* parse modifier helper */
> @@ -104,6 +111,9 @@ void perf_evsel__delete(struct perf_evsel *evsel);
> void perf_evsel__config(struct perf_evsel *evsel,
> struct perf_record_opts *opts);
>
> +int __perf_evsel__sample_size(u64 sample_type);
> +void perf_evsel__calc_id_pos(struct perf_evsel *evsel);
> +
> bool perf_evsel__is_cache_op_valid(u8 type, u8 op);
>
> #define PERF_EVSEL__MAX_ALIASES 8
> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
> index de16a77..0e7ae17 100644
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -739,7 +739,7 @@ static void perf_session__print_tstamp(struct perf_session *session,
> union perf_event *event,
> struct perf_sample *sample)
> {
> - u64 sample_type = perf_evlist__sample_type(session->evlist);
> + u64 sample_type = __perf_evlist__combined_sample_type(session->evlist);
>
> if (event->header.type != PERF_RECORD_SAMPLE &&
> !perf_evlist__sample_id_all(session->evlist)) {
> @@ -1001,7 +1001,7 @@ static int perf_session__preprocess_sample(struct perf_session *session,
> union perf_event *event, struct perf_sample *sample)
> {
> if (event->header.type != PERF_RECORD_SAMPLE ||
> - !(perf_evlist__sample_type(session->evlist) & PERF_SAMPLE_CALLCHAIN))
> + !sample->callchain)
> return 0;
>
> if (!ip_callchain__valid(sample->callchain, event)) {
> --
> 1.7.11.7

2013-08-18 19:04:25

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH V11 03/15] perf tools: allow non-matching sample types

On 16/08/2013 9:41 p.m., Arnaldo Carvalho de Melo wrote:
> Em Wed, Aug 14, 2013 at 03:48:25PM +0300, Adrian Hunter escreveu:
>> Sample types need not be identical to determine
>> the sample id from the event. Only the position
>> of the sample id needs to be the same.
>>
>> Compatible sample types are ones in which the bits
>> defined by PERF_COMPAT_MASK are the same.
>> 'perf_evlist__config()' forces sample types to be
>> compatible on that basis.
>>
>> Signed-off-by: Adrian Hunter <[email protected]>
>> ---
>> tools/perf/builtin-report.c | 2 +-
>> tools/perf/util/event.h | 14 +++++
>> tools/perf/util/evlist.c | 136 ++++++++++++++++++++++++++++++++++++++++++--
>> tools/perf/util/evlist.h | 8 ++-
>> tools/perf/util/evsel.c | 64 ++++++++++++++++++++-
>> tools/perf/util/evsel.h | 10 ++++
>> tools/perf/util/session.c | 4 +-
>> 7 files changed, 228 insertions(+), 10 deletions(-)
>>
>> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
>> index 958a56a..9725aa3 100644
>> --- a/tools/perf/builtin-report.c
>> +++ b/tools/perf/builtin-report.c
>> @@ -365,7 +365,7 @@ static int process_read_event(struct perf_tool *tool,
>> static int perf_report__setup_sample_type(struct perf_report *rep)
>> {
>> struct perf_session *self = rep->session;
>> - u64 sample_type = perf_evlist__sample_type(self->evlist);
>> + u64 sample_type = perf_evlist__combined_sample_type(self->evlist);
>>
>> if (!self->fd_pipe && !(sample_type & PERF_SAMPLE_CALLCHAIN)) {
>> if (sort__has_parent) {
>> diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
>> index 15db071..f6c45fd 100644
>> --- a/tools/perf/util/event.h
>> +++ b/tools/perf/util/event.h
>> @@ -65,6 +65,20 @@ struct read_event {
>> PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
>> PERF_SAMPLE_CPU | PERF_SAMPLE_PERIOD)
>>
>> +/*
>> + * Events have compatible sample types if the following bits all have the same
>> + * value. This is because the order of sample members is fixed. For sample
>> + * events the order is: PERF_SAMPLE_IP, PERF_SAMPLE_TID, PERF_SAMPLE_TIME,
>> + * PERF_SAMPLE_ADDR, PERF_SAMPLE_ID. For non-sample events the sample members
>> + * are accessed in reverse order. The order is: PERF_SAMPLE_ID,
>> + * PERF_SAMPLE_STREAM_ID, PERF_SAMPLE_CPU.
>> + */
>> +#define PERF_COMPAT_MASK \
>> + (PERF_SAMPLE_IP | PERF_SAMPLE_TID | \
>> + PERF_SAMPLE_TIME | PERF_SAMPLE_ADDR | \
>> + PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
>> + PERF_SAMPLE_CPU)
>> +
>> struct sample_event {
>> struct perf_event_header header;
>> u64 array[];
>> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
>> index 1f5105a..2e5c0b7 100644
>> --- a/tools/perf/util/evlist.c
>> +++ b/tools/perf/util/evlist.c
>> @@ -49,6 +49,46 @@ struct perf_evlist *perf_evlist__new(void)
>> return evlist;
>> }
>>
>> +/**
>> + * perf_evlist__set_id_pos - set the positions of event ids.
>> + * @evlist: selected event list
>> + *
>> + * Events with compatible sample types all have the same id_pos
>> + * and is_pos. For convenience, put a copy on evlist.
>> + */
>> +static void perf_evlist__set_id_pos(struct perf_evlist *evlist)
>> +{
>> + struct perf_evsel *first = perf_evlist__first(evlist);
>> +
>> + evlist->id_pos = first->id_pos;
>> + evlist->is_pos = first->is_pos;
>> +}
>> +
>> +/**
>> + * perf_evlist__make_sample_types_compatible - make sample types compatible.
>> + * @evlist: selected event list
>> + *
>> + * Events with compatible sample types all have the same id_pos and is_pos.
>> + * This can be achieved by matching the bits of PERF_COMPAT_MASK.
>> + */
>> +void perf_evlist__make_sample_types_compatible(struct perf_evlist *evlist)
>> +{
>> + struct perf_evsel *evsel;
>> + u64 compat = 0;
>> +
>> + list_for_each_entry(evsel, &evlist->entries, node)
>> + compat |= evsel->attr.sample_type & PERF_COMPAT_MASK;
>> +
>> + list_for_each_entry(evsel, &evlist->entries, node) {
>> + evsel->attr.sample_type |= compat;
>> + evsel->sample_size =
>> + __perf_evsel__sample_size(evsel->attr.sample_type);
>> + perf_evsel__calc_id_pos(evsel);
>> + }
>> +
>> + perf_evlist__set_id_pos(evlist);
>> +}
>> +
>> void perf_evlist__config(struct perf_evlist *evlist,
>> struct perf_record_opts *opts)
>> {
>> @@ -69,6 +109,8 @@ void perf_evlist__config(struct perf_evlist *evlist,
>> if (evlist->nr_entries > 1)
>> perf_evsel__set_sample_id(evsel);
>> }
>> +
>> + perf_evlist__make_sample_types_compatible(evlist);
>> }
>>
>> static void perf_evlist__purge(struct perf_evlist *evlist)
>> @@ -102,6 +144,7 @@ void perf_evlist__add(struct perf_evlist *evlist, struct perf_evsel *entry)
>> {
>> list_add_tail(&entry->node, &evlist->entries);
>> ++evlist->nr_entries;
>> + perf_evlist__set_id_pos(evlist);
>
> So we repeatedly call this, that will set it to the same element (we add
> to the tail)), its not a problem, but wouldn't it be clearer as:
>
> if (!evlist->nr_entries++)
> perf_evlist__set_id_pos(evlist);

OK

>
>> }
>>
>> void perf_evlist__splice_list_tail(struct perf_evlist *evlist,
>> @@ -110,6 +153,7 @@ void perf_evlist__splice_list_tail(struct perf_evlist *evlist,
>> {
>> list_splice_tail(list, &evlist->entries);
>> evlist->nr_entries += nr_entries;
>> + perf_evlist__set_id_pos(evlist);
>
> Ditto.

OK

>
>> }
>>
>> void __perf_evlist__set_leader(struct list_head *list)
>> @@ -371,6 +415,55 @@ struct perf_evsel *perf_evlist__id2evsel(struct perf_evlist *evlist, u64 id)
>> return NULL;
>> }
>>
>> +static int perf_evlist__event2id(struct perf_evlist *evlist,
>> + union perf_event *event, u64 *id)
>> +{
>> + const u64 *array = event->sample.array;
>> + ssize_t n;
>> +
>> + n = (event->header.size - sizeof(event->header)) >> 3;
>> +
>> + if (event->header.type == PERF_RECORD_SAMPLE) {
>> + if (evlist->id_pos >= n)
>> + return -1;
>> + *id = array[evlist->id_pos];
>> + } else {
>> + if (evlist->is_pos >= n)
>> + return -1;
>> + n -= evlist->is_pos;
>> + *id = array[n];
>> + }
>> + return 0;
>> +}
>> +
>> +static struct perf_evsel *perf_evlist__event2evsel(struct perf_evlist *evlist,
>> + union perf_event *event)
>> +{
>> + struct hlist_head *head;
>> + struct perf_sample_id *sid;
>> + int hash;
>> + u64 id;
>> +
>> + if (evlist->nr_entries == 1 || evlist->matching_sample_types)
>> + return perf_evlist__first(evlist);
>
> So this doesn't really maps to the evsel with PERF_SAMPLE_ID, but to a
> evsel that has a sample_type that is the same as whatever evsel maps to
> what is in PERF_SAMPLE_ID, right?
>
> I think we should use a better name for this function, lets see its
> usage...

I will get rid of evlist->matching_sample_types

>
>> + if (perf_evlist__event2id(evlist, event, &id))
>> + return NULL;
>> +
>> + /* Synthesized events have an id of zero */
>> + if (!id)
>> + return perf_evlist__first(evlist);
>> +
>> + hash = hash_64(id, PERF_EVLIST__HLIST_BITS);
>> + head = &evlist->heads[hash];
>> +
>> + hlist_for_each_entry(sid, head, node) {
>> + if (sid->id == id)
>> + return sid->evsel;
>> + }
>> + return NULL;
>> +}
>> +
>> union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
>> {
>> struct perf_mmap *md = &evlist->mmap[idx];
>> @@ -682,19 +775,49 @@ int perf_evlist__set_filter(struct perf_evlist *evlist, const char *filter)
>> bool perf_evlist__valid_sample_type(struct perf_evlist *evlist)
>> {
>> struct perf_evsel *first = perf_evlist__first(evlist), *pos = first;
>> + bool ok = true;
>>
>> list_for_each_entry_continue(pos, &evlist->entries, node) {
>> - if (first->attr.sample_type != pos->attr.sample_type)
>> + if (first->attr.sample_type != pos->attr.sample_type) {
>> + ok = false;
>> + break;
>> + }
>> + }
>> +
>> + if (ok) {
>> + evlist->matching_sample_types = true;
>> + return true;
>> + }
>> +
>
> What about:
>
> evlist->matching_sample_types = true;
>
> list_for_each_entry_continue(pos, &evlist->entries, node) {
> if (first->attr.sample_type != pos->attr.sample_type) {
> evlist->matching_sample_types = false;
> break;
> }
> }
>
> if (evlist->matching_sample_types)
> return true;
>

I will get rid of evlist->matching_sample_types

>> + if (evlist->id_pos < 0 || evlist->is_pos < 0)
>> + return false;
>
> Where do we set this?

id_pos and is_pos are calculated by __perf_evsel__calc_id_pos() and
__perf_evsel__calc_is_pos(). -1 means there was no PERF_SAMPLE_ID.

>
> If this is the case why don't we test it as the first step in this
> perf_evlist__valid_sample_type() function?
>
> Humm, probably if matching_sample_types is true the values of these
> variables are not used at all?

Yes, but if I drop matching_sample_types it will be the first thing checked.

>
>> + list_for_each_entry(pos, &evlist->entries, node) {
>> + if (pos->id_pos != evlist->id_pos ||
>> + pos->is_pos != evlist->is_pos)
>> return false;
>> }
>>
>> return true;
>> }
>>
>> -u64 perf_evlist__sample_type(struct perf_evlist *evlist)
>> +u64 __perf_evlist__combined_sample_type(struct perf_evlist *evlist)
>> {
>> - struct perf_evsel *first = perf_evlist__first(evlist);
>> - return first->attr.sample_type;
>> + struct perf_evsel *evsel;
>> +
>> + if (evlist->combined_sample_type)
>> + return evlist->combined_sample_type;
>> +
>> + list_for_each_entry(evsel, &evlist->entries, node)
>> + evlist->combined_sample_type |= evsel->attr.sample_type;
>> +
>> + return evlist->combined_sample_type;
>> +}
>> +
>> +u64 perf_evlist__combined_sample_type(struct perf_evlist *evlist)
>> +{
>> + evlist->combined_sample_type = 0;
>> + return __perf_evlist__combined_sample_type(evlist);
>> }
>>
>> bool perf_evlist__valid_read_format(struct perf_evlist *evlist)
>> @@ -907,7 +1030,10 @@ int perf_evlist__start_workload(struct perf_evlist *evlist)
>> int perf_evlist__parse_sample(struct perf_evlist *evlist, union perf_event *event,
>> struct perf_sample *sample)
>> {
>> - struct perf_evsel *evsel = perf_evlist__first(evlist);
>> + struct perf_evsel *evsel = perf_evlist__event2evsel(evlist, event);
>> +
>> + if (!evsel)
>> + return -EFAULT;
>
> Ok, here, for an evlist with three events with the same sample_type
> we'll always get the first event, so:
>
> Can't we have the same sample_type but different sample_regs_user, thus
> different sample_size and then this:
>
> if (type & PERF_SAMPLE_REGS_USER) {
> /* First u64 tells us if we have any regs in sample. */
> u64 avail = *array++;
>
> if (avail) {
> data->user_regs.regs = (u64 *)array;
> array += hweight_long(regs_user);
> }
> }
>
> could break if the first event asked for more registers to be dumped per
> sample?
>
> I.e. that optimization to return the first entry needs to look at all
> the evsels sample_regs_user?
>

That is how it works now - the first evsel is used for parsing. But
yes, it is better to remove evlist->matching_sample_types.

2013-08-21 13:39:57

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH V11 09/15] perf: make events stream always parsable

On Wed, Aug 14, 2013 at 3:00 PM, Adrian Hunter <[email protected]> wrote:
> On 14/08/13 15:48, Adrian Hunter wrote:
>> The event stream is not always parsable because the format of a sample
>> is dependent on the sample_type of the selected event. When there
>> is more than one selected event and the sample_types are not the
>> same then parsing becomes problematic. A sample can be matched to its
>> selected event using the ID that is allocated when the event is opened.
>> Unfortunately, to get the ID from the sample means first parsing it.
>>
>> This patch adds a new sample format bit PERF_SAMPLE_IDENTIFER that puts
>> the ID at a fixed position so that the ID can be retrieved without
>> parsing the sample. For sample events, that is the first position
>> immediately after the header. For non-sample events, that is the last
>> position.
>>
>> In this respect parsing samples requires that the sample_type and ID
>> values are recorded. For example, perf tools records struct perf_event_attr
>> and the IDs within the perf.data file. Those must be read first
>> before it is possible to parse samples found later in the perf.data file.
>>
>> Signed-off-by: Adrian Hunter <[email protected]>
>
> Forgot to add Peter's ack which is here:
>
> http://marc.info/?l=linux-kernel&m=137629757013526&w=2

Tested this patch via libpfm4 and per-event branch-stack.
Works well.

Tested-by: Stephane Eranian <[email protected]>

Subject: [tip:perf/core] perf tools: Re-implement debug print function for linking python/perf.so

Commit-ID: 8afb4c018e21c882c8fad196772ef74d494185e2
Gitweb: http://git.kernel.org/tip/8afb4c018e21c882c8fad196772ef74d494185e2
Author: Adrian Hunter <[email protected]>
AuthorDate: Wed, 14 Aug 2013 15:48:23 +0300
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Fri, 16 Aug 2013 17:17:58 -0300

perf tools: Re-implement debug print function for linking python/perf.so

The python/perf.so python binding links a subset of objects.

Re-implement 'verbose' and 'eprintf' so they (and consequently
'pr_debug') can be used in objects linked into pythin/perf.so.

Note 'eprintf' must be re-implemented because the full version links the
browser ui.

Signed-off-by: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/python.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
index 925e0c3..381f4fd 100644
--- a/tools/perf/util/python.c
+++ b/tools/perf/util/python.c
@@ -8,6 +8,26 @@
#include "cpumap.h"
#include "thread_map.h"

+/*
+ * Support debug printing even though util/debug.c is not linked. That means
+ * implementing 'verbose' and 'eprintf'.
+ */
+int verbose;
+
+int eprintf(int level, const char *fmt, ...)
+{
+ va_list args;
+ int ret = 0;
+
+ if (verbose >= level) {
+ va_start(args, fmt);
+ ret = vfprintf(stderr, fmt, args);
+ va_end(args);
+ }
+
+ return ret;
+}
+
/* Define PyVarObject_HEAD_INIT for python 2.5 */
#ifndef PyVarObject_HEAD_INIT
# define PyVarObject_HEAD_INIT(type, size) PyObject_HEAD_INIT(type) size,

Subject: [tip:perf/core] perf tools: Add debug prints

Commit-ID: e3e1a54fce81ee045dd152deb5435b136cb0b75f
Gitweb: http://git.kernel.org/tip/e3e1a54fce81ee045dd152deb5435b136cb0b75f
Author: Adrian Hunter <[email protected]>
AuthorDate: Wed, 14 Aug 2013 15:48:24 +0300
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Fri, 16 Aug 2013 17:17:58 -0300

perf tools: Add debug prints

It is useful to see the arguments to perf_event_open and whether the
perf events ring buffer was mmapped per-cpu or per-thread.

That information will now be displayed when verbose is 2 i.e option -vv.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ fixup trivial conflict with fcb14f7 ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/evlist.c | 3 +++
tools/perf/util/evsel.c | 67 +++++++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 69 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index c7d111f..1f5105a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -14,6 +14,7 @@
#include "target.h"
#include "evlist.h"
#include "evsel.h"
+#include "debug.h"
#include <unistd.h>

#include "parse-events.h"
@@ -486,6 +487,7 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist, int prot, int m
int nr_cpus = cpu_map__nr(evlist->cpus);
int nr_threads = thread_map__nr(evlist->threads);

+ pr_debug2("perf event ring buffer mmapped per cpu\n");
for (cpu = 0; cpu < nr_cpus; cpu++) {
int output = -1;

@@ -524,6 +526,7 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist, int prot, in
int thread;
int nr_threads = thread_map__nr(evlist->threads);

+ pr_debug2("perf event ring buffer mmapped per thread\n");
for (thread = 0; thread < nr_threads; thread++) {
int output = -1;

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index a29c8d0..47cbe1e 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -22,6 +22,7 @@
#include "thread_map.h"
#include "target.h"
#include "perf_regs.h"
+#include "debug.h"

static struct {
bool sample_id_all;
@@ -862,6 +863,65 @@ static int get_group_fd(struct perf_evsel *evsel, int cpu, int thread)
return fd;
}

+#define __PRINT_ATTR(fmt, cast, field) \
+ fprintf(fp, " %-19s "fmt"\n", #field, cast attr->field)
+
+#define PRINT_ATTR_U32(field) __PRINT_ATTR("%u" , , field)
+#define PRINT_ATTR_X32(field) __PRINT_ATTR("%#x", , field)
+#define PRINT_ATTR_U64(field) __PRINT_ATTR("%" PRIu64, (uint64_t), field)
+#define PRINT_ATTR_X64(field) __PRINT_ATTR("%#"PRIx64, (uint64_t), field)
+
+#define PRINT_ATTR2N(name1, field1, name2, field2) \
+ fprintf(fp, " %-19s %u %-19s %u\n", \
+ name1, attr->field1, name2, attr->field2)
+
+#define PRINT_ATTR2(field1, field2) \
+ PRINT_ATTR2N(#field1, field1, #field2, field2)
+
+static size_t perf_event_attr__fprintf(struct perf_event_attr *attr, FILE *fp)
+{
+ size_t ret = 0;
+
+ ret += fprintf(fp, "%.60s\n", graph_dotted_line);
+ ret += fprintf(fp, "perf_event_attr:\n");
+
+ ret += PRINT_ATTR_U32(type);
+ ret += PRINT_ATTR_U32(size);
+ ret += PRINT_ATTR_X64(config);
+ ret += PRINT_ATTR_U64(sample_period);
+ ret += PRINT_ATTR_U64(sample_freq);
+ ret += PRINT_ATTR_X64(sample_type);
+ ret += PRINT_ATTR_X64(read_format);
+
+ ret += PRINT_ATTR2(disabled, inherit);
+ ret += PRINT_ATTR2(pinned, exclusive);
+ ret += PRINT_ATTR2(exclude_user, exclude_kernel);
+ ret += PRINT_ATTR2(exclude_hv, exclude_idle);
+ ret += PRINT_ATTR2(mmap, comm);
+ ret += PRINT_ATTR2(freq, inherit_stat);
+ ret += PRINT_ATTR2(enable_on_exec, task);
+ ret += PRINT_ATTR2(watermark, precise_ip);
+ ret += PRINT_ATTR2(mmap_data, sample_id_all);
+ ret += PRINT_ATTR2(exclude_host, exclude_guest);
+ ret += PRINT_ATTR2N("excl.callchain_kern", exclude_callchain_kernel,
+ "excl.callchain_user", exclude_callchain_user);
+
+ ret += PRINT_ATTR_U32(wakeup_events);
+ ret += PRINT_ATTR_U32(wakeup_watermark);
+ ret += PRINT_ATTR_X32(bp_type);
+ ret += PRINT_ATTR_X64(bp_addr);
+ ret += PRINT_ATTR_X64(config1);
+ ret += PRINT_ATTR_U64(bp_len);
+ ret += PRINT_ATTR_X64(config2);
+ ret += PRINT_ATTR_X64(branch_sample_type);
+ ret += PRINT_ATTR_X64(sample_regs_user);
+ ret += PRINT_ATTR_U32(sample_stack_user);
+
+ ret += fprintf(fp, "%.60s\n", graph_dotted_line);
+
+ return ret;
+}
+
static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
struct thread_map *threads)
{
@@ -886,6 +946,9 @@ retry_sample_id:
if (perf_missing_features.sample_id_all)
evsel->attr.sample_id_all = 0;

+ if (verbose >= 2)
+ perf_event_attr__fprintf(&evsel->attr, stderr);
+
for (cpu = 0; cpu < cpus->nr; cpu++) {

for (thread = 0; thread < threads->nr; thread++) {
@@ -895,8 +958,10 @@ retry_sample_id:
pid = threads->map[thread];

group_fd = get_group_fd(evsel, cpu, thread);
-
retry_open:
+ pr_debug2("perf_event_open: pid %d cpu %d group_fd %d flags %#lx\n",
+ pid, cpus->map[cpu], group_fd, flags);
+
FD(evsel, cpu, thread) = sys_perf_event_open(&evsel->attr,
pid,
cpus->map[cpu],