Hi
Here are some fixes and tweaks to perf tools (version 5).
Changes in V5:
Re-based to Arnaldo's tree and dropped already applied patches:
perf tools: remove unused parameter
perf tools: fix missing tool parameter
perf tools: fix missing 'finished_round'
perf tools: fix parse_events_terms() segfault on error path
perf tools: fix new_term() missing free on error path
perf tools: add const specifier to perf_pmu__find name parameter
perf tools: tidy duplicated munmap code
perf tools: validate perf event header size
perf tools: add debug prints
Changed to perf_event_attr__fprintf()
perf tools: add pid to struct thread
Always set the pid, even if a pid is already set
perf tools: change machine__findnew_thread() to set thread pid
Replaces: perf tools: change "machine" functions to set thread pid
perf tools: add support for PERF_SAMPLE_IDENTFIER
Only use PERF_SAMPLE_IDENTFIER if sample types are different
perf tools: expand perf_event__synthesize_sample()
New patch in preparation of a sample parsing test
perf tools: add a sample parsing test
New patch
Changes in V4:
I added kernel support for matching sample types via
PERF_SAMPLE_IDENTIFIER. perf tools support for that required
first fixing some other things.
perf tools: fix parse_events_terms() freeing local variable on error path
Dropped - covered by David Ahern
perf tools: struct thread has a tid not a pid
Added ack by David Ahern
perf tools: add pid to struct thread
Remove unused function
perf tools: fix missing increment in sample parsing
New patch
perf tools: tidy up sample parsing overflow checking
New patch
perf tools: remove unnecessary callchain validation
New patch
perf tools: remove references to struct ip_event
New patch
perf tools: move struct ip_event
New patch
perf: make events stream always parsable
New patch
perf tools: add support for PERF_SAMPLE_IDENTFIER
New patch
Changes in V3:
perf tools: add pid to struct thread
Split into 2 patches
perf tools: fix ppid in thread__fork()
Dropped for now
Changes in V2:
perf tools: fix missing tool parameter
Fixed one extra occurrence
perf tools: fix parse_events_terms() freeing local variable on error path
Made "freeing" code into a new function
perf tools: validate perf event header size
Corrected byte-swapping
perf tools: allow non-matching sample types
Added comments
Fixed id_pos calculation
id_pos/is_pos updated whenever sample_type changes
Removed perf_evlist__sample_type()
Added __perf_evlist__combined_sample_type()
Added perf_evlist__combined_sample_type()
Added perf_evlist__make_sample_types_compatible()
Added ack's to patches acked by Jiri Olsa
Adrian Hunter (12):
perf tools: add debug prints
perf tools: allow non-matching sample types
perf tools: add pid to struct thread
perf tools: change machine__findnew_thread() to set thread pid
perf tools: tidy up sample parsing overflow checking
perf tools: remove unnecessary callchain validation
perf tools: remove references to struct ip_event
perf tools: move struct ip_event
perf: make events stream always parsable
perf tools: add support for PERF_SAMPLE_IDENTFIER
perf tools: expand perf_event__synthesize_sample()
perf tools: add a sample parsing test
include/uapi/linux/perf_event.h | 3 +-
kernel/events/core.c | 11 +-
tools/perf/Makefile | 1 +
tools/perf/builtin-inject.c | 7 +-
tools/perf/builtin-kmem.c | 3 +-
tools/perf/builtin-kvm.c | 3 +-
tools/perf/builtin-lock.c | 3 +-
tools/perf/builtin-mem.c | 2 +-
tools/perf/builtin-report.c | 2 +-
tools/perf/builtin-sched.c | 17 +-
tools/perf/builtin-script.c | 3 +-
tools/perf/builtin-top.c | 10 +-
tools/perf/builtin-trace.c | 12 +-
tools/perf/tests/builtin-test.c | 4 +
tools/perf/tests/hists_link.c | 30 +++-
tools/perf/tests/mmap-basic.c | 2 +-
tools/perf/tests/sample-parsing.c | 231 ++++++++++++++++++++++++++
tools/perf/tests/tests.h | 1 +
tools/perf/util/build-id.c | 11 +-
tools/perf/util/callchain.c | 8 -
tools/perf/util/callchain.h | 5 -
tools/perf/util/event.c | 5 +-
tools/perf/util/event.h | 31 ++--
tools/perf/util/evlist.c | 230 +++++++++++++++++++++++++-
tools/perf/util/evlist.h | 9 +-
tools/perf/util/evsel.c | 330 ++++++++++++++++++++++++++++++++------
tools/perf/util/evsel.h | 13 +-
tools/perf/util/machine.c | 37 +++--
tools/perf/util/machine.h | 3 +-
tools/perf/util/session.c | 35 ++--
tools/perf/util/thread.c | 3 +-
tools/perf/util/thread.h | 3 +-
32 files changed, 909 insertions(+), 159 deletions(-)
create mode 100644 tools/perf/tests/sample-parsing.c
Regards
Adrian
Sample types need not be identical to determine
the sample id from the event. Only the position
of the sample id needs to be the same.
Compatible sample types are ones in which the bits
defined by PERF_COMPAT_MASK are the same.
'perf_evlist__config()' forces sample types to be
compatible on that basis.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/builtin-report.c | 2 +-
tools/perf/util/event.h | 14 +++++
tools/perf/util/evlist.c | 135 ++++++++++++++++++++++++++++++++++++++++++--
tools/perf/util/evlist.h | 8 ++-
tools/perf/util/evsel.c | 64 ++++++++++++++++++++-
tools/perf/util/evsel.h | 10 ++++
tools/perf/util/session.c | 8 ++-
7 files changed, 230 insertions(+), 11 deletions(-)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 9a7e54d..7601278 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -367,7 +367,7 @@ static int process_read_event(struct perf_tool *tool,
static int perf_report__setup_sample_type(struct perf_report *rep)
{
struct perf_session *self = rep->session;
- u64 sample_type = perf_evlist__sample_type(self->evlist);
+ u64 sample_type = perf_evlist__combined_sample_type(self->evlist);
if (!self->fd_pipe && !(sample_type & PERF_SAMPLE_CALLCHAIN)) {
if (sort__has_parent) {
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 1813895..3aef78c 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -65,6 +65,20 @@ struct read_event {
PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
PERF_SAMPLE_CPU | PERF_SAMPLE_PERIOD)
+/*
+ * Events have compatible sample types if the following bits all have the same
+ * value. This is because the order of sample members is fixed. For sample
+ * events the order is: PERF_SAMPLE_IP, PERF_SAMPLE_TID, PERF_SAMPLE_TIME,
+ * PERF_SAMPLE_ADDR, PERF_SAMPLE_ID. For non-sample events the sample members
+ * are accessed in reverse order. The order is: PERF_SAMPLE_ID,
+ * PERF_SAMPLE_STREAM_ID, PERF_SAMPLE_CPU.
+ */
+#define PERF_COMPAT_MASK \
+ (PERF_SAMPLE_IP | PERF_SAMPLE_TID | \
+ PERF_SAMPLE_TIME | PERF_SAMPLE_ADDR | \
+ PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
+ PERF_SAMPLE_CPU)
+
struct sample_event {
struct perf_event_header header;
u64 array[];
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 2b77b33..ffa4afb 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -49,6 +49,45 @@ struct perf_evlist *perf_evlist__new(void)
return evlist;
}
+/**
+ * perf_evlist__set_id_pos - set the positions of event ids.
+ * @evlist: selected event list
+ *
+ * Events with compatible sample types all have the same id_pos
+ * and is_pos. For convenience, put a copy on evlist.
+ */
+static void perf_evlist__set_id_pos(struct perf_evlist *evlist)
+{
+ struct perf_evsel *first = perf_evlist__first(evlist);
+
+ evlist->id_pos = first->id_pos;
+ evlist->is_pos = first->is_pos;
+}
+
+/**
+ * perf_evlist__make_sample_types_compatible - make sample types compatible.
+ * @evlist: selected event list
+ *
+ * Events with compatible sample types all have the same id_pos and is_pos.
+ * This can be achieved by matching the bits of PERF_COMPAT_MASK.
+ */
+void perf_evlist__make_sample_types_compatible(struct perf_evlist *evlist)
+{
+ struct perf_evsel *evsel;
+ u64 compat = 0;
+
+ list_for_each_entry(evsel, &evlist->entries, node)
+ compat |= evsel->attr.sample_type & PERF_COMPAT_MASK;
+
+ list_for_each_entry(evsel, &evlist->entries, node) {
+ evsel->attr.sample_type |= compat;
+ evsel->sample_size = __perf_evsel__sample_size(evsel->attr.sample_type);
+ perf_evsel__calc_id_pos(evsel);
+ }
+
+ perf_evlist__set_id_pos(evlist);
+}
+
void perf_evlist__config(struct perf_evlist *evlist,
struct perf_record_opts *opts)
{
@@ -69,6 +108,8 @@ void perf_evlist__config(struct perf_evlist *evlist,
if (evlist->nr_entries > 1)
perf_evsel__set_sample_id(evsel);
}
+
+ perf_evlist__make_sample_types_compatible(evlist);
}
static void perf_evlist__purge(struct perf_evlist *evlist)
@@ -102,6 +143,7 @@ void perf_evlist__add(struct perf_evlist *evlist, struct perf_evsel *entry)
{
list_add_tail(&entry->node, &evlist->entries);
++evlist->nr_entries;
+ perf_evlist__set_id_pos(evlist);
}
void perf_evlist__splice_list_tail(struct perf_evlist *evlist,
@@ -110,6 +152,7 @@ void perf_evlist__splice_list_tail(struct perf_evlist *evlist,
{
list_splice_tail(list, &evlist->entries);
evlist->nr_entries += nr_entries;
+ perf_evlist__set_id_pos(evlist);
}
void __perf_evlist__set_leader(struct list_head *list)
@@ -339,6 +382,55 @@ struct perf_evsel *perf_evlist__id2evsel(struct perf_evlist *evlist, u64 id)
return NULL;
}
+static int perf_evlist__event2id(struct perf_evlist *evlist,
+ union perf_event *event, u64 *id)
+{
+ const u64 *array = event->sample.array;
+ ssize_t n;
+
+ n = (event->header.size - sizeof(event->header)) >> 3;
+
+ if (event->header.type == PERF_RECORD_SAMPLE) {
+ if (evlist->id_pos >= n)
+ return -1;
+ *id = array[evlist->id_pos];
+ } else {
+ if (evlist->is_pos >= n)
+ return -1;
+ n -= evlist->is_pos;
+ *id = array[n];
+ }
+ return 0;
+}
+
+static struct perf_evsel *perf_evlist__event2evsel(struct perf_evlist *evlist,
+ union perf_event *event)
+{
+ struct hlist_head *head;
+ struct perf_sample_id *sid;
+ int hash;
+ u64 id;
+
+ if (evlist->nr_entries == 1 || evlist->matching_sample_types)
+ return perf_evlist__first(evlist);
+
+ if (perf_evlist__event2id(evlist, event, &id))
+ return NULL;
+
+ /* Synthesized events have an id of zero */
+ if (!id)
+ return perf_evlist__first(evlist);
+
+ hash = hash_64(id, PERF_EVLIST__HLIST_BITS);
+ head = &evlist->heads[hash];
+
+ hlist_for_each_entry(sid, head, node) {
+ if (sid->id == id)
+ return sid->evsel;
+ }
+ return NULL;
+}
+
union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
{
struct perf_mmap *md = &evlist->mmap[idx];
@@ -650,19 +742,49 @@ int perf_evlist__set_filter(struct perf_evlist *evlist, const char *filter)
bool perf_evlist__valid_sample_type(struct perf_evlist *evlist)
{
struct perf_evsel *first = perf_evlist__first(evlist), *pos = first;
+ bool ok = true;
list_for_each_entry_continue(pos, &evlist->entries, node) {
- if (first->attr.sample_type != pos->attr.sample_type)
+ if (first->attr.sample_type != pos->attr.sample_type) {
+ ok = false;
+ break;
+ }
+ }
+
+ if (ok) {
+ evlist->matching_sample_types = true;
+ return true;
+ }
+
+ if (evlist->id_pos < 0 || evlist->is_pos < 0)
+ return false;
+
+ list_for_each_entry(pos, &evlist->entries, node) {
+ if (pos->id_pos != evlist->id_pos ||
+ pos->is_pos != evlist->is_pos)
return false;
}
return true;
}
-u64 perf_evlist__sample_type(struct perf_evlist *evlist)
+u64 __perf_evlist__combined_sample_type(struct perf_evlist *evlist)
{
- struct perf_evsel *first = perf_evlist__first(evlist);
- return first->attr.sample_type;
+ struct perf_evsel *evsel;
+
+ if (evlist->combined_sample_type)
+ return evlist->combined_sample_type;
+
+ list_for_each_entry(evsel, &evlist->entries, node)
+ evlist->combined_sample_type |= evsel->attr.sample_type;
+
+ return evlist->combined_sample_type;
+}
+
+u64 perf_evlist__combined_sample_type(struct perf_evlist *evlist)
+{
+ evlist->combined_sample_type = 0;
+ return __perf_evlist__combined_sample_type(evlist);
}
u16 perf_evlist__id_hdr_size(struct perf_evlist *evlist)
@@ -856,7 +978,10 @@ int perf_evlist__start_workload(struct perf_evlist *evlist)
int perf_evlist__parse_sample(struct perf_evlist *evlist, union perf_event *event,
struct perf_sample *sample)
{
- struct perf_evsel *evsel = perf_evlist__first(evlist);
+ struct perf_evsel *evsel = perf_evlist__event2evsel(evlist, event);
+
+ if (!evsel)
+ return -EFAULT;
return perf_evsel__parse_sample(evsel, event, sample);
}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 0583d36..b1be475 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -32,11 +32,15 @@ struct perf_evlist {
int nr_fds;
int nr_mmaps;
int mmap_len;
+ int id_pos;
+ int is_pos;
+ u64 combined_sample_type;
struct {
int cork_fd;
pid_t pid;
} workload;
bool overwrite;
+ bool matching_sample_types;
struct perf_mmap *mmap;
struct pollfd *pollfd;
struct thread_map *threads;
@@ -83,6 +87,7 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *self, int idx);
int perf_evlist__open(struct perf_evlist *evlist);
void perf_evlist__close(struct perf_evlist *evlist);
+void perf_evlist__make_sample_types_compatible(struct perf_evlist *evlist);
void perf_evlist__config(struct perf_evlist *evlist,
struct perf_record_opts *opts);
@@ -118,7 +123,8 @@ int perf_evlist__apply_filters(struct perf_evlist *evlist);
void __perf_evlist__set_leader(struct list_head *list);
void perf_evlist__set_leader(struct perf_evlist *evlist);
-u64 perf_evlist__sample_type(struct perf_evlist *evlist);
+u64 __perf_evlist__combined_sample_type(struct perf_evlist *evlist);
+u64 perf_evlist__combined_sample_type(struct perf_evlist *evlist);
bool perf_evlist__sample_id_all(struct perf_evlist *evlist);
u16 perf_evlist__id_hdr_size(struct perf_evlist *evlist);
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 0b9c4fd..724b75a 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -30,7 +30,7 @@ static struct {
#define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))
-static int __perf_evsel__sample_size(u64 sample_type)
+int __perf_evsel__sample_size(u64 sample_type)
{
u64 mask = sample_type & PERF_SAMPLE_MASK;
int size = 0;
@@ -46,6 +46,65 @@ static int __perf_evsel__sample_size(u64 sample_type)
return size;
}
+/**
+ * __perf_evsel__calc_id_pos - calculate id_pos.
+ * @sample_type: sample type
+ *
+ * This function returns the position of the event id (PERF_SAMPLE_ID) in a
+ * sample event i.e. in the array of struct sample_event.
+ */
+static int __perf_evsel__calc_id_pos(u64 sample_type)
+{
+ int idx = 0;
+
+ if (!(sample_type & PERF_SAMPLE_ID))
+ return -1;
+
+ if (sample_type & PERF_SAMPLE_IP)
+ idx += 1;
+
+ if (sample_type & PERF_SAMPLE_TID)
+ idx += 1;
+
+ if (sample_type & PERF_SAMPLE_TIME)
+ idx += 1;
+
+ if (sample_type & PERF_SAMPLE_ADDR)
+ idx += 1;
+
+ return idx;
+}
+
+/**
+ * __perf_evsel__calc_is_pos - calculate is_pos.
+ * @sample_type: sample type
+ *
+ * This function returns the position (counting backwards) of the event id
+ * (PERF_SAMPLE_ID) in a non-sample event i.e. if sample_id_all is used there is
+ * an id sample appended to non-sample events.
+ */
+static int __perf_evsel__calc_is_pos(u64 sample_type)
+{
+ int idx = 1;
+
+ if (!(sample_type & PERF_SAMPLE_ID))
+ return -1;
+
+ if (sample_type & PERF_SAMPLE_CPU)
+ idx += 1;
+
+ if (sample_type & PERF_SAMPLE_STREAM_ID)
+ idx += 1;
+
+ return idx;
+}
+
+void perf_evsel__calc_id_pos(struct perf_evsel *evsel)
+{
+ evsel->id_pos = __perf_evsel__calc_id_pos(evsel->attr.sample_type);
+ evsel->is_pos = __perf_evsel__calc_is_pos(evsel->attr.sample_type);
+}
+
void hists__init(struct hists *hists)
{
memset(hists, 0, sizeof(*hists));
@@ -62,6 +121,7 @@ void __perf_evsel__set_sample_bit(struct perf_evsel *evsel,
if (!(evsel->attr.sample_type & bit)) {
evsel->attr.sample_type |= bit;
evsel->sample_size += sizeof(u64);
+ perf_evsel__calc_id_pos(evsel);
}
}
@@ -71,6 +131,7 @@ void __perf_evsel__reset_sample_bit(struct perf_evsel *evsel,
if (evsel->attr.sample_type & bit) {
evsel->attr.sample_type &= ~bit;
evsel->sample_size -= sizeof(u64);
+ perf_evsel__calc_id_pos(evsel);
}
}
@@ -89,6 +150,7 @@ void perf_evsel__init(struct perf_evsel *evsel,
INIT_LIST_HEAD(&evsel->node);
hists__init(&evsel->hists);
evsel->sample_size = __perf_evsel__sample_size(attr->sample_type);
+ perf_evsel__calc_id_pos(evsel);
}
struct perf_evsel *perf_evsel__new(struct perf_event_attr *attr, int idx)
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 3f156cc..c6d616c 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -45,6 +45,11 @@ struct perf_sample_id {
* @name - Can be set to retain the original event name passed by the user,
* so that when showing results in tools such as 'perf stat', we
* show the name used, not some alias.
+ * @id_pos: the position of the event id (PERF_SAMPLE_ID) in a sample event
+ * i.e. in the array of struct sample_event
+ * @is_pos: the position (counting backwards) of the event id (PERF_SAMPLE_ID)
+ * in a non-sample event i.e. if sample_id_all is used there is an id
+ * sample appended to non-sample events
*/
struct perf_evsel {
struct list_head node;
@@ -71,6 +76,8 @@ struct perf_evsel {
} handler;
struct cpu_map *cpus;
unsigned int sample_size;
+ int id_pos;
+ int is_pos;
bool supported;
bool needs_swap;
/* parse modifier helper */
@@ -100,6 +107,9 @@ void perf_evsel__delete(struct perf_evsel *evsel);
void perf_evsel__config(struct perf_evsel *evsel,
struct perf_record_opts *opts);
+int __perf_evsel__sample_size(u64 sample_type);
+void perf_evsel__calc_id_pos(struct perf_evsel *evsel);
+
bool perf_evsel__is_cache_op_valid(u8 type, u8 op);
#define PERF_EVSEL__MAX_ALIASES 8
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 1eb58ee..c0eb9aa 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -743,7 +743,7 @@ static void perf_session__print_tstamp(struct perf_session *session,
union perf_event *event,
struct perf_sample *sample)
{
- u64 sample_type = perf_evlist__sample_type(session->evlist);
+ u64 sample_type = __perf_evlist__combined_sample_type(session->evlist);
if (event->header.type != PERF_RECORD_SAMPLE &&
!perf_evlist__sample_id_all(session->evlist)) {
@@ -902,7 +902,7 @@ static int perf_session__preprocess_sample(struct perf_session *session,
union perf_event *event, struct perf_sample *sample)
{
if (event->header.type != PERF_RECORD_SAMPLE ||
- !(perf_evlist__sample_type(session->evlist) & PERF_SAMPLE_CALLCHAIN))
+ !sample->callchain)
return 0;
if (!ip_callchain__valid(sample->callchain, event)) {
@@ -1304,7 +1304,9 @@ int perf_session__process_events(struct perf_session *self,
bool perf_session__has_traces(struct perf_session *session, const char *msg)
{
- if (!(perf_evlist__sample_type(session->evlist) & PERF_SAMPLE_RAW)) {
+ u64 sample_type = perf_evlist__combined_sample_type(session->evlist);
+
+ if (!(sample_type & PERF_SAMPLE_RAW)) {
pr_err("No trace sample to read. Did you call 'perf %s'?\n", msg);
return false;
}
--
1.7.11.7
It is useful to see the arguments to perf_event_open
and whether the perf events ring buffer was mmapped
per-cpu or per-thread. That information will now be
displayed when verbose is 2 i.e option -vv
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
---
tools/perf/util/evlist.c | 3 +++
tools/perf/util/evsel.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 61 insertions(+)
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 42ea4e9..2b77b33 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -14,6 +14,7 @@
#include "target.h"
#include "evlist.h"
#include "evsel.h"
+#include "debug.h"
#include <unistd.h>
#include "parse-events.h"
@@ -454,6 +455,7 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist, int prot, int m
int nr_cpus = cpu_map__nr(evlist->cpus);
int nr_threads = thread_map__nr(evlist->threads);
+ pr_debug2("perf event ring buffer mmapped per cpu\n");
for (cpu = 0; cpu < nr_cpus; cpu++) {
int output = -1;
@@ -492,6 +494,7 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist, int prot, in
int thread;
int nr_threads = thread_map__nr(evlist->threads);
+ pr_debug2("perf event ring buffer mmapped per thread\n");
for (thread = 0; thread < nr_threads; thread++) {
int output = -1;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index a635461..0b9c4fd 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -21,6 +21,7 @@
#include "thread_map.h"
#include "target.h"
#include "perf_regs.h"
+#include "debug.h"
static struct {
bool sample_id_all;
@@ -817,6 +818,58 @@ static int get_group_fd(struct perf_evsel *evsel, int cpu, int thread)
return fd;
}
+static size_t perf_event_attr__fprintf(struct perf_event_attr *attr, FILE *fp)
+{
+ size_t ret;
+
+ ret = fprintf(fp, "------------------------------------------------------------\n");
+ ret += fprintf(fp, "perf_event_attr:\n");
+ ret += fprintf(fp, " type %u\n", attr->type);
+ ret += fprintf(fp, " size %u\n", attr->size);
+ ret += fprintf(fp, " config %#"PRIx64"\n", (uint64_t)attr->config);
+ ret += fprintf(fp, " sample_period %"PRIu64"\n", (uint64_t)attr->sample_period);
+ ret += fprintf(fp, " sample_freq %"PRIu64"\n", (uint64_t)attr->sample_freq);
+ ret += fprintf(fp, " sample_type %#"PRIx64"\n", (uint64_t)attr->sample_type);
+ ret += fprintf(fp, " read_format %#"PRIx64"\n", (uint64_t)attr->read_format);
+
+ ret += fprintf(fp, " disabled %u ", attr->disabled);
+ ret += fprintf(fp, "inherit %u\n", attr->inherit);
+ ret += fprintf(fp, " pinned %u ", attr->pinned);
+ ret += fprintf(fp, "exclusive %u\n", attr->exclusive);
+ ret += fprintf(fp, " exclude_user %u ", attr->exclude_user);
+ ret += fprintf(fp, "exclude_kernel %u\n", attr->exclude_kernel);
+ ret += fprintf(fp, " exclude_hv %u ", attr->exclude_hv);
+ ret += fprintf(fp, "exclude_idle %u\n", attr->exclude_idle);
+ ret += fprintf(fp, " mmap %u ", attr->mmap);
+ ret += fprintf(fp, "comm %u\n", attr->comm);
+ ret += fprintf(fp, " freq %u ", attr->freq);
+ ret += fprintf(fp, "inherit_stat %u\n", attr->inherit_stat);
+ ret += fprintf(fp, " enable_on_exec %u ", attr->enable_on_exec);
+ ret += fprintf(fp, "task %u\n", attr->task);
+ ret += fprintf(fp, " watermark %u ", attr->watermark);
+ ret += fprintf(fp, "precise_ip %u\n", attr->precise_ip);
+ ret += fprintf(fp, " mmap_data %u ", attr->mmap_data);
+ ret += fprintf(fp, "sample_id_all %u\n", attr->sample_id_all);
+ ret += fprintf(fp, " exclude_host %u ", attr->exclude_host);
+ ret += fprintf(fp, "exclude_guest %u\n", attr->exclude_guest);
+ ret += fprintf(fp, " excl.callchain.kern %u ", attr->exclude_callchain_kernel);
+ ret += fprintf(fp, "excl.callchain.user %u\n", attr->exclude_callchain_user);
+
+ ret += fprintf(fp, " wakeup_events %u\n", attr->wakeup_events);
+ ret += fprintf(fp, " wakeup_watermark %u\n", attr->wakeup_watermark);
+ ret += fprintf(fp, " bp_type %#x\n", attr->bp_type);
+ ret += fprintf(fp, " bp_addr %#"PRIx64"\n", (uint64_t)attr->bp_addr);
+ ret += fprintf(fp, " config1 %#"PRIx64"\n", (uint64_t)attr->config1);
+ ret += fprintf(fp, " bp_len %"PRIu64"\n", (uint64_t)attr->bp_len);
+ ret += fprintf(fp, " config2 %#"PRIx64"\n", (uint64_t)attr->config2);
+ ret += fprintf(fp, " branch_sample_type %#"PRIx64"\n", (uint64_t)attr->branch_sample_type);
+ ret += fprintf(fp, " sample_regs_user %#"PRIx64"\n", (uint64_t)attr->sample_regs_user);
+ ret += fprintf(fp, " sample_stack_user %u\n", attr->sample_stack_user);
+ ret += fprintf(fp, "------------------------------------------------------------\n");
+
+ return ret;
+}
+
static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
struct thread_map *threads)
{
@@ -840,6 +893,9 @@ retry_sample_id:
if (perf_missing_features.sample_id_all)
evsel->attr.sample_id_all = 0;
+ if (verbose >= 2)
+ perf_event_attr__fprintf(&evsel->attr, stderr);
+
for (cpu = 0; cpu < cpus->nr; cpu++) {
for (thread = 0; thread < threads->nr; thread++) {
@@ -850,6 +906,8 @@ retry_sample_id:
group_fd = get_group_fd(evsel, cpu, thread);
+ pr_debug2("perf_event_open: pid %d cpu %d group_fd %d flags %#lx\n",
+ pid, cpus->map[cpu], group_fd, flags);
FD(evsel, cpu, thread) = sys_perf_event_open(&evsel->attr,
pid,
cpus->map[cpu],
--
1.7.11.7
Record pid on struct thread. The member is named 'pid_'
to avoid confusion with the 'tid' member which was previously
named 'pid'.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/machine.c | 16 +++++++++++-----
tools/perf/util/thread.c | 3 ++-
tools/perf/util/thread.h | 3 ++-
3 files changed, 15 insertions(+), 7 deletions(-)
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index f9f9d63..67ba572 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -233,7 +233,8 @@ void machines__set_id_hdr_size(struct machines *machines, u16 id_hdr_size)
return;
}
-static struct thread *__machine__findnew_thread(struct machine *machine, pid_t tid,
+static struct thread *__machine__findnew_thread(struct machine *machine,
+ pid_t pid, pid_t tid,
bool create)
{
struct rb_node **p = &machine->threads.rb_node;
@@ -245,8 +246,11 @@ static struct thread *__machine__findnew_thread(struct machine *machine, pid_t t
* so most of the time we dont have to look up
* the full rbtree:
*/
- if (machine->last_match && machine->last_match->tid == tid)
+ if (machine->last_match && machine->last_match->tid == tid) {
+ if (pid && pid != machine->last_match->pid_)
+ machine->last_match->pid_ = pid;
return machine->last_match;
+ }
while (*p != NULL) {
parent = *p;
@@ -254,6 +258,8 @@ static struct thread *__machine__findnew_thread(struct machine *machine, pid_t t
if (th->tid == tid) {
machine->last_match = th;
+ if (pid && pid != th->pid_)
+ th->pid_ = pid;
return th;
}
@@ -266,7 +272,7 @@ static struct thread *__machine__findnew_thread(struct machine *machine, pid_t t
if (!create)
return NULL;
- th = thread__new(tid);
+ th = thread__new(pid, tid);
if (th != NULL) {
rb_link_node(&th->rb_node, parent, p);
rb_insert_color(&th->rb_node, &machine->threads);
@@ -278,12 +284,12 @@ static struct thread *__machine__findnew_thread(struct machine *machine, pid_t t
struct thread *machine__findnew_thread(struct machine *machine, pid_t tid)
{
- return __machine__findnew_thread(machine, tid, true);
+ return __machine__findnew_thread(machine, 0, tid, true);
}
struct thread *machine__find_thread(struct machine *machine, pid_t tid)
{
- return __machine__findnew_thread(machine, tid, false);
+ return __machine__findnew_thread(machine, 0, tid, false);
}
int machine__process_comm_event(struct machine *machine, union perf_event *event)
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 6feeb88..e3d4a55 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -7,12 +7,13 @@
#include "util.h"
#include "debug.h"
-struct thread *thread__new(pid_t tid)
+struct thread *thread__new(pid_t pid, pid_t tid)
{
struct thread *self = zalloc(sizeof(*self));
if (self != NULL) {
map_groups__init(&self->mg);
+ self->pid_ = pid;
self->tid = tid;
self->ppid = -1;
self->comm = malloc(32);
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 0fe1f9c..fc464bc 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -12,6 +12,7 @@ struct thread {
struct list_head node;
};
struct map_groups mg;
+ pid_t pid_; /* Not all tools update this */
pid_t tid;
pid_t ppid;
char shortname[3];
@@ -24,7 +25,7 @@ struct thread {
struct machine;
-struct thread *thread__new(pid_t tid);
+struct thread *thread__new(pid_t pid, pid_t tid);
void thread__delete(struct thread *self);
int thread__set_comm(struct thread *self, const char *comm);
--
1.7.11.7
The size of data retrieved from a sample event must be
validated to ensure it does not go past the end of the
event. That was being done sporadically and without
considering integer overflows.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/evsel.c | 101 ++++++++++++++++++++++++++++++------------------
1 file changed, 63 insertions(+), 38 deletions(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 724b75a..febff21 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1114,24 +1114,38 @@ static int perf_evsel__parse_id_sample(const struct perf_evsel *evsel,
return 0;
}
-static bool sample_overlap(const union perf_event *event,
- const void *offset, u64 size)
+static inline bool overflow_one(const void *endp, const void *offset)
{
- const void *base = event;
-
- if (offset + size > base + event->header.size)
- return true;
+ return offset + sizeof(u64) > endp;
+}
- return false;
+static inline bool overflow(const void *endp, u16 max_size, const void *offset,
+ u64 size)
+{
+ return size > max_size || offset + size > endp;
}
+#define OVERFLOW_CHECK_ONE(offset) \
+ do { \
+ if (overflow_one(endp, (offset))) \
+ return -EFAULT; \
+ } while (0)
+
+#define OVERFLOW_CHECK(offset, size) \
+ do { \
+ if (overflow(endp, max_size, (offset), (size))) \
+ return -EFAULT; \
+ } while (0)
+
int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
struct perf_sample *data)
{
u64 type = evsel->attr.sample_type;
- u64 regs_user = evsel->attr.sample_regs_user;
bool swapped = evsel->needs_swap;
const u64 *array;
+ u16 max_size = event->header.size;
+ const void *endp = (void *)event + max_size;
+ u64 sz;
/*
* used for cross-endian analysis. See git commit 65014ab3
@@ -1153,6 +1167,11 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
array = event->sample.array;
+ /*
+ * sample_size is based on PERF_SAMPLE_MASK which includes up to
+ * PERF_SAMPLE_PERIOD. After that overflow_one() or overflow() must be
+ * used to check the format does not go past the end of the event.
+ */
if (evsel->sample_size + sizeof(event->header) > event->header.size)
return -EFAULT;
@@ -1221,20 +1240,19 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
}
if (type & PERF_SAMPLE_CALLCHAIN) {
- if (sample_overlap(event, array, sizeof(data->callchain->nr)))
- return -EFAULT;
-
- data->callchain = (struct ip_callchain *)array;
+ const u64 max_callchain_nr = UINT64_MAX / sizeof(u64);
- if (sample_overlap(event, array, data->callchain->nr))
+ OVERFLOW_CHECK_ONE(array);
+ data->callchain = (struct ip_callchain *)array++;
+ if (data->callchain->nr > max_callchain_nr)
return -EFAULT;
-
- array += 1 + data->callchain->nr;
+ sz = data->callchain->nr * sizeof(u64);
+ OVERFLOW_CHECK(array, sz);
+ array = (void*)array + sz;
}
if (type & PERF_SAMPLE_RAW) {
- const u64 *pdata;
-
+ OVERFLOW_CHECK_ONE(array);
u.val64 = *array;
if (WARN_ONCE(swapped,
"Endianness of raw data not corrected!\n")) {
@@ -1243,65 +1261,72 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
u.val32[0] = bswap_32(u.val32[0]);
u.val32[1] = bswap_32(u.val32[1]);
}
-
- if (sample_overlap(event, array, sizeof(u32)))
- return -EFAULT;
-
data->raw_size = u.val32[0];
- pdata = (void *) array + sizeof(u32);
+ array = (void *)array + sizeof(u32);
- if (sample_overlap(event, pdata, data->raw_size))
- return -EFAULT;
-
- data->raw_data = (void *) pdata;
-
- array = (void *)array + data->raw_size + sizeof(u32);
+ OVERFLOW_CHECK(array, data->raw_size);
+ data->raw_data = (void *)array;
+ array = (void *)array + data->raw_size;
}
if (type & PERF_SAMPLE_BRANCH_STACK) {
- u64 sz;
+ const u64 max_branch_nr = UINT64_MAX / sizeof(struct branch_entry);
- data->branch_stack = (struct branch_stack *)array;
- array++; /* nr */
+ OVERFLOW_CHECK_ONE(array);
+ data->branch_stack = (struct branch_stack *)array++;
+ if (data->branch_stack->nr > max_branch_nr)
+ return -EFAULT;
sz = data->branch_stack->nr * sizeof(struct branch_entry);
- sz /= sizeof(u64);
- array += sz;
+ OVERFLOW_CHECK(array, sz);
+ array = (void *)array + sz;
}
if (type & PERF_SAMPLE_REGS_USER) {
+ u64 avail;
+
/* First u64 tells us if we have any regs in sample. */
- u64 avail = *array++;
+ OVERFLOW_CHECK_ONE(array);
+ avail = *array++;
if (avail) {
+ u64 regs_user = evsel->attr.sample_regs_user;
+
+ sz = hweight_long(regs_user) * sizeof(u64);
+ OVERFLOW_CHECK(array, sz);
data->user_regs.regs = (u64 *)array;
- array += hweight_long(regs_user);
+ array = (void *)array + sz;
}
}
if (type & PERF_SAMPLE_STACK_USER) {
- u64 size = *array++;
+ OVERFLOW_CHECK_ONE(array);
+ sz = *array++;
data->user_stack.offset = ((char *)(array - 1)
- (char *) event);
- if (!size) {
+ if (!sz) {
data->user_stack.size = 0;
} else {
+ OVERFLOW_CHECK(array, sz);
data->user_stack.data = (char *)array;
- array += size / sizeof(*array);
+ array = (void *)array + sz;
+ OVERFLOW_CHECK_ONE(array);
data->user_stack.size = *array++;
}
}
data->weight = 0;
if (type & PERF_SAMPLE_WEIGHT) {
+ OVERFLOW_CHECK_ONE(array);
data->weight = *array;
array++;
}
data->data_src = PERF_MEM_DATA_SRC_NONE;
if (type & PERF_SAMPLE_DATA_SRC) {
+ OVERFLOW_CHECK_ONE(array);
data->data_src = *array;
array++;
}
--
1.7.11.7
Enable parsing of samples with sample format bit
PERF_SAMPLE_IDENTFIER. In addition, if the kernel supports
it, prefer it to selecting PERF_SAMPLE_ID thereby avoiding
the need to force compatible sample types.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/tests/mmap-basic.c | 2 +-
tools/perf/util/event.h | 9 +++--
tools/perf/util/evlist.c | 94 +++++++++++++++++++++++++++++++++++++++++--
tools/perf/util/evlist.h | 1 +
tools/perf/util/evsel.c | 41 +++++++++++++++----
tools/perf/util/evsel.h | 3 +-
6 files changed, 134 insertions(+), 16 deletions(-)
diff --git a/tools/perf/tests/mmap-basic.c b/tools/perf/tests/mmap-basic.c
index 5b1b5ab..c4185b9 100644
--- a/tools/perf/tests/mmap-basic.c
+++ b/tools/perf/tests/mmap-basic.c
@@ -72,7 +72,7 @@ int test__basic_mmap(void)
}
evsels[i]->attr.wakeup_events = 1;
- perf_evsel__set_sample_id(evsels[i]);
+ perf_evsel__set_sample_id(evsels[i], false);
perf_evlist__add(evlist, evsels[i]);
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index a7b2245..ce2a92c 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -53,7 +53,8 @@ struct read_event {
(PERF_SAMPLE_IP | PERF_SAMPLE_TID | \
PERF_SAMPLE_TIME | PERF_SAMPLE_ADDR | \
PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
- PERF_SAMPLE_CPU | PERF_SAMPLE_PERIOD)
+ PERF_SAMPLE_CPU | PERF_SAMPLE_PERIOD | \
+ PERF_SAMPLE_IDENTIFIER)
/*
* Events have compatible sample types if the following bits all have the same
@@ -61,13 +62,15 @@ struct read_event {
* events the order is: PERF_SAMPLE_IP, PERF_SAMPLE_TID, PERF_SAMPLE_TIME,
* PERF_SAMPLE_ADDR, PERF_SAMPLE_ID. For non-sample events the sample members
* are accessed in reverse order. The order is: PERF_SAMPLE_ID,
- * PERF_SAMPLE_STREAM_ID, PERF_SAMPLE_CPU.
+ * PERF_SAMPLE_STREAM_ID, PERF_SAMPLE_CPU. PERF_SAMPLE_IDENTIFIER is added for
+ * completeness but it should not be used with PERF_SAMPLE_ID. Sample types
+ * that include PERF_SAMPLE_IDENTIFIER are always compatible.
*/
#define PERF_COMPAT_MASK \
(PERF_SAMPLE_IP | PERF_SAMPLE_TID | \
PERF_SAMPLE_TIME | PERF_SAMPLE_ADDR | \
PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
- PERF_SAMPLE_CPU)
+ PERF_SAMPLE_CPU | PERF_SAMPLE_IDENTIFIER)
struct sample_event {
struct perf_event_header header;
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index ffa4afb..42735db 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -88,10 +88,81 @@ void perf_evlist__make_sample_types_compatible(struct perf_evlist *evlist)
perf_evlist__set_id_pos(evlist);
}
+typedef void (*setup_probe_fn_t)(struct perf_evsel *evsel);
+
+static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
+{
+ struct perf_evlist *evlist;
+ struct perf_evsel *evsel;
+ int err = -EAGAIN, fd;
+
+ evlist = perf_evlist__new();
+ if (!evlist)
+ return -ENOMEM;
+
+ if (parse_events(evlist, str))
+ goto out_delete;
+
+ evsel = perf_evlist__first(evlist);
+
+ fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+ if (fd < 0)
+ goto out_delete;
+ close(fd);
+
+ fn(evsel);
+
+ fd = sys_perf_event_open(&evsel->attr, -1, cpu, -1, 0);
+ if (fd < 0) {
+ if (errno == EINVAL)
+ err = -EINVAL;
+ goto out_delete;
+ }
+ close(fd);
+ err = 0;
+
+out_delete:
+ perf_evlist__delete(evlist);
+ return err;
+}
+
+static bool perf_probe_api(setup_probe_fn_t fn)
+{
+ const char *try[] = {"cycles:u", "instructions:u", "cpu-clock", NULL};
+ struct cpu_map *cpus;
+ int cpu, ret, i = 0;
+
+ cpus = cpu_map__new(NULL);
+ if (!cpus)
+ return false;
+ cpu = cpus->map[0];
+ cpu_map__delete(cpus);
+
+ do {
+ ret = perf_do_probe_api(fn, cpu, try[i++]);
+ if (!ret)
+ return true;
+ } while (ret == -EAGAIN && try[i]);
+
+ return false;
+}
+
+static void perf_probe_sample_identifier(struct perf_evsel *evsel)
+{
+ evsel->attr.sample_type |= PERF_SAMPLE_IDENTIFIER;
+}
+
+bool perf_can_sample_identifier(void)
+{
+ return perf_probe_api(perf_probe_sample_identifier);
+}
+
void perf_evlist__config(struct perf_evlist *evlist,
struct perf_record_opts *opts)
{
struct perf_evsel *evsel;
+ bool use_sample_identifier = false;
+
/*
* Set the evsel leader links before we configure attributes,
* since some might depend on this info.
@@ -102,14 +173,26 @@ void perf_evlist__config(struct perf_evlist *evlist,
if (evlist->cpus->map[0] < 0)
opts->no_inherit = true;
- list_for_each_entry(evsel, &evlist->entries, node) {
+ list_for_each_entry(evsel, &evlist->entries, node)
perf_evsel__config(evsel, opts);
- if (evlist->nr_entries > 1)
- perf_evsel__set_sample_id(evsel);
+ if (evlist->nr_entries > 1) {
+ struct perf_evsel *first = perf_evlist__first(evlist);
+
+ list_for_each_entry(evsel, &evlist->entries, node) {
+ if (evsel->attr.sample_type == first->attr.sample_type)
+ continue;
+ use_sample_identifier = perf_can_sample_identifier();
+ break;
+ }
+ list_for_each_entry(evsel, &evlist->entries, node)
+ perf_evsel__set_sample_id(evsel, use_sample_identifier);
}
- perf_evlist__make_sample_types_compatible(evlist);
+ if (use_sample_identifier)
+ perf_evlist__set_id_pos(evlist);
+ else
+ perf_evlist__make_sample_types_compatible(evlist);
}
static void perf_evlist__purge(struct perf_evlist *evlist)
@@ -813,6 +896,9 @@ u16 perf_evlist__id_hdr_size(struct perf_evlist *evlist)
if (sample_type & PERF_SAMPLE_CPU)
size += sizeof(data->cpu) * 2;
+
+ if (sample_type & PERF_SAMPLE_IDENTIFIER)
+ size += sizeof(data->id);
out:
return size;
}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index b1be475..9b767e6 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -88,6 +88,7 @@ int perf_evlist__open(struct perf_evlist *evlist);
void perf_evlist__close(struct perf_evlist *evlist);
void perf_evlist__make_sample_types_compatible(struct perf_evlist *evlist);
+bool perf_can_sample_identifier(void);
void perf_evlist__config(struct perf_evlist *evlist,
struct perf_record_opts *opts);
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index dfdd6cb..4940f65 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -50,13 +50,17 @@ int __perf_evsel__sample_size(u64 sample_type)
* __perf_evsel__calc_id_pos - calculate id_pos.
* @sample_type: sample type
*
- * This function returns the position of the event id (PERF_SAMPLE_ID) in a
- * sample event i.e. in the array of struct sample_event.
+ * This function returns the position of the event id (PERF_SAMPLE_ID or
+ * PERF_SAMPLE_IDENTIFIER) in a sample event i.e. in the array of struct
+ * sample_event.
*/
static int __perf_evsel__calc_id_pos(u64 sample_type)
{
int idx = 0;
+ if (sample_type & PERF_SAMPLE_IDENTIFIER)
+ return 0;
+
if (!(sample_type & PERF_SAMPLE_ID))
return -1;
@@ -80,13 +84,16 @@ static int __perf_evsel__calc_id_pos(u64 sample_type)
* @sample_type: sample type
*
* This function returns the position (counting backwards) of the event id
- * (PERF_SAMPLE_ID) in a non-sample event i.e. if sample_id_all is used there is
- * an id sample appended to non-sample events.
+ * (PERF_SAMPLE_ID or PERF_SAMPLE_IDENTIFIER) in a non-sample event i.e. if
+ * sample_id_all is used there is an id sample appended to non-sample events.
*/
static int __perf_evsel__calc_is_pos(u64 sample_type)
{
int idx = 1;
+ if (sample_type & PERF_SAMPLE_IDENTIFIER)
+ return 1;
+
if (!(sample_type & PERF_SAMPLE_ID))
return -1;
@@ -135,9 +142,13 @@ void __perf_evsel__reset_sample_bit(struct perf_evsel *evsel,
}
}
-void perf_evsel__set_sample_id(struct perf_evsel *evsel)
+void perf_evsel__set_sample_id(struct perf_evsel *evsel,
+ bool can_sample_identifier)
{
- perf_evsel__set_sample_bit(evsel, ID);
+ if (can_sample_identifier)
+ perf_evsel__set_sample_bit(evsel, IDENTIFIER);
+ else
+ perf_evsel__set_sample_bit(evsel, ID);
evsel->attr.read_format |= PERF_FORMAT_ID;
}
@@ -1071,6 +1082,11 @@ static int perf_evsel__parse_id_sample(const struct perf_evsel *evsel,
array += ((event->header.size -
sizeof(event->header)) / sizeof(u64)) - 1;
+ if (type & PERF_SAMPLE_IDENTIFIER) {
+ sample->id = *array;
+ array--;
+ }
+
if (type & PERF_SAMPLE_CPU) {
u.val64 = *array;
if (swapped) {
@@ -1175,6 +1191,12 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
if (evsel->sample_size + sizeof(event->header) > event->header.size)
return -EFAULT;
+ data->id = -1ULL;
+ if (type & PERF_SAMPLE_IDENTIFIER) {
+ data->id = *array;
+ array++;
+ }
+
if (type & PERF_SAMPLE_IP) {
data->ip = *array;
array++;
@@ -1205,7 +1227,6 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
array++;
}
- data->id = -1ULL;
if (type & PERF_SAMPLE_ID) {
data->id = *array;
array++;
@@ -1348,6 +1369,11 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type,
array = event->sample.array;
+ if (type & PERF_SAMPLE_IDENTIFIER) {
+ *array = sample->id;
+ array++;
+ }
+
if (type & PERF_SAMPLE_IP) {
*array = sample->ip;
array++;
@@ -1536,6 +1562,7 @@ static int sample_type__fprintf(FILE *fp, bool *first, u64 value)
bit_name(READ), bit_name(CALLCHAIN), bit_name(ID), bit_name(CPU),
bit_name(PERIOD), bit_name(STREAM_ID), bit_name(RAW),
bit_name(BRANCH_STACK), bit_name(REGS_USER), bit_name(STACK_USER),
+ bit_name(IDENTIFIER),
{ .name = NULL, }
};
#undef bit_name
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index c6d616c..bca8e5f 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -148,7 +148,8 @@ void __perf_evsel__reset_sample_bit(struct perf_evsel *evsel,
#define perf_evsel__reset_sample_bit(evsel, bit) \
__perf_evsel__reset_sample_bit(evsel, PERF_SAMPLE_##bit)
-void perf_evsel__set_sample_id(struct perf_evsel *evsel);
+void perf_evsel__set_sample_id(struct perf_evsel *evsel,
+ bool use_sample_identifier);
int perf_evsel__set_filter(struct perf_evsel *evsel, int ncpus, int nthreads,
const char *filter);
--
1.7.11.7
struct ip_event assumes fixeed positions for ip, pid
and tid. That is no longer true with the addition of
PERF_SAMPLE_IDENTIFIER. The information is anyway in
struct sample, so use that instead.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/builtin-inject.c | 4 ++--
tools/perf/builtin-kmem.c | 4 ++--
tools/perf/builtin-mem.c | 2 +-
tools/perf/builtin-script.c | 4 ++--
tools/perf/builtin-top.c | 10 +++++-----
tools/perf/tests/hists_link.c | 4 ++++
tools/perf/util/build-id.c | 8 ++++----
tools/perf/util/event.c | 6 +++---
tools/perf/util/evsel.c | 4 ++--
tools/perf/util/session.c | 7 ++++---
10 files changed, 29 insertions(+), 24 deletions(-)
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index c467eac..da0cc13 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -204,7 +204,7 @@ static int perf_event__inject_buildid(struct perf_tool *tool,
cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
- thread = machine__findnew_thread(machine, event->ip.pid, event->ip.pid);
+ thread = machine__findnew_thread(machine, sample->pid, sample->pid);
if (thread == NULL) {
pr_err("problem processing %d event, skipping it.\n",
event->header.type);
@@ -212,7 +212,7 @@ static int perf_event__inject_buildid(struct perf_tool *tool,
}
thread__find_addr_map(thread, machine, cpumode, MAP__FUNCTION,
- event->ip.ip, &al);
+ sample->ip, &al);
if (al.map != NULL) {
if (!al.map->dso->hit) {
diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index c324778..c2dff9c 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -305,8 +305,8 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
struct perf_evsel *evsel,
struct machine *machine)
{
- struct thread *thread = machine__findnew_thread(machine, event->ip.pid,
- event->ip.pid);
+ struct thread *thread = machine__findnew_thread(machine, sample->pid,
+ sample->pid);
if (thread == NULL) {
pr_debug("problem processing %d event, skipping it.\n",
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index a8ff6d2..4274680 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -96,7 +96,7 @@ dump_raw_samples(struct perf_tool *tool,
symbol_conf.field_sep,
sample->tid,
symbol_conf.field_sep,
- event->ip.ip,
+ sample->ip,
symbol_conf.field_sep,
sample->addr,
symbol_conf.field_sep,
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 338b5a4..51ba51c 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -480,8 +480,8 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
struct machine *machine)
{
struct addr_location al;
- struct thread *thread = machine__findnew_thread(machine, event->ip.pid,
- event->ip.tid);
+ struct thread *thread = machine__findnew_thread(machine, sample->pid,
+ sample->tid);
if (thread == NULL) {
pr_debug("problem processing %d event, skipping it.\n",
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index bbf4635..88daba7 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -690,7 +690,7 @@ static void perf_event__process_sample(struct perf_tool *tool,
{
struct perf_top *top = container_of(tool, struct perf_top, tool);
struct symbol *parent = NULL;
- u64 ip = event->ip.ip;
+ u64 ip = sample->ip;
struct addr_location al;
int err;
@@ -700,10 +700,10 @@ static void perf_event__process_sample(struct perf_tool *tool,
if (!seen)
seen = intlist__new(NULL);
- if (!intlist__has_entry(seen, event->ip.pid)) {
+ if (!intlist__has_entry(seen, sample->pid)) {
pr_err("Can't find guest [%d]'s kernel information\n",
- event->ip.pid);
- intlist__add(seen, event->ip.pid);
+ sample->pid);
+ intlist__add(seen, sample->pid);
}
return;
}
@@ -838,7 +838,7 @@ static void perf_top__mmap_read_idx(struct perf_top *top, int idx)
break;
case PERF_RECORD_MISC_GUEST_KERNEL:
++top->guest_kernel_samples;
- machine = perf_session__find_machine(session, event->ip.pid);
+ machine = perf_session__find_machine(session, sample.pid);
break;
case PERF_RECORD_MISC_GUEST_USER:
++top->guest_us_samples;
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 5a178d5..7e0ca15 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -220,6 +220,8 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
},
};
+ sample.pid = ip_event.ip.pid;
+ sample.ip = ip_event.ip.ip;
if (perf_event__preprocess_sample(&event, machine, &al,
&sample, 0) < 0)
goto out;
@@ -244,6 +246,8 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
},
};
+ sample.pid = ip_event.ip.pid;
+ sample.ip = ip_event.ip.ip;
if (perf_event__preprocess_sample(&event, machine, &al,
&sample, 0) < 0)
goto out;
diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index 0f9d27a..fb58409 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -18,14 +18,14 @@
int build_id__mark_dso_hit(struct perf_tool *tool __maybe_unused,
union perf_event *event,
- struct perf_sample *sample __maybe_unused,
+ struct perf_sample *sample,
struct perf_evsel *evsel __maybe_unused,
struct machine *machine)
{
struct addr_location al;
u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
- struct thread *thread = machine__findnew_thread(machine, event->ip.pid,
- event->ip.pid);
+ struct thread *thread = machine__findnew_thread(machine, sample->pid,
+ sample->pid);
if (thread == NULL) {
pr_err("problem processing %d event, skipping it.\n",
@@ -34,7 +34,7 @@ int build_id__mark_dso_hit(struct perf_tool *tool __maybe_unused,
}
thread__find_addr_map(thread, machine, cpumode, MAP__FUNCTION,
- event->ip.ip, &al);
+ sample->ip, &al);
if (al.map != NULL)
al.map->dso->hit = 1;
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index ec494a3..6861d19 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -677,8 +677,8 @@ int perf_event__preprocess_sample(const union perf_event *event,
symbol_filter_t filter)
{
u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
- struct thread *thread = machine__findnew_thread(machine, event->ip.pid,
- event->ip.pid);
+ struct thread *thread = machine__findnew_thread(machine, sample->pid,
+ sample->pid);
if (thread == NULL)
return -1;
@@ -700,7 +700,7 @@ int perf_event__preprocess_sample(const union perf_event *event,
machine__create_kernel_maps(machine);
thread__find_addr_map(thread, machine, cpumode, MAP__FUNCTION,
- event->ip.ip, al);
+ sample->ip, al);
dump_printf(" ...... dso: %s\n",
al->map ? al->map->dso->long_name :
al->level == 'H' ? "[hypervisor]" : "<not found>");
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index febff21..dfdd6cb 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1176,7 +1176,7 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
return -EFAULT;
if (type & PERF_SAMPLE_IP) {
- data->ip = event->ip.ip;
+ data->ip = *array;
array++;
}
@@ -1349,7 +1349,7 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type,
array = event->sample.array;
if (type & PERF_SAMPLE_IP) {
- event->ip.ip = sample->ip;
+ *array = sample->ip;
array++;
}
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index a0e427b..5a8221d 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -811,7 +811,8 @@ static void dump_sample(struct perf_evsel *evsel, union perf_event *event,
static struct machine *
perf_session__find_machine_for_cpumode(struct perf_session *session,
- union perf_event *event)
+ union perf_event *event,
+ struct perf_sample *sample)
{
const u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
@@ -823,7 +824,7 @@ static struct machine *
if (event->header.type == PERF_RECORD_MMAP)
pid = event->mmap.pid;
else
- pid = event->ip.pid;
+ pid = sample->pid;
return perf_session__findnew_machine(session, pid);
}
@@ -860,7 +861,7 @@ static int perf_session_deliver_event(struct perf_session *session,
hists__inc_nr_events(&evsel->hists, event->header.type);
}
- machine = perf_session__find_machine_for_cpumode(session, event);
+ machine = perf_session__find_machine_for_cpumode(session, event, sample);
switch (event->header.type) {
case PERF_RECORD_SAMPLE:
--
1.7.11.7
Expand perf_event__synthesize_sample() to handle all
sample format bits.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/builtin-inject.c | 3 +-
tools/perf/util/event.h | 1 +
tools/perf/util/evsel.c | 70 ++++++++++++++++++++++++++++++++++++++++++++-
3 files changed, 72 insertions(+), 2 deletions(-)
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index da0cc13..459bf20 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -307,7 +307,8 @@ found:
sample_sw.period = sample->period;
sample_sw.time = sample->time;
perf_event__synthesize_sample(event_sw, evsel->attr.sample_type,
- &sample_sw, false);
+ evsel->attr.sample_regs_user, &sample_sw,
+ false);
build_id__mark_dso_hit(tool, event_sw, &sample_sw, evsel, machine);
return perf_event__repipe(tool, event_sw, &sample_sw, machine);
}
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index ce2a92c..e00f922 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -228,6 +228,7 @@ int perf_event__preprocess_sample(const union perf_event *self,
const char *perf_event__name(unsigned int id);
int perf_event__synthesize_sample(union perf_event *event, u64 type,
+ u64 sample_regs_user,
const struct perf_sample *sample,
bool swapped);
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 4940f65..ed5d767 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1356,11 +1356,12 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
}
int perf_event__synthesize_sample(union perf_event *event, u64 type,
+ u64 sample_regs_user,
const struct perf_sample *sample,
bool swapped)
{
u64 *array;
-
+ size_t sz;
/*
* used for cross-endian analysis. See git commit 65014ab3
* for why this goofiness is needed.
@@ -1433,6 +1434,73 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type,
array++;
}
+ if (type & PERF_SAMPLE_READ) {
+ fprintf(stderr, "PERF_SAMPLE_READ is unsupported for now\n");
+ return -1;
+ }
+
+ if (type & PERF_SAMPLE_CALLCHAIN) {
+ sz = (sample->callchain->nr + 1) * sizeof(u64);
+ memcpy(array, sample->callchain, sz);
+ array = (void*)array + sz;
+ }
+
+ if (type & PERF_SAMPLE_RAW) {
+ u.val32[0] = sample->raw_size;
+ if (WARN_ONCE(swapped,
+ "Endianness of raw data not corrected!\n")) {
+ /*
+ * Inverse of what is done in perf_evsel__parse_sample
+ */
+ u.val32[0] = bswap_32(u.val32[0]);
+ u.val32[1] = bswap_32(u.val32[1]);
+ u.val64 = bswap_64(u.val64);
+ }
+ *array = u.val64;
+ array = (void *)array + sizeof(u32);
+
+ memcpy(array, sample->raw_data, sample->raw_size);
+ array = (void *)array + sample->raw_size;
+ }
+
+ if (type & PERF_SAMPLE_BRANCH_STACK) {
+ sz = sample->branch_stack->nr * sizeof(struct branch_entry);
+ sz += sizeof(u64);
+ memcpy(array, sample->branch_stack, sz);
+ array = (void *)array + sz;
+ }
+
+ if (type & PERF_SAMPLE_REGS_USER) {
+ if (sample->user_regs.regs && sample_regs_user) {
+ *array++ = sample_regs_user;
+ sz = hweight_long(sample_regs_user) * sizeof(u64);
+ memcpy(array, sample->user_regs.regs, sz);
+ array = (void *)array + sz;
+ } else {
+ *array++ = 0;
+ }
+ }
+
+ if (type & PERF_SAMPLE_STACK_USER) {
+ sz = sample->user_stack.size;
+ *array++ = sz;
+ if (sz) {
+ memcpy(array, sample->user_stack.data, sz);
+ array = (void *)array + sz;
+ *array++ = sz;
+ }
+ }
+
+ if (type & PERF_SAMPLE_WEIGHT) {
+ *array = sample->weight;
+ array++;
+ }
+
+ if (type & PERF_SAMPLE_DATA_SRC) {
+ *array = sample->data_src;
+ array++;
+ }
+
return 0;
}
--
1.7.11.7
Add a test that checks that sample parsing is correctly
implemented.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/Makefile | 1 +
tools/perf/tests/builtin-test.c | 4 +
tools/perf/tests/sample-parsing.c | 231 ++++++++++++++++++++++++++++++++++++++
tools/perf/tests/tests.h | 1 +
4 files changed, 237 insertions(+)
create mode 100644 tools/perf/tests/sample-parsing.c
diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index 5b7c6db..fe20130 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -389,6 +389,7 @@ LIB_OBJS += $(OUTPUT)tests/bp_signal.o
LIB_OBJS += $(OUTPUT)tests/bp_signal_overflow.o
LIB_OBJS += $(OUTPUT)tests/task-exit.o
LIB_OBJS += $(OUTPUT)tests/sw-clock.o
+LIB_OBJS += $(OUTPUT)tests/sample-parsing.o
BUILTIN_OBJS += $(OUTPUT)builtin-annotate.o
BUILTIN_OBJS += $(OUTPUT)builtin-bench.o
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 35b45f1466..5ee3933 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -94,6 +94,10 @@ static struct test {
.func = test__sw_clock_freq,
},
{
+ .desc = "Test sample parsing",
+ .func = test__sample_parsing,
+ },
+ {
.func = NULL,
},
};
diff --git a/tools/perf/tests/sample-parsing.c b/tools/perf/tests/sample-parsing.c
new file mode 100644
index 0000000..b9859f6
--- /dev/null
+++ b/tools/perf/tests/sample-parsing.c
@@ -0,0 +1,231 @@
+#include <stdbool.h>
+#include <inttypes.h>
+
+#include "event.h"
+#include "evsel.h"
+
+#include "tests.h"
+
+#define COMP(m) \
+ while (s1->m != s2->m) { \
+ pr_debug("Samples differ at '"#m"'\n"); \
+ return false; \
+ }
+
+#define MCOMP(m) \
+ while (memcmp(&s1->m, &s2->m, sizeof(s1->m))) { \
+ pr_debug("Samples differ at '"#m"'\n"); \
+ return false; \
+ }
+
+static bool samples_same(const struct perf_sample *s1,
+ const struct perf_sample *s2, u64 type, u64 regs_user)
+{
+ size_t i;
+
+ if (type & PERF_SAMPLE_IDENTIFIER)
+ COMP(id);
+
+ if (type & PERF_SAMPLE_IP)
+ COMP(ip);
+
+ if (type & PERF_SAMPLE_TID) {
+ COMP(pid);
+ COMP(tid);
+ }
+
+ if (type & PERF_SAMPLE_TIME)
+ COMP(time);
+
+ if (type & PERF_SAMPLE_ADDR)
+ COMP(addr);
+
+ if (type & PERF_SAMPLE_ID)
+ COMP(id);
+
+ if (type & PERF_SAMPLE_STREAM_ID)
+ COMP(stream_id);
+
+ if (type & PERF_SAMPLE_CPU)
+ COMP(cpu);
+
+ if (type & PERF_SAMPLE_PERIOD)
+ COMP(period);
+
+ if (type & PERF_SAMPLE_CALLCHAIN) {
+ COMP(callchain->nr);
+ for (i = 0; i < s1->callchain->nr; i++)
+ COMP(callchain->ips[i]);
+ }
+
+ if (type & PERF_SAMPLE_RAW) {
+ COMP(raw_size);
+ if (memcmp(s1->raw_data, s2->raw_data, s1->raw_size)) {
+ pr_debug("Samples differ at 'raw_data'\n");
+ return false;
+ }
+ }
+
+ if (type & PERF_SAMPLE_BRANCH_STACK) {
+ COMP(branch_stack->nr);
+ for (i = 0; i < s1->branch_stack->nr; i++)
+ MCOMP(branch_stack->entries[i]);
+ }
+
+ if (type & PERF_SAMPLE_REGS_USER) {
+ size_t sz = hweight_long(regs_user) * sizeof(u64);
+
+ if ((sz && (!s1->user_regs.regs || !s2->user_regs.regs)) ||
+ memcmp(s1->user_regs.regs, s2->user_regs.regs, sz)) {
+ pr_debug("Samples differ at 'user_regs'\n");
+ return false;
+ }
+ }
+
+ if (type & PERF_SAMPLE_STACK_USER) {
+ COMP(user_stack.size);
+ if (memcmp(s1->user_stack.data, s1->user_stack.data, s1->user_stack.size)) {
+ pr_debug("Samples differ at 'user_stack'\n");
+ return false;
+ }
+ }
+
+ if (type & PERF_SAMPLE_WEIGHT)
+ COMP(weight);
+
+ if (type & PERF_SAMPLE_DATA_SRC)
+ COMP(data_src);
+
+ return true;
+}
+
+static int do_test(u64 sample_type, u64 sample_regs_user)
+{
+ struct perf_evsel evsel = {
+ .needs_swap = false,
+ .attr = {
+ .sample_type = sample_type,
+ .sample_regs_user = sample_regs_user,
+ },
+ };
+ union perf_event event = {
+ .header = {
+ .type = PERF_RECORD_SAMPLE,
+ .size = sizeof(union perf_event),
+ },
+ };
+ union {
+ struct ip_callchain callchain;
+ u64 data[64];
+ } callchain = {
+ /* 3 ips */
+ .data = {3, 201, 202, 203},
+ };
+ union {
+ struct branch_stack branch_stack;
+ u64 data[64];
+ } branch_stack = {
+ /* 1 branch_entry */
+ .data = {1, 211, 212, 213},
+ };
+ u64 user_regs[64];
+ const u64 raw_data[] = {0x123456780a0b0c0dULL, 0x1102030405060708ULL};
+ const u64 data[] = {0x2211443366558877ULL, 0, 0xaabbccddeeff4321ULL};
+ const struct perf_sample sample = {
+ .ip = 101,
+ .pid = 102,
+ .tid = 103,
+ .time = 104,
+ .addr = 105,
+ .id = 106,
+ .stream_id = 107,
+ .period = 108,
+ .weight = 109,
+ .cpu = 110,
+ .raw_size = sizeof(raw_data),
+ .data_src = 111,
+ .raw_data = (void *)raw_data,
+ .callchain = &callchain.callchain,
+ .branch_stack = &branch_stack.branch_stack,
+ .user_regs = {
+ .regs = user_regs,
+ },
+ .user_stack = {
+ .size = sizeof(data),
+ .data = (void *)data,
+ },
+ };
+ struct perf_sample sample_out;
+ size_t i;
+ int err;
+
+ for (i = 0; i < sizeof(user_regs); i++)
+ *(u8 *)user_regs = i;
+
+ err = perf_event__synthesize_sample(&event, sample_type, sample_regs_user, &sample, false);
+ if (err) {
+ pr_debug("perf_event__synthesize_sample failed for sample_type %#"PRIx64", error %d\n", sample_type, err);
+ return -1;
+ }
+
+ evsel.sample_size = __perf_evsel__sample_size(sample_type);
+
+ err = perf_evsel__parse_sample(&evsel, &event, &sample_out);
+ if (err) {
+ pr_debug("perf_evsel__parse_sample failed for sample_type %#"PRIx64", error %d\n", sample_type, err);
+ return -1;
+ }
+
+ if (!samples_same(&sample, &sample_out, sample_type, sample_regs_user)) {
+ pr_debug("parsing failed for sample_type %#"PRIx64"\n", sample_type);
+ return -1;
+ }
+
+ return 0;
+}
+
+/**
+ * test__sample_parsing - test sample parsing.
+ *
+ * This function implements a test that synthesizes a sample event, parses it
+ * and then checks that the parsed sample matches the original sample. The test
+ * checks sample format bits separately and together. If the test passes %0 is
+ * returned, otherwise %-1 is returned.
+ */
+int test__sample_parsing(void)
+{
+ u64 sample_type;
+ u64 sample_regs_user;
+ int err;
+
+ /*
+ * Fail the test if it has not been updated when new sample format bits
+ * were added.
+ */
+ if (PERF_SAMPLE_MAX > PERF_SAMPLE_IDENTIFIER << 1) {
+ pr_debug("sample format has changed - this test needs updating\n");
+ return -1;
+ }
+
+ /* Test each sample format bit separately */
+ for (sample_type = 1; sample_type != PERF_SAMPLE_MAX; sample_type <<= 1) {
+ if (sample_type == PERF_SAMPLE_READ)
+ continue;
+ if (sample_type == PERF_SAMPLE_REGS_USER)
+ sample_regs_user = 0x3fff;
+ else
+ sample_regs_user = 0;
+ err = do_test(sample_type, sample_regs_user);
+ if (err)
+ return err;
+ }
+
+ /* Test all sample format bits together */
+ sample_type = (PERF_SAMPLE_MAX - 1) & (~PERF_SAMPLE_READ);
+ sample_regs_user = 0x3fff;
+ err = do_test(sample_type, sample_regs_user);
+ if (err)
+ return err;
+
+ return 0;
+}
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 07a92f9..90e3056 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -35,5 +35,6 @@ int test__bp_signal(void);
int test__bp_signal_overflow(void);
int test__task_exit(void);
int test__sw_clock_freq(void);
+int test__sample_parsing(void);
#endif /* TESTS_H */
--
1.7.11.7
The event stream is not always parsable because the format of a sample
is dependent on the sample_type of the selected event. When there
is more than one selected event and the sample_types are not the
same then parsing becomes problematic. A sample can be matched to its
selected event using the ID that is allocated when the event is opened.
Unfortunately, to get the ID from the sample means first parsing it.
This patch adds a new sample format bit PERF_SAMPLE_IDENTIFER that puts
the ID at a fixed position so that the ID can be retrieved without
parsing the sample. For sample events, that is the first position
immediately after the header. For non-sample events, that is the last
position.
Signed-off-by: Adrian Hunter <[email protected]>
---
include/uapi/linux/perf_event.h | 3 ++-
kernel/events/core.c | 11 ++++++++++-
2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 0b1df41..6bb217e 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -134,8 +134,9 @@ enum perf_event_sample_format {
PERF_SAMPLE_STACK_USER = 1U << 13,
PERF_SAMPLE_WEIGHT = 1U << 14,
PERF_SAMPLE_DATA_SRC = 1U << 15,
+ PERF_SAMPLE_IDENTIFIER = 1U << 16,
- PERF_SAMPLE_MAX = 1U << 16, /* non-ABI */
+ PERF_SAMPLE_MAX = 1U << 17, /* non-ABI */
};
/*
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 1833bc5..9908d3d 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1203,6 +1203,9 @@ static void perf_event__id_header_size(struct perf_event *event)
if (sample_type & PERF_SAMPLE_TIME)
size += sizeof(data->time);
+ if (sample_type & PERF_SAMPLE_IDENTIFIER)
+ size += sizeof(data->id);
+
if (sample_type & PERF_SAMPLE_ID)
size += sizeof(data->id);
@@ -4229,7 +4232,7 @@ static void __perf_event_header__init_id(struct perf_event_header *header,
if (sample_type & PERF_SAMPLE_TIME)
data->time = perf_clock();
- if (sample_type & PERF_SAMPLE_ID)
+ if (sample_type & (PERF_SAMPLE_ID | PERF_SAMPLE_IDENTIFIER))
data->id = primary_event_id(event);
if (sample_type & PERF_SAMPLE_STREAM_ID)
@@ -4268,6 +4271,9 @@ static void __perf_event__output_id_sample(struct perf_output_handle *handle,
if (sample_type & PERF_SAMPLE_CPU)
perf_output_put(handle, data->cpu_entry);
+
+ if (sample_type & PERF_SAMPLE_IDENTIFIER)
+ perf_output_put(handle, data->id);
}
void perf_event__output_id_sample(struct perf_event *event,
@@ -4380,6 +4386,9 @@ void perf_output_sample(struct perf_output_handle *handle,
perf_output_put(handle, *header);
+ if (sample_type & PERF_SAMPLE_IDENTIFIER)
+ perf_output_put(handle, data->id);
+
if (sample_type & PERF_SAMPLE_IP)
perf_output_put(handle, data->ip);
--
1.7.11.7
struct ip_event assumes fixed positions for ip, pid
and tid. That is no longer true with the addition of
PERF_SAMPLE_IDENTIFIER.
struct ip_event is no longer used except by hists_link.c.
Move it there.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/tests/hists_link.c | 23 +++++++++++++++++++----
tools/perf/util/event.h | 11 -----------
2 files changed, 19 insertions(+), 15 deletions(-)
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 7e0ca15..cfdbfd9 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -195,6 +195,19 @@ static struct sample fake_samples[][5] = {
},
};
+/* PERF_SAMPLE_IP | PERF_SAMPLE_TID | * but not PERF_SAMPLE_IDENTIFIER */
+struct ip_event {
+ struct perf_event_header header;
+ u64 ip;
+ u32 pid, tid;
+ unsigned char __more_data[];
+};
+
+union perf_ip_event {
+ struct ip_event ip;
+ union perf_event event;
+};
+
static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
{
struct perf_evsel *evsel;
@@ -210,7 +223,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
*/
list_for_each_entry(evsel, &evlist->entries, node) {
for (k = 0; k < ARRAY_SIZE(fake_common_samples); k++) {
- const union perf_event event = {
+ const union perf_ip_event ip_event = {
.ip = {
.header = {
.misc = PERF_RECORD_MISC_USER,
@@ -219,10 +232,11 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
.ip = fake_common_samples[k].ip,
},
};
+ const union perf_event *event = &ip_event.event;
sample.pid = ip_event.ip.pid;
sample.ip = ip_event.ip.ip;
- if (perf_event__preprocess_sample(&event, machine, &al,
+ if (perf_event__preprocess_sample(event, machine, &al,
&sample, 0) < 0)
goto out;
@@ -236,7 +250,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
}
for (k = 0; k < ARRAY_SIZE(fake_samples[i]); k++) {
- const union perf_event event = {
+ const union perf_ip_event ip_event = {
.ip = {
.header = {
.misc = PERF_RECORD_MISC_USER,
@@ -245,10 +259,11 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
.ip = fake_samples[i][k].ip,
},
};
+ const union perf_event *event = &ip_event.event;
sample.pid = ip_event.ip.pid;
sample.ip = ip_event.ip.ip;
- if (perf_event__preprocess_sample(&event, machine, &al,
+ if (perf_event__preprocess_sample(event, machine, &al,
&sample, 0) < 0)
goto out;
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 3aef78c..a7b2245 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -8,16 +8,6 @@
#include "map.h"
#include "build-id.h"
-/*
- * PERF_SAMPLE_IP | PERF_SAMPLE_TID | *
- */
-struct ip_event {
- struct perf_event_header header;
- u64 ip;
- u32 pid, tid;
- unsigned char __more_data[];
-};
-
struct mmap_event {
struct perf_event_header header;
u32 pid, tid;
@@ -162,7 +152,6 @@ struct tracing_data_event {
union perf_event {
struct perf_event_header header;
- struct ip_event ip;
struct mmap_event mmap;
struct comm_event comm;
struct fork_event fork;
--
1.7.11.7
Add a new parameter for 'pid' to machine__findnew_thread().
Change callers to pass 'pid' when it is known.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/builtin-inject.c | 2 +-
tools/perf/builtin-kmem.c | 3 ++-
tools/perf/builtin-kvm.c | 3 ++-
tools/perf/builtin-lock.c | 3 ++-
tools/perf/builtin-sched.c | 17 +++++++++--------
tools/perf/builtin-script.c | 3 ++-
tools/perf/builtin-trace.c | 12 +++++++++---
tools/perf/tests/hists_link.c | 3 ++-
tools/perf/util/build-id.c | 7 +++++--
tools/perf/util/event.c | 3 ++-
tools/perf/util/machine.c | 23 ++++++++++++++++-------
tools/perf/util/machine.h | 3 ++-
tools/perf/util/session.c | 2 +-
13 files changed, 55 insertions(+), 29 deletions(-)
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index ad1296c..c467eac 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -204,7 +204,7 @@ static int perf_event__inject_buildid(struct perf_tool *tool,
cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
- thread = machine__findnew_thread(machine, event->ip.pid);
+ thread = machine__findnew_thread(machine, event->ip.pid, event->ip.pid);
if (thread == NULL) {
pr_err("problem processing %d event, skipping it.\n",
event->header.type);
diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index b49f5c5..c324778 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -305,7 +305,8 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
struct perf_evsel *evsel,
struct machine *machine)
{
- struct thread *thread = machine__findnew_thread(machine, event->ip.pid);
+ struct thread *thread = machine__findnew_thread(machine, event->ip.pid,
+ event->ip.pid);
if (thread == NULL) {
pr_debug("problem processing %d event, skipping it.\n",
diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index 24b78ae..ecbbec8 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -691,7 +691,8 @@ static int process_sample_event(struct perf_tool *tool,
struct perf_evsel *evsel,
struct machine *machine)
{
- struct thread *thread = machine__findnew_thread(machine, sample->tid);
+ struct thread *thread = machine__findnew_thread(machine, sample->pid,
+ sample->tid);
struct perf_kvm_stat *kvm = container_of(tool, struct perf_kvm_stat,
tool);
diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index 76543a4..ee33ba2 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -805,7 +805,8 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
struct perf_evsel *evsel,
struct machine *machine)
{
- struct thread *thread = machine__findnew_thread(machine, sample->tid);
+ struct thread *thread = machine__findnew_thread(machine, sample->pid,
+ sample->tid);
if (thread == NULL) {
pr_debug("problem processing %d event, skipping it.\n",
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 948183a..49593a0 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -934,8 +934,8 @@ static int latency_switch_event(struct perf_sched *sched,
return -1;
}
- sched_out = machine__findnew_thread(machine, prev_pid);
- sched_in = machine__findnew_thread(machine, next_pid);
+ sched_out = machine__findnew_thread(machine, 0, prev_pid);
+ sched_in = machine__findnew_thread(machine, 0, next_pid);
out_events = thread_atoms_search(&sched->atom_root, sched_out, &sched->cmp_pid);
if (!out_events) {
@@ -978,7 +978,7 @@ static int latency_runtime_event(struct perf_sched *sched,
{
const u32 pid = perf_evsel__intval(evsel, sample, "pid");
const u64 runtime = perf_evsel__intval(evsel, sample, "runtime");
- struct thread *thread = machine__findnew_thread(machine, pid);
+ struct thread *thread = machine__findnew_thread(machine, 0, pid);
struct work_atoms *atoms = thread_atoms_search(&sched->atom_root, thread, &sched->cmp_pid);
u64 timestamp = sample->time;
int cpu = sample->cpu;
@@ -1016,7 +1016,7 @@ static int latency_wakeup_event(struct perf_sched *sched,
if (!success)
return 0;
- wakee = machine__findnew_thread(machine, pid);
+ wakee = machine__findnew_thread(machine, 0, pid);
atoms = thread_atoms_search(&sched->atom_root, wakee, &sched->cmp_pid);
if (!atoms) {
if (thread_atoms_insert(sched, wakee))
@@ -1070,7 +1070,7 @@ static int latency_migrate_task_event(struct perf_sched *sched,
if (sched->profile_cpu == -1)
return 0;
- migrant = machine__findnew_thread(machine, pid);
+ migrant = machine__findnew_thread(machine, 0, pid);
atoms = thread_atoms_search(&sched->atom_root, migrant, &sched->cmp_pid);
if (!atoms) {
if (thread_atoms_insert(sched, migrant))
@@ -1289,8 +1289,8 @@ static int map_switch_event(struct perf_sched *sched, struct perf_evsel *evsel,
return -1;
}
- sched_out = machine__findnew_thread(machine, prev_pid);
- sched_in = machine__findnew_thread(machine, next_pid);
+ sched_out = machine__findnew_thread(machine, 0, prev_pid);
+ sched_in = machine__findnew_thread(machine, 0, next_pid);
sched->curr_thread[this_cpu] = sched_in;
@@ -1425,7 +1425,8 @@ static int perf_sched__process_tracepoint_sample(struct perf_tool *tool __maybe_
struct perf_evsel *evsel,
struct machine *machine)
{
- struct thread *thread = machine__findnew_thread(machine, sample->tid);
+ struct thread *thread = machine__findnew_thread(machine, sample->pid,
+ sample->tid);
int err = 0;
if (thread == NULL) {
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 3de8979..338b5a4 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -480,7 +480,8 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
struct machine *machine)
{
struct addr_location al;
- struct thread *thread = machine__findnew_thread(machine, event->ip.tid);
+ struct thread *thread = machine__findnew_thread(machine, event->ip.pid,
+ event->ip.tid);
if (thread == NULL) {
pr_debug("problem processing %d event, skipping it.\n",
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 0e4b67f..f8bc748 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -301,7 +301,9 @@ static int trace__sys_enter(struct trace *trace, struct perf_evsel *evsel,
char *msg;
void *args;
size_t printed = 0;
- struct thread *thread = machine__findnew_thread(&trace->host, sample->tid);
+ struct thread *thread = machine__findnew_thread(&trace->host,
+ sample->pid,
+ sample->tid);
struct syscall *sc = trace__syscall_info(trace, evsel, sample);
struct thread_trace *ttrace = thread__trace(thread);
@@ -344,7 +346,9 @@ static int trace__sys_exit(struct trace *trace, struct perf_evsel *evsel,
{
int ret;
u64 duration = 0;
- struct thread *thread = machine__findnew_thread(&trace->host, sample->tid);
+ struct thread *thread = machine__findnew_thread(&trace->host,
+ sample->pid,
+ sample->tid);
struct thread_trace *ttrace = thread__trace(thread);
struct syscall *sc = trace__syscall_info(trace, evsel, sample);
@@ -397,7 +401,9 @@ static int trace__sched_stat_runtime(struct trace *trace, struct perf_evsel *evs
{
u64 runtime = perf_evsel__intval(evsel, sample, "runtime");
double runtime_ms = (double)runtime / NSEC_PER_MSEC;
- struct thread *thread = machine__findnew_thread(&trace->host, sample->tid);
+ struct thread *thread = machine__findnew_thread(&trace->host,
+ sample->pid,
+ sample->tid);
struct thread_trace *ttrace = thread__trace(thread);
if (ttrace == NULL)
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 89085a9..5a178d5 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -88,7 +88,8 @@ static struct machine *setup_fake_machine(struct machines *machines)
for (i = 0; i < ARRAY_SIZE(fake_threads); i++) {
struct thread *thread;
- thread = machine__findnew_thread(machine, fake_threads[i].pid);
+ thread = machine__findnew_thread(machine, fake_threads[i].pid,
+ fake_threads[i].pid);
if (thread == NULL)
goto out;
diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index 5295625..0f9d27a 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -24,7 +24,8 @@ int build_id__mark_dso_hit(struct perf_tool *tool __maybe_unused,
{
struct addr_location al;
u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
- struct thread *thread = machine__findnew_thread(machine, event->ip.pid);
+ struct thread *thread = machine__findnew_thread(machine, event->ip.pid,
+ event->ip.pid);
if (thread == NULL) {
pr_err("problem processing %d event, skipping it.\n",
@@ -47,7 +48,9 @@ static int perf_event__exit_del_thread(struct perf_tool *tool __maybe_unused,
__maybe_unused,
struct machine *machine)
{
- struct thread *thread = machine__findnew_thread(machine, event->fork.tid);
+ struct thread *thread = machine__findnew_thread(machine,
+ event->fork.pid,
+ event->fork.tid);
dump_printf("(%d:%d):(%d:%d)\n", event->fork.pid, event->fork.tid,
event->fork.ppid, event->fork.ptid);
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 9541270..ec494a3 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -677,7 +677,8 @@ int perf_event__preprocess_sample(const union perf_event *event,
symbol_filter_t filter)
{
u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
- struct thread *thread = machine__findnew_thread(machine, event->ip.pid);
+ struct thread *thread = machine__findnew_thread(machine, event->ip.pid,
+ event->ip.pid);
if (thread == NULL)
return -1;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 67ba572..16b84ac 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -30,7 +30,8 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
return -ENOMEM;
if (pid != HOST_KERNEL_ID) {
- struct thread *thread = machine__findnew_thread(machine, pid);
+ struct thread *thread = machine__findnew_thread(machine, 0,
+ pid);
char comm[64];
if (thread == NULL)
@@ -282,9 +283,10 @@ static struct thread *__machine__findnew_thread(struct machine *machine,
return th;
}
-struct thread *machine__findnew_thread(struct machine *machine, pid_t tid)
+struct thread *machine__findnew_thread(struct machine *machine, pid_t pid,
+ pid_t tid)
{
- return __machine__findnew_thread(machine, 0, tid, true);
+ return __machine__findnew_thread(machine, pid, tid, true);
}
struct thread *machine__find_thread(struct machine *machine, pid_t tid)
@@ -294,7 +296,9 @@ struct thread *machine__find_thread(struct machine *machine, pid_t tid)
int machine__process_comm_event(struct machine *machine, union perf_event *event)
{
- struct thread *thread = machine__findnew_thread(machine, event->comm.tid);
+ struct thread *thread = machine__findnew_thread(machine,
+ event->comm.pid,
+ event->comm.tid);
if (dump_trace)
perf_event__fprintf_comm(event, stdout);
@@ -975,7 +979,8 @@ int machine__process_mmap_event(struct machine *machine, union perf_event *event
return 0;
}
- thread = machine__findnew_thread(machine, event->mmap.pid);
+ thread = machine__findnew_thread(machine, event->mmap.pid,
+ event->mmap.pid);
if (thread == NULL)
goto out_problem;
@@ -1002,8 +1007,12 @@ out_problem:
int machine__process_fork_event(struct machine *machine, union perf_event *event)
{
- struct thread *thread = machine__findnew_thread(machine, event->fork.tid);
- struct thread *parent = machine__findnew_thread(machine, event->fork.ptid);
+ struct thread *thread = machine__findnew_thread(machine,
+ event->fork.pid,
+ event->fork.tid);
+ struct thread *parent = machine__findnew_thread(machine,
+ event->fork.ppid,
+ event->fork.ptid);
if (dump_trace)
perf_event__fprintf_task(event, stdout);
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 5bb6244..604be6b 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -101,7 +101,8 @@ static inline bool machine__is_host(struct machine *machine)
return machine ? machine->pid == HOST_KERNEL_ID : false;
}
-struct thread *machine__findnew_thread(struct machine *machine, pid_t tid);
+struct thread *machine__findnew_thread(struct machine *machine, pid_t pid,
+ pid_t tid);
size_t machine__fprintf(struct machine *machine, FILE *fp);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index c0eb9aa..aaa7d36 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1002,7 +1002,7 @@ void perf_event_header__bswap(struct perf_event_header *self)
struct thread *perf_session__findnew(struct perf_session *session, pid_t pid)
{
- return machine__findnew_thread(&session->machines.host, pid);
+ return machine__findnew_thread(&session->machines.host, 0, pid);
}
static struct thread *perf_session__register_idle_thread(struct perf_session *self)
--
1.7.11.7
Now that the sample parsing correctly checks data sizes
there is no reason for it to be done again for callchains.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/callchain.c | 8 --------
tools/perf/util/callchain.h | 5 -----
tools/perf/util/session.c | 20 --------------------
3 files changed, 33 deletions(-)
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 42b6a63..024162a 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -20,14 +20,6 @@
__thread struct callchain_cursor callchain_cursor;
-bool ip_callchain__valid(struct ip_callchain *chain,
- const union perf_event *event)
-{
- unsigned int chain_size = event->header.size;
- chain_size -= (unsigned long)&event->ip.__more_data - (unsigned long)event;
- return chain->nr * sizeof(u64) <= chain_size;
-}
-
#define chain_for_each_child(child, parent) \
list_for_each_entry(child, &parent->children, siblings)
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 3ee9f67..988c1aa 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -103,11 +103,6 @@ int callchain_append(struct callchain_root *root,
int callchain_merge(struct callchain_cursor *cursor,
struct callchain_root *dst, struct callchain_root *src);
-struct ip_callchain;
-union perf_event;
-
-bool ip_callchain__valid(struct ip_callchain *chain,
- const union perf_event *event);
/*
* Initialize a cursor before adding entries inside, but keep
* the previously allocated entries as a cache.
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index aaa7d36..a0e427b 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -898,22 +898,6 @@ static int perf_session_deliver_event(struct perf_session *session,
}
}
-static int perf_session__preprocess_sample(struct perf_session *session,
- union perf_event *event, struct perf_sample *sample)
-{
- if (event->header.type != PERF_RECORD_SAMPLE ||
- !sample->callchain)
- return 0;
-
- if (!ip_callchain__valid(sample->callchain, event)) {
- pr_debug("call-chain problem with event, skipping it.\n");
- ++session->stats.nr_invalid_chains;
- session->stats.total_invalid_chains += sample->period;
- return -EINVAL;
- }
- return 0;
-}
-
static int perf_session__process_user_event(struct perf_session *session, union perf_event *event,
struct perf_tool *tool, u64 file_offset)
{
@@ -978,10 +962,6 @@ static int perf_session__process_event(struct perf_session *session,
if (ret)
return ret;
- /* Preprocess sample records - precheck callchains */
- if (perf_session__preprocess_sample(session, event, &sample))
- return 0;
-
if (tool->ordered_samples) {
ret = perf_session_queue_event(session, event, &sample,
file_offset);
--
1.7.11.7
On 7/11/13 7:12 AM, Adrian Hunter wrote:
> The event stream is not always parsable because the format of a sample
> is dependent on the sample_type of the selected event. When there
> is more than one selected event and the sample_types are not the
> same then parsing becomes problematic. A sample can be matched to its
> selected event using the ID that is allocated when the event is opened.
> Unfortunately, to get the ID from the sample means first parsing it.
Here's an alternative suggestion -- one that does not involve changing
the kernel API or requiring a common denominator in sample_type options.
perf handles event streams through an mmap which can be directly tied to
an evsel (a single event) when the mmap is created. ie., when events are
read we know exactly which evsel they correspond to. (See
perf_evlist__mmap_per_cpu and perf_evlist__mmap_per_thread and add
struct perf_evsel *evsel entry to struct perf_mmap).
Commands like perf-record can inject a user event into the stream and
hence the data file every time the evsel changes while walking all of
the mmap's reading events -- very similar to the way finished round is
done. The event would only contain a perf_event_header which is 8 bytes
so this does not add a lot to a data file. As an optimization the evsel
event could only be injected if the sample_types differ.
Live commands would just use the evsel connected to the mmap -- no
lookups needed which would simplify things a bit processing the events.
In short, the information to associate event streams to an event (evsel)
is currently available -- it's just being discarded in the many layers.
I'll try to whip up some code that implements this in the next few days.
David
On 7/11/13 9:43 AM, David Ahern wrote:
> On 7/11/13 7:12 AM, Adrian Hunter wrote:
>> The event stream is not always parsable because the format of a sample
>> is dependent on the sample_type of the selected event. When there
>> is more than one selected event and the sample_types are not the
>> same then parsing becomes problematic. A sample can be matched to its
>> selected event using the ID that is allocated when the event is opened.
>> Unfortunately, to get the ID from the sample means first parsing it.
>
> Here's an alternative suggestion -- one that does not involve changing
> the kernel API or requiring a common denominator in sample_type options.
>
> perf handles event streams through an mmap which can be directly tied to
> an evsel (a single event) when the mmap is created. ie., when events are
> read we know exactly which evsel they correspond to. (See
> perf_evlist__mmap_per_cpu and perf_evlist__mmap_per_thread and add
> struct perf_evsel *evsel entry to struct perf_mmap).
Read that code a bit too quickly. All events for a thread are dropped
into the same buffer so as currently organized there is not an easy
correlation.
David
>
> Commands like perf-record can inject a user event into the stream and
> hence the data file every time the evsel changes while walking all of
> the mmap's reading events -- very similar to the way finished round is
> done. The event would only contain a perf_event_header which is 8 bytes
> so this does not add a lot to a data file. As an optimization the evsel
> event could only be injected if the sample_types differ.
>
> Live commands would just use the evsel connected to the mmap -- no
> lookups needed which would simplify things a bit processing the events.
>
> In short, the information to associate event streams to an event (evsel)
> is currently available -- it's just being discarded in the many layers.
>
> I'll try to whip up some code that implements this in the next few days.
>
> David
>
On 11/07/13 18:43, David Ahern wrote:
> On 7/11/13 7:12 AM, Adrian Hunter wrote:
>> The event stream is not always parsable because the format of a sample
>> is dependent on the sample_type of the selected event. When there
>> is more than one selected event and the sample_types are not the
>> same then parsing becomes problematic. A sample can be matched to its
>> selected event using the ID that is allocated when the event is opened.
>> Unfortunately, to get the ID from the sample means first parsing it.
>
> Here's an alternative suggestion -- one that does not involve changing the
> kernel API or requiring a common denominator in sample_type options.
The kernel API is designed to be extensible. Extending it in a way that is
perhaps unexpected but nevertheless backward compatible, is an appropriate
solution.
>
> perf handles event streams through an mmap which can be directly tied to an
> evsel (a single event) when the mmap is created. ie., when events are read
> we know exactly which evsel they correspond to. (See
> perf_evlist__mmap_per_cpu and perf_evlist__mmap_per_thread and add struct
> perf_evsel *evsel entry to struct perf_mmap).
>
> Commands like perf-record can inject a user event into the stream and hence
> the data file every time the evsel changes while walking all of the mmap's
> reading events -- very similar to the way finished round is done. The event
> would only contain a perf_event_header which is 8 bytes so this does not add
> a lot to a data file. As an optimization the evsel event could only be
> injected if the sample_types differ.
>
> Live commands would just use the evsel connected to the mmap -- no lookups
> needed which would simplify things a bit processing the events.
>
> In short, the information to associate event streams to an event (evsel) is
> currently available -- it's just being discarded in the many layers.
>
> I'll try to whip up some code that implements this in the next few days.
>
> David
>
>
>
On Thu, Jul 11, 2013 at 04:12:18PM +0300, Adrian Hunter wrote:
> The event stream is not always parsable because the format of a sample
> is dependent on the sample_type of the selected event. When there
> is more than one selected event and the sample_types are not the
> same then parsing becomes problematic. A sample can be matched to its
> selected event using the ID that is allocated when the event is opened.
> Unfortunately, to get the ID from the sample means first parsing it.
>
> This patch adds a new sample format bit PERF_SAMPLE_IDENTIFER that puts
> the ID at a fixed position so that the ID can be retrieved without
> parsing the sample. For sample events, that is the first position
> immediately after the header. For non-sample events, that is the last
> position.
There's events where this isn't a possible location; take PERF_RECORD_MMAP for
instance; the tail is the complete filename.
Why not always insert right after the header?
Also; you forgot to update include/uapi/linux/perf_event.h:enum perf_event_type
That's the format documentation.
On 12/07/13 12:56, Peter Zijlstra wrote:
> On Thu, Jul 11, 2013 at 04:12:18PM +0300, Adrian Hunter wrote:
>> The event stream is not always parsable because the format of a sample
>> is dependent on the sample_type of the selected event. When there
>> is more than one selected event and the sample_types are not the
>> same then parsing becomes problematic. A sample can be matched to its
>> selected event using the ID that is allocated when the event is opened.
>> Unfortunately, to get the ID from the sample means first parsing it.
>>
>> This patch adds a new sample format bit PERF_SAMPLE_IDENTIFER that puts
>> the ID at a fixed position so that the ID can be retrieved without
>> parsing the sample. For sample events, that is the first position
>> immediately after the header. For non-sample events, that is the last
>> position.
>
> There's events where this isn't a possible location; take PERF_RECORD_MMAP for
> instance; the tail is the complete filename.
PERF_RECORD_MMAP falls in the category I have called non-sample events.
Those events are appended with an ID sample. perf tools parses the ID
sample backwards from header.size. So the ID is at the last position
relative to header.size
>
> Why not always insert right after the header?
It is for sample events i.e. PERF_RECORD_SAMPLE
>
> Also; you forgot to update include/uapi/linux/perf_event.h:enum perf_event_type
> That's the format documentation.
>
OK
On Fri, Jul 12, 2013 at 03:56:01PM +0300, Adrian Hunter wrote:
> > There's events where this isn't a possible location; take PERF_RECORD_MMAP for
> > instance; the tail is the complete filename.
>
> PERF_RECORD_MMAP falls in the category I have called non-sample events.
> Those events are appended with an ID sample. perf tools parses the ID
> sample backwards from header.size. So the ID is at the last position
> relative to header.size
But why? Why make it different per PERF_RECORD type?
On 12/07/13 17:55, Peter Zijlstra wrote:
> On Fri, Jul 12, 2013 at 03:56:01PM +0300, Adrian Hunter wrote:
>>> There's events where this isn't a possible location; take PERF_RECORD_MMAP for
>>> instance; the tail is the complete filename.
>>
>> PERF_RECORD_MMAP falls in the category I have called non-sample events.
>> Those events are appended with an ID sample. perf tools parses the ID
>> sample backwards from header.size. So the ID is at the last position
>> relative to header.size
>
> But why? Why make it different per PERF_RECORD type?
There have always been two formats:
1. PERF_RECORD_SAMPLE as defined by perf_output_sample()
2. everything else as defined by __perf_event__output_id_sample()
The two formats are not the same, and there is no reason for them to be.
PERF_RECORD_SAMPLE is parsed forwards, so the ID is at the first position.
ID samples are parsed backwards, so the ID is at the last position (i.e. the
first position parsed).
On Mon, Jul 15, 2013 at 8:14 AM, Adrian Hunter <[email protected]> wrote:
> On 12/07/13 17:55, Peter Zijlstra wrote:
>> On Fri, Jul 12, 2013 at 03:56:01PM +0300, Adrian Hunter wrote:
>>>> There's events where this isn't a possible location; take PERF_RECORD_MMAP for
>>>> instance; the tail is the complete filename.
>>>
>>> PERF_RECORD_MMAP falls in the category I have called non-sample events.
>>> Those events are appended with an ID sample. perf tools parses the ID
>>> sample backwards from header.size. So the ID is at the last position
>>> relative to header.size
>>
>> But why? Why make it different per PERF_RECORD type?
>
> There have always been two formats:
>
> 1. PERF_RECORD_SAMPLE as defined by perf_output_sample()
>
> 2. everything else as defined by __perf_event__output_id_sample()
>
> The two formats are not the same, and there is no reason for them to be.
>
> PERF_RECORD_SAMPLE is parsed forwards, so the ID is at the first position.
>
> ID samples are parsed backwards, so the ID is at the last position (i.e. the
> first position parsed).
>
I am missing something here.
Why do we need an event ID for RECORD_MMAP records?
I understand those are requested by events, but do we care which one?
The information is global to the monitored process and not specific to an event.
On 15/07/13 14:53, Stephane Eranian wrote:
> On Mon, Jul 15, 2013 at 8:14 AM, Adrian Hunter <[email protected]> wrote:
>> On 12/07/13 17:55, Peter Zijlstra wrote:
>>> On Fri, Jul 12, 2013 at 03:56:01PM +0300, Adrian Hunter wrote:
>>>>> There's events where this isn't a possible location; take PERF_RECORD_MMAP for
>>>>> instance; the tail is the complete filename.
>>>>
>>>> PERF_RECORD_MMAP falls in the category I have called non-sample events.
>>>> Those events are appended with an ID sample. perf tools parses the ID
>>>> sample backwards from header.size. So the ID is at the last position
>>>> relative to header.size
>>>
>>> But why? Why make it different per PERF_RECORD type?
>>
>> There have always been two formats:
>>
>> 1. PERF_RECORD_SAMPLE as defined by perf_output_sample()
>>
>> 2. everything else as defined by __perf_event__output_id_sample()
>>
>> The two formats are not the same, and there is no reason for them to be.
>>
>> PERF_RECORD_SAMPLE is parsed forwards, so the ID is at the first position.
>>
>> ID samples are parsed backwards, so the ID is at the last position (i.e. the
>> first position parsed).
>>
> I am missing something here.
> Why do we need an event ID for RECORD_MMAP records?
> I understand those are requested by events, but do we care which one?
> The information is global to the monitored process and not specific to an event.
The ID sample has, for example, the time, so you still have to parse it -
which means you need the sample_type which means you need the id.
On 15/07/13 09:14, Adrian Hunter wrote:
> On 12/07/13 17:55, Peter Zijlstra wrote:
>> On Fri, Jul 12, 2013 at 03:56:01PM +0300, Adrian Hunter wrote:
>>>> There's events where this isn't a possible location; take PERF_RECORD_MMAP for
>>>> instance; the tail is the complete filename.
>>>
>>> PERF_RECORD_MMAP falls in the category I have called non-sample events.
>>> Those events are appended with an ID sample. perf tools parses the ID
>>> sample backwards from header.size. So the ID is at the last position
>>> relative to header.size
>>
>> But why? Why make it different per PERF_RECORD type?
>
> There have always been two formats:
>
> 1. PERF_RECORD_SAMPLE as defined by perf_output_sample()
>
> 2. everything else as defined by __perf_event__output_id_sample()
>
> The two formats are not the same, and there is no reason for them to be.
>
> PERF_RECORD_SAMPLE is parsed forwards, so the ID is at the first position.
>
> ID samples are parsed backwards, so the ID is at the last position (i.e. the
> first position parsed).
If you want the ID at the first position in the ID sample, it is do-able.
It means perf tools will have to be changed to calculate the variable start
position of the ID sample, and then parse the ID sample forwards from there.
Please advise.
On Thu, Jul 11, 2013 at 04:12:21PM +0300, Adrian Hunter wrote:
> Add a test that checks that sample parsing is correctly
> implemented.
nice!!! :)
>
> Signed-off-by: Adrian Hunter <[email protected]>
SNIP
> +static int do_test(u64 sample_type, u64 sample_regs_user)
> +{
> + struct perf_evsel evsel = {
> + .needs_swap = false,
> + .attr = {
> + .sample_type = sample_type,
> + .sample_regs_user = sample_regs_user,
> + },
> + };
> + union perf_event event = {
> + .header = {
> + .type = PERF_RECORD_SAMPLE,
> + .size = sizeof(union perf_event),
> + },
> + };
maybe I'm missing something but how does event.sample.array get
allocated? It's used in perf_event__synthesize_sample func.
do we need to use something like in the patch below..
thanks,
jirka
---
diff --git a/tools/perf/tests/sample-parsing.c b/tools/perf/tests/sample-parsing.c
index c6e6813..dcbd40d 100644
--- a/tools/perf/tests/sample-parsing.c
+++ b/tools/perf/tests/sample-parsing.c
@@ -111,12 +111,8 @@ static int do_test(u64 sample_type, u64 sample_regs_user)
.sample_regs_user = sample_regs_user,
},
};
- union perf_event event = {
- .header = {
- .type = PERF_RECORD_SAMPLE,
- .size = sizeof(union perf_event),
- },
- };
+ u64 buf[100];
+ union perf_event *event = (union perf_event *) buf;
union {
struct ip_callchain callchain;
u64 data[64];
@@ -162,10 +158,15 @@ static int do_test(u64 sample_type, u64 sample_regs_user)
size_t i;
int err;
+ event->sample.header = (struct perf_event_header) {
+ .type = PERF_RECORD_SAMPLE,
+ .size = sizeof(union perf_event),
+ };
+
for (i = 0; i < sizeof(user_regs); i++)
*(u8 *)user_regs = i;
- err = perf_event__synthesize_sample(&event, sample_type,
+ err = perf_event__synthesize_sample(event, sample_type,
sample_regs_user, &sample, false);
if (err) {
pr_debug("%s failed for sample_type %#"PRIx64", error %d\n",
@@ -175,7 +176,7 @@ static int do_test(u64 sample_type, u64 sample_regs_user)
evsel.sample_size = __perf_evsel__sample_size(sample_type);
- err = perf_evsel__parse_sample(&evsel, &event, &sample_out);
+ err = perf_evsel__parse_sample(&evsel, event, &sample_out);
if (err) {
pr_debug("%s failed for sample_type %#"PRIx64", error %d\n",
"perf_evsel__parse_sample", sample_type, err);
On Tue, Jul 16, 2013 at 09:49:31AM +0300, Adrian Hunter wrote:
> If you want the ID at the first position in the ID sample, it is do-able.
> It means perf tools will have to be changed to calculate the variable start
> position of the ID sample, and then parse the ID sample forwards from there.
> Please advise.
Oh urgh I see.. how about I merge something like the below first?
Re-reading your patch; you add something to struct sample_id; which means that
events which 'forget' to set perf_event_attr::sample_id_all while setting
PERF_SAMPLE_IDENTIFIER are 'funny'. Should we refuse them?
---
Subject: perf: Update perf_event_type documentation
Due to a discussion with Adrian I had a good look at the perf_event_type record
layout and found the documentation to be somewhat unclear.
Cc: Adrian Hunter <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
---
include/uapi/linux/perf_event.h | 21 +++++++++++++++++++--
kernel/events/core.c | 31 ++++++++++++++++---------------
2 files changed, 35 insertions(+), 17 deletions(-)
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 0b1df41..335016c 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -478,6 +478,16 @@ enum perf_event_type {
* file will be supported by older perf tools, with these new optional
* fields being ignored.
*
+ * struct sample_id {
+ * { u32 pid, tid; } && PERF_SAMPLE_TID
+ * { u64 time; } && PERF_SAMPLE_TIME
+ * { u64 id; } && PERF_SAMPLE_ID
+ * { u64 stream_id;} && PERF_SAMPLE_STREAM_ID
+ * { u32 cpu, res; } && PERF_SAMPLE_CPU
+ * } && perf_event_attr::sample_id_all
+ */
+
+ /*
* The MMAP events record the PROT_EXEC mappings so that we can
* correlate userspace IPs to code. They have the following structure:
*
@@ -488,7 +498,8 @@ enum perf_event_type {
* u64 addr;
* u64 len;
* u64 pgoff;
+ * struct sample_id sample_id;
* };
*/
PERF_RECORD_MMAP = 1,
@@ -498,6 +509,7 @@ enum perf_event_type {
* struct perf_event_header header;
* u64 id;
* u64 lost;
+ * struct sample_id sample_id;
* };
*/
PERF_RECORD_LOST = 2,
@@ -508,6 +520,7 @@ enum perf_event_type {
*
* u32 pid, tid;
* char comm[];
+ * struct sample_id sample_id;
* };
*/
PERF_RECORD_COMM = 3,
@@ -518,6 +531,7 @@ enum perf_event_type {
* u32 pid, ppid;
* u32 tid, ptid;
* u64 time;
+ * struct sample_id sample_id;
* };
*/
PERF_RECORD_EXIT = 4,
@@ -528,6 +542,7 @@ enum perf_event_type {
* u64 time;
* u64 id;
* u64 stream_id;
+ * struct sample_id sample_id;
* };
*/
PERF_RECORD_THROTTLE = 5,
@@ -539,6 +554,7 @@ enum perf_event_type {
* u32 pid, ppid;
* u32 tid, ptid;
* u64 time;
+ * struct sample_id sample_id;
* };
*/
PERF_RECORD_FORK = 7,
@@ -549,6 +565,7 @@ enum perf_event_type {
* u32 pid, tid;
*
* struct read_format values;
+ * struct sample_id sample_id;
* };
*/
PERF_RECORD_READ = 8,
@@ -596,7 +613,7 @@ enum perf_event_type {
* u64 dyn_size; } && PERF_SAMPLE_STACK_USER
*
* { u64 weight; } && PERF_SAMPLE_WEIGHT
- * { u64 data_src; } && PERF_SAMPLE_DATA_SRC
+ * { u64 data_src; } && PERF_SAMPLE_DATA_SRC
* };
*/
PERF_RECORD_SAMPLE = 9,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 708ab70..c9ef899 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4462,20 +4462,6 @@ void perf_output_sample(struct perf_output_handle *handle,
}
}
- if (!event->attr.watermark) {
- int wakeup_events = event->attr.wakeup_events;
-
- if (wakeup_events) {
- struct ring_buffer *rb = handle->rb;
- int events = local_inc_return(&rb->events);
-
- if (events >= wakeup_events) {
- local_sub(wakeup_events, &rb->events);
- local_inc(&rb->wakeup);
- }
- }
- }
-
if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
if (data->br_stack) {
size_t size;
@@ -4511,16 +4497,31 @@ void perf_output_sample(struct perf_output_handle *handle,
}
}
- if (sample_type & PERF_SAMPLE_STACK_USER)
+ if (sample_type & PERF_SAMPLE_STACK_USER) {
perf_output_sample_ustack(handle,
data->stack_user_size,
data->regs_user.regs);
+ }
if (sample_type & PERF_SAMPLE_WEIGHT)
perf_output_put(handle, data->weight);
if (sample_type & PERF_SAMPLE_DATA_SRC)
perf_output_put(handle, data->data_src.val);
+
+ if (!event->attr.watermark) {
+ int wakeup_events = event->attr.wakeup_events;
+
+ if (wakeup_events) {
+ struct ring_buffer *rb = handle->rb;
+ int events = local_inc_return(&rb->events);
+
+ if (events >= wakeup_events) {
+ local_sub(wakeup_events, &rb->events);
+ local_inc(&rb->wakeup);
+ }
+ }
+ }
}
void perf_prepare_sample(struct perf_event_header *header,
On Tue, Jul 16, 2013 at 5:09 PM, Peter Zijlstra <[email protected]> wrote:
> On Tue, Jul 16, 2013 at 09:49:31AM +0300, Adrian Hunter wrote:
>
>> If you want the ID at the first position in the ID sample, it is do-able.
>> It means perf tools will have to be changed to calculate the variable start
>> position of the ID sample, and then parse the ID sample forwards from there.
>> Please advise.
>
This is complicated and leads me to believe that a better way of doing this
is to leverage some bits in the header. This is what my proposal was. I did
not yet have a chance to post it. But you can use a bit in the header
to indicate
the presence of the eventID. That way, you do not have to relate to the
event attr->sample_type to determine how to parse the meta-data records
such as MMAP. When the bit is present in the header, the eventID is
to be found at -8 bytes form the end of the record.
> Oh urgh I see.. how about I merge something like the below first?
>
> Re-reading your patch; you add something to struct sample_id; which means that
> events which 'forget' to set perf_event_attr::sample_id_all while setting
> PERF_SAMPLE_IDENTIFIER are 'funny'. Should we refuse them?
>
>
> ---
> Subject: perf: Update perf_event_type documentation
>
> Due to a discussion with Adrian I had a good look at the perf_event_type record
> layout and found the documentation to be somewhat unclear.
>
> Cc: Adrian Hunter <[email protected]>
> Signed-off-by: Peter Zijlstra <[email protected]>
> ---
> include/uapi/linux/perf_event.h | 21 +++++++++++++++++++--
> kernel/events/core.c | 31 ++++++++++++++++---------------
> 2 files changed, 35 insertions(+), 17 deletions(-)
>
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 0b1df41..335016c 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -478,6 +478,16 @@ enum perf_event_type {
> * file will be supported by older perf tools, with these new optional
> * fields being ignored.
> *
> + * struct sample_id {
> + * { u32 pid, tid; } && PERF_SAMPLE_TID
> + * { u64 time; } && PERF_SAMPLE_TIME
> + * { u64 id; } && PERF_SAMPLE_ID
> + * { u64 stream_id;} && PERF_SAMPLE_STREAM_ID
> + * { u32 cpu, res; } && PERF_SAMPLE_CPU
> + * } && perf_event_attr::sample_id_all
> + */
> +
> + /*
> * The MMAP events record the PROT_EXEC mappings so that we can
> * correlate userspace IPs to code. They have the following structure:
> *
> @@ -488,7 +498,8 @@ enum perf_event_type {
> * u64 addr;
> * u64 len;
> * u64 pgoff;
> + * struct sample_id sample_id;
> * };
> */
> PERF_RECORD_MMAP = 1,
> @@ -498,6 +509,7 @@ enum perf_event_type {
> * struct perf_event_header header;
> * u64 id;
> * u64 lost;
> + * struct sample_id sample_id;
> * };
> */
> PERF_RECORD_LOST = 2,
> @@ -508,6 +520,7 @@ enum perf_event_type {
> *
> * u32 pid, tid;
> * char comm[];
> + * struct sample_id sample_id;
> * };
> */
> PERF_RECORD_COMM = 3,
> @@ -518,6 +531,7 @@ enum perf_event_type {
> * u32 pid, ppid;
> * u32 tid, ptid;
> * u64 time;
> + * struct sample_id sample_id;
> * };
> */
> PERF_RECORD_EXIT = 4,
> @@ -528,6 +542,7 @@ enum perf_event_type {
> * u64 time;
> * u64 id;
> * u64 stream_id;
> + * struct sample_id sample_id;
> * };
> */
> PERF_RECORD_THROTTLE = 5,
> @@ -539,6 +554,7 @@ enum perf_event_type {
> * u32 pid, ppid;
> * u32 tid, ptid;
> * u64 time;
> + * struct sample_id sample_id;
> * };
> */
> PERF_RECORD_FORK = 7,
> @@ -549,6 +565,7 @@ enum perf_event_type {
> * u32 pid, tid;
> *
> * struct read_format values;
> + * struct sample_id sample_id;
> * };
> */
> PERF_RECORD_READ = 8,
> @@ -596,7 +613,7 @@ enum perf_event_type {
> * u64 dyn_size; } && PERF_SAMPLE_STACK_USER
> *
> * { u64 weight; } && PERF_SAMPLE_WEIGHT
> - * { u64 data_src; } && PERF_SAMPLE_DATA_SRC
> + * { u64 data_src; } && PERF_SAMPLE_DATA_SRC
> * };
> */
> PERF_RECORD_SAMPLE = 9,
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 708ab70..c9ef899 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -4462,20 +4462,6 @@ void perf_output_sample(struct perf_output_handle *handle,
> }
> }
>
> - if (!event->attr.watermark) {
> - int wakeup_events = event->attr.wakeup_events;
> -
> - if (wakeup_events) {
> - struct ring_buffer *rb = handle->rb;
> - int events = local_inc_return(&rb->events);
> -
> - if (events >= wakeup_events) {
> - local_sub(wakeup_events, &rb->events);
> - local_inc(&rb->wakeup);
> - }
> - }
> - }
> -
> if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
> if (data->br_stack) {
> size_t size;
> @@ -4511,16 +4497,31 @@ void perf_output_sample(struct perf_output_handle *handle,
> }
> }
>
> - if (sample_type & PERF_SAMPLE_STACK_USER)
> + if (sample_type & PERF_SAMPLE_STACK_USER) {
> perf_output_sample_ustack(handle,
> data->stack_user_size,
> data->regs_user.regs);
> + }
>
> if (sample_type & PERF_SAMPLE_WEIGHT)
> perf_output_put(handle, data->weight);
>
> if (sample_type & PERF_SAMPLE_DATA_SRC)
> perf_output_put(handle, data->data_src.val);
> +
> + if (!event->attr.watermark) {
> + int wakeup_events = event->attr.wakeup_events;
> +
> + if (wakeup_events) {
> + struct ring_buffer *rb = handle->rb;
> + int events = local_inc_return(&rb->events);
> +
> + if (events >= wakeup_events) {
> + local_sub(wakeup_events, &rb->events);
> + local_inc(&rb->wakeup);
> + }
> + }
> + }
> }
>
> void perf_prepare_sample(struct perf_event_header *header,
>
On 16/07/13 15:48, Jiri Olsa wrote:
> On Thu, Jul 11, 2013 at 04:12:21PM +0300, Adrian Hunter wrote:
>> Add a test that checks that sample parsing is correctly
>> implemented.
>
> nice!!! :)
>
>>
>> Signed-off-by: Adrian Hunter <[email protected]>
>
> SNIP
>
>> +static int do_test(u64 sample_type, u64 sample_regs_user)
>> +{
>> + struct perf_evsel evsel = {
>> + .needs_swap = false,
>> + .attr = {
>> + .sample_type = sample_type,
>> + .sample_regs_user = sample_regs_user,
>> + },
>> + };
>> + union perf_event event = {
>> + .header = {
>> + .type = PERF_RECORD_SAMPLE,
>> + .size = sizeof(union perf_event),
>> + },
>> + };
>
> maybe I'm missing something but how does event.sample.array get
> allocated? It's used in perf_event__synthesize_sample func.
'union perf_event' is very big because of 'struct mmap_event' which contains
'char filename[PATH_MAX]' and PATH_MAX is 4096.
>
> do we need to use something like in the patch below..
Might as well calculate it correctly.
>
> thanks,
> jirka
>
> ---
> diff --git a/tools/perf/tests/sample-parsing.c b/tools/perf/tests/sample-parsing.c
> index c6e6813..dcbd40d 100644
> --- a/tools/perf/tests/sample-parsing.c
> +++ b/tools/perf/tests/sample-parsing.c
> @@ -111,12 +111,8 @@ static int do_test(u64 sample_type, u64 sample_regs_user)
> .sample_regs_user = sample_regs_user,
> },
> };
> - union perf_event event = {
> - .header = {
> - .type = PERF_RECORD_SAMPLE,
> - .size = sizeof(union perf_event),
> - },
> - };
> + u64 buf[100];
> + union perf_event *event = (union perf_event *) buf;
> union {
> struct ip_callchain callchain;
> u64 data[64];
> @@ -162,10 +158,15 @@ static int do_test(u64 sample_type, u64 sample_regs_user)
> size_t i;
> int err;
>
> + event->sample.header = (struct perf_event_header) {
> + .type = PERF_RECORD_SAMPLE,
> + .size = sizeof(union perf_event),
> + };
> +
> for (i = 0; i < sizeof(user_regs); i++)
> *(u8 *)user_regs = i;
>
> - err = perf_event__synthesize_sample(&event, sample_type,
> + err = perf_event__synthesize_sample(event, sample_type,
> sample_regs_user, &sample, false);
> if (err) {
> pr_debug("%s failed for sample_type %#"PRIx64", error %d\n",
> @@ -175,7 +176,7 @@ static int do_test(u64 sample_type, u64 sample_regs_user)
>
> evsel.sample_size = __perf_evsel__sample_size(sample_type);
>
> - err = perf_evsel__parse_sample(&evsel, &event, &sample_out);
> + err = perf_evsel__parse_sample(&evsel, event, &sample_out);
> if (err) {
> pr_debug("%s failed for sample_type %#"PRIx64", error %d\n",
> "perf_evsel__parse_sample", sample_type, err);
>
>
On 16/07/13 18:09, Peter Zijlstra wrote:
> On Tue, Jul 16, 2013 at 09:49:31AM +0300, Adrian Hunter wrote:
>
>> If you want the ID at the first position in the ID sample, it is do-able.
>> It means perf tools will have to be changed to calculate the variable start
>> position of the ID sample, and then parse the ID sample forwards from there.
>> Please advise.
>
> Oh urgh I see.. how about I merge something like the below first?
I put in my V7 series.
>
> Re-reading your patch; you add something to struct sample_id; which means that
> events which 'forget' to set perf_event_attr::sample_id_all while setting
> PERF_SAMPLE_IDENTIFIER are 'funny'. Should we refuse them?
No, that is OK.
>
>
> ---
> Subject: perf: Update perf_event_type documentation
>
> Due to a discussion with Adrian I had a good look at the perf_event_type record
> layout and found the documentation to be somewhat unclear.
>
> Cc: Adrian Hunter <[email protected]>
> Signed-off-by: Peter Zijlstra <[email protected]>
> ---
> include/uapi/linux/perf_event.h | 21 +++++++++++++++++++--
> kernel/events/core.c | 31 ++++++++++++++++---------------
> 2 files changed, 35 insertions(+), 17 deletions(-)
>
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 0b1df41..335016c 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -478,6 +478,16 @@ enum perf_event_type {
> * file will be supported by older perf tools, with these new optional
> * fields being ignored.
> *
> + * struct sample_id {
> + * { u32 pid, tid; } && PERF_SAMPLE_TID
> + * { u64 time; } && PERF_SAMPLE_TIME
> + * { u64 id; } && PERF_SAMPLE_ID
> + * { u64 stream_id;} && PERF_SAMPLE_STREAM_ID
> + * { u32 cpu, res; } && PERF_SAMPLE_CPU
> + * } && perf_event_attr::sample_id_all
> + */
> +
> + /*
> * The MMAP events record the PROT_EXEC mappings so that we can
> * correlate userspace IPs to code. They have the following structure:
> *
> @@ -488,7 +498,8 @@ enum perf_event_type {
> * u64 addr;
> * u64 len;
> * u64 pgoff;
> + * struct sample_id sample_id;
> * };
> */
> PERF_RECORD_MMAP = 1,
> @@ -498,6 +509,7 @@ enum perf_event_type {
> * struct perf_event_header header;
> * u64 id;
> * u64 lost;
> + * struct sample_id sample_id;
> * };
> */
> PERF_RECORD_LOST = 2,
> @@ -508,6 +520,7 @@ enum perf_event_type {
> *
> * u32 pid, tid;
> * char comm[];
> + * struct sample_id sample_id;
> * };
> */
> PERF_RECORD_COMM = 3,
> @@ -518,6 +531,7 @@ enum perf_event_type {
> * u32 pid, ppid;
> * u32 tid, ptid;
> * u64 time;
> + * struct sample_id sample_id;
> * };
> */
> PERF_RECORD_EXIT = 4,
> @@ -528,6 +542,7 @@ enum perf_event_type {
> * u64 time;
> * u64 id;
> * u64 stream_id;
> + * struct sample_id sample_id;
> * };
> */
> PERF_RECORD_THROTTLE = 5,
> @@ -539,6 +554,7 @@ enum perf_event_type {
> * u32 pid, ppid;
> * u32 tid, ptid;
> * u64 time;
> + * struct sample_id sample_id;
> * };
> */
> PERF_RECORD_FORK = 7,
> @@ -549,6 +565,7 @@ enum perf_event_type {
> * u32 pid, tid;
> *
> * struct read_format values;
> + * struct sample_id sample_id;
> * };
> */
> PERF_RECORD_READ = 8,
> @@ -596,7 +613,7 @@ enum perf_event_type {
> * u64 dyn_size; } && PERF_SAMPLE_STACK_USER
> *
> * { u64 weight; } && PERF_SAMPLE_WEIGHT
> - * { u64 data_src; } && PERF_SAMPLE_DATA_SRC
> + * { u64 data_src; } && PERF_SAMPLE_DATA_SRC
> * };
> */
> PERF_RECORD_SAMPLE = 9,
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 708ab70..c9ef899 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -4462,20 +4462,6 @@ void perf_output_sample(struct perf_output_handle *handle,
> }
> }
>
> - if (!event->attr.watermark) {
> - int wakeup_events = event->attr.wakeup_events;
> -
> - if (wakeup_events) {
> - struct ring_buffer *rb = handle->rb;
> - int events = local_inc_return(&rb->events);
> -
> - if (events >= wakeup_events) {
> - local_sub(wakeup_events, &rb->events);
> - local_inc(&rb->wakeup);
> - }
> - }
> - }
> -
> if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
> if (data->br_stack) {
> size_t size;
> @@ -4511,16 +4497,31 @@ void perf_output_sample(struct perf_output_handle *handle,
> }
> }
>
> - if (sample_type & PERF_SAMPLE_STACK_USER)
> + if (sample_type & PERF_SAMPLE_STACK_USER) {
> perf_output_sample_ustack(handle,
> data->stack_user_size,
> data->regs_user.regs);
> + }
>
> if (sample_type & PERF_SAMPLE_WEIGHT)
> perf_output_put(handle, data->weight);
>
> if (sample_type & PERF_SAMPLE_DATA_SRC)
> perf_output_put(handle, data->data_src.val);
> +
> + if (!event->attr.watermark) {
> + int wakeup_events = event->attr.wakeup_events;
> +
> + if (wakeup_events) {
> + struct ring_buffer *rb = handle->rb;
> + int events = local_inc_return(&rb->events);
> +
> + if (events >= wakeup_events) {
> + local_sub(wakeup_events, &rb->events);
> + local_inc(&rb->wakeup);
> + }
> + }
> + }
> }
>
> void perf_prepare_sample(struct perf_event_header *header,
>
>
>
Commit-ID: a5cdd40c9877e9aba704c020fd65d26b5cfecf18
Gitweb: http://git.kernel.org/tip/a5cdd40c9877e9aba704c020fd65d26b5cfecf18
Author: Peter Zijlstra <[email protected]>
AuthorDate: Tue, 16 Jul 2013 17:09:07 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Tue, 23 Jul 2013 12:17:08 +0200
perf: Update perf_event_type documentation
Due to a discussion with Adrian I had a good look at the perf_event_type record
layout and found the documentation to be somewhat unclear.
Cc: Adrian Hunter <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
include/uapi/linux/perf_event.h | 18 +++++++++++++++++-
kernel/events/core.c | 31 ++++++++++++++++---------------
2 files changed, 33 insertions(+), 16 deletions(-)
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 0b1df41..00d8274 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -478,6 +478,16 @@ enum perf_event_type {
* file will be supported by older perf tools, with these new optional
* fields being ignored.
*
+ * struct sample_id {
+ * { u32 pid, tid; } && PERF_SAMPLE_TID
+ * { u64 time; } && PERF_SAMPLE_TIME
+ * { u64 id; } && PERF_SAMPLE_ID
+ * { u64 stream_id;} && PERF_SAMPLE_STREAM_ID
+ * { u32 cpu, res; } && PERF_SAMPLE_CPU
+ * } && perf_event_attr::sample_id_all
+ */
+
+ /*
* The MMAP events record the PROT_EXEC mappings so that we can
* correlate userspace IPs to code. They have the following structure:
*
@@ -498,6 +508,7 @@ enum perf_event_type {
* struct perf_event_header header;
* u64 id;
* u64 lost;
+ * struct sample_id sample_id;
* };
*/
PERF_RECORD_LOST = 2,
@@ -508,6 +519,7 @@ enum perf_event_type {
*
* u32 pid, tid;
* char comm[];
+ * struct sample_id sample_id;
* };
*/
PERF_RECORD_COMM = 3,
@@ -518,6 +530,7 @@ enum perf_event_type {
* u32 pid, ppid;
* u32 tid, ptid;
* u64 time;
+ * struct sample_id sample_id;
* };
*/
PERF_RECORD_EXIT = 4,
@@ -528,6 +541,7 @@ enum perf_event_type {
* u64 time;
* u64 id;
* u64 stream_id;
+ * struct sample_id sample_id;
* };
*/
PERF_RECORD_THROTTLE = 5,
@@ -539,6 +553,7 @@ enum perf_event_type {
* u32 pid, ppid;
* u32 tid, ptid;
* u64 time;
+ * struct sample_id sample_id;
* };
*/
PERF_RECORD_FORK = 7,
@@ -549,6 +564,7 @@ enum perf_event_type {
* u32 pid, tid;
*
* struct read_format values;
+ * struct sample_id sample_id;
* };
*/
PERF_RECORD_READ = 8,
@@ -596,7 +612,7 @@ enum perf_event_type {
* u64 dyn_size; } && PERF_SAMPLE_STACK_USER
*
* { u64 weight; } && PERF_SAMPLE_WEIGHT
- * { u64 data_src; } && PERF_SAMPLE_DATA_SRC
+ * { u64 data_src; } && PERF_SAMPLE_DATA_SRC
* };
*/
PERF_RECORD_SAMPLE = 9,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5e2bce9..1274114 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4462,20 +4462,6 @@ void perf_output_sample(struct perf_output_handle *handle,
}
}
- if (!event->attr.watermark) {
- int wakeup_events = event->attr.wakeup_events;
-
- if (wakeup_events) {
- struct ring_buffer *rb = handle->rb;
- int events = local_inc_return(&rb->events);
-
- if (events >= wakeup_events) {
- local_sub(wakeup_events, &rb->events);
- local_inc(&rb->wakeup);
- }
- }
- }
-
if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
if (data->br_stack) {
size_t size;
@@ -4511,16 +4497,31 @@ void perf_output_sample(struct perf_output_handle *handle,
}
}
- if (sample_type & PERF_SAMPLE_STACK_USER)
+ if (sample_type & PERF_SAMPLE_STACK_USER) {
perf_output_sample_ustack(handle,
data->stack_user_size,
data->regs_user.regs);
+ }
if (sample_type & PERF_SAMPLE_WEIGHT)
perf_output_put(handle, data->weight);
if (sample_type & PERF_SAMPLE_DATA_SRC)
perf_output_put(handle, data->data_src.val);
+
+ if (!event->attr.watermark) {
+ int wakeup_events = event->attr.wakeup_events;
+
+ if (wakeup_events) {
+ struct ring_buffer *rb = handle->rb;
+ int events = local_inc_return(&rb->events);
+
+ if (events >= wakeup_events) {
+ local_sub(wakeup_events, &rb->events);
+ local_inc(&rb->wakeup);
+ }
+ }
+ }
}
void perf_prepare_sample(struct perf_event_header *header,
On Tue, 23 Jul 2013, tip-bot for Peter Zijlstra wrote:
> Commit-ID: a5cdd40c9877e9aba704c020fd65d26b5cfecf18
> Gitweb: http://git.kernel.org/tip/a5cdd40c9877e9aba704c020fd65d26b5cfecf18
> Author: Peter Zijlstra <[email protected]>
> AuthorDate: Tue, 16 Jul 2013 17:09:07 +0200
> Committer: Ingo Molnar <[email protected]>
> CommitDate: Tue, 23 Jul 2013 12:17:08 +0200
>
> perf: Update perf_event_type documentation
>
> Due to a discussion with Adrian I had a good look at the perf_event_type record
> layout and found the documentation to be somewhat unclear.
This code makes a lot of code changes considering it is "updating
documentation".
Also, will code following this "documentation" be backward compatible?
Meaning, if you have code that depends on the new fields you've added
and run on older kernels, will you get sane results? Or will the code
get garbage and/or segfault in the sample_id fields because you read
past the end of valid data?
> Cc: Adrian Hunter <[email protected]>
> Signed-off-by: Peter Zijlstra <[email protected]>
> Link: http://lkml.kernel.org/r/[email protected]
> Signed-off-by: Ingo Molnar <[email protected]>
> ---
> include/uapi/linux/perf_event.h | 18 +++++++++++++++++-
> kernel/events/core.c | 31 ++++++++++++++++---------------
> 2 files changed, 33 insertions(+), 16 deletions(-)
>
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 0b1df41..00d8274 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -478,6 +478,16 @@ enum perf_event_type {
> * file will be supported by older perf tools, with these new optional
> * fields being ignored.
> *
> + * struct sample_id {
> + * { u32 pid, tid; } && PERF_SAMPLE_TID
> + * { u64 time; } && PERF_SAMPLE_TIME
> + * { u64 id; } && PERF_SAMPLE_ID
> + * { u64 stream_id;} && PERF_SAMPLE_STREAM_ID
> + * { u32 cpu, res; } && PERF_SAMPLE_CPU
> + * } && perf_event_attr::sample_id_all
> + */
> +
> + /*
> * The MMAP events record the PROT_EXEC mappings so that we can
> * correlate userspace IPs to code. They have the following structure:
> *
> @@ -498,6 +508,7 @@ enum perf_event_type {
> * struct perf_event_header header;
> * u64 id;
> * u64 lost;
> + * struct sample_id sample_id;
> * };
> */
> PERF_RECORD_LOST = 2,
> @@ -508,6 +519,7 @@ enum perf_event_type {
> *
> * u32 pid, tid;
> * char comm[];
> + * struct sample_id sample_id;
> * };
> */
> PERF_RECORD_COMM = 3,
> @@ -518,6 +530,7 @@ enum perf_event_type {
> * u32 pid, ppid;
> * u32 tid, ptid;
> * u64 time;
> + * struct sample_id sample_id;
> * };
> */
> PERF_RECORD_EXIT = 4,
> @@ -528,6 +541,7 @@ enum perf_event_type {
> * u64 time;
> * u64 id;
> * u64 stream_id;
> + * struct sample_id sample_id;
> * };
> */
> PERF_RECORD_THROTTLE = 5,
> @@ -539,6 +553,7 @@ enum perf_event_type {
> * u32 pid, ppid;
> * u32 tid, ptid;
> * u64 time;
> + * struct sample_id sample_id;
> * };
> */
> PERF_RECORD_FORK = 7,
> @@ -549,6 +564,7 @@ enum perf_event_type {
> * u32 pid, tid;
> *
> * struct read_format values;
> + * struct sample_id sample_id;
> * };
> */
> PERF_RECORD_READ = 8,
> @@ -596,7 +612,7 @@ enum perf_event_type {
> * u64 dyn_size; } && PERF_SAMPLE_STACK_USER
> *
> * { u64 weight; } && PERF_SAMPLE_WEIGHT
> - * { u64 data_src; } && PERF_SAMPLE_DATA_SRC
> + * { u64 data_src; } && PERF_SAMPLE_DATA_SRC
> * };
> */
> PERF_RECORD_SAMPLE = 9,
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 5e2bce9..1274114 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -4462,20 +4462,6 @@ void perf_output_sample(struct perf_output_handle *handle,
> }
> }
>
> - if (!event->attr.watermark) {
> - int wakeup_events = event->attr.wakeup_events;
> -
> - if (wakeup_events) {
> - struct ring_buffer *rb = handle->rb;
> - int events = local_inc_return(&rb->events);
> -
> - if (events >= wakeup_events) {
> - local_sub(wakeup_events, &rb->events);
> - local_inc(&rb->wakeup);
> - }
> - }
> - }
> -
> if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
> if (data->br_stack) {
> size_t size;
> @@ -4511,16 +4497,31 @@ void perf_output_sample(struct perf_output_handle *handle,
> }
> }
>
> - if (sample_type & PERF_SAMPLE_STACK_USER)
> + if (sample_type & PERF_SAMPLE_STACK_USER) {
> perf_output_sample_ustack(handle,
> data->stack_user_size,
> data->regs_user.regs);
> + }
>
> if (sample_type & PERF_SAMPLE_WEIGHT)
> perf_output_put(handle, data->weight);
>
> if (sample_type & PERF_SAMPLE_DATA_SRC)
> perf_output_put(handle, data->data_src.val);
> +
> + if (!event->attr.watermark) {
> + int wakeup_events = event->attr.wakeup_events;
> +
> + if (wakeup_events) {
> + struct ring_buffer *rb = handle->rb;
> + int events = local_inc_return(&rb->events);
> +
> + if (events >= wakeup_events) {
> + local_sub(wakeup_events, &rb->events);
> + local_inc(&rb->wakeup);
> + }
> + }
> + }
> }
>
> void perf_prepare_sample(struct perf_event_header *header,
On 24/07/13 20:54, Vince Weaver wrote:
> On Tue, 23 Jul 2013, tip-bot for Peter Zijlstra wrote:
>
>> Commit-ID: a5cdd40c9877e9aba704c020fd65d26b5cfecf18
>> Gitweb: http://git.kernel.org/tip/a5cdd40c9877e9aba704c020fd65d26b5cfecf18
>> Author: Peter Zijlstra <[email protected]>
>> AuthorDate: Tue, 16 Jul 2013 17:09:07 +0200
>> Committer: Ingo Molnar <[email protected]>
>> CommitDate: Tue, 23 Jul 2013 12:17:08 +0200
>>
>> perf: Update perf_event_type documentation
>>
>> Due to a discussion with Adrian I had a good look at the perf_event_type record
>> layout and found the documentation to be somewhat unclear.
>
> This code makes a lot of code changes considering it is "updating
> documentation".
>
> Also, will code following this "documentation" be backward compatible?
>
> Meaning, if you have code that depends on the new fields you've added
> and run on older kernels, will you get sane results? Or will the code
> get garbage and/or segfault in the sample_id fields because you read
> past the end of valid data?
This is not the patch that adds new sample format bits.
New sample format bits are always backward compatible in the sense that the
kernel perf_event_open system call will return an error if you try to use
sample format bits that it does not know about.
Applications will not spontaneously start using sample format bits they have
not been programmed for. If an application is enhanced to use a new sample
format bit then it must be ready to cope with errors on older kernels. With
respect to the sample format bit I tried to propose
(PERF_SAMPLE_IDENTIFIER), I modified perf tools to check if the kernel
supported it. Other applications that want to use PERF_SAMPLE_IDENTIFIER
should take the same approach.
New sample format bits are added from time to time such as
PERF_SAMPLE_WEIGHT and PERF_SAMPLE_DATA_SRC which were added in 3.10.
>
>
>> Cc: Adrian Hunter <[email protected]>
>> Signed-off-by: Peter Zijlstra <[email protected]>
>> Link: http://lkml.kernel.org/r/[email protected]
>> Signed-off-by: Ingo Molnar <[email protected]>
>> ---
>> include/uapi/linux/perf_event.h | 18 +++++++++++++++++-
>> kernel/events/core.c | 31 ++++++++++++++++---------------
>> 2 files changed, 33 insertions(+), 16 deletions(-)
>>
>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>> index 0b1df41..00d8274 100644
>> --- a/include/uapi/linux/perf_event.h
>> +++ b/include/uapi/linux/perf_event.h
>> @@ -478,6 +478,16 @@ enum perf_event_type {
>> * file will be supported by older perf tools, with these new optional
>> * fields being ignored.
>> *
>> + * struct sample_id {
>> + * { u32 pid, tid; } && PERF_SAMPLE_TID
>> + * { u64 time; } && PERF_SAMPLE_TIME
>> + * { u64 id; } && PERF_SAMPLE_ID
>> + * { u64 stream_id;} && PERF_SAMPLE_STREAM_ID
>> + * { u32 cpu, res; } && PERF_SAMPLE_CPU
>> + * } && perf_event_attr::sample_id_all
>> + */
>> +
>> + /*
>> * The MMAP events record the PROT_EXEC mappings so that we can
>> * correlate userspace IPs to code. They have the following structure:
>> *
>> @@ -498,6 +508,7 @@ enum perf_event_type {
>> * struct perf_event_header header;
>> * u64 id;
>> * u64 lost;
>> + * struct sample_id sample_id;
>> * };
>> */
>> PERF_RECORD_LOST = 2,
>> @@ -508,6 +519,7 @@ enum perf_event_type {
>> *
>> * u32 pid, tid;
>> * char comm[];
>> + * struct sample_id sample_id;
>> * };
>> */
>> PERF_RECORD_COMM = 3,
>> @@ -518,6 +530,7 @@ enum perf_event_type {
>> * u32 pid, ppid;
>> * u32 tid, ptid;
>> * u64 time;
>> + * struct sample_id sample_id;
>> * };
>> */
>> PERF_RECORD_EXIT = 4,
>> @@ -528,6 +541,7 @@ enum perf_event_type {
>> * u64 time;
>> * u64 id;
>> * u64 stream_id;
>> + * struct sample_id sample_id;
>> * };
>> */
>> PERF_RECORD_THROTTLE = 5,
>> @@ -539,6 +553,7 @@ enum perf_event_type {
>> * u32 pid, ppid;
>> * u32 tid, ptid;
>> * u64 time;
>> + * struct sample_id sample_id;
>> * };
>> */
>> PERF_RECORD_FORK = 7,
>> @@ -549,6 +564,7 @@ enum perf_event_type {
>> * u32 pid, tid;
>> *
>> * struct read_format values;
>> + * struct sample_id sample_id;
>> * };
>> */
>> PERF_RECORD_READ = 8,
>> @@ -596,7 +612,7 @@ enum perf_event_type {
>> * u64 dyn_size; } && PERF_SAMPLE_STACK_USER
>> *
>> * { u64 weight; } && PERF_SAMPLE_WEIGHT
>> - * { u64 data_src; } && PERF_SAMPLE_DATA_SRC
>> + * { u64 data_src; } && PERF_SAMPLE_DATA_SRC
>> * };
>> */
>> PERF_RECORD_SAMPLE = 9,
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index 5e2bce9..1274114 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -4462,20 +4462,6 @@ void perf_output_sample(struct perf_output_handle *handle,
>> }
>> }
>>
>> - if (!event->attr.watermark) {
>> - int wakeup_events = event->attr.wakeup_events;
>> -
>> - if (wakeup_events) {
>> - struct ring_buffer *rb = handle->rb;
>> - int events = local_inc_return(&rb->events);
>> -
>> - if (events >= wakeup_events) {
>> - local_sub(wakeup_events, &rb->events);
>> - local_inc(&rb->wakeup);
>> - }
>> - }
>> - }
>> -
>> if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
>> if (data->br_stack) {
>> size_t size;
>> @@ -4511,16 +4497,31 @@ void perf_output_sample(struct perf_output_handle *handle,
>> }
>> }
>>
>> - if (sample_type & PERF_SAMPLE_STACK_USER)
>> + if (sample_type & PERF_SAMPLE_STACK_USER) {
>> perf_output_sample_ustack(handle,
>> data->stack_user_size,
>> data->regs_user.regs);
>> + }
>>
>> if (sample_type & PERF_SAMPLE_WEIGHT)
>> perf_output_put(handle, data->weight);
>>
>> if (sample_type & PERF_SAMPLE_DATA_SRC)
>> perf_output_put(handle, data->data_src.val);
>> +
>> + if (!event->attr.watermark) {
>> + int wakeup_events = event->attr.wakeup_events;
>> +
>> + if (wakeup_events) {
>> + struct ring_buffer *rb = handle->rb;
>> + int events = local_inc_return(&rb->events);
>> +
>> + if (events >= wakeup_events) {
>> + local_sub(wakeup_events, &rb->events);
>> + local_inc(&rb->wakeup);
>> + }
>> + }
>> + }
>> }
>>
>> void perf_prepare_sample(struct perf_event_header *header,
>
>
On Wed, Jul 24, 2013 at 01:54:00PM -0400, Vince Weaver wrote:
> On Tue, 23 Jul 2013, tip-bot for Peter Zijlstra wrote:
>
> > Commit-ID: a5cdd40c9877e9aba704c020fd65d26b5cfecf18
> > Gitweb: http://git.kernel.org/tip/a5cdd40c9877e9aba704c020fd65d26b5cfecf18
> > Author: Peter Zijlstra <[email protected]>
> > AuthorDate: Tue, 16 Jul 2013 17:09:07 +0200
> > Committer: Ingo Molnar <[email protected]>
> > CommitDate: Tue, 23 Jul 2013 12:17:08 +0200
> >
> > perf: Update perf_event_type documentation
> >
> > Due to a discussion with Adrian I had a good look at the perf_event_type record
> > layout and found the documentation to be somewhat unclear.
>
> This code makes a lot of code changes considering it is "updating
> documentation".
>
> Also, will code following this "documentation" be backward compatible?
>
> Meaning, if you have code that depends on the new fields you've added
> and run on older kernels, will you get sane results? Or will the code
> get garbage and/or segfault in the sample_id fields because you read
> past the end of valid data?
I didn't add any fields, this is simply catching up with what was
already there. But yes, this is very much intended to be compatible.
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index 5e2bce9..1274114 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -4462,20 +4462,6 @@ void perf_output_sample(struct perf_output_handle *handle,
> > }
> > }
> >
> > - if (!event->attr.watermark) {
> > - int wakeup_events = event->attr.wakeup_events;
> > -
> > - if (wakeup_events) {
> > - struct ring_buffer *rb = handle->rb;
> > - int events = local_inc_return(&rb->events);
> > -
> > - if (events >= wakeup_events) {
> > - local_sub(wakeup_events, &rb->events);
> > - local_inc(&rb->wakeup);
> > - }
> > - }
> > - }
> > -
> > if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
> > if (data->br_stack) {
> > size_t size;
> > @@ -4511,16 +4497,31 @@ void perf_output_sample(struct perf_output_handle *handle,
> > }
> > }
> >
> > - if (sample_type & PERF_SAMPLE_STACK_USER)
> > + if (sample_type & PERF_SAMPLE_STACK_USER) {
> > perf_output_sample_ustack(handle,
> > data->stack_user_size,
> > data->regs_user.regs);
> > + }
> >
> > if (sample_type & PERF_SAMPLE_WEIGHT)
> > perf_output_put(handle, data->weight);
> >
> > if (sample_type & PERF_SAMPLE_DATA_SRC)
> > perf_output_put(handle, data->data_src.val);
> > +
> > + if (!event->attr.watermark) {
> > + int wakeup_events = event->attr.wakeup_events;
> > +
> > + if (wakeup_events) {
> > + struct ring_buffer *rb = handle->rb;
> > + int events = local_inc_return(&rb->events);
> > +
> > + if (events >= wakeup_events) {
> > + local_sub(wakeup_events, &rb->events);
> > + local_inc(&rb->wakeup);
> > + }
> > + }
> > + }
> > }
I suppose you're 'complaining' about these changes? Yeah, maybe I should
have split them up in a separate patch.
I moved the watermark code from somewhere in the middle of actually
doing the output to the end (where it once was) and added a 'missing'
pair of curlies.
I noticed both oddities when looking at the code to verify the output
format which was updated in the other hunks.
It should be a NOP functionality wise.
On Thu, 25 Jul 2013, Peter Zijlstra wrote:
> I didn't add any fields, this is simply catching up with what was
> already there. But yes, this is very much intended to be compatible.
So those id fields have been there since 2.6.31 and just no one noticed
the header file comments were wrong? That's fine then, I just have a lot
of code and other documentation that needs to be updated then.
> I suppose you're 'complaining' about these changes? Yeah, maybe I should
> have split them up in a separate patch.
>
> ...
>
> It should be a NOP functionality wise.
yes, it looks like a NOP. I was just surprised to see code changes (even
trivial ones) when the commit message only said documentation update.
Vince
On Thu, 25 Jul 2013, Adrian Hunter wrote:
> On 24/07/13 20:54, Vince Weaver wrote:
> > On Tue, 23 Jul 2013, tip-bot for Peter Zijlstra wrote:
> >
> >> Commit-ID: a5cdd40c9877e9aba704c020fd65d26b5cfecf18
> >> Gitweb: http://git.kernel.org/tip/a5cdd40c9877e9aba704c020fd65d26b5cfecf18
> >> Author: Peter Zijlstra <[email protected]>
> >> AuthorDate: Tue, 16 Jul 2013 17:09:07 +0200
> >> Committer: Ingo Molnar <[email protected]>
> >> CommitDate: Tue, 23 Jul 2013 12:17:08 +0200
> >>
> >> perf: Update perf_event_type documentation
> >>
> >> Due to a discussion with Adrian I had a good look at the perf_event_type record
> >> layout and found the documentation to be somewhat unclear.
> ...
> This is not the patch that adds new sample format bits.
>
> New sample format bits are always backward compatible in the sense that the
> kernel perf_event_open system call will return an error if you try to use
> sample format bits that it does not know about.
Yes, I meant the addition of the sample_id fields to the existing sample
values. Though if Peter is right and those have always been there since
the beginning and just not documented then that won't break anything.
The commit message wasn't really as clear as it could be.
Vince
On 7/25/13 6:31 AM, Vince Weaver wrote:
> On Thu, 25 Jul 2013, Peter Zijlstra wrote:
>
>> I didn't add any fields, this is simply catching up with what was
>> already there. But yes, this is very much intended to be compatible.
>
> So those id fields have been there since 2.6.31 and just no one noticed
> the header file comments were wrong? That's fine then, I just have a lot
> of code and other documentation that needs to be updated then.
I believe those fields came in with the sample_id_all patch -
c980d1091810df13f21aabbce545fd98f545bbf7 - which is 2.6.37.
David
On Thu, Jul 25, 2013 at 08:34:51AM -0400, Vince Weaver wrote:
> Yes, I meant the addition of the sample_id fields to the existing sample
> values. Though if Peter is right and those have always been there since
> the beginning and just not documented then that won't break anything.
> The commit message wasn't really as clear as it could be.
/me writes "I shall write better Changelogs." a hundred times.
On Thu, 25 Jul 2013, Adrian Hunter wrote:
> >> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> >> index 0b1df41..00d8274 100644
> >> --- a/include/uapi/linux/perf_event.h
> >> +++ b/include/uapi/linux/perf_event.h
> >> @@ -478,6 +478,16 @@ enum perf_event_type {
> >> * file will be supported by older perf tools, with these new optional
> >> * fields being ignored.
> >> *
> >> + * struct sample_id {
> >> + * { u32 pid, tid; } && PERF_SAMPLE_TID
> >> + * { u64 time; } && PERF_SAMPLE_TIME
> >> + * { u64 id; } && PERF_SAMPLE_ID
> >> + * { u64 stream_id;} && PERF_SAMPLE_STREAM_ID
> >> + * { u32 cpu, res; } && PERF_SAMPLE_CPU
> >> + * } && perf_event_attr::sample_id_all
> >> + */
> >> +
a thing that personally bothers me are these imaginary struct definitions
added as part of the documentation that aren't actually available in the
public perf_event.h
I can see why it's done, but it can be confusing picking out in later
definitions which struct fields are real and which ones are conceptual.
> >> @@ -498,6 +508,7 @@ enum perf_event_type {
> >> * struct perf_event_header header;
> >> * u64 id;
> >> * u64 lost;
> >> + * struct sample_id sample_id;
> >> * };
This is what confused me; this documentation shows the sample_id
as always being there, but in reality that struct is only there
if perf_event_attr::sample_id_all is set.
It might be clearer
if you stuck the perf_event_attr::sample_id_all qualifier each
place you add the sample_id field.
And it's nice that the header holds a size field which ensures backward
compatability, I somehow had forgotten that with my initial complaint.
Vince
On Thu, 25 Jul 2013, Peter Zijlstra wrote:
> On Thu, Jul 25, 2013 at 08:34:51AM -0400, Vince Weaver wrote:
> > Yes, I meant the addition of the sample_id fields to the existing sample
> > values. Though if Peter is right and those have always been there since
> > the beginning and just not documented then that won't break anything.
> > The commit message wasn't really as clear as it could be.
>
> /me writes "I shall write better Changelogs." a hundred times.
yes, sorry, I shouldn't have been so grumpy about it.
After spending 45 minutes digging through the kernel source and other
documentation it all makes sense now. The existing manpage even does a
mildly accurate job of explaining the situation. It's just I have enough
trouble keeping up with the large number of recent perf related patches
without having to spend a lot of time decoding cryptic Changelogs.
Vince
On Thu, Jul 25, 2013 at 11:24:05PM -0400, Vince Weaver wrote:
> On Thu, 25 Jul 2013, Peter Zijlstra wrote:
>
> > On Thu, Jul 25, 2013 at 08:34:51AM -0400, Vince Weaver wrote:
> > > Yes, I meant the addition of the sample_id fields to the existing sample
> > > values. Though if Peter is right and those have always been there since
> > > the beginning and just not documented then that won't break anything.
> > > The commit message wasn't really as clear as it could be.
> >
> > /me writes "I shall write better Changelogs." a hundred times.
>
> yes, sorry, I shouldn't have been so grumpy about it.
I don't mind; I wish people would tell me my changelogs suck before they
get committed though; I can still fix them at that point.
It not only helps others, but also myself. I've on many occasion kicked
myself for not writing clearer changelogs when investigating something
months (or years) after I wrote the initial code.
The problem is that I mostly know what I mean when I write it and I'm
not very good at writing in the first place -- real changelogs often
take longer to write than the actual code. But I guess that's true for
many people.
Anyway.. I'm ever trying to write better changelogs, but alas there's
also always far more room for improvement :-)
On Thu, Jul 25, 2013 at 11:20:24PM -0400, Vince Weaver wrote:
>
> On Thu, 25 Jul 2013, Adrian Hunter wrote:
>
> > >> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> > >> index 0b1df41..00d8274 100644
> > >> --- a/include/uapi/linux/perf_event.h
> > >> +++ b/include/uapi/linux/perf_event.h
> > >> @@ -478,6 +478,16 @@ enum perf_event_type {
> > >> * file will be supported by older perf tools, with these new optional
> > >> * fields being ignored.
> > >> *
> > >> + * struct sample_id {
> > >> + * { u32 pid, tid; } && PERF_SAMPLE_TID
> > >> + * { u64 time; } && PERF_SAMPLE_TIME
> > >> + * { u64 id; } && PERF_SAMPLE_ID
> > >> + * { u64 stream_id;} && PERF_SAMPLE_STREAM_ID
> > >> + * { u32 cpu, res; } && PERF_SAMPLE_CPU
> > >> + * } && perf_event_attr::sample_id_all
> > >> + */
> > >> +
>
> a thing that personally bothers me are these imaginary struct definitions
> added as part of the documentation that aren't actually available in the
> public perf_event.h
>
> I can see why it's done, but it can be confusing picking out in later
> definitions which struct fields are real and which ones are conceptual.
Would it help if we changed the syntax to not look as much as real C
would?
> > >> @@ -498,6 +508,7 @@ enum perf_event_type {
> > >> * struct perf_event_header header;
> > >> * u64 id;
> > >> * u64 lost;
> > >> + * struct sample_id sample_id;
> > >> * };
>
> This is what confused me; this documentation shows the sample_id
> as always being there, but in reality that struct is only there
> if perf_event_attr::sample_id_all is set.
>
> It might be clearer
> if you stuck the perf_event_attr::sample_id_all qualifier each
> place you add the sample_id field.
Ah, I actually considered that but then got lazy and used the 0 sized
struct idea :/
On Fri, 26 Jul 2013, Peter Zijlstra wrote:
> On Thu, Jul 25, 2013 at 11:20:24PM -0400, Vince Weaver wrote:
> >
> > a thing that personally bothers me are these imaginary struct definitions
> > added as part of the documentation that aren't actually available in the
> > public perf_event.h
> >
> > I can see why it's done, but it can be confusing picking out in later
> > definitions which struct fields are real and which ones are conceptual.
>
> Would it help if we changed the syntax to not look as much as real C
> would?
I've been thinking and I can't really think of a clearer way to present
the layout. So I guess it's fine the way it is. Hopefully not many
people are stuck having to implement code based on header file comments
anyway.
> > It might be clearer
> > if you stuck the perf_event_attr::sample_id_all qualifier each
> > place you add the sample_id field.
>
> Ah, I actually considered that but then got lazy and used the 0 sized
> struct idea :/
It might just be me. For whatever reason the C parser in my head doesn't
handle GNU extensions like 0-sized structs.
Vince
On Tue, 23 Jul 2013, tip-bot for Peter Zijlstra wrote:
> Commit-ID: a5cdd40c9877e9aba704c020fd65d26b5cfecf18
> Gitweb: http://git.kernel.org/tip/a5cdd40c9877e9aba704c020fd65d26b5cfecf18
> Author: Peter Zijlstra <[email protected]>
> AuthorDate: Tue, 16 Jul 2013 17:09:07 +0200
> Committer: Ingo Molnar <[email protected]>
> CommitDate: Tue, 23 Jul 2013 12:17:08 +0200
>
> perf: Update perf_event_type documentation
>
> Due to a discussion with Adrian I had a good look at the perf_event_type record
> layout and found the documentation to be somewhat unclear.
>
> Cc: Adrian Hunter <[email protected]>
> Signed-off-by: Peter Zijlstra <[email protected]>
> Link: http://lkml.kernel.org/r/[email protected]
> Signed-off-by: Ingo Molnar <[email protected]>
> ---
> include/uapi/linux/perf_event.h | 18 +++++++++++++++++-
> kernel/events/core.c | 31 ++++++++++++++++---------------
> 2 files changed, 33 insertions(+), 16 deletions(-)
>
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 0b1df41..00d8274 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -478,6 +478,16 @@ enum perf_event_type {
> * file will be supported by older perf tools, with these new optional
> * fields being ignored.
> *
> + * struct sample_id {
> + * { u32 pid, tid; } && PERF_SAMPLE_TID
> + * { u64 time; } && PERF_SAMPLE_TIME
> + * { u64 id; } && PERF_SAMPLE_ID
> + * { u64 stream_id;} && PERF_SAMPLE_STREAM_ID
> + * { u32 cpu, res; } && PERF_SAMPLE_CPU
> + * } && perf_event_attr::sample_id_all
> + */
> +
> + /*
> * The MMAP events record the PROT_EXEC mappings so that we can
> * correlate userspace IPs to code. They have the following structure:
> *
> @@ -498,6 +508,7 @@ enum perf_event_type {
> * struct perf_event_header header;
> * u64 id;
> * u64 lost;
> + * struct sample_id sample_id;
> * };
> */
> PERF_RECORD_LOST = 2,
So in Adrian Hunter's posted patches the PERF_RECORD_MMAP documentation
is also patched like all the others, with
struct sample_id sample_id;
But in this tip message (and in the current linus-git kernel) somehow the
PERF_RECORD_MMAP line of the patch was dropped.
Was that intentional?
I'm trying to document this mess in the manpage, not fun.
Vince
On Fri, Sep 13, 2013 at 05:31:26PM -0400, Vince Weaver wrote:
> So in Adrian Hunter's posted patches the PERF_RECORD_MMAP documentation
> is also patched like all the others, with
>
> struct sample_id sample_id;
>
> But in this tip message (and in the current linus-git kernel) somehow the
> PERF_RECORD_MMAP line of the patch was dropped.
>
> Was that intentional?
No, no idea what happened..
---
Subject: perf: Update ABI comment
From: Peter Zijlstra <[email protected]>
Date: Fri Sep 13 23:39:17 CEST 2013
For some mysterious reason the sample_id field of PERF_RECORD_MMAP
went AWOL.
Reported-by: Vince Weaver <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
---
include/uapi/linux/perf_event.h | 1 +
1 file changed, 1 insertion(+)
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -528,6 +528,7 @@ enum perf_event_type {
* u64 len;
* u64 pgoff;
* char filename[];
+ * struct sample_id sample_id;
* };
*/
PERF_RECORD_MMAP = 1,