Hi
Here are some fixes and tweaks (version 2) for perf tools.
Changes in V2:
perf tools: Fix non-debug build
New patch
perf evsel: Add a debug print if perf_event_open fails
Unchanged
perf script: Make perf_script a local variable
Split from "perf script: Set up output options for in-stream attributes"
perf script: Set up output options for in-stream attributes
Split out "perf script: Make perf_script a local variable"
perf inject: Do not repipe attributes to a perf.data file
Unchanged
perf tools: Fix 32-bit cross build
Pass only EXTRA_CFLAGS
perf tools: Fix libunwind build and feature detection for 32-bit build
Add Jiri's Ack
perf evlist: Add a debug print if event buffer mmap fails
Add errno
perf tools: Allow non-matching sample types
Suppress compatible sample types for trace tool
perf sched: Make struct perf_sched sched a local variable
New patch
perf sched: Fix optimized build time
New patch
perf tools: Do not accept parse_tag_value() overflow
New patch
perf tools: Validate that mmap_pages is not too big
New patch
Patches dropped because they have been applied:
perf evsel: Add missing 'mmap2' from debug print
perf record: Improve write_output error message
perf evsel: Add missing decrement in id sample parsing
perf session: Add missing sample flush for piped events
perf session: Add missing members to perf_event__attr_swap()
perf evlist: Fix 32-bit build error
perf tools: Fix test_on_exit for 32-bit build
perf tools: Fix bench/numa.c for 32-bit build
perf tools: fix perf_evlist__mmap comments
perf tools: factor out duplicated evlist mmap code
perf script: print addr by default for BTS
Adrian Hunter (14):
perf tools: Fix non-debug build
perf evsel: Add a debug print if perf_event_open fails
perf script: Make perf_script a local variable
perf script: Set up output options for in-stream attributes
perf inject: Do not repipe attributes to a perf.data file
perf tools: Fix 32-bit cross build
perf tools: Fix libunwind build and feature detection for 32-bit build
perf evlist: Add a debug print if event buffer mmap fails
perf record: Add an option to force per-cpu mmaps
perf tools: Allow non-matching sample types
perf sched: Make struct perf_sched sched a local variable
perf sched: Fix optimized build time
perf tools: Do not accept parse_tag_value() overflow
perf tools: Validate that mmap_pages is not too big
tools/perf/Documentation/perf-record.txt | 6 ++
tools/perf/Makefile.perf | 2 +-
tools/perf/builtin-inject.c | 5 ++
tools/perf/builtin-record.c | 2 +
tools/perf/builtin-sched.c | 44 +++++++------
tools/perf/builtin-script.c | 102 +++++++++++++++++++++---------
tools/perf/builtin-trace.c | 1 +
tools/perf/config/Makefile | 12 +++-
tools/perf/config/feature-checks/Makefile | 6 +-
tools/perf/perf.h | 1 +
tools/perf/util/event.h | 16 +++++
tools/perf/util/evlist.c | 45 +++++++++++--
tools/perf/util/evlist.h | 1 +
tools/perf/util/evsel.c | 6 +-
tools/perf/util/record.c | 5 +-
tools/perf/util/target.h | 1 +
tools/perf/util/util.c | 2 +
17 files changed, 190 insertions(+), 67 deletions(-)
Regards
Adrian
In the absence of s DEBUG variable definition
on the command line perf tools was building
without optimization. Fix by assigning
DEBUG if it is not defined.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/config/Makefile | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index c516d6b..543aa95 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -66,6 +66,10 @@ ifneq ($(WERROR),0)
CFLAGS += -Werror
endif
+ifndef DEBUG
+ DEBUG := 0
+endif
+
ifeq ($(DEBUG),0)
CFLAGS += -O6
endif
--
1.7.11.7
There is a debug print (at verbose level 2) for each
call to perf_event_open. Add another debug print if
the call fails, and print the error number.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/evsel.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index ec0cc1e..e09c7e6 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1052,6 +1052,8 @@ retry_open:
group_fd, flags);
if (FD(evsel, cpu, thread) < 0) {
err = -errno;
+ pr_debug2("perf_event_open failed, error %d\n",
+ err);
goto try_fallback;
}
set_rlimit = NO_CHANGE;
--
1.7.11.7
Change perf_script from being global to being local.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/builtin-script.c | 40 ++++++++++++++++++++++++----------------
1 file changed, 24 insertions(+), 16 deletions(-)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 27de606..0c50e4ae 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -542,18 +542,9 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
return 0;
}
-static struct perf_tool perf_script = {
- .sample = process_sample_event,
- .mmap = perf_event__process_mmap,
- .mmap2 = perf_event__process_mmap2,
- .comm = perf_event__process_comm,
- .exit = perf_event__process_exit,
- .fork = perf_event__process_fork,
- .attr = perf_event__process_attr,
- .tracing_data = perf_event__process_tracing_data,
- .build_id = perf_event__process_build_id,
- .ordered_samples = true,
- .ordering_requires_timestamps = true,
+struct perf_script {
+ struct perf_tool tool;
+ struct perf_session *session;
};
static void sig_handler(int sig __maybe_unused)
@@ -561,13 +552,13 @@ static void sig_handler(int sig __maybe_unused)
session_done = 1;
}
-static int __cmd_script(struct perf_session *session)
+static int __cmd_script(struct perf_script *scr)
{
int ret;
signal(SIGINT, sig_handler);
- ret = perf_session__process_events(session, &perf_script);
+ ret = perf_session__process_events(scr->session, &scr->tool);
if (debug_mode)
pr_err("Misordered timestamps: %" PRIu64 "\n", nr_unordered);
@@ -1273,6 +1264,21 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
char *script_path = NULL;
const char **__argv;
int i, j, err;
+ struct perf_script perf_script = {
+ .tool = {
+ .sample = process_sample_event,
+ .mmap = perf_event__process_mmap,
+ .mmap2 = perf_event__process_mmap2,
+ .comm = perf_event__process_comm,
+ .exit = perf_event__process_exit,
+ .fork = perf_event__process_fork,
+ .attr = perf_event__process_attr,
+ .tracing_data = perf_event__process_tracing_data,
+ .build_id = perf_event__process_build_id,
+ .ordered_samples = true,
+ .ordering_requires_timestamps = true,
+ },
+ };
const struct option options[] = {
OPT_BOOLEAN('D', "dump-raw-trace", &dump_trace,
"dump raw trace in ASCII"),
@@ -1498,10 +1504,12 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
if (!script_name)
setup_pager();
- session = perf_session__new(&file, false, &perf_script);
+ session = perf_session__new(&file, false, &perf_script.tool);
if (session == NULL)
return -ENOMEM;
+ perf_script.session = session;
+
if (cpu_list) {
if (perf_session__cpu_bitmap(session, cpu_list, cpu_bitmap))
return -1;
@@ -1565,7 +1573,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
if (err < 0)
goto out;
- err = __cmd_script(session);
+ err = __cmd_script(&perf_script);
perf_session__delete(session);
cleanup_scripting();
--
1.7.11.7
perf.data files contain the attributes separately, do not
put them in the event stream as well.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/builtin-inject.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 4aa6d78..e7ac679 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -72,12 +72,17 @@ static int perf_event__repipe_attr(struct perf_tool *tool,
union perf_event *event,
struct perf_evlist **pevlist)
{
+ struct perf_inject *inject = container_of(tool, struct perf_inject,
+ tool);
int ret;
ret = perf_event__process_attr(tool, event, pevlist);
if (ret)
return ret;
+ if (!inject->pipe_output)
+ return 0;
+
return perf_event__repipe_synth(tool, event);
}
--
1.7.11.7
Add a debug print if mmap of the perf event
ring buffer fails.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/evlist.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 85c4c80..4bc2a3a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -600,6 +600,8 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist,
evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, prot,
MAP_SHARED, fd, 0);
if (evlist->mmap[idx].base == MAP_FAILED) {
+ pr_debug2("failed to mmap perf event ring buffer, error %d\n",
+ errno);
evlist->mmap[idx].base = NULL;
return -1;
}
--
1.7.11.7
By default, when tasks are specified (i.e. -p, -t
or -u options) per-thread mmaps are created. Add
an option to override that and force per-cpu mmaps.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/Documentation/perf-record.txt | 6 ++++++
tools/perf/builtin-record.c | 2 ++
tools/perf/util/evlist.c | 4 +++-
tools/perf/util/evsel.c | 4 ++--
tools/perf/util/target.h | 1 +
5 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index f10ab63..2ea6685 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -189,6 +189,12 @@ abort events and some memory events in precise mode on modern Intel CPUs.
--transaction::
Record transaction flags for transaction related events.
+--force-per-cpu::
+Force the use of per-cpu mmaps. By default, when tasks are specified (i.e. -p,
+-t or -u options) per-thread mmaps are created. This option overrides that and
+forces per-cpu mmaps. A side-effect of that is that inheritance is
+automatically enabled. Add the -i option also to disable inheritance.
+
SEE ALSO
--------
linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index ab8d15e..4c0657f 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -860,6 +860,8 @@ const struct option record_options[] = {
"sample by weight (on special events only)"),
OPT_BOOLEAN(0, "transaction", &record.opts.sample_transaction,
"sample transaction flags (special events only)"),
+ OPT_BOOLEAN(0, "force-per-cpu", &record.opts.target.force_per_cpu,
+ "force the use of per-cpu mmaps"),
OPT_END()
};
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 4bc2a3a..9e2b98c 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -792,7 +792,9 @@ int perf_evlist__create_maps(struct perf_evlist *evlist,
if (evlist->threads == NULL)
return -1;
- if (perf_target__has_task(target))
+ if (target->force_per_cpu)
+ evlist->cpus = cpu_map__new(target->cpu_list);
+ else if (perf_target__has_task(target))
evlist->cpus = cpu_map__dummy_new();
else if (!perf_target__has_cpu(target) && !target->uses_mmap)
evlist->cpus = cpu_map__dummy_new();
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index e09c7e6..f1839b0 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -645,7 +645,7 @@ void perf_evsel__config(struct perf_evsel *evsel,
}
}
- if (perf_target__has_cpu(&opts->target))
+ if (perf_target__has_cpu(&opts->target) || opts->target.force_per_cpu)
perf_evsel__set_sample_bit(evsel, CPU);
if (opts->period)
@@ -653,7 +653,7 @@ void perf_evsel__config(struct perf_evsel *evsel,
if (!perf_missing_features.sample_id_all &&
(opts->sample_time || !opts->no_inherit ||
- perf_target__has_cpu(&opts->target)))
+ perf_target__has_cpu(&opts->target) || opts->target.force_per_cpu))
perf_evsel__set_sample_bit(evsel, TIME);
if (opts->raw_samples) {
diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
index a4be857..6d7efbd 100644
--- a/tools/perf/util/target.h
+++ b/tools/perf/util/target.h
@@ -12,6 +12,7 @@ struct perf_target {
uid_t uid;
bool system_wide;
bool uses_mmap;
+ bool force_per_cpu;
};
enum perf_target_errno {
--
1.7.11.7
parse_tag_value() accepts an "unsigned long" and
multiplies it according to a tag character. Do
not accept the value if the multiplication
overflows.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/util.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index c25e57b..28a0a89 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -386,6 +386,8 @@ unsigned long parse_tag_value(const char *str, struct parse_tag *tags)
if (s != endptr)
break;
+ if (value > ULONG_MAX / i->mult)
+ break;
value *= i->mult;
return value;
}
--
1.7.11.7
For kernels that do not support PERF_SAMPLE_IDENTIFIER,
sample types need not be identical to determine
the sample id from the event. Only the position
of the sample id needs to be the same.
Compatible sample types are ones in which the bits
defined by PERF_COMPAT_MASK are the same.
'perf_evlist__config()' forces sample types to be
compatible on that basis.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/builtin-trace.c | 1 +
tools/perf/perf.h | 1 +
tools/perf/util/event.h | 16 ++++++++++++++++
tools/perf/util/evlist.c | 25 +++++++++++++++++++++++++
tools/perf/util/evlist.h | 1 +
tools/perf/util/record.c | 5 ++++-
6 files changed, 48 insertions(+), 1 deletion(-)
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index fa620bc..5da3920 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2060,6 +2060,7 @@ int cmd_trace(int argc, const char **argv, const char *prefix __maybe_unused)
.user_interval = ULLONG_MAX,
.no_delay = true,
.mmap_pages = 1024,
+ .incompatible_sample_types = true,
},
.output = stdout,
.show_comm = true,
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index f61c230..aeecdf7 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -233,6 +233,7 @@ struct perf_record_opts {
u64 user_interval;
u16 stack_dump_size;
bool sample_transaction;
+ bool incompatible_sample_types;
};
#endif
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 752709c..ca0689c 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -78,6 +78,22 @@ struct throttle_event {
/* perf sample has 16 bits size limit */
#define PERF_SAMPLE_MAX_SIZE (1 << 16)
+/*
+ * Events have compatible sample types if the following bits all have the same
+ * value. This is because the order of sample members is fixed. For sample
+ * events the order is: PERF_SAMPLE_IP, PERF_SAMPLE_TID, PERF_SAMPLE_TIME,
+ * PERF_SAMPLE_ADDR, PERF_SAMPLE_ID. For non-sample events the sample members
+ * are accessed in reverse order. The order is: PERF_SAMPLE_ID,
+ * PERF_SAMPLE_STREAM_ID, PERF_SAMPLE_CPU. PERF_SAMPLE_IDENTIFIER is added for
+ * completeness but it should not be used with PERF_SAMPLE_ID. Sample types
+ * that include PERF_SAMPLE_IDENTIFIER are always compatible.
+ */
+#define PERF_COMPAT_MASK \
+ (PERF_SAMPLE_IP | PERF_SAMPLE_TID | \
+ PERF_SAMPLE_TIME | PERF_SAMPLE_ADDR | \
+ PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
+ PERF_SAMPLE_CPU | PERF_SAMPLE_IDENTIFIER)
+
struct sample_event {
struct perf_event_header header;
u64 array[];
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 9e2b98c..9d17998 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -87,6 +87,31 @@ static void perf_evlist__update_id_pos(struct perf_evlist *evlist)
perf_evlist__set_id_pos(evlist);
}
+/**
+ * perf_evlist__make_sample_types_compatible - make sample types compatible.
+ * @evlist: selected event list
+ *
+ * Events with compatible sample types all have the same id_pos and is_pos.
+ * This can be achieved by matching the bits of PERF_COMPAT_MASK.
+ */
+void perf_evlist__make_sample_types_compatible(struct perf_evlist *evlist)
+{
+ struct perf_evsel *evsel;
+ u64 compat = 0;
+
+ list_for_each_entry(evsel, &evlist->entries, node)
+ compat |= evsel->attr.sample_type & PERF_COMPAT_MASK;
+
+ list_for_each_entry(evsel, &evlist->entries, node) {
+ evsel->attr.sample_type |= compat;
+ evsel->sample_size =
+ __perf_evsel__sample_size(evsel->attr.sample_type);
+ perf_evsel__calc_id_pos(evsel);
+ }
+
+ perf_evlist__set_id_pos(evlist);
+}
+
static void perf_evlist__purge(struct perf_evlist *evlist)
{
struct perf_evsel *pos, *n;
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 7f8f1ae..8c5cdb9 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -94,6 +94,7 @@ int perf_evlist__open(struct perf_evlist *evlist);
void perf_evlist__close(struct perf_evlist *evlist);
void perf_evlist__set_id_pos(struct perf_evlist *evlist);
+void perf_evlist__make_sample_types_compatible(struct perf_evlist *evlist);
bool perf_can_sample_identifier(void);
void perf_evlist__config(struct perf_evlist *evlist,
struct perf_record_opts *opts);
diff --git a/tools/perf/util/record.c b/tools/perf/util/record.c
index 18d73aa..1eb1290 100644
--- a/tools/perf/util/record.c
+++ b/tools/perf/util/record.c
@@ -104,5 +104,8 @@ void perf_evlist__config(struct perf_evlist *evlist,
perf_evsel__set_sample_id(evsel, use_sample_identifier);
}
- perf_evlist__set_id_pos(evlist);
+ if (use_sample_identifier || opts->incompatible_sample_types)
+ perf_evlist__set_id_pos(evlist);
+ else
+ perf_evlist__make_sample_types_compatible(evlist);
}
--
1.7.11.7
Amend perf_evlist__parse_mmap_pages() to check that
the mmap_pages entered via the --mmap_pages/-m
option is not too big.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/evlist.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 9d17998..9d6d01c 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -725,7 +725,8 @@ static size_t perf_evlist__mmap_size(unsigned long pages)
int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
int unset __maybe_unused)
{
- unsigned int pages, val, *mmap_pages = opt->value;
+ unsigned int *mmap_pages = opt->value;
+ unsigned long pages, val;
size_t size;
static struct parse_tag tags[] = {
{ .tag = 'B', .mult = 1 },
@@ -736,12 +737,12 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
};
val = parse_tag_value(str, tags);
- if (val != (unsigned int) -1) {
+ if (val != (unsigned long) -1) {
/* we got file size value */
pages = PERF_ALIGN(val, page_size) / page_size;
- if (!is_power_of_2(pages)) {
+ if (pages < (1UL << 31) && !is_power_of_2(pages)) {
pages = next_pow2(pages);
- pr_info("rounding mmap pages size to %u (%u pages)\n",
+ pr_info("rounding mmap pages size to %lu (%lu pages)\n",
pages * page_size, pages);
}
} else {
@@ -754,6 +755,11 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
}
}
+ if (pages > UINT_MAX || pages > SIZE_MAX / page_size) {
+ pr_err("--mmap_pages/-m value too big\n");
+ return -1;
+ }
+
size = perf_evlist__mmap_size(pages);
if (!size) {
pr_err("--mmap_pages/-m value must be a power of two.");
--
1.7.11.7
Change "struct perf_sched sched" from being global
to being local.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/builtin-sched.c | 41 ++++++++++++++++++++---------------------
1 file changed, 20 insertions(+), 21 deletions(-)
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 5a46b10..5a33856 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -1655,29 +1655,28 @@ static int __cmd_record(int argc, const char **argv)
return cmd_record(i, rec_argv, NULL);
}
-static const char default_sort_order[] = "avg, max, switch, runtime";
-static struct perf_sched sched = {
- .tool = {
- .sample = perf_sched__process_tracepoint_sample,
- .comm = perf_event__process_comm,
- .lost = perf_event__process_lost,
- .fork = perf_sched__process_fork_event,
- .ordered_samples = true,
- },
- .cmp_pid = LIST_HEAD_INIT(sched.cmp_pid),
- .sort_list = LIST_HEAD_INIT(sched.sort_list),
- .start_work_mutex = PTHREAD_MUTEX_INITIALIZER,
- .work_done_wait_mutex = PTHREAD_MUTEX_INITIALIZER,
- .curr_pid = { [0 ... MAX_CPUS - 1] = -1 },
- .sort_order = default_sort_order,
- .replay_repeat = 10,
- .profile_cpu = -1,
- .next_shortname1 = 'A',
- .next_shortname2 = '0',
-};
-
int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
{
+ const char default_sort_order[] = "avg, max, switch, runtime";
+ struct perf_sched sched = {
+ .tool = {
+ .sample = perf_sched__process_tracepoint_sample,
+ .comm = perf_event__process_comm,
+ .lost = perf_event__process_lost,
+ .fork = perf_sched__process_fork_event,
+ .ordered_samples = true,
+ },
+ .cmp_pid = LIST_HEAD_INIT(sched.cmp_pid),
+ .sort_list = LIST_HEAD_INIT(sched.sort_list),
+ .start_work_mutex = PTHREAD_MUTEX_INITIALIZER,
+ .work_done_wait_mutex = PTHREAD_MUTEX_INITIALIZER,
+ .curr_pid = { [0 ... MAX_CPUS - 1] = -1 },
+ .sort_order = default_sort_order,
+ .replay_repeat = 10,
+ .profile_cpu = -1,
+ .next_shortname1 = 'A',
+ .next_shortname2 = '0',
+ };
const struct option latency_options[] = {
OPT_STRING('s', "sort", &sched.sort_order, "key[,key2...]",
"sort by key(s): runtime, switch, avg, max"),
--
1.7.11.7
builtin-sched.c took a log time to build with
-O6 optimization. This turned out to be caused
by:
.curr_pid = { [0 ... MAX_CPUS - 1] = -1 },
Fix by initializing curr_pid programmatically.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/builtin-sched.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 5a33856..ddb5dc1 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -1670,7 +1670,6 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
.sort_list = LIST_HEAD_INIT(sched.sort_list),
.start_work_mutex = PTHREAD_MUTEX_INITIALIZER,
.work_done_wait_mutex = PTHREAD_MUTEX_INITIALIZER,
- .curr_pid = { [0 ... MAX_CPUS - 1] = -1 },
.sort_order = default_sort_order,
.replay_repeat = 10,
.profile_cpu = -1,
@@ -1732,6 +1731,10 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
.switch_event = replay_switch_event,
.fork_event = replay_fork_event,
};
+ unsigned int i;
+
+ for (i = 0; i < ARRAY_SIZE(sched.curr_pid); i++)
+ sched.curr_pid[i] = -1;
argc = parse_options(argc, argv, sched_options, sched_usage,
PARSE_OPT_STOP_AT_NON_OPTION);
--
1.7.11.7
Use -lunwind-x86 instead of -lunwind-x86_64 for
32-bit build.
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
---
tools/perf/config/Makefile | 6 ++++--
tools/perf/config/feature-checks/Makefile | 4 ++--
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index 2f1d7d7..ffb5f55 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -25,9 +25,11 @@ ifeq ($(ARCH),x86_64)
RAW_ARCH := x86_64
CFLAGS += -DHAVE_ARCH_X86_64_SUPPORT
ARCH_INCLUDE = ../../arch/x86/lib/memcpy_64.S ../../arch/x86/lib/memset_64.S
+ LIBUNWIND_LIBS = -lunwind -lunwind-x86_64
+ else
+ LIBUNWIND_LIBS = -lunwind -lunwind-x86
endif
NO_PERF_REGS := 0
- LIBUNWIND_LIBS = -lunwind -lunwind-x86_64
endif
ifeq ($(NO_PERF_REGS),0)
@@ -96,7 +98,7 @@ endif
feature_check = $(eval $(feature_check_code))
define feature_check_code
- feature-$(1) := $(shell $(MAKE) OUTPUT=$(OUTPUT_FEATURES) CFLAGS="$(EXTRA_CFLAGS)" LDFLAGS=$(LDFLAGS) -C config/feature-checks test-$1 >/dev/null 2>/dev/null && echo 1 || echo 0)
+ feature-$(1) := $(shell $(MAKE) OUTPUT=$(OUTPUT_FEATURES) CFLAGS="$(EXTRA_CFLAGS)" LDFLAGS=$(LDFLAGS) LIBUNWIND_LIBS="$(LIBUNWIND_LIBS)" -C config/feature-checks test-$1 >/dev/null 2>/dev/null && echo 1 || echo 0)
endef
feature_set = $(eval $(feature_set_code))
diff --git a/tools/perf/config/feature-checks/Makefile b/tools/perf/config/feature-checks/Makefile
index 353c00c..d37d58d 100644
--- a/tools/perf/config/feature-checks/Makefile
+++ b/tools/perf/config/feature-checks/Makefile
@@ -36,7 +36,7 @@ BUILD = $(CC) $(CFLAGS) $(LDFLAGS) -o $(OUTPUT)$@ [email protected]
###############################
test-all:
- $(BUILD) -Werror -fstack-protector -fstack-protector-all -O2 -Werror -D_FORTIFY_SOURCE=2 -ldw -lelf -lnuma -lunwind -lunwind-x86_64 -lelf -laudit -I/usr/include/slang -lslang $(shell pkg-config --libs --cflags gtk+-2.0 2>/dev/null) $(FLAGS_PERL_EMBED) $(FLAGS_PYTHON_EMBED) -DPACKAGE='"perf"' -lbfd -ldl
+ $(BUILD) -Werror -fstack-protector -fstack-protector-all -O2 -Werror -D_FORTIFY_SOURCE=2 -ldw -lelf -lnuma $(LIBUNWIND_LIBS) -lelf -laudit -I/usr/include/slang -lslang $(shell pkg-config --libs --cflags gtk+-2.0 2>/dev/null) $(FLAGS_PERL_EMBED) $(FLAGS_PYTHON_EMBED) -DPACKAGE='"perf"' -lbfd -ldl
test-hello:
$(BUILD)
@@ -72,7 +72,7 @@ test-libnuma:
$(BUILD) -lnuma
test-libunwind:
- $(BUILD) -lunwind -lunwind-x86_64 -lelf
+ $(BUILD) $(LIBUNWIND_LIBS) -lelf
test-libaudit:
$(BUILD) -laudit
--
1.7.11.7
Setting EXTRA_CFLAGS=-m32 did not work because it
was not passed around.
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/Makefile.perf | 2 +-
tools/perf/config/Makefile | 4 ++--
tools/perf/config/feature-checks/Makefile | 2 +-
3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 326a26e..e462598 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -707,7 +707,7 @@ $(LIB_FILE): $(LIB_OBJS)
TE_SOURCES = $(wildcard $(TRACE_EVENT_DIR)*.[ch])
$(LIBTRACEEVENT): $(TE_SOURCES)
- $(QUIET_SUBDIR0)$(TRACE_EVENT_DIR) $(QUIET_SUBDIR1) O=$(OUTPUT) libtraceevent.a
+ $(QUIET_SUBDIR0)$(TRACE_EVENT_DIR) $(QUIET_SUBDIR1) O=$(OUTPUT) CFLAGS="-g -Wall $(EXTRA_CFLAGS)" libtraceevent.a
$(LIBTRACEEVENT)-clean:
$(call QUIET_CLEAN, libtraceevent)
diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index 543aa95..2f1d7d7 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -96,7 +96,7 @@ endif
feature_check = $(eval $(feature_check_code))
define feature_check_code
- feature-$(1) := $(shell $(MAKE) OUTPUT=$(OUTPUT_FEATURES) LDFLAGS=$(LDFLAGS) -C config/feature-checks test-$1 >/dev/null 2>/dev/null && echo 1 || echo 0)
+ feature-$(1) := $(shell $(MAKE) OUTPUT=$(OUTPUT_FEATURES) CFLAGS="$(EXTRA_CFLAGS)" LDFLAGS=$(LDFLAGS) -C config/feature-checks test-$1 >/dev/null 2>/dev/null && echo 1 || echo 0)
endef
feature_set = $(eval $(feature_set_code))
@@ -173,7 +173,7 @@ ifeq ($(feature-all), 1)
#
$(foreach feat,$(CORE_FEATURE_TESTS),$(call feature_set,$(feat)))
else
- $(shell $(MAKE) OUTPUT=$(OUTPUT_FEATURES) LDFLAGS=$(LDFLAGS) -i -j -C config/feature-checks $(CORE_FEATURE_TESTS) >/dev/null 2>&1)
+ $(shell $(MAKE) OUTPUT=$(OUTPUT_FEATURES) CFLAGS="$(EXTRA_CFLAGS)" LDFLAGS=$(LDFLAGS) -i -j -C config/feature-checks $(CORE_FEATURE_TESTS) >/dev/null 2>&1)
$(foreach feat,$(CORE_FEATURE_TESTS),$(call feature_check,$(feat)))
endif
diff --git a/tools/perf/config/feature-checks/Makefile b/tools/perf/config/feature-checks/Makefile
index 452b67c..353c00c 100644
--- a/tools/perf/config/feature-checks/Makefile
+++ b/tools/perf/config/feature-checks/Makefile
@@ -31,7 +31,7 @@ CC := $(CC) -MD
all: $(FILES)
-BUILD = $(CC) $(LDFLAGS) -o $(OUTPUT)$@ [email protected]
+BUILD = $(CC) $(CFLAGS) $(LDFLAGS) -o $(OUTPUT)$@ [email protected]
###############################
--
1.7.11.7
Attributes (struct perf_event_attr) are recorded
separately in the perf.data file. perf script uses
them to set up output options. However attributes
can also be in the event stream, for example when
the input is a pipe (i.e. live mode). This patch
makes perf script process in-stream attributes in
the same way as on-file attributes.
Here is an example:
Before this patch:
$ perf record uname | perf script
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.015 MB (null) (~655 samples) ]
:4220 4220 [-01] 2933367.838906: cycles:
:4220 4220 [-01] 2933367.838910: cycles:
:4220 4220 [-01] 2933367.838912: cycles:
:4220 4220 [-01] 2933367.838914: cycles:
:4220 4220 [-01] 2933367.838916: cycles:
:4220 4220 [-01] 2933367.838918: cycles:
uname 4220 [-01] 2933367.838938: cycles:
uname 4220 [-01] 2933367.839207: cycles:
After this patch:
$ perf record uname | perf script
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.015 MB (null) (~655 samples) ]
:4582 4582 2933425.707724: cycles: ffffffff81043ffa native_write_msr_safe ([kernel.kallsyms])
:4582 4582 2933425.707728: cycles: ffffffff81043ffa native_write_msr_safe ([kernel.kallsyms])
:4582 4582 2933425.707730: cycles: ffffffff81043ffa native_write_msr_safe ([kernel.kallsyms])
:4582 4582 2933425.707732: cycles: ffffffff81043ffa native_write_msr_safe ([kernel.kallsyms])
:4582 4582 2933425.707734: cycles: ffffffff81043ffa native_write_msr_safe ([kernel.kallsyms])
:4582 4582 2933425.707736: cycles: ffffffff81309a24 memcpy ([kernel.kallsyms])
uname 4582 2933425.707760: cycles: ffffffff8109c1c7 enqueue_task_fair ([kernel.kallsyms])
uname 4582 2933425.707978: cycles: ffffffff81308457 clear_page_c ([kernel.kallsyms])
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/builtin-script.c | 64 +++++++++++++++++++++++++++++++++------------
1 file changed, 48 insertions(+), 16 deletions(-)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 0c50e4ae..b2270b5 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -229,6 +229,24 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel,
return 0;
}
+static void set_print_ip_opts(struct perf_event_attr *attr)
+{
+ unsigned int type = attr->type;
+
+ output[type].print_ip_opts = 0;
+ if (PRINT_FIELD(IP))
+ output[type].print_ip_opts |= PRINT_IP_OPT_IP;
+
+ if (PRINT_FIELD(SYM))
+ output[type].print_ip_opts |= PRINT_IP_OPT_SYM;
+
+ if (PRINT_FIELD(DSO))
+ output[type].print_ip_opts |= PRINT_IP_OPT_DSO;
+
+ if (PRINT_FIELD(SYMOFFSET))
+ output[type].print_ip_opts |= PRINT_IP_OPT_SYMOFFSET;
+}
+
/*
* verify all user requested events exist and the samples
* have the expected data
@@ -237,7 +255,6 @@ static int perf_session__check_output_opt(struct perf_session *session)
{
int j;
struct perf_evsel *evsel;
- struct perf_event_attr *attr;
for (j = 0; j < PERF_TYPE_MAX; ++j) {
evsel = perf_session__find_first_evtype(session, j);
@@ -260,20 +277,7 @@ static int perf_session__check_output_opt(struct perf_session *session)
if (evsel == NULL)
continue;
- attr = &evsel->attr;
-
- output[j].print_ip_opts = 0;
- if (PRINT_FIELD(IP))
- output[j].print_ip_opts |= PRINT_IP_OPT_IP;
-
- if (PRINT_FIELD(SYM))
- output[j].print_ip_opts |= PRINT_IP_OPT_SYM;
-
- if (PRINT_FIELD(DSO))
- output[j].print_ip_opts |= PRINT_IP_OPT_DSO;
-
- if (PRINT_FIELD(SYMOFFSET))
- output[j].print_ip_opts |= PRINT_IP_OPT_SYMOFFSET;
+ set_print_ip_opts(&evsel->attr);
}
return 0;
@@ -547,6 +551,34 @@ struct perf_script {
struct perf_session *session;
};
+static int process_attr(struct perf_tool *tool, union perf_event *event,
+ struct perf_evlist **pevlist)
+{
+ struct perf_script *scr = container_of(tool, struct perf_script, tool);
+ struct perf_evlist *evlist;
+ struct perf_evsel *evsel, *pos;
+ int err;
+
+ err = perf_event__process_attr(tool, event, pevlist);
+ if (err)
+ return err;
+
+ evlist = *pevlist;
+ evsel = perf_evlist__last(*pevlist);
+
+ if (evsel->attr.type >= PERF_TYPE_MAX)
+ return 0;
+
+ list_for_each_entry(pos, &evlist->entries, node) {
+ if (pos->attr.type == evsel->attr.type && pos != evsel)
+ return 0;
+ }
+
+ set_print_ip_opts(&evsel->attr);
+
+ return perf_evsel__check_attr(evsel, scr->session);
+}
+
static void sig_handler(int sig __maybe_unused)
{
session_done = 1;
@@ -1272,7 +1304,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
.comm = perf_event__process_comm,
.exit = perf_event__process_exit,
.fork = perf_event__process_fork,
- .attr = perf_event__process_attr,
+ .attr = process_attr,
.tracing_data = perf_event__process_tracing_data,
.build_id = perf_event__process_build_id,
.ordered_samples = true,
--
1.7.11.7
On 10/22/13 8:34 AM, Adrian Hunter wrote:
> -static int __cmd_script(struct perf_session *session)
> +static int __cmd_script(struct perf_script *scr)
for naming consistency that should be *script.
> {
> int ret;
>
> signal(SIGINT, sig_handler);
>
> - ret = perf_session__process_events(session, &perf_script);
> + ret = perf_session__process_events(scr->session, &scr->tool);
>
> if (debug_mode)
> pr_err("Misordered timestamps: %" PRIu64 "\n", nr_unordered);
> @@ -1273,6 +1264,21 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
> char *script_path = NULL;
> const char **__argv;
> int i, j, err;
> + struct perf_script perf_script = {
Ditto: struct perf_script script;
Otherwise the change looks fine to me.
Acked-by: David Ahern <[email protected]>
On 10/22/13 8:34 AM, Adrian Hunter wrote:
> +static int process_attr(struct perf_tool *tool, union perf_event *event,
> + struct perf_evlist **pevlist)
> +{
> + struct perf_script *scr = container_of(tool, struct perf_script, tool);
> + struct perf_evlist *evlist;
> + struct perf_evsel *evsel, *pos;
> + int err;
> +
> + err = perf_event__process_attr(tool, event, pevlist);
> + if (err)
> + return err;
> +
> + evlist = *pevlist;
> + evsel = perf_evlist__last(*pevlist);
This assumes new entries are added to the end of evlist in
perf_event__process_attr. Would be better to change it to return the
newly created evsel so you don't need to look it up after adding it.
> +
> + if (evsel->attr.type >= PERF_TYPE_MAX)
> + return 0;
> +
> + list_for_each_entry(pos, &evlist->entries, node) {
> + if (pos->attr.type == evsel->attr.type && pos != evsel)
> + return 0;
> + }
What's the point of this loop?
David
On 23/10/13 09:15, David Ahern wrote:
> On 10/22/13 8:34 AM, Adrian Hunter wrote:
>> +static int process_attr(struct perf_tool *tool, union perf_event *event,
>> + struct perf_evlist **pevlist)
>> +{
>> + struct perf_script *scr = container_of(tool, struct perf_script, tool);
>> + struct perf_evlist *evlist;
>> + struct perf_evsel *evsel, *pos;
>> + int err;
>> +
>> + err = perf_event__process_attr(tool, event, pevlist);
>> + if (err)
>> + return err;
>> +
>> + evlist = *pevlist;
>> + evsel = perf_evlist__last(*pevlist);
>
> This assumes new entries are added to the end of evlist in
> perf_event__process_attr. Would be better to change it to return the newly
> created evsel so you don't need to look it up after adding it.
perf_event__process_attr() must not reorder the attributes,
it would misrepresent the way they were recorded.
>
>> +
>> + if (evsel->attr.type >= PERF_TYPE_MAX)
>> + return 0;
>> +
>> + list_for_each_entry(pos, &evlist->entries, node) {
>> + if (pos->attr.type == evsel->attr.type && pos != evsel)
>> + return 0;
>> + }
>
> What's the point of this loop?
Each type is checked once - see perf_session__check_output_opt()
On 10/22/13 8:34 AM, Adrian Hunter wrote:
> Change "struct perf_sched sched" from being global
> to being local.
>
> Signed-off-by: Adrian Hunter <[email protected]>
> ---
> tools/perf/builtin-sched.c | 41 ++++++++++++++++++++---------------------
> 1 file changed, 20 insertions(+), 21 deletions(-)
>
Acked-by: David Ahern <[email protected]>
On 10/22/13 8:34 AM, Adrian Hunter wrote:
> builtin-sched.c took a log time to build with
> -O6 optimization. This turned out to be caused
> by:
>
> .curr_pid = { [0 ... MAX_CPUS - 1] = -1 },
>
> Fix by initializing curr_pid programmatically.
>
> Signed-off-by: Adrian Hunter<[email protected]>
Acked-by: David Ahern <[email protected]>
Em Wed, Oct 23, 2013 at 07:43:55AM +0100, David Ahern escreveu:
> On 10/22/13 8:34 AM, Adrian Hunter wrote:
> >Change "struct perf_sched sched" from being global to being local.
> Acked-by: David Ahern <[email protected]>
Hey guys, this is essentially a revert of:
commit f36f83f947ede547833e462696893f866df77324
Author: Namhyung Kim <[email protected]>
Date: Tue Jun 4 14:46:19 2013 +0900
perf sched: Move struct perf_sched definition out of cmd_sched()
For some reason it consumed quite amount of compile time when declared
as local variable, and it disappeared when moved out of the function.
Moving other variables/tables didn't help.
On my system this single-file-change build time reduced from 11s to 3s.
-----
- Arnaldo
Em Tue, Oct 22, 2013 at 10:34:16AM +0300, Adrian Hunter escreveu:
> builtin-sched.c took a log time to build with
> -O6 optimization. This turned out to be caused
> by:
>
> .curr_pid = { [0 ... MAX_CPUS - 1] = -1 },
>
> Fix by initializing curr_pid programmatically.
Ok, understood, so its just this bit that was causing the delay,
applying both patches, thanks!
- Arnaldo
> Signed-off-by: Adrian Hunter <[email protected]>
> ---
> tools/perf/builtin-sched.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 5a33856..ddb5dc1 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -1670,7 +1670,6 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
> .sort_list = LIST_HEAD_INIT(sched.sort_list),
> .start_work_mutex = PTHREAD_MUTEX_INITIALIZER,
> .work_done_wait_mutex = PTHREAD_MUTEX_INITIALIZER,
> - .curr_pid = { [0 ... MAX_CPUS - 1] = -1 },
> .sort_order = default_sort_order,
> .replay_repeat = 10,
> .profile_cpu = -1,
> @@ -1732,6 +1731,10 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
> .switch_event = replay_switch_event,
> .fork_event = replay_fork_event,
> };
> + unsigned int i;
> +
> + for (i = 0; i < ARRAY_SIZE(sched.curr_pid); i++)
> + sched.curr_pid[i] = -1;
>
> argc = parse_options(argc, argv, sched_options, sched_usage,
> PARSE_OPT_STOP_AT_NON_OPTION);
> --
> 1.7.11.7
Commit-ID: 8a39df8faa1cb130f136d5e404332c16fbb936c0
Gitweb: http://git.kernel.org/tip/8a39df8faa1cb130f136d5e404332c16fbb936c0
Author: Adrian Hunter <[email protected]>
AuthorDate: Tue, 22 Oct 2013 10:34:15 +0300
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Wed, 23 Oct 2013 10:24:19 -0300
perf sched: Make struct perf_sched sched a local variable
Change "struct perf_sched sched" from being global to being local.
The build slowdown cured by f36f83f947ed is dealt with in the following
patch, by programatically setting perf_sched.curr_pid.
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: David Ahern <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-sched.c | 41 ++++++++++++++++++++---------------------
1 file changed, 20 insertions(+), 21 deletions(-)
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 5a46b10..5a33856 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -1655,29 +1655,28 @@ static int __cmd_record(int argc, const char **argv)
return cmd_record(i, rec_argv, NULL);
}
-static const char default_sort_order[] = "avg, max, switch, runtime";
-static struct perf_sched sched = {
- .tool = {
- .sample = perf_sched__process_tracepoint_sample,
- .comm = perf_event__process_comm,
- .lost = perf_event__process_lost,
- .fork = perf_sched__process_fork_event,
- .ordered_samples = true,
- },
- .cmp_pid = LIST_HEAD_INIT(sched.cmp_pid),
- .sort_list = LIST_HEAD_INIT(sched.sort_list),
- .start_work_mutex = PTHREAD_MUTEX_INITIALIZER,
- .work_done_wait_mutex = PTHREAD_MUTEX_INITIALIZER,
- .curr_pid = { [0 ... MAX_CPUS - 1] = -1 },
- .sort_order = default_sort_order,
- .replay_repeat = 10,
- .profile_cpu = -1,
- .next_shortname1 = 'A',
- .next_shortname2 = '0',
-};
-
int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
{
+ const char default_sort_order[] = "avg, max, switch, runtime";
+ struct perf_sched sched = {
+ .tool = {
+ .sample = perf_sched__process_tracepoint_sample,
+ .comm = perf_event__process_comm,
+ .lost = perf_event__process_lost,
+ .fork = perf_sched__process_fork_event,
+ .ordered_samples = true,
+ },
+ .cmp_pid = LIST_HEAD_INIT(sched.cmp_pid),
+ .sort_list = LIST_HEAD_INIT(sched.sort_list),
+ .start_work_mutex = PTHREAD_MUTEX_INITIALIZER,
+ .work_done_wait_mutex = PTHREAD_MUTEX_INITIALIZER,
+ .curr_pid = { [0 ... MAX_CPUS - 1] = -1 },
+ .sort_order = default_sort_order,
+ .replay_repeat = 10,
+ .profile_cpu = -1,
+ .next_shortname1 = 'A',
+ .next_shortname2 = '0',
+ };
const struct option latency_options[] = {
OPT_STRING('s', "sort", &sched.sort_order, "key[,key2...]",
"sort by key(s): runtime, switch, avg, max"),
Commit-ID: 89c97d936e76b064a52ee056602b2a62b3f1ef70
Gitweb: http://git.kernel.org/tip/89c97d936e76b064a52ee056602b2a62b3f1ef70
Author: Adrian Hunter <[email protected]>
AuthorDate: Tue, 22 Oct 2013 10:34:09 +0300
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Wed, 23 Oct 2013 10:58:03 -0300
perf inject: Do not repipe attributes to a perf.data file
perf.data files contain the attributes separately, do not put them in
the event stream as well.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-inject.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index eb1a594..409ceaf 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -72,12 +72,17 @@ static int perf_event__repipe_attr(struct perf_tool *tool,
union perf_event *event,
struct perf_evlist **pevlist)
{
+ struct perf_inject *inject = container_of(tool, struct perf_inject,
+ tool);
int ret;
ret = perf_event__process_attr(tool, event, pevlist);
if (ret)
return ret;
+ if (!inject->pipe_output)
+ return 0;
+
return perf_event__repipe_synth(tool, event);
}
Commit-ID: 6f3e5eda9d6cc74538430d8f9e8e4baa01249160
Gitweb: http://git.kernel.org/tip/6f3e5eda9d6cc74538430d8f9e8e4baa01249160
Author: Adrian Hunter <[email protected]>
AuthorDate: Tue, 22 Oct 2013 10:34:07 +0300
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Wed, 23 Oct 2013 10:27:03 -0300
perf script: Make perf_script a local variable
Change perf_script from being global to being local.
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: David Ahern <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Made the minor consistency changes suggested by David Ahern ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-script.c | 40 ++++++++++++++++++++++++----------------
1 file changed, 24 insertions(+), 16 deletions(-)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 27de606..0ae88c2 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -542,18 +542,9 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
return 0;
}
-static struct perf_tool perf_script = {
- .sample = process_sample_event,
- .mmap = perf_event__process_mmap,
- .mmap2 = perf_event__process_mmap2,
- .comm = perf_event__process_comm,
- .exit = perf_event__process_exit,
- .fork = perf_event__process_fork,
- .attr = perf_event__process_attr,
- .tracing_data = perf_event__process_tracing_data,
- .build_id = perf_event__process_build_id,
- .ordered_samples = true,
- .ordering_requires_timestamps = true,
+struct perf_script {
+ struct perf_tool tool;
+ struct perf_session *session;
};
static void sig_handler(int sig __maybe_unused)
@@ -561,13 +552,13 @@ static void sig_handler(int sig __maybe_unused)
session_done = 1;
}
-static int __cmd_script(struct perf_session *session)
+static int __cmd_script(struct perf_script *script)
{
int ret;
signal(SIGINT, sig_handler);
- ret = perf_session__process_events(session, &perf_script);
+ ret = perf_session__process_events(script->session, &script->tool);
if (debug_mode)
pr_err("Misordered timestamps: %" PRIu64 "\n", nr_unordered);
@@ -1273,6 +1264,21 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
char *script_path = NULL;
const char **__argv;
int i, j, err;
+ struct perf_script script = {
+ .tool = {
+ .sample = process_sample_event,
+ .mmap = perf_event__process_mmap,
+ .mmap2 = perf_event__process_mmap2,
+ .comm = perf_event__process_comm,
+ .exit = perf_event__process_exit,
+ .fork = perf_event__process_fork,
+ .attr = perf_event__process_attr,
+ .tracing_data = perf_event__process_tracing_data,
+ .build_id = perf_event__process_build_id,
+ .ordered_samples = true,
+ .ordering_requires_timestamps = true,
+ },
+ };
const struct option options[] = {
OPT_BOOLEAN('D', "dump-raw-trace", &dump_trace,
"dump raw trace in ASCII"),
@@ -1498,10 +1504,12 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
if (!script_name)
setup_pager();
- session = perf_session__new(&file, false, &perf_script);
+ session = perf_session__new(&file, false, &script.tool);
if (session == NULL)
return -ENOMEM;
+ script.session = session;
+
if (cpu_list) {
if (perf_session__cpu_bitmap(session, cpu_list, cpu_bitmap))
return -1;
@@ -1565,7 +1573,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
if (err < 0)
goto out;
- err = __cmd_script(session);
+ err = __cmd_script(&script);
perf_session__delete(session);
cleanup_scripting();
Commit-ID: 56921becdd1eb0720603fc2e6e4c7f518196d917
Gitweb: http://git.kernel.org/tip/56921becdd1eb0720603fc2e6e4c7f518196d917
Author: Adrian Hunter <[email protected]>
AuthorDate: Tue, 22 Oct 2013 10:34:17 +0300
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Wed, 23 Oct 2013 10:59:09 -0300
perf tools: Do not accept parse_tag_value() overflow
parse_tag_value() accepts an "unsigned long" and multiplies it according
to a tag character. Do not accept the value if the multiplication
overflows.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/util.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index c25e57b..28a0a89 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -386,6 +386,8 @@ unsigned long parse_tag_value(const char *str, struct parse_tag *tags)
if (s != endptr)
break;
+ if (value > ULONG_MAX / i->mult)
+ break;
value *= i->mult;
return value;
}
Commit-ID: 2fbe4abe944868aafdde233557ac85379b60ce46
Gitweb: http://git.kernel.org/tip/2fbe4abe944868aafdde233557ac85379b60ce46
Author: Adrian Hunter <[email protected]>
AuthorDate: Tue, 22 Oct 2013 10:34:18 +0300
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Wed, 23 Oct 2013 11:06:03 -0300
perf evlist: Validate that mmap_pages is not too big
Amend perf_evlist__parse_mmap_pages() to check that the mmap_pages
entered via the --mmap_pages/-m option is not too big.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/evlist.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 85c4c80..2ce92ec 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -698,7 +698,8 @@ static size_t perf_evlist__mmap_size(unsigned long pages)
int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
int unset __maybe_unused)
{
- unsigned int pages, val, *mmap_pages = opt->value;
+ unsigned int *mmap_pages = opt->value;
+ unsigned long pages, val;
size_t size;
static struct parse_tag tags[] = {
{ .tag = 'B', .mult = 1 },
@@ -709,12 +710,12 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
};
val = parse_tag_value(str, tags);
- if (val != (unsigned int) -1) {
+ if (val != (unsigned long) -1) {
/* we got file size value */
pages = PERF_ALIGN(val, page_size) / page_size;
- if (!is_power_of_2(pages)) {
+ if (pages < (1UL << 31) && !is_power_of_2(pages)) {
pages = next_pow2(pages);
- pr_info("rounding mmap pages size to %u (%u pages)\n",
+ pr_info("rounding mmap pages size to %lu (%lu pages)\n",
pages * page_size, pages);
}
} else {
@@ -727,6 +728,11 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
}
}
+ if (pages > UINT_MAX || pages > SIZE_MAX / page_size) {
+ pr_err("--mmap_pages/-m value too big\n");
+ return -1;
+ }
+
size = perf_evlist__mmap_size(pages);
if (!size) {
pr_err("--mmap_pages/-m value must be a power of two.");
Commit-ID: 74af377bc25dd9ebcb0be12836abb6b401b5dd08
Gitweb: http://git.kernel.org/tip/74af377bc25dd9ebcb0be12836abb6b401b5dd08
Author: Adrian Hunter <[email protected]>
AuthorDate: Tue, 22 Oct 2013 10:34:05 +0300
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Wed, 23 Oct 2013 11:07:14 -0300
perf tools: Fix non-debug build
In the absence of s DEBUG variable definition on the command line perf
tools was building without optimization. Fix by assigning DEBUG if it
is not defined.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/config/Makefile | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index c516d6b..543aa95 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -66,6 +66,10 @@ ifneq ($(WERROR),0)
CFLAGS += -Werror
endif
+ifndef DEBUG
+ DEBUG := 0
+endif
+
ifeq ($(DEBUG),0)
CFLAGS += -O6
endif
Commit-ID: 156a2b022907687f28c72d1ba601015f295cd99e
Gitweb: http://git.kernel.org/tip/156a2b022907687f28c72d1ba601015f295cd99e
Author: Adrian Hunter <[email protected]>
AuthorDate: Tue, 22 Oct 2013 10:34:16 +0300
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Wed, 23 Oct 2013 10:24:29 -0300
perf sched: Optimize build time
builtin-sched.c took a log time to build with -O6 optimization. This
turned out to be caused by:
.curr_pid = { [0 ... MAX_CPUS - 1] = -1 },
Fix by initializing curr_pid programmatically.
This addresses the problem cured in f36f83f947ed using a smaller hammer.
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: David Ahern <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-sched.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 5a33856..ddb5dc1 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -1670,7 +1670,6 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
.sort_list = LIST_HEAD_INIT(sched.sort_list),
.start_work_mutex = PTHREAD_MUTEX_INITIALIZER,
.work_done_wait_mutex = PTHREAD_MUTEX_INITIALIZER,
- .curr_pid = { [0 ... MAX_CPUS - 1] = -1 },
.sort_order = default_sort_order,
.replay_repeat = 10,
.profile_cpu = -1,
@@ -1732,6 +1731,10 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
.switch_event = replay_switch_event,
.fork_event = replay_fork_event,
};
+ unsigned int i;
+
+ for (i = 0; i < ARRAY_SIZE(sched.curr_pid); i++)
+ sched.curr_pid[i] = -1;
argc = parse_options(argc, argv, sched_options, sched_usage,
PARSE_OPT_STOP_AT_NON_OPTION);