2016-03-29 23:43:46

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [GIT PULL 00/11] perf/core improvements and fixes

Hi Ingo,

Please consider pulling, this is on top of my previously submitted
acme/perf/urgent, so that we can test Andi's udis86 work on 'perf script'.

This is now test built in several more docker images, including
minimal feature cross-compiler builds ones:

# dm
minimal-debian-experimental-x-mips64: Ok
minimal-debian-experimental-x-mips64el: Ok
minimal-debian-experimental-x-mipsel: Ok
minimal-ubuntu-x-arm: Ok
minimal-ubuntu-x-arm64: Ok
minimal-ubuntu-x-ppc64: Ok
minimal-ubuntu-x-ppc64el: Ok
alldeps-debian: Ok
alldeps-mageia: Ok
alldeps-rhel7: Ok
alldeps-centos: Ok
alldeps-opensuse: Ok
alldeps-ubuntu: Ok
#

Those x-arch cross docker images already allow me to avoid introducing
bugs like the powerpc one Sukadev spotted.

I need to figure out how to install more devel packages for things like
libelf-devel:arch in debian/ubuntu, I almost got there with 'dpkg
--add-architecture arch', but I still need to figure out how to find the list
of multilib enabled devel packages to allow me to have devel packages for other
arches than the native one...

- Arnaldo

The following changes since commit 3ea223adcb0c5893a6dc8ed3a84dce264cbb61d6:

perf tools: Add missing initialization of perf_sample.cpumode in synthesized samples (2016-03-29 20:03:56 -0300)

are available in the git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160329

for you to fetch changes up to 7c2927ccf0daf630cf66570f061c860c73df23c7:

perf script: Add support for printing assembler (2016-03-29 20:15:16 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

User visible:

- Add support for printing assembler using the udis86 library (Andi Kleen)

E.g.:

# perf record -e intel_pt// true
# perf script -F ip,sym,asm
<SNIP>
ffffffff8106399d native_write_msr_safe
ret
ffffffff81013728 pt_config
ret $0x5b81
ffffffff810139e0 pt_event_start
ret
ffffffff810144c3 pt_event_add
jnz 0x81014489
ffffffff81014491 pt_event_add
ret
ffffffff8119df62 event_sched_in.isra.93
jz 0x8119df69
ffffffff8119df78 event_sched_in.isra.93
jz event_sched_in.isra.93+506
ffffffff8119e069 event_sched_in.isra.93
call 0x81c29600
<SNIP>

- Add support for skipping itrace instructions, useful to fast forward
processor trace (Intel PT, BTS) to right after initialization code at the start
of a workload (Andi Kleen)

- Add support for backtraces in perl 'perf script's (Dima Kogan)

- Add -U/-K (--all-user/--all-kernel) options to 'perf mem' (Jiri Olsa)

- Make -f/--force option documentation consistent across tools (Jiri Olsa)

Infrastructure:

- Add 'perf test' to check for event times (Jiri Olsa)

- 'perf config' cleanups (Taeung Song)

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

----------------------------------------------------------------
Andi Kleen (3):
perf tools: Add support for skipping itrace instructions
perf tools: Add probing for udev86 library
perf script: Add support for printing assembler

Dima Kogan (1):
perf script perl: Perl scripts now get a backtrace, like the python ones

Jiri Olsa (4):
perf mem: Add -U/-K (--all-user/--all-kernel) options
perf tools: Make hists__collapse_insert_entry static
perf tools: Make -f/--force option documentation consistent across tools
perf tests: Add test to check for event times

Taeung Song (3):
perf config: Remove duplicated set_buildid_dir calls
perf config: Rework buildid_dir_command_config to perf_buildid_config
perf config: Rename 'v' to 'home' in set_buildid_dir()

tools/build/Makefile.feature | 6 +-
tools/build/feature/Makefile | 8 +-
tools/build/feature/test-all.c | 5 +
tools/build/feature/test-udis86.c | 8 +
tools/perf/Documentation/intel-pt.txt | 7 +
tools/perf/Documentation/itrace.txt | 8 +
tools/perf/Documentation/perf-annotate.txt | 2 +-
tools/perf/Documentation/perf-diff.txt | 2 +-
tools/perf/Documentation/perf-mem.txt | 8 +
tools/perf/Documentation/perf-report.txt | 2 +-
tools/perf/Documentation/perf-script.txt | 8 +-
tools/perf/builtin-mem.c | 11 +-
tools/perf/builtin-script.c | 107 +++++++++-
tools/perf/config/Makefile | 5 +
tools/perf/perf.c | 3 +-
tools/perf/tests/Build | 1 +
tools/perf/tests/builtin-test.c | 4 +
tools/perf/tests/event-times.c | 236 +++++++++++++++++++++
tools/perf/tests/tests.h | 1 +
tools/perf/util/auxtrace.c | 7 +
tools/perf/util/auxtrace.h | 2 +
tools/perf/util/config.c | 57 ++---
tools/perf/util/hist.c | 5 +-
tools/perf/util/hist.h | 2 -
tools/perf/util/intel-bts.c | 5 +
tools/perf/util/intel-pt.c | 22 +-
.../perf/util/scripting-engines/trace-event-perl.c | 114 +++++++++-
27 files changed, 581 insertions(+), 65 deletions(-)
create mode 100644 tools/build/feature/test-udis86.c
create mode 100644 tools/perf/tests/event-times.c


2016-03-29 23:41:48

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 08/11] perf script perl: Perl scripts now get a backtrace, like the python ones

From: Dima Kogan <[email protected]>

We have some infrastructure to use perl or python to analyze logs
generated by perf. Prior to this patch, only the python tools had
access to backtrace information. This patch makes this information
available to perl scripts as well. Example:

Let's look at malloc() calls made by the seq utility. First we
create a probe point:

$ perf probe -x /lib/x86_64-linux-gnu/libc.so.6 malloc
Added new events:
...

Now we run seq, while monitoring malloc() calls with perf

$ perf record --call-graph=dwarf -e probe_libc:malloc seq 5
1
2
3
4
5
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.064 MB perf.data (6 samples) ]

We can use perf to look at its log to see the malloc calls and the backtrace

$ perf script
seq 14195 [000] 1927993.748254: probe_libc:malloc: (7f9ff8edd320) bytes=0x22
7f9ff8edd320 malloc (/lib/x86_64-linux-gnu/libc-2.22.so)
7f9ff8e8eab0 set_binding_values.part.0 (/lib/x86_64-linux-gnu/libc-2.22.so)
7f9ff8e8eda1 __bindtextdomain (/lib/x86_64-linux-gnu/libc-2.22.so)
401b22 main (/usr/bin/seq)
7f9ff8e82610 __libc_start_main (/lib/x86_64-linux-gnu/libc-2.22.so)
402799 _start (/usr/bin/seq)
...

We can also use the scripting facilities. We create a skeleton perl
script that simply prints out the events

$ perf script -g perl
generated Perl script: perf-script.pl

We can then use this script to see the malloc() calls with a
backtrace. Prior to this patch, the backtrace was not available to
the perl scripts.

$ perf script -s perf-script.pl
probe_libc::malloc 0 1927993.748254260 14195 seq __probe_ip=140325052863264, bytes=34
[7f9ff8edd320] malloc
[7f9ff8e8eab0] set_binding_values.part.0
[7f9ff8e8eda1] __bindtextdomain
[401b22] main
[7f9ff8e82610] __libc_start_main
[402799] _start
...

Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Dima Kogan <[email protected]>
---
.../perf/util/scripting-engines/trace-event-perl.c | 114 +++++++++++++++++++--
1 file changed, 106 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/scripting-engines/trace-event-perl.c b/tools/perf/util/scripting-engines/trace-event-perl.c
index b3aabc0d4eb0..1d160855cda9 100644
--- a/tools/perf/util/scripting-engines/trace-event-perl.c
+++ b/tools/perf/util/scripting-engines/trace-event-perl.c
@@ -31,6 +31,8 @@
#include <perl.h>

#include "../../perf.h"
+#include "../callchain.h"
+#include "../machine.h"
#include "../thread.h"
#include "../event.h"
#include "../trace-event.h"
@@ -248,10 +250,78 @@ static void define_event_symbols(struct event_format *event,
define_event_symbols(event, ev_name, args->next);
}

+static SV *perl_process_callchain(struct perf_sample *sample,
+ struct perf_evsel *evsel,
+ struct addr_location *al)
+{
+ AV *list;
+
+ list = newAV();
+ if (!list)
+ goto exit;
+
+ if (!symbol_conf.use_callchain || !sample->callchain)
+ goto exit;
+
+ if (thread__resolve_callchain(al->thread, evsel,
+ sample, NULL, NULL,
+ PERF_MAX_STACK_DEPTH) != 0) {
+ pr_err("Failed to resolve callchain. Skipping\n");
+ goto exit;
+ }
+ callchain_cursor_commit(&callchain_cursor);
+
+
+ while (1) {
+ HV *elem;
+ struct callchain_cursor_node *node;
+ node = callchain_cursor_current(&callchain_cursor);
+ if (!node)
+ break;
+
+ elem = newHV();
+ if (!elem)
+ goto exit;
+
+ hv_stores(elem, "ip", newSVuv(node->ip));
+
+ if (node->sym) {
+ HV *sym = newHV();
+ if (!sym)
+ goto exit;
+ hv_stores(sym, "start", newSVuv(node->sym->start));
+ hv_stores(sym, "end", newSVuv(node->sym->end));
+ hv_stores(sym, "binding", newSVuv(node->sym->binding));
+ hv_stores(sym, "name", newSVpvn(node->sym->name,
+ node->sym->namelen));
+ hv_stores(elem, "sym", newRV_noinc((SV*)sym));
+ }
+
+ if (node->map) {
+ struct map *map = node->map;
+ const char *dsoname = "[unknown]";
+ if (map && map->dso && (map->dso->name || map->dso->long_name)) {
+ if (symbol_conf.show_kernel_path && map->dso->long_name)
+ dsoname = map->dso->long_name;
+ else if (map->dso->name)
+ dsoname = map->dso->name;
+ }
+ hv_stores(elem, "dso", newSVpv(dsoname,0));
+ }
+
+ callchain_cursor_advance(&callchain_cursor);
+ av_push(list, newRV_noinc((SV*)elem));
+ }
+
+exit:
+ return newRV_noinc((SV*)list);
+}
+
static void perl_process_tracepoint(struct perf_sample *sample,
struct perf_evsel *evsel,
- struct thread *thread)
+ struct addr_location *al)
{
+ struct thread *thread = al->thread;
struct event_format *event = evsel->tp_format;
struct format_field *field;
static char handler[256];
@@ -295,6 +365,7 @@ static void perl_process_tracepoint(struct perf_sample *sample,
XPUSHs(sv_2mortal(newSVuv(ns)));
XPUSHs(sv_2mortal(newSViv(pid)));
XPUSHs(sv_2mortal(newSVpv(comm, 0)));
+ XPUSHs(sv_2mortal(perl_process_callchain(sample, evsel, al)));

/* common fields other than pid can be accessed via xsub fns */

@@ -329,6 +400,7 @@ static void perl_process_tracepoint(struct perf_sample *sample,
XPUSHs(sv_2mortal(newSVuv(nsecs)));
XPUSHs(sv_2mortal(newSViv(pid)));
XPUSHs(sv_2mortal(newSVpv(comm, 0)));
+ XPUSHs(sv_2mortal(perl_process_callchain(sample, evsel, al)));
call_pv("main::trace_unhandled", G_SCALAR);
}
SPAGAIN;
@@ -366,7 +438,7 @@ static void perl_process_event(union perf_event *event,
struct perf_evsel *evsel,
struct addr_location *al)
{
- perl_process_tracepoint(sample, evsel, al->thread);
+ perl_process_tracepoint(sample, evsel, al);
perl_process_event_generic(event, sample, evsel);
}

@@ -490,7 +562,27 @@ static int perl_generate_script(struct pevent *pevent, const char *outfile)
fprintf(ofp, "use Perf::Trace::Util;\n\n");

fprintf(ofp, "sub trace_begin\n{\n\t# optional\n}\n\n");
- fprintf(ofp, "sub trace_end\n{\n\t# optional\n}\n\n");
+ fprintf(ofp, "sub trace_end\n{\n\t# optional\n}\n");
+
+
+ fprintf(ofp, "\n\
+sub print_backtrace\n\
+{\n\
+ my $callchain = shift;\n\
+ for my $node (@$callchain)\n\
+ {\n\
+ if(exists $node->{sym})\n\
+ {\n\
+ printf( \"\\t[\\%%x] \\%%s\\n\", $node->{ip}, $node->{sym}{name});\n\
+ }\n\
+ else\n\
+ {\n\
+ printf( \"\\t[\\%%x]\\n\", $node{ip});\n\
+ }\n\
+ }\n\
+}\n\n\
+");
+

while ((event = trace_find_next_event(pevent, event))) {
fprintf(ofp, "sub %s::%s\n{\n", event->system, event->name);
@@ -502,7 +594,8 @@ static int perl_generate_script(struct pevent *pevent, const char *outfile)
fprintf(ofp, "$common_secs, ");
fprintf(ofp, "$common_nsecs,\n");
fprintf(ofp, "\t $common_pid, ");
- fprintf(ofp, "$common_comm,\n\t ");
+ fprintf(ofp, "$common_comm, ");
+ fprintf(ofp, "$common_callchain,\n\t ");

not_first = 0;
count = 0;
@@ -519,7 +612,7 @@ static int perl_generate_script(struct pevent *pevent, const char *outfile)

fprintf(ofp, "\tprint_header($event_name, $common_cpu, "
"$common_secs, $common_nsecs,\n\t "
- "$common_pid, $common_comm);\n\n");
+ "$common_pid, $common_comm, $common_callchain);\n\n");

fprintf(ofp, "\tprintf(\"");

@@ -581,17 +674,22 @@ static int perl_generate_script(struct pevent *pevent, const char *outfile)
fprintf(ofp, "$%s", f->name);
}

- fprintf(ofp, ");\n");
+ fprintf(ofp, ");\n\n");
+
+ fprintf(ofp, "\tprint_backtrace($common_callchain);\n");
+
fprintf(ofp, "}\n\n");
}

fprintf(ofp, "sub trace_unhandled\n{\n\tmy ($event_name, $context, "
"$common_cpu, $common_secs, $common_nsecs,\n\t "
- "$common_pid, $common_comm) = @_;\n\n");
+ "$common_pid, $common_comm, $common_callchain) = @_;\n\n");

fprintf(ofp, "\tprint_header($event_name, $common_cpu, "
"$common_secs, $common_nsecs,\n\t $common_pid, "
- "$common_comm);\n}\n\n");
+ "$common_comm, $common_callchain);\n");
+ fprintf(ofp, "\tprint_backtrace($common_callchain);\n");
+ fprintf(ofp, "}\n\n");

fprintf(ofp, "sub print_header\n{\n"
"\tmy ($event_name, $cpu, $secs, $nsecs, $pid, $comm) = @_;\n\n"
--
2.5.5

2016-03-29 23:42:00

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 09/11] perf tools: Add support for skipping itrace instructions

From: Andi Kleen <[email protected]>

When using 'perf script' to look at PT traces it is often useful to
ignore the initialization code at the beginning.

On larger traces which may have many millions of instructions in
initialization code doing that in a pipeline can be very slow, with perf
script spending a lot of CPU time calling printf and writing data.

This patch adds an extension to the --itrace argument that skips 'n'
events (instructions, branches or transactions) at the beginning. This
is much more efficient.

v2:
Add support for BTS (Adrian Hunter)
Document in itrace.txt
Fix branch check
Check transactions and instructions too

Committer note:

To test intel_pt one needs to make sure VT-x isn't active, i.e.
stopping KVM guests on the test machine, as described by Andi Kleen
at http://lkml.kernel.org/r/[email protected]

Signed-off-by: Andi Kleen <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Documentation/intel-pt.txt | 7 +++++++
tools/perf/Documentation/itrace.txt | 8 ++++++++
tools/perf/util/auxtrace.c | 7 +++++++
tools/perf/util/auxtrace.h | 2 ++
tools/perf/util/intel-bts.c | 5 +++++
tools/perf/util/intel-pt.c | 22 ++++++++++++++++++++--
6 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/intel-pt.txt b/tools/perf/Documentation/intel-pt.txt
index be764f9ec769..c6c8318e38a2 100644
--- a/tools/perf/Documentation/intel-pt.txt
+++ b/tools/perf/Documentation/intel-pt.txt
@@ -672,6 +672,7 @@ The letters are:
d create a debug log
g synthesize a call chain (use with i or x)
l synthesize last branch entries (use with i or x)
+ s skip initial number of events

"Instructions" events look like they were recorded by "perf record -e
instructions".
@@ -730,6 +731,12 @@ from one sample to the next.

To disable trace decoding entirely, use the option --no-itrace.

+It is also possible to skip events generated (instructions, branches, transactions)
+at the beginning. This is useful to ignore initialization code.
+
+ --itrace=i0nss1000000
+
+skips the first million instructions.

dump option
-----------
diff --git a/tools/perf/Documentation/itrace.txt b/tools/perf/Documentation/itrace.txt
index 65453f4c7006..e2a4c5e0dbe5 100644
--- a/tools/perf/Documentation/itrace.txt
+++ b/tools/perf/Documentation/itrace.txt
@@ -7,6 +7,7 @@
d create a debug log
g synthesize a call chain (use with i or x)
l synthesize last branch entries (use with i or x)
+ s skip initial number of events

The default is all events i.e. the same as --itrace=ibxe

@@ -24,3 +25,10 @@

Also the number of last branch entries (default 64, max. 1024) for
instructions or transactions events can be specified.
+
+ It is also possible to skip events generated (instructions, branches, transactions)
+ at the beginning. This is useful to ignore initialization code.
+
+ --itrace=i0nss1000000
+
+ skips the first million instructions.
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index ec164fe70718..c9169011e55e 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -940,6 +940,7 @@ void itrace_synth_opts__set_default(struct itrace_synth_opts *synth_opts)
synth_opts->period = PERF_ITRACE_DEFAULT_PERIOD;
synth_opts->callchain_sz = PERF_ITRACE_DEFAULT_CALLCHAIN_SZ;
synth_opts->last_branch_sz = PERF_ITRACE_DEFAULT_LAST_BRANCH_SZ;
+ synth_opts->initial_skip = 0;
}

/*
@@ -1064,6 +1065,12 @@ int itrace_parse_synth_opts(const struct option *opt, const char *str,
synth_opts->last_branch_sz = val;
}
break;
+ case 's':
+ synth_opts->initial_skip = strtoul(p, &endptr, 10);
+ if (p == endptr)
+ goto out_err;
+ p = endptr;
+ break;
case ' ':
case ',':
break;
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index 57ff31ecb8e4..767989e0e312 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -68,6 +68,7 @@ enum itrace_period_type {
* @last_branch_sz: branch context size
* @period: 'instructions' events period
* @period_type: 'instructions' events period type
+ * @initial_skip: skip N events at the beginning.
*/
struct itrace_synth_opts {
bool set;
@@ -86,6 +87,7 @@ struct itrace_synth_opts {
unsigned int last_branch_sz;
unsigned long long period;
enum itrace_period_type period_type;
+ unsigned long initial_skip;
};

/**
diff --git a/tools/perf/util/intel-bts.c b/tools/perf/util/intel-bts.c
index abf1366e2a24..9df996085563 100644
--- a/tools/perf/util/intel-bts.c
+++ b/tools/perf/util/intel-bts.c
@@ -66,6 +66,7 @@ struct intel_bts {
u64 branches_id;
size_t branches_event_size;
bool synth_needs_swap;
+ unsigned long num_events;
};

struct intel_bts_queue {
@@ -275,6 +276,10 @@ static int intel_bts_synth_branch_sample(struct intel_bts_queue *btsq,
union perf_event event;
struct perf_sample sample = { .ip = 0, };

+ if (bts->synth_opts.initial_skip &&
+ bts->num_events++ <= bts->synth_opts.initial_skip)
+ return 0;
+
event.sample.header.type = PERF_RECORD_SAMPLE;
event.sample.header.misc = PERF_RECORD_MISC_USER;
event.sample.header.size = sizeof(struct perf_event_header);
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 407f11b97c8d..ddec87f6e616 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -100,6 +100,8 @@ struct intel_pt {
u64 cyc_bit;
u64 noretcomp_bit;
unsigned max_non_turbo_ratio;
+
+ unsigned long num_events;
};

enum switch_state {
@@ -972,6 +974,10 @@ static int intel_pt_synth_branch_sample(struct intel_pt_queue *ptq)
if (pt->branches_filter && !(pt->branches_filter & ptq->flags))
return 0;

+ if (pt->synth_opts.initial_skip &&
+ pt->num_events++ < pt->synth_opts.initial_skip)
+ return 0;
+
event->sample.header.type = PERF_RECORD_SAMPLE;
event->sample.header.misc = PERF_RECORD_MISC_USER;
event->sample.header.size = sizeof(struct perf_event_header);
@@ -1029,6 +1035,10 @@ static int intel_pt_synth_instruction_sample(struct intel_pt_queue *ptq)
union perf_event *event = ptq->event_buf;
struct perf_sample sample = { .ip = 0, };

+ if (pt->synth_opts.initial_skip &&
+ pt->num_events++ < pt->synth_opts.initial_skip)
+ return 0;
+
event->sample.header.type = PERF_RECORD_SAMPLE;
event->sample.header.misc = PERF_RECORD_MISC_USER;
event->sample.header.size = sizeof(struct perf_event_header);
@@ -1087,6 +1097,10 @@ static int intel_pt_synth_transaction_sample(struct intel_pt_queue *ptq)
union perf_event *event = ptq->event_buf;
struct perf_sample sample = { .ip = 0, };

+ if (pt->synth_opts.initial_skip &&
+ pt->num_events++ < pt->synth_opts.initial_skip)
+ return 0;
+
event->sample.header.type = PERF_RECORD_SAMPLE;
event->sample.header.misc = PERF_RECORD_MISC_USER;
event->sample.header.size = sizeof(struct perf_event_header);
@@ -1199,14 +1213,18 @@ static int intel_pt_sample(struct intel_pt_queue *ptq)
ptq->have_sample = false;

if (pt->sample_instructions &&
- (state->type & INTEL_PT_INSTRUCTION)) {
+ (state->type & INTEL_PT_INSTRUCTION) &&
+ (!pt->synth_opts.initial_skip ||
+ pt->num_events++ >= pt->synth_opts.initial_skip)) {
err = intel_pt_synth_instruction_sample(ptq);
if (err)
return err;
}

if (pt->sample_transactions &&
- (state->type & INTEL_PT_TRANSACTION)) {
+ (state->type & INTEL_PT_TRANSACTION) &&
+ (!pt->synth_opts.initial_skip ||
+ pt->num_events++ >= pt->synth_opts.initial_skip)) {
err = intel_pt_synth_transaction_sample(ptq);
if (err)
return err;
--
2.5.5

2016-03-29 23:41:52

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 04/11] perf tests: Add test to check for event times

From: Jiri Olsa <[email protected]>

This test creates software event 'cpu-clock' attaches it in several ways
and checks that enabled and running times match.

Committer notes:

Testing it:

[acme@jouet linux]$ perf test -v times
44: Test events times :
--- start ---
test child forked, pid 27170
attaching to spawned child, enable on exec
OK : ena 307328, run 307328
attaching to current thread as enabled
OK : ena 7826, run 7826
attaching to current thread as disabled
OK : ena 738, run 738
attaching to CPU 0 as enabled
SKIP : not enough rights
attaching to CPU 0 as enabled
SKIP : not enough rights
test child finished with -2
---- end ----
Test events times: Skip
[acme@jouet linux]$

[root@jouet ~]# perf test times
44: Test events times : Ok
[root@jouet ~]# perf test -v times
44: Test events times :
--- start ---
test child forked, pid 27306
attaching to spawned child, enable on exec
OK : ena 479290, run 479290
attaching to current thread as enabled
OK : ena 11356, run 11356
attaching to current thread as disabled
OK : ena 987, run 987
attaching to CPU 0 as enabled
OK : ena 3717, run 3717
attaching to CPU 0 as enabled
OK : ena 2323, run 2323
test child finished with 0
---- end ----
Test events times: Ok
[root@jouet ~]#

Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/tests/Build | 1 +
tools/perf/tests/builtin-test.c | 4 +
tools/perf/tests/event-times.c | 236 ++++++++++++++++++++++++++++++++++++++++
tools/perf/tests/tests.h | 1 +
4 files changed, 242 insertions(+)
create mode 100644 tools/perf/tests/event-times.c

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index 1ba628ed049a..449fe97a555f 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -37,6 +37,7 @@ perf-y += topology.o
perf-y += cpumap.o
perf-y += stat.o
perf-y += event_update.o
+perf-y += event-times.o

$(OUTPUT)tests/llvm-src-base.c: tests/bpf-script-example.c tests/Build
$(call rule_mkdir)
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index f2b1dcac45d3..93c467015e71 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -204,6 +204,10 @@ static struct test generic_tests[] = {
.func = test__event_update,
},
{
+ .desc = "Test events times",
+ .func = test__event_times,
+ },
+ {
.func = NULL,
},
};
diff --git a/tools/perf/tests/event-times.c b/tools/perf/tests/event-times.c
new file mode 100644
index 000000000000..95fb744f6628
--- /dev/null
+++ b/tools/perf/tests/event-times.c
@@ -0,0 +1,236 @@
+#include <linux/compiler.h>
+#include <string.h>
+#include "tests.h"
+#include "evlist.h"
+#include "evsel.h"
+#include "util.h"
+#include "debug.h"
+#include "thread_map.h"
+#include "target.h"
+
+static int attach__enable_on_exec(struct perf_evlist *evlist)
+{
+ struct perf_evsel *evsel = perf_evlist__last(evlist);
+ struct target target = {
+ .uid = UINT_MAX,
+ };
+ const char *argv[] = { "true", NULL, };
+ char sbuf[STRERR_BUFSIZE];
+ int err;
+
+ pr_debug("attaching to spawned child, enable on exec\n");
+
+ err = perf_evlist__create_maps(evlist, &target);
+ if (err < 0) {
+ pr_debug("Not enough memory to create thread/cpu maps\n");
+ return err;
+ }
+
+ err = perf_evlist__prepare_workload(evlist, &target, argv, false, NULL);
+ if (err < 0) {
+ pr_debug("Couldn't run the workload!\n");
+ return err;
+ }
+
+ evsel->attr.enable_on_exec = 1;
+
+ err = perf_evlist__open(evlist);
+ if (err < 0) {
+ pr_debug("perf_evlist__open: %s\n",
+ strerror_r(errno, sbuf, sizeof(sbuf)));
+ return err;
+ }
+
+ return perf_evlist__start_workload(evlist) == 1 ? TEST_OK : TEST_FAIL;
+}
+
+static int detach__enable_on_exec(struct perf_evlist *evlist)
+{
+ waitpid(evlist->workload.pid, NULL, 0);
+ return 0;
+}
+
+static int attach__current_disabled(struct perf_evlist *evlist)
+{
+ struct perf_evsel *evsel = perf_evlist__last(evlist);
+ struct thread_map *threads;
+ int err;
+
+ pr_debug("attaching to current thread as disabled\n");
+
+ threads = thread_map__new(-1, getpid(), UINT_MAX);
+ if (threads == NULL) {
+ pr_debug("thread_map__new\n");
+ return -1;
+ }
+
+ evsel->attr.disabled = 1;
+
+ err = perf_evsel__open_per_thread(evsel, threads);
+ if (err) {
+ pr_debug("Failed to open event cpu-clock:u\n");
+ return err;
+ }
+
+ thread_map__put(threads);
+ return perf_evsel__enable(evsel) == 0 ? TEST_OK : TEST_FAIL;
+}
+
+static int attach__current_enabled(struct perf_evlist *evlist)
+{
+ struct perf_evsel *evsel = perf_evlist__last(evlist);
+ struct thread_map *threads;
+ int err;
+
+ pr_debug("attaching to current thread as enabled\n");
+
+ threads = thread_map__new(-1, getpid(), UINT_MAX);
+ if (threads == NULL) {
+ pr_debug("failed to call thread_map__new\n");
+ return -1;
+ }
+
+ err = perf_evsel__open_per_thread(evsel, threads);
+
+ thread_map__put(threads);
+ return err == 0 ? TEST_OK : TEST_FAIL;
+}
+
+static int detach__disable(struct perf_evlist *evlist)
+{
+ struct perf_evsel *evsel = perf_evlist__last(evlist);
+
+ return perf_evsel__enable(evsel);
+}
+
+static int attach__cpu_disabled(struct perf_evlist *evlist)
+{
+ struct perf_evsel *evsel = perf_evlist__last(evlist);
+ struct cpu_map *cpus;
+ int err;
+
+ pr_debug("attaching to CPU 0 as enabled\n");
+
+ cpus = cpu_map__new("0");
+ if (cpus == NULL) {
+ pr_debug("failed to call cpu_map__new\n");
+ return -1;
+ }
+
+ evsel->attr.disabled = 1;
+
+ err = perf_evsel__open_per_cpu(evsel, cpus);
+ if (err) {
+ if (err == -EACCES)
+ return TEST_SKIP;
+
+ pr_debug("Failed to open event cpu-clock:u\n");
+ return err;
+ }
+
+ cpu_map__put(cpus);
+ return perf_evsel__enable(evsel);
+}
+
+static int attach__cpu_enabled(struct perf_evlist *evlist)
+{
+ struct perf_evsel *evsel = perf_evlist__last(evlist);
+ struct cpu_map *cpus;
+ int err;
+
+ pr_debug("attaching to CPU 0 as enabled\n");
+
+ cpus = cpu_map__new("0");
+ if (cpus == NULL) {
+ pr_debug("failed to call cpu_map__new\n");
+ return -1;
+ }
+
+ err = perf_evsel__open_per_cpu(evsel, cpus);
+ if (err == -EACCES)
+ return TEST_SKIP;
+
+ cpu_map__put(cpus);
+ return err ? TEST_FAIL : TEST_OK;
+}
+
+static int test_times(int (attach)(struct perf_evlist *),
+ int (detach)(struct perf_evlist *))
+{
+ struct perf_counts_values count;
+ struct perf_evlist *evlist = NULL;
+ struct perf_evsel *evsel;
+ int err = -1, i;
+
+ evlist = perf_evlist__new();
+ if (!evlist) {
+ pr_debug("failed to create event list\n");
+ goto out_err;
+ }
+
+ err = parse_events(evlist, "cpu-clock:u", NULL);
+ if (err) {
+ pr_debug("failed to parse event cpu-clock:u\n");
+ goto out_err;
+ }
+
+ evsel = perf_evlist__last(evlist);
+ evsel->attr.read_format |=
+ PERF_FORMAT_TOTAL_TIME_ENABLED |
+ PERF_FORMAT_TOTAL_TIME_RUNNING;
+
+ err = attach(evlist);
+ if (err == TEST_SKIP) {
+ pr_debug(" SKIP : not enough rights\n");
+ return err;
+ }
+
+ TEST_ASSERT_VAL("failed to attach", !err);
+
+ for (i = 0; i < 100000000; i++) { }
+
+ TEST_ASSERT_VAL("failed to detach", !detach(evlist));
+
+ perf_evsel__read(evsel, 0, 0, &count);
+
+ err = !(count.ena == count.run);
+
+ pr_debug(" %s: ena %" PRIu64", run %" PRIu64"\n",
+ !err ? "OK " : "FAILED",
+ count.ena, count.run);
+
+out_err:
+ if (evlist)
+ perf_evlist__delete(evlist);
+ return !err ? TEST_OK : TEST_FAIL;
+}
+
+/*
+ * This test creates software event 'cpu-clock'
+ * attaches it in several ways (explained below)
+ * and checks that enabled and running times
+ * match.
+ */
+int test__event_times(int subtest __maybe_unused)
+{
+ int err, ret = 0;
+
+#define _T(attach, detach) \
+ err = test_times(attach, detach); \
+ if (err && (ret == TEST_OK || ret == TEST_SKIP)) \
+ ret = err;
+
+ /* attach on newly spawned process after exec */
+ _T(attach__enable_on_exec, detach__enable_on_exec)
+ /* attach on current process as enabled */
+ _T(attach__current_enabled, detach__disable)
+ /* attach on current process as disabled */
+ _T(attach__current_disabled, detach__disable)
+ /* attach on cpu as disabled */
+ _T(attach__cpu_disabled, detach__disable)
+ /* attach on cpu as enabled */
+ _T(attach__cpu_enabled, detach__disable)
+
+#undef _T
+ return ret;
+}
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 82b2b5e6ba7c..0fc946989cf0 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -85,6 +85,7 @@ int test__synthesize_stat_config(int subtest);
int test__synthesize_stat(int subtest);
int test__synthesize_stat_round(int subtest);
int test__event_update(int subtest);
+int test__event_times(int subtest);

#if defined(__arm__) || defined(__aarch64__)
#ifdef HAVE_DWARF_UNWIND_SUPPORT
--
2.5.5

2016-03-29 23:42:06

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 10/11] perf tools: Add probing for udev86 library

From: Andi Kleen <[email protected]>

Add autoprobing for the udev86 disassembler library.

Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/build/Makefile.feature | 6 ++++--
tools/build/feature/Makefile | 8 ++++++--
tools/build/feature/test-all.c | 5 +++++
tools/build/feature/test-udis86.c | 8 ++++++++
tools/perf/config/Makefile | 5 +++++
5 files changed, 28 insertions(+), 4 deletions(-)
create mode 100644 tools/build/feature/test-udis86.c

diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index 6b7707270aa3..db4f426cae09 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -55,7 +55,8 @@ FEATURE_TESTS_BASIC := \
zlib \
lzma \
get_cpuid \
- bpf
+ bpf \
+ udis86

# FEATURE_TESTS_BASIC + FEATURE_TESTS_EXTRA is the complete list
# of all feature tests
@@ -94,7 +95,8 @@ FEATURE_DISPLAY ?= \
zlib \
lzma \
get_cpuid \
- bpf
+ bpf \
+ udis86

# Set FEATURE_CHECK_(C|LD)FLAGS-all for all FEATURE_TESTS features.
# If in the future we need per-feature checks/flags for features not
diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index c5f4c417428d..d05c312f25c0 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -36,7 +36,8 @@ FILES= \
test-zlib.bin \
test-lzma.bin \
test-bpf.bin \
- test-get_cpuid.bin
+ test-get_cpuid.bin \
+ test-udis86.bin

FILES := $(addprefix $(OUTPUT),$(FILES))

@@ -51,7 +52,7 @@ __BUILD = $(CC) $(CFLAGS) -Wall -Werror -o $@ $(patsubst %.bin,%.c,$(@F)) $(LDFL
###############################

$(OUTPUT)test-all.bin:
- $(BUILD) -fstack-protector-all -O2 -D_FORTIFY_SOURCE=2 -ldw -lelf -lnuma -lelf -laudit -I/usr/include/slang -lslang $(shell $(PKG_CONFIG) --libs --cflags gtk+-2.0 2>/dev/null) $(FLAGS_PERL_EMBED) $(FLAGS_PYTHON_EMBED) -DPACKAGE='"perf"' -lbfd -ldl -lz -llzma
+ $(BUILD) -fstack-protector-all -O2 -D_FORTIFY_SOURCE=2 -ldw -lelf -lnuma -lelf -laudit -I/usr/include/slang -lslang $(shell $(PKG_CONFIG) --libs --cflags gtk+-2.0 2>/dev/null) $(FLAGS_PERL_EMBED) $(FLAGS_PYTHON_EMBED) -DPACKAGE='"perf"' -lbfd -ldl -lz -llzma -ludis86

$(OUTPUT)test-hello.bin:
$(BUILD)
@@ -97,6 +98,9 @@ $(OUTPUT)test-numa_num_possible_cpus.bin:
$(OUTPUT)test-libunwind.bin:
$(BUILD) -lelf

+$(OUTPUT)test-udis86.bin:
+ $(BUILD) -ludis86
+
$(OUTPUT)test-libunwind-debug-frame.bin:
$(BUILD) -lelf

diff --git a/tools/build/feature/test-all.c b/tools/build/feature/test-all.c
index e499a36c1e4a..76b0de3d145a 100644
--- a/tools/build/feature/test-all.c
+++ b/tools/build/feature/test-all.c
@@ -133,6 +133,10 @@
# include "test-libcrypto.c"
#undef main

+#define main main_test_udis86
+# include "test-udis86.c"
+#endif
+
int main(int argc, char *argv[])
{
main_test_libpython();
@@ -163,6 +167,7 @@ int main(int argc, char *argv[])
main_test_get_cpuid();
main_test_bpf();
main_test_libcrypto();
+ main_test_udis86();

return 0;
}
diff --git a/tools/build/feature/test-udis86.c b/tools/build/feature/test-udis86.c
new file mode 100644
index 000000000000..623c545f4bad
--- /dev/null
+++ b/tools/build/feature/test-udis86.c
@@ -0,0 +1,8 @@
+#include <udis86.h>
+
+int main(void)
+{
+ ud_t ud;
+ ud_init(&ud);
+ return 0;
+}
diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index f7d7f5a1cad5..399ada8e7a47 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -587,6 +587,11 @@ ifneq ($(filter -lbfd,$(EXTLIBS)),)
CFLAGS += -DHAVE_LIBBFD_SUPPORT
endif

+ifeq ($(feature-udis86), 1)
+ CFLAGS += -DHAVE_UDIS86
+ EXTLIBS += -ludis86
+endif
+
ifndef NO_ZLIB
ifeq ($(feature-zlib), 1)
CFLAGS += -DHAVE_ZLIB_SUPPORT
--
2.5.5

2016-03-29 23:41:55

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 07/11] perf config: Rename 'v' to 'home' in set_buildid_dir()

From: Taeung Song <[email protected]>

Change the variable name 'v' to 'home' to make it more readable.

Signed-off-by: Taeung Song <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/config.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
index 2dd78f4c97a0..5c20d783423b 100644
--- a/tools/perf/util/config.c
+++ b/tools/perf/util/config.c
@@ -540,10 +540,11 @@ void set_buildid_dir(const char *dir)

/* default to $HOME/.debug */
if (buildid_dir[0] == '\0') {
- char *v = getenv("HOME");
- if (v) {
+ char *home = getenv("HOME");
+
+ if (home) {
snprintf(buildid_dir, MAXPATHLEN-1, "%s/%s",
- v, DEBUG_CACHE_DIR);
+ home, DEBUG_CACHE_DIR);
} else {
strncpy(buildid_dir, DEBUG_CACHE_DIR, MAXPATHLEN-1);
}
--
2.5.5

2016-03-29 23:42:17

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 11/11] perf script: Add support for printing assembler

From: Andi Kleen <[email protected]>

When dumping PT traces with perf script it is very useful to see the
assembler for each sample, so that it is easily possible to follow the
control flow.

As using objdump is difficult and inefficient from perf script this
patch uses the udis86 library to implement assembler output. The
library can be downloaded from http://udis86.sourceforge.net/

The library is probed as an external dependency in the usual way. Then
'perf script' calls into it when needed, and handles callbacks to
resolve symbols.

% perf record -e intel_pt//u true
% perf script -F sym,symoff,ip,asm --itrace=i0ns | head
7fc7188b4190 _start+0x0 mov %rsp, %rdi
7fc7188b4193 _start+0x3 call _dl_start
7fc7188b7710 _dl_start+0x0 push %rbp
7fc7188b7711 _dl_start+0x1 mov %rsp, %rbp
7fc7188b7714 _dl_start+0x4 push %r15
7fc7188b7716 _dl_start+0x6 push %r14
7fc7188b7718 _dl_start+0x8 push %r13
7fc7188b771a _dl_start+0xa push %r12
7fc7188b771c _dl_start+0xc mov %rdi, %r12
7fc7188b771f _dl_start+0xf push %rbx

Current issues:

- Some jump references do not get resolved to symbols.
- udis86 release does not support STAC/CLAC, which are used in the kernel,
but there is a pending patch for it.

v2: Fix address resolution. Port to latest acme/perf/core

Committer note:

To test intel_pt one needs to make sure VT-x isn't active, i.e.
stopping KVM guests on the test machine, as described by Andi Kleen at
http://lkml.kernel.org/r/[email protected]

Signed-off-by: Andi Kleen <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Documentation/perf-script.txt | 4 +-
tools/perf/builtin-script.c | 107 +++++++++++++++++++++++++++++--
2 files changed, 105 insertions(+), 6 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 22ef3933342a..f2b81d837799 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -116,7 +116,7 @@ OPTIONS
--fields::
Comma separated list of fields to print. Options are:
comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
- srcline, period, iregs, brstack, brstacksym, flags.
+ srcline, period, iregs, brstack, brstacksym, flags, asm.
Field list can be prepended with the type, trace, sw or hw,
to indicate to which event type the field list applies.
e.g., -f sw:comm,tid,time,ip,sym and -f trace:time,cpu,trace
@@ -185,6 +185,8 @@ OPTIONS

The brstacksym is identical to brstack, except that the FROM and TO addresses are printed in a symbolic form if possible.

+ When asm is specified the assembler instruction of each sample is printed in disassembled form.
+
-k::
--vmlinux=<file>::
vmlinux pathname
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 3770c3dffe5e..323572e72706 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -25,6 +25,10 @@
#include "asm/bug.h"
#include "util/mem-events.h"

+#ifdef HAVE_UDIS86
+#include <udis86.h>
+#endif
+
static char const *script_name;
static char const *generate_script_lang;
static bool debug_mode;
@@ -62,6 +66,7 @@ enum perf_output_field {
PERF_OUTPUT_DATA_SRC = 1U << 17,
PERF_OUTPUT_WEIGHT = 1U << 18,
PERF_OUTPUT_BPF_OUTPUT = 1U << 19,
+ PERF_OUTPUT_ASM = 1U << 20,
};

struct output_option {
@@ -88,6 +93,7 @@ struct output_option {
{.str = "data_src", .field = PERF_OUTPUT_DATA_SRC},
{.str = "weight", .field = PERF_OUTPUT_WEIGHT},
{.str = "bpf-output", .field = PERF_OUTPUT_BPF_OUTPUT},
+ {.str = "asm", .field = PERF_OUTPUT_ASM},
};

/* default set to maintain compatibility with current format */
@@ -282,7 +288,11 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel,
"selected. Hence, no address to lookup the source line number.\n");
return -EINVAL;
}
-
+ if (PRINT_FIELD(ASM) && !PRINT_FIELD(IP)) {
+ pr_err("Display of assembler requested but sample IP is not\n"
+ "selected.\n");
+ return -EINVAL;
+ }
if ((PRINT_FIELD(PID) || PRINT_FIELD(TID)) &&
perf_evsel__check_stype(evsel, PERF_SAMPLE_TID, "TID",
PERF_OUTPUT_TID|PERF_OUTPUT_PID))
@@ -421,6 +431,88 @@ static void print_sample_iregs(struct perf_sample *sample,
}
}

+#ifdef HAVE_UDIS86
+
+struct perf_ud {
+ ud_t ud_obj;
+ struct thread *thread;
+ u8 cpumode;
+ int cpu;
+};
+
+static const char *dis_resolve(struct ud *u, uint64_t addr, int64_t *off)
+{
+ struct perf_ud *ud = container_of(u, struct perf_ud, ud_obj);
+ struct addr_location al;
+
+ memset(&al, 0, sizeof(struct addr_location));
+
+ thread__find_addr_map(ud->thread, ud->cpumode, MAP__FUNCTION, addr, &al);
+ if (!al.map)
+ thread__find_addr_map(ud->thread, ud->cpumode, MAP__VARIABLE,
+ addr, &al);
+ al.cpu = ud->cpu;
+ al.sym = NULL;
+
+ if (al.map)
+ al.sym = map__find_symbol(al.map, al.addr, NULL);
+
+ if (!al.sym)
+ return NULL;
+
+ if (al.addr < al.sym->end)
+ *off = al.addr - al.sym->start;
+ else
+ *off = al.addr - al.map->start - al.sym->start;
+ return al.sym->name;
+}
+#endif
+
+static void print_sample_asm(struct perf_sample *sample __maybe_unused,
+ struct thread *thread __maybe_unused,
+ struct perf_event_attr *attr __maybe_unused,
+ struct addr_location *al __maybe_unused,
+ struct machine *machine __maybe_unused)
+{
+#ifdef HAVE_UDIS86
+ static bool ud_initialized = false;
+ static struct perf_ud ud;
+ u8 buffer[32];
+ int len;
+ u64 offset;
+
+ if (!ud_initialized) {
+ ud_initialized = true;
+ ud_init(&ud.ud_obj);
+ ud_set_syntax(&ud.ud_obj, UD_SYN_ATT);
+ ud_set_sym_resolver(&ud.ud_obj, dis_resolve);
+ }
+ ud.thread = thread;
+ ud.cpumode = sample->cpumode;
+ ud.cpu = sample->cpu;
+
+ if (!al->map || !al->map->dso)
+ return;
+ if (al->map->dso->data.status == DSO_DATA_STATUS_ERROR)
+ return;
+
+ /* Load maps to ensure dso->is_64_bit has been updated */
+ map__load(al->map, machine->symbol_filter);
+
+ offset = al->map->map_ip(al->map, sample->ip);
+ len = dso__data_read_offset(al->map->dso, machine,
+ offset, buffer, 32);
+ if (len <= 0)
+ return;
+
+ ud_set_mode(&ud.ud_obj, al->map->dso->is_64_bit ? 64 : 32);
+ ud_set_pc(&ud.ud_obj, sample->ip);
+ ud_set_input_buffer(&ud.ud_obj, buffer, len);
+ ud_disassemble(&ud.ud_obj);
+ printf("\t%s", ud_insn_asm(&ud.ud_obj));
+#endif
+}
+
static void print_sample_start(struct perf_sample *sample,
struct thread *thread,
struct perf_evsel *evsel)
@@ -739,7 +831,8 @@ static size_t data_src__printf(u64 data_src)

static void process_event(struct perf_script *script,
struct perf_sample *sample, struct perf_evsel *evsel,
- struct addr_location *al)
+ struct addr_location *al,
+ struct machine *machine)
{
struct thread *thread = al->thread;
struct perf_event_attr *attr = &evsel->attr;
@@ -767,7 +860,7 @@ static void process_event(struct perf_script *script,

if (is_bts_event(attr)) {
print_sample_bts(sample, evsel, thread, al);
- return;
+ goto print_rest;
}

if (PRINT_FIELD(TRACE))
@@ -796,6 +889,7 @@ static void process_event(struct perf_script *script,
if (PRINT_FIELD(IREGS))
print_sample_iregs(sample, attr);

+print_rest:
if (PRINT_FIELD(BRSTACK))
print_sample_brstack(sample);
else if (PRINT_FIELD(BRSTACKSYM))
@@ -804,6 +898,9 @@ static void process_event(struct perf_script *script,
if (perf_evsel__is_bpf_output(evsel) && PRINT_FIELD(BPF_OUTPUT))
print_sample_bpf_output(sample);

+ if (PRINT_FIELD(ASM))
+ print_sample_asm(sample, thread, attr, al, machine);
+
printf("\n");
}

@@ -910,7 +1007,7 @@ static int process_sample_event(struct perf_tool *tool,
if (scripting_ops)
scripting_ops->process_event(event, sample, evsel, &al);
else
- process_event(scr, sample, evsel, &al);
+ process_event(scr, sample, evsel, &al, machine);

out_put:
addr_location__put(&al);
@@ -2010,7 +2107,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
"comma separated output fields prepend with 'type:'. "
"Valid types: hw,sw,trace,raw. "
"Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
- "addr,symoff,period,iregs,brstack,brstacksym,flags", parse_output_fields),
+ "addr,symoff,period,iregs,brstack,brstacksym,flags,asm", parse_output_fields),
OPT_BOOLEAN('a', "all-cpus", &system_wide,
"system-wide collection from all CPUs"),
OPT_STRING('S', "symbols", &symbol_conf.sym_list_str, "symbol[,symbol...]",
--
2.5.5

2016-03-29 23:43:08

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 06/11] perf config: Rework buildid_dir_command_config to perf_buildid_config

From: Taeung Song <[email protected]>

To avoid repeated calling perf_config() remove
buildid_dir_command_config() and add new perf_buildid_config into
perf_default_config.

Because perf_config() is already called with perf_default_config at
main().

Signed-off-by: Taeung Song <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/config.c | 50 +++++++++++++++++-------------------------------
1 file changed, 18 insertions(+), 32 deletions(-)

diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
index 4e727635476e..2dd78f4c97a0 100644
--- a/tools/perf/util/config.c
+++ b/tools/perf/util/config.c
@@ -377,6 +377,21 @@ const char *perf_config_dirname(const char *name, const char *value)
return value;
}

+static int perf_buildid_config(const char *var, const char *value)
+{
+ /* same dir for all commands */
+ if (!strcmp(var, "buildid.dir")) {
+ const char *dirname = perf_config_dirname(var, value);
+
+ if (!dirname)
+ return -1;
+ strncpy(buildid_dir, dirname, MAXPATHLEN-1);
+ buildid_dir[MAXPATHLEN-1] = '\0';
+ }
+
+ return 0;
+}
+
static int perf_default_core_config(const char *var __maybe_unused,
const char *value __maybe_unused)
{
@@ -412,6 +427,9 @@ int perf_default_config(const char *var, const char *value,
if (!prefixcmp(var, "llvm."))
return perf_llvm_config(var, value);

+ if (!prefixcmp(var, "buildid."))
+ return perf_buildid_config(var, value);
+
/* Add other config variables here. */
return 0;
}
@@ -515,43 +533,11 @@ int config_error_nonbool(const char *var)
return error("Missing value for '%s'", var);
}

-struct buildid_dir_config {
- char *dir;
-};
-
-static int buildid_dir_command_config(const char *var, const char *value,
- void *data)
-{
- struct buildid_dir_config *c = data;
- const char *v;
-
- /* same dir for all commands */
- if (!strcmp(var, "buildid.dir")) {
- v = perf_config_dirname(var, value);
- if (!v)
- return -1;
- strncpy(c->dir, v, MAXPATHLEN-1);
- c->dir[MAXPATHLEN-1] = '\0';
- }
- return 0;
-}
-
-static void check_buildid_dir_config(void)
-{
- struct buildid_dir_config c;
- c.dir = buildid_dir;
- perf_config(buildid_dir_command_config, &c);
-}
-
void set_buildid_dir(const char *dir)
{
if (dir)
scnprintf(buildid_dir, MAXPATHLEN-1, "%s", dir);

- /* try config file */
- if (buildid_dir[0] == '\0')
- check_buildid_dir_config();
-
/* default to $HOME/.debug */
if (buildid_dir[0] == '\0') {
char *v = getenv("HOME");
--
2.5.5

2016-03-29 23:43:10

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 05/11] perf config: Remove duplicated set_buildid_dir calls

From: Taeung Song <[email protected]>

Signed-off-by: Taeung Song <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/perf.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index aaee0a782747..7b2df2b46525 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -549,6 +549,7 @@ int main(int argc, const char **argv)
srandom(time(NULL));

perf_config(perf_default_config, NULL);
+ set_buildid_dir(NULL);

/* get debugfs/tracefs mount point from /proc/mounts */
tracing_path_mount();
@@ -572,7 +573,6 @@ int main(int argc, const char **argv)
}
if (!prefixcmp(cmd, "trace")) {
#ifdef HAVE_LIBAUDIT_SUPPORT
- set_buildid_dir(NULL);
setup_path();
argv[0] = "trace";
return cmd_trace(argc, argv, NULL);
@@ -587,7 +587,6 @@ int main(int argc, const char **argv)
argc--;
handle_options(&argv, &argc, NULL);
commit_pager_choice();
- set_buildid_dir(NULL);

if (argc > 0) {
if (!prefixcmp(argv[0], "--"))
--
2.5.5

2016-03-29 23:41:47

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 03/11] perf tools: Make -f/--force option documentation consistent across tools

From: Jiri Olsa <[email protected]>

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Documentation/perf-annotate.txt | 2 +-
tools/perf/Documentation/perf-diff.txt | 2 +-
tools/perf/Documentation/perf-report.txt | 2 +-
tools/perf/Documentation/perf-script.txt | 4 ++++
4 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-annotate.txt b/tools/perf/Documentation/perf-annotate.txt
index e9cd39a92dc2..778f54d4d0bd 100644
--- a/tools/perf/Documentation/perf-annotate.txt
+++ b/tools/perf/Documentation/perf-annotate.txt
@@ -33,7 +33,7 @@ OPTIONS

-f::
--force::
- Don't complain, do it.
+ Don't do ownership validation.

-v::
--verbose::
diff --git a/tools/perf/Documentation/perf-diff.txt b/tools/perf/Documentation/perf-diff.txt
index d1deb573877f..3e9490b9c533 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -75,7 +75,7 @@ OPTIONS

-f::
--force::
- Don't complain, do it.
+ Don't do ownership validation.

--symfs=<directory>::
Look for files with symbols relative to this directory.
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 12113992ac9d..496d42cdf02b 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -285,7 +285,7 @@ OPTIONS

-f::
--force::
- Don't complain, do it.
+ Don't do ownership validation.

--symfs=<directory>::
Look for files with symbols relative to this directory.
diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 382ddfb45d1d..22ef3933342a 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -262,6 +262,10 @@ include::itrace.txt[]
--ns::
Use 9 decimal places when displaying time (i.e. show the nanoseconds)

+-f::
+--force::
+ Don't do ownership validation.
+
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-script-perl[1],
--
2.5.5

2016-03-29 23:43:45

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 02/11] perf tools: Make hists__collapse_insert_entry static

From: Jiri Olsa <[email protected]>

No need to export hists__collapse_insert_entry function.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/hist.c | 5 +++--
tools/perf/util/hist.h | 2 --
2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 31c4641fe5ff..3d34c57dfbe2 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1295,8 +1295,9 @@ static int hists__hierarchy_insert_entry(struct hists *hists,
return ret;
}

-int hists__collapse_insert_entry(struct hists *hists, struct rb_root *root,
- struct hist_entry *he)
+static int hists__collapse_insert_entry(struct hists *hists,
+ struct rb_root *root,
+ struct hist_entry *he)
{
struct rb_node **p = &root->rb_node;
struct rb_node *parent = NULL;
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index bec0cd660fbd..588596561cb3 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -199,8 +199,6 @@ int hists__init(void);
int __hists__init(struct hists *hists, struct perf_hpp_list *hpp_list);

struct rb_root *hists__get_rotate_entries_in(struct hists *hists);
-int hists__collapse_insert_entry(struct hists *hists,
- struct rb_root *root, struct hist_entry *he);

struct perf_hpp {
char *buf;
--
2.5.5

2016-03-29 23:44:32

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 01/11] perf mem: Add -U/-K (--all-user/--all-kernel) options

From: Jiri Olsa <[email protected]>

Add -U/-K (--all-user/--all-kernel) options to use the perf record
--all-user/--all-kernel options.

Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Documentation/perf-mem.txt | 8 ++++++++
tools/perf/builtin-mem.c | 11 ++++++++++-
2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-mem.txt b/tools/perf/Documentation/perf-mem.txt
index 43310d8661fe..1d6092c460dd 100644
--- a/tools/perf/Documentation/perf-mem.txt
+++ b/tools/perf/Documentation/perf-mem.txt
@@ -48,6 +48,14 @@ OPTIONS
option can be passed in record mode. It will be interpreted the same way as perf
record.

+-K::
+--all-kernel::
+ Configure all used events to run in kernel space.
+
+-U::
+--all-user::
+ Configure all used events to run in user space.
+
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-report[1]
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 85db3be4b3cb..1dc140c5481d 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -62,19 +62,22 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
int rec_argc, i = 0, j;
const char **rec_argv;
int ret;
+ bool all_user = false, all_kernel = false;
struct option options[] = {
OPT_CALLBACK('e', "event", &mem, "event",
"event selector. use 'perf mem record -e list' to list available events",
parse_record_events),
OPT_INCR('v', "verbose", &verbose,
"be more verbose (show counter open errors, etc)"),
+ OPT_BOOLEAN('U', "--all-user", &all_user, "collect only user level data"),
+ OPT_BOOLEAN('K', "--all-kernel", &all_kernel, "collect only kernel level data"),
OPT_END()
};

argc = parse_options(argc, argv, options, record_mem_usage,
PARSE_OPT_STOP_AT_NON_OPTION);

- rec_argc = argc + 7; /* max number of arguments */
+ rec_argc = argc + 9; /* max number of arguments */
rec_argv = calloc(rec_argc + 1, sizeof(char *));
if (!rec_argv)
return -1;
@@ -103,6 +106,12 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
rec_argv[i++] = perf_mem_events__name(j);
};

+ if (all_user)
+ rec_argv[i++] = "--all-user";
+
+ if (all_kernel)
+ rec_argv[i++] = "--all-kernel";
+
for (j = 0; j < argc; j++, i++)
rec_argv[i] = argv[j];

--
2.5.5

2016-03-30 10:43:32

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 10/11] perf tools: Add probing for udev86 library


* Arnaldo Carvalho de Melo <[email protected]> wrote:

> From: Andi Kleen <[email protected]>
>
> Add autoprobing for the udev86 disassembler library.

So the typo in the title is confusing, what is 'udev86'?

Also, this library does not seem to be available on stock Ubuntu. We should not be
adding library dependencies that cannot be resolved on major distros:

... get_cpuid: [ on ]
... bpf: [ on ]
... udis86: [ OFF ]

Thanks,

Ingo

2016-03-30 13:36:16

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 10/11] perf tools: Add probing for udev86 library

Em Wed, Mar 30, 2016 at 12:43:27PM +0200, Ingo Molnar escreveu:
> > From: Andi Kleen <[email protected]>

> > Add autoprobing for the udev86 disassembler library.

> So the typo in the title is confusing, what is 'udev86'?

> Also, this library does not seem to be available on stock Ubuntu. We should not be
> adding library dependencies that cannot be resolved on major distros:

Ok, I'll remove, I thought it would be ok because I fired up:

# dnf install udis86-devel

On fedora and it installed straight away, but after I started trying to
update my docker images I couldn't find it on debian
experimental/unstable:

[root@jouet ~]# docker run -t -i debian:experimental /bin/bash
root@b97e620820b4:/# apt-get update
Get:1 http://debian.pop-sc.rnp.br/debian unstable InRelease [196 kB]
Get:2 http://debian.pop-sc.rnp.br/debian experimental InRelease [111 kB]
Get:3 http://debian.pop-sc.rnp.br/debian unstable/main amd64 Packages
[9477 kB]
Get:4 http://debian.pop-sc.rnp.br/debian experimental/main amd64
Packages [579 kB]
Fetched 10.4 MB in 15s (664 kB/s)
Reading package lists... Done
root@b97e620820b4:/# apt-cache search udis86
root@b97e620820b4:/#

Nor even in OpenSuSE:

[root@jouet ~]# docker run -t -i opensuse /bin/bash
bash-4.2# zypper search udis86
Retrieving repository 'NON-OSS' metadata ..........................[done]
Building repository 'NON-OSS' cache ...............................[done]
Retrieving repository 'OSS' metadata ..............................[done]
Building repository 'OSS' cache ...................................[done]
Retrieving repository 'OSS Update' metadata .......................[done]
Building repository 'OSS Update' cache ............................[done]
Retrieving repository 'Update Non-Oss' metadata ...................[done]
Building repository 'Update Non-Oss' cache ........................[done]
Loading repository data...
Reading installed packages...
No packages found.
bash-4.2#

Or even Mageia:

[root@jouet ~]# docker run -t -i mageia /bin/bash
[root@cb6ee54f2faa /]# urpmf udis86
http://mirrors.ustc.edu.cn/mageia/distrib/5/x86_64/media/core/release/media_info/20150615-211931-files.xml.lzma
http://mirrors.ustc.edu.cn/mageia/distrib/5/x86_64/media/core/updates/media_info/20160326-150702-files.xml.lzma
http://mirrors.ustc.edu.cn/mageia/distrib/5/i586/media/core/release/media_info/20150615-211537-files.xml.lzma
http://mirrors.ustc.edu.cn/mageia/distrib/5/i586/media/core/updates/media_info/20160326-150428-files.xml.lzma
[root@cb6ee54f2faa /]#

But then, use in perf could be a driver for that package to get
included, perhaps we can do the same we did for libbabeltrace? I.e.
leave it disabled by default?

Thanks,

- Arnaldo

2016-03-30 13:52:53

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 10/11] perf tools: Add probing for udev86 library


* Arnaldo Carvalho de Melo <[email protected]> wrote:

> Em Wed, Mar 30, 2016 at 12:43:27PM +0200, Ingo Molnar escreveu:
> > > From: Andi Kleen <[email protected]>
>
> > > Add autoprobing for the udev86 disassembler library.
>
> > So the typo in the title is confusing, what is 'udev86'?
>
> > Also, this library does not seem to be available on stock Ubuntu. We should not be
> > adding library dependencies that cannot be resolved on major distros:
>
> Ok, I'll remove, I thought it would be ok because I fired up:
>
> # dnf install udis86-devel
>
> On fedora and it installed straight away, but after I started trying to
> update my docker images I couldn't find it on debian
> experimental/unstable:

> Nor even in OpenSuSE:

> Or even Mageia:

Yeah, so udis86 also seems to be a pretty old, relatively stale library with no
support for new instructions AFAICS.

So I'd rather encourage librarizing one of the x86 instruction decoders in
arch/x86/, and adding pretty-printing functionality to it. The code can already
see instruction boundaries, which is the hardest part.

That would also be better supported on non-x86 architectures in the long run:

triton:~/tip> find arch/ -name insn.c | xargs ls -l
-rw-rw-r-- 1 mingo mingo 30244 Mar 29 11:24 arch/arm64/kernel/insn.c
-rw-rw-r-- 1 mingo mingo 1347 Dec 8 06:27 arch/arm/kernel/insn.c
-rw-rw-r-- 1 mingo mingo 15123 Mar 30 12:31 arch/x86/lib/insn.c

Such an in-kernel-repo library could also be used by live kernel debuggers such as
kgdb/kdb, oops/crash-time disassembly printout, etc.

... so how about that direction instead?

Thank,

Ingo

2016-03-30 14:11:00

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 10/11] perf tools: Add probing for udev86 library

Em Wed, Mar 30, 2016 at 03:52:46PM +0200, Ingo Molnar escreveu:
> * Arnaldo Carvalho de Melo <[email protected]> wrote:
> > Em Wed, Mar 30, 2016 at 12:43:27PM +0200, Ingo Molnar escreveu:
> > > > From: Andi Kleen <[email protected]>
> > > > Add autoprobing for the udev86 disassembler library.

> > > So the typo in the title is confusing, what is 'udev86'?

> > > Also, this library does not seem to be available on stock Ubuntu. We should not be
> > > adding library dependencies that cannot be resolved on major distros:

> > Ok, I'll remove, I thought it would be ok because I fired up:

> > # dnf install udis86-devel

> > On fedora and it installed straight away, but after I started trying to
> > update my docker images I couldn't find it on debian
> > experimental/unstable:

> > Nor even in OpenSuSE:

> > Or even Mageia:

> Yeah, so udis86 also seems to be a pretty old, relatively stale library with no
> support for new instructions AFAICS.

> So I'd rather encourage librarizing one of the x86 instruction decoders in
> arch/x86/, and adding pretty-printing functionality to it. The code can already

That was my first reaction too, but then, having some interesting
results right now, I thought, would either push development on the
udis86 code or show that it is not going anywhere and thus we better try
doing what you're suggesting, which was my initial suggestion to Andi.

> see instruction boundaries, which is the hardest part.

> That would also be better supported on non-x86 architectures in the long run:

> triton:~/tip> find arch/ -name insn.c | xargs ls -l
> -rw-rw-r-- 1 mingo mingo 30244 Mar 29 11:24 arch/arm64/kernel/insn.c
> -rw-rw-r-- 1 mingo mingo 1347 Dec 8 06:27 arch/arm/kernel/insn.c
> -rw-rw-r-- 1 mingo mingo 15123 Mar 30 12:31 arch/x86/lib/insn.c

> Such an in-kernel-repo library could also be used by live kernel debuggers such as
> kgdb/kdb, oops/crash-time disassembly printout, etc.

> ... so how about that direction instead?

I'd much rather see it in that direction too, for the time being, I'm
removing the udis86 patches from acme/perf/core,

- Arnaldo

2016-03-30 14:42:23

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 10/11] perf tools: Add probing for udev86 library

On Wed, Mar 30, 2016 at 12:43:27PM +0200, Ingo Molnar wrote:
>
> * Arnaldo Carvalho de Melo <[email protected]> wrote:
>
> > From: Andi Kleen <[email protected]>
> >
> > Add autoprobing for the udev86 disassembler library.
>
> So the typo in the title is confusing, what is 'udev86'?

Agreed, it's confusing. It was meant to be udis86.

http://udis86.sourceforge.net

The next patch that uses it had the URL too, but that didn't make it
when this patch was split off.

>
> Also, this library does not seem to be available on stock Ubuntu. We should not be
> adding library dependencies that cannot be resolved on major distros:
>
> ... get_cpuid: [ on ]
> ... bpf: [ on ]
> ... udis86: [ OFF ]

It's a chicken'n'egg problem. Likely when perf needs it they will add it soon.
For now it's fairly easy to install manually.

Using a real disassembler in perf has a lot of advantages over objdump. It allows
full instruction traces with PT, which is useful for debugging. It also
allows hot path analysis of branch mispredictions and automatic micro benchmarks
using LBRs (implemented inn https://lkml.org/lkml/2016/3/28/331). Possibly
more in the future.

-Andi

2016-03-30 14:47:30

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 10/11] perf tools: Add probing for udev86 library

Em Wed, Mar 30, 2016 at 11:10:53AM -0300, Arnaldo Carvalho de Melo escreveu:
> > Such an in-kernel-repo library could also be used by live kernel debuggers such as
> > kgdb/kdb, oops/crash-time disassembly printout, etc.
>
> > ... so how about that direction instead?
>
> I'd much rather see it in that direction too, for the time being, I'm
> removing the udis86 patches from acme/perf/core,

So, there it is, please consider pulling, same thing as
perf-core-for-mingo-20160329, modulo the udis86 stuff.

tag perf-core-for-mingo-20160330
Tagger: Arnaldo Carvalho de Melo <[email protected]>
Date: Wed Mar 30 11:28:40 2016 -0300

perf/core improvements and fixes:

User visible:

- Add support for skipping itrace instructions, useful to fast forward
processor trace (Intel PT, BTS) to right after initialization code at the start
of a workload (Andi Kleen)

- Add support for backtraces in perl 'perf script's (Dima Kogan)

- Add -U/-K (--all-user/--all-kernel) options to 'perf mem' (Jiri Olsa)

- Make -f/--force option documentation consistent across tools (Jiri Olsa)

Infrastructure:

- Add 'perf test' to check for event times (Jiri Olsa)

- 'perf config' cleanups (Taeung Song)

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
-----BEGIN PGP SIGNATURE-----

2016-03-30 15:08:48

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 10/11] perf tools: Add probing for udev86 library

>
> Yeah, so udis86 also seems to be a pretty old, relatively stale library with no
> support for new instructions AFAICS.

There are lots of new instructions in pull requests on github.

But yes the author seems to be a bit slow in pulling.

>
> So I'd rather encourage librarizing one of the x86 instruction decoders in
> arch/x86/, and adding pretty-printing functionality to it. The code can already
> see instruction boundaries, which is the hardest part.
>
> That would also be better supported on non-x86 architectures in the long run:
>
> triton:~/tip> find arch/ -name insn.c | xargs ls -l
> -rw-rw-r-- 1 mingo mingo 30244 Mar 29 11:24 arch/arm64/kernel/insn.c
> -rw-rw-r-- 1 mingo mingo 1347 Dec 8 06:27 arch/arm/kernel/insn.c
> -rw-rw-r-- 1 mingo mingo 15123 Mar 30 12:31 arch/x86/lib/insn.c
>
> Such an in-kernel-repo library could also be used by live kernel debuggers such as
> kgdb/kdb, oops/crash-time disassembly printout, etc.
>
> ... so how about that direction instead?

It's a major project. Who is gonna work on it? Are you volunteering?

Longer term I agree it would be reasonable (if someone can be found to work on it),
but short term udis86 is there and works today.

-Andi

2016-03-31 06:34:47

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 10/11] perf tools: Add probing for udev86 library


* Arnaldo Carvalho de Melo <[email protected]> wrote:

> Em Wed, Mar 30, 2016 at 11:10:53AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > Such an in-kernel-repo library could also be used by live kernel debuggers such as
> > > kgdb/kdb, oops/crash-time disassembly printout, etc.
> >
> > > ... so how about that direction instead?
> >
> > I'd much rather see it in that direction too, for the time being, I'm
> > removing the udis86 patches from acme/perf/core,
>
> So, there it is, please consider pulling, same thing as
> perf-core-for-mingo-20160329, modulo the udis86 stuff.
>
> tag perf-core-for-mingo-20160330
> Tagger: Arnaldo Carvalho de Melo <[email protected]>
> Date: Wed Mar 30 11:28:40 2016 -0300
>
> perf/core improvements and fixes:
>
> User visible:
>
> - Add support for skipping itrace instructions, useful to fast forward
> processor trace (Intel PT, BTS) to right after initialization code at the start
> of a workload (Andi Kleen)
>
> - Add support for backtraces in perl 'perf script's (Dima Kogan)
>
> - Add -U/-K (--all-user/--all-kernel) options to 'perf mem' (Jiri Olsa)
>
> - Make -f/--force option documentation consistent across tools (Jiri Olsa)
>
> Infrastructure:
>
> - Add 'perf test' to check for event times (Jiri Olsa)
>
> - 'perf config' cleanups (Taeung Song)
>
> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

Pulled into tip:perf/core, thanks a lot Arnaldo!

Ingo

2016-03-31 06:49:46

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 10/11] perf tools: Add probing for udev86 library


* Andi Kleen <[email protected]> wrote:

> >
> > Yeah, so udis86 also seems to be a pretty old, relatively stale library with no
> > support for new instructions AFAICS.
>
> There are lots of new instructions in pull requests on github.
>
> But yes the author seems to be a bit slow in pulling.
>
> >
> > So I'd rather encourage librarizing one of the x86 instruction decoders in
> > arch/x86/, and adding pretty-printing functionality to it. The code can already
> > see instruction boundaries, which is the hardest part.
> >
> > That would also be better supported on non-x86 architectures in the long run:
> >
> > triton:~/tip> find arch/ -name insn.c | xargs ls -l
> > -rw-rw-r-- 1 mingo mingo 30244 Mar 29 11:24 arch/arm64/kernel/insn.c
> > -rw-rw-r-- 1 mingo mingo 1347 Dec 8 06:27 arch/arm/kernel/insn.c
> > -rw-rw-r-- 1 mingo mingo 15123 Mar 30 12:31 arch/x86/lib/insn.c
> >
> > Such an in-kernel-repo library could also be used by live kernel debuggers such as
> > kgdb/kdb, oops/crash-time disassembly printout, etc.
> >
> > ... so how about that direction instead?
>
> It's a major project. Who is gonna work on it? Are you volunteering?

As a maintainer I don't have much free time and me writing code for every
technical disagreement doesn't scale in any case, but I'm responsible for creating
the environment and incentives for people to eventually work on such enhancements.

My main tool to create such an environment is to convince people, if that tool
doesn't work then my secondary tool is to say 'no'. If we merge udis86 support we
remove one big incentive for people to factor out and enhance the already existing
in-kernel disassemblers - not good.

Also, you are exaggerating the technical difficulties a lot (which is sadly a well
known modus operandi of yours when you don't get your way in technical
discussions, and which conflicts with the Linux project's 'can do' attitude
frequently), I've seen in-kernel disassembler pretty-printing patches on lkml as
well, so clearly it's possible and desirable.

Thanks,

Ingo