2015-02-10 23:43:08

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH 0/4] perf: add support for profiling jitted code

This patch series extends perf record/report/annotate to enable
profiling of jitted (just-in-time compiled) code. The current
perf tool provides very limited support for profiling jitted
code for some runtime environments. But the support is experimental
and cannot be used in complex environments. It relies on files
in /tmp, for instance. It does not support annotate mode or
rejitted code.

This patch series adds a better way of profiling jitted code
with the following advantages:
- support any jitted code environment (some with modifications)
- support Java runtime with JVMTI interface with no modifications
- provides a portable JVMTI agent library
- known to support V8 runtime
- known to support DART runtime
- supports code rejitting and movements
- no files in /tmp
- meta-data file is unique to each run
- no changes to perf report/annotate

The support is based on cooperation with the runtime. For Java runtimes,
supporting the JVMTI interface, there is no change necessary. For other
runtimes, modifications are necessary to emit the meta-data necessary
to support symbolization and annotation of the samples. Those modifications
are fairly straighforward and already tested on V8 and DART.

The jit environment emits a binary dump file which contains the jitted
code (in raw format) and meta-data describing the mapping of functions.
The binary format is documented in the jitdump.h header file. It is
adapted from the OProfile jitdump format.

To enable synchronization of the runtime MMAPs with those recorded by
the kernel on behalf of the perf tool, the runtime needs to timestamp
any record in the dump file using the same time source. The current
patch series is relying on Sonny Rao's posix-clock patch series
posted on LKML in 2013. The link to the patches is:
https://lkml.org/lkml/2013/12/6/1044

Without this driver installed, records emitted by the runtime cannot
be properly synchronized (inserted in the flow on MMAPS) and symbolization
may be incorrect, especially for runtimes moving code around.

The current support only works when the runtime is monitored from
start to finish: perf record java --agenpath:libpfmjvmti.so my_class.

Once the run is completed, the jitdump file needs to be injected into
the perf.data file. This is accomplished by using the perf inject command.
This will also generate an ELF image for each jitted function. The
inject MMAP records will point to those ELF images. The reasoning
behind using ELF images is that it makes processing for perf report
and annotate automatic and transparent. It also makes it easier to
package and analyze on a remote machine.

The reporting is unchanged, simply invoke perf report or perf annotate
on the modified perf.data file. The jitted code will appear symbolized
and the assembly view will display the instruction level profile!

As an added bonus, the series includes support for demangling function
signature from OpenJDK.

The current patch series does not include support for source line
information. Therefore, the source code cannot yet be displayed
in perf annotate. This will come in a later update.

Furthermore, we believe there is a way to skip the perf inject phase
and have perf report/annotate directly inject the MMAP records
on the fly during processing of the perf.data file. Perf report would
also generate the ELF files if necessary. Such optimization, would
make using this extension seamless in system-wide mode and larger
environments. This will be added in a later update as well.

To use the new feature:
- install the posix clock driver
- make sure you chmod 644 /dev/trace_clock
- compile perf
- cd tools/perf/jvmti; make; install wherever is appropriate

Example using openJDK:
$ perf record java -agentpath:libjvmti.so my_class
java: jvmti: jitdump in $HOME/.debug/jit/java-jit-20150207.XXL9649H/jit-6320.dump
$ perf inject -i perf.data -j $HOME/.debug/jit/java-jit-20150207.XXL9649H/jit-6320.dump -o perf.data.jitted
$ perf report -i perf.data.jitted

Thanks to all the contributors and testers.

Enjoy,

Stephane Eranian (4):
perf tools: add Java demangling support
perf tools: pass timestamp to map_init
perf inject: add jitdump mmap injection support
perf tools: add JVMTI agent library

tools/perf/Documentation/perf-inject.txt | 11 +
tools/perf/Makefile.perf | 6 +-
tools/perf/builtin-inject.c | 205 ++++++++++++++
tools/perf/jvmti/Makefile | 70 +++++
tools/perf/jvmti/jvmti_agent.c | 349 +++++++++++++++++++++++
tools/perf/jvmti/jvmti_agent.h | 23 ++
tools/perf/jvmti/libjvmti.c | 149 ++++++++++
tools/perf/util/genelf.c | 463 +++++++++++++++++++++++++++++++
tools/perf/util/genelf.h | 6 +
tools/perf/util/jit.h | 27 ++
tools/perf/util/jitdump.c | 233 ++++++++++++++++
tools/perf/util/jitdump.h | 92 ++++++
tools/perf/util/machine.c | 8 +-
tools/perf/util/map.c | 9 +-
tools/perf/util/map.h | 5 +-
tools/perf/util/symbol-elf.c | 2 +
tools/perf/util/symbol.c | 195 ++++++++++++-
tools/perf/util/symbol.h | 1 +
18 files changed, 1844 insertions(+), 10 deletions(-)
create mode 100644 tools/perf/jvmti/Makefile
create mode 100644 tools/perf/jvmti/jvmti_agent.c
create mode 100644 tools/perf/jvmti/jvmti_agent.h
create mode 100644 tools/perf/jvmti/libjvmti.c
create mode 100644 tools/perf/util/genelf.c
create mode 100644 tools/perf/util/genelf.h
create mode 100644 tools/perf/util/jit.h
create mode 100644 tools/perf/util/jitdump.c
create mode 100644 tools/perf/util/jitdump.h

--
1.9.1


2015-02-10 23:44:05

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH 1/4] perf tools: add Java demangling support

Add Java function descriptor demangling support.
Something bfd cannot do.

Signed-off-by: Stephane Eranian <[email protected]>
---
tools/perf/util/symbol-elf.c | 2 +
tools/perf/util/symbol.c | 195 ++++++++++++++++++++++++++++++++++++++++++-
tools/perf/util/symbol.h | 1 +
3 files changed, 197 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index b24f9d8..71ee49b 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1019,6 +1019,8 @@ int dso__load_sym(struct dso *dso, struct map *map,
demangle_flags = DMGL_PARAMS | DMGL_ANSI;

demangled = bfd_demangle(NULL, elf_name, demangle_flags);
+ if (demangled == NULL)
+ demangled = java_demangle_sym(elf_name);
if (demangled != NULL)
elf_name = demangled;
}
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index a690668..ef50343 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -9,6 +9,7 @@
#include <fcntl.h>
#include <unistd.h>
#include <inttypes.h>
+#include <bfd.h>
#include "build-id.h"
#include "util.h"
#include "debug.h"
@@ -1241,6 +1242,195 @@ int dso__load_kallsyms(struct dso *dso, const char *filename,
return dso__split_kallsyms(dso, map, delta, filter);
}

+enum {
+ MODE_PREFIX=0,
+ MODE_CLASS=1,
+ MODE_FUNC=2,
+ MODE_TYPE=3,
+ MODE_CTYPE=3, /* class arg */
+};
+
+#define BASE_ENT(c, n) [c-'A']=n
+static const char *base_types['Z'-'A' + 1]={
+ BASE_ENT('B', "byte" ),
+ BASE_ENT('C', "char" ),
+ BASE_ENT('D', "double" ),
+ BASE_ENT('F', "float" ),
+ BASE_ENT('I', "int" ),
+ BASE_ENT('J', "long" ),
+ BASE_ENT('S', "short" ),
+ BASE_ENT('Z', "bool" ),
+};
+
+/*
+ * demangle Java symbol between str and end positions and stores
+ * up to maxlen characters into buf. The parser starts in mode.
+ *
+ * Use MODE_PREFIX to process entire prototype till end position
+ * Use MODE_TYPE to process return type if str starts on return type char
+ *
+ * Return:
+ * success: buf
+ * error : NULL
+ */
+static char *
+__demangle_java_sym(const char *str, const char *end, char *buf, int maxlen, int mode)
+{
+ int rlen = 0;
+ int array = 0;
+ int narg = 0;
+ const char *q;
+
+ if (!end)
+ end = str + strlen(str);
+
+ for (q = str; q != end; q++) {
+
+ if (rlen == (maxlen - 1))
+ break;
+
+ switch (*q) {
+ case 'L':
+ if (mode == MODE_PREFIX || mode == MODE_CTYPE) {
+ if (mode == MODE_CTYPE) {
+ if (narg)
+ rlen += snprintf(buf+rlen, maxlen - rlen, ", ");
+ narg++;
+ }
+ rlen += snprintf(buf+rlen, maxlen - rlen, "class ");
+ if (mode == MODE_PREFIX)
+ mode = MODE_CLASS;
+ } else
+ buf[rlen++] = *q;
+ break;
+ case 'B':
+ case 'C':
+ case 'D':
+ case 'F':
+ case 'I':
+ case 'J':
+ case 'S':
+ case 'Z':
+ if (mode == MODE_TYPE) {
+ if (narg)
+ rlen += snprintf(buf+rlen, maxlen - rlen, ", ");
+ rlen += snprintf(buf+rlen, maxlen - rlen, "%s", base_types[*q - 'A']);
+ while(array--)
+ rlen += snprintf(buf+rlen, maxlen - rlen, "[]");
+ array = 0;
+ narg++;
+ } else
+ buf[rlen++] = *q;
+ break;
+ case 'V':
+ if (mode == MODE_TYPE) {
+ rlen += snprintf(buf+rlen, maxlen - rlen, "void");
+ while(array--)
+ rlen += snprintf(buf+rlen, maxlen - rlen, "[]");
+ array = 0;
+ } else
+ buf[rlen++] = *q;
+ break;
+ case '[':
+ if (mode != MODE_TYPE)
+ goto error;
+ array++;
+ break;
+ case '(':
+ if (mode != MODE_FUNC)
+ goto error;
+ buf[rlen++] = *q;
+ mode = MODE_TYPE;
+ break;
+ case ')':
+ if (mode != MODE_TYPE)
+ goto error;
+ buf[rlen++] = *q;
+ narg = 0;
+ break;
+ case ';':
+ if (mode != MODE_CLASS && mode != MODE_CTYPE)
+ goto error;
+ /* safe because at least one other char to process */
+ if (isalpha(*(q+1)))
+ rlen += snprintf(buf+rlen, maxlen - rlen, ".");
+ if (mode == MODE_CLASS)
+ mode = MODE_FUNC;
+ else if (mode == MODE_CTYPE)
+ mode = MODE_TYPE;
+ break;
+ case '/':
+ if (mode != MODE_CLASS && mode != MODE_CTYPE)
+ goto error;
+ rlen += snprintf(buf+rlen, maxlen - rlen, ".");
+ break;
+ default :
+ buf[rlen++] = *q;
+ }
+ }
+ buf[rlen] = '\0';
+ return buf;
+error:
+ return NULL;
+}
+
+/*
+ * Demangle Java function signature (Hotspot, not GCJ)
+ * input:
+ * str: string to parse. String is not modified
+ * return:
+ * if can demangle then a a newly allocate string is returned.
+ * if cannot demangle, then NULL is returned
+ *
+ * Note that caller is responsible for freeing demangled string
+ */
+char *
+java_demangle_sym(const char *str)
+{
+ char *buf, *ptr;
+ char *p;
+ size_t len, l1;
+
+ if (!str)
+ return NULL;
+
+ /* find start of retunr type */
+ p = strrchr(str, ')');
+ if (!p)
+ return NULL;
+
+ /*
+ * expansion factor estimated to 3x
+ */
+ len = strlen(str) * 3 + 1;
+ buf = malloc(len);
+ if (!buf)
+ return NULL;
+
+ buf[0] = '\0';
+ /*
+ * get return type first
+ */
+ ptr = __demangle_java_sym(p+1, NULL, buf, len, MODE_TYPE);
+ if (!ptr)
+ goto error;
+
+ /* add space between return type and function prototype */
+ l1 = strlen(buf);
+ buf[l1++] = ' ';
+
+ /* process function up to return type */
+ ptr = __demangle_java_sym(str, p + 1, buf + l1, len - l1, MODE_PREFIX);
+ if (!ptr)
+ goto error;
+
+ return buf;
+error:
+ free(buf);
+ return NULL;
+}
+
+
static int dso__load_perf_map(struct dso *dso, struct map *map,
symbol_filter_t filter)
{
@@ -1257,6 +1447,7 @@ static int dso__load_perf_map(struct dso *dso, struct map *map,
u64 start, size;
struct symbol *sym;
int line_len, len;
+ char *name;

line_len = getline(&line, &n, file);
if (line_len < 0)
@@ -1279,7 +1470,9 @@ static int dso__load_perf_map(struct dso *dso, struct map *map,
if (len + 2 >= line_len)
continue;

- sym = symbol__new(start, size, STB_GLOBAL, line + len);
+ name = line + len;
+
+ sym = symbol__new(start, size, STB_GLOBAL, name);

if (sym == NULL)
goto out_delete_line;
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 1650dcb..2a6e23e 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -295,4 +295,5 @@ int compare_proc_modules(const char *from, const char *to);
int setup_list(struct strlist **list, const char *list_str,
const char *list_name);

+char * java_demangle_sym(const char *str);
#endif /* __PERF_SYMBOL */
--
1.9.1

2015-02-10 23:43:17

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH 2/4] perf tools: pass timestamp to map_init

This patch passes the sample timestamp down to the
map_init function. This is used to sort mmap records.

Signed-off-by: Stephane Eranian <[email protected]>
---
tools/perf/util/machine.c | 8 ++++++--
tools/perf/util/map.c | 9 +++++----
tools/perf/util/map.h | 5 +++--
3 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 1bca3a9..3288a71 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1176,7 +1176,9 @@ int machine__process_mmap2_event(struct machine *machine,
event->mmap2.ino_generation,
event->mmap2.prot,
event->mmap2.flags,
- event->mmap2.filename, type, thread);
+ event->mmap2.filename, type,
+ thread,
+ sample->time);

if (map == NULL)
goto out_problem;
@@ -1223,7 +1225,9 @@ int machine__process_mmap_event(struct machine *machine, union perf_event *event
event->mmap.len, event->mmap.pgoff,
event->mmap.pid, 0, 0, 0, 0, 0, 0,
event->mmap.filename,
- type, thread);
+ type,
+ thread,
+ sample->time);

if (map == NULL)
goto out_problem;
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 62ca9f2..fac9d34 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -123,7 +123,7 @@ static inline bool replace_android_lib(const char *filename, char *newfilename)
}

void map__init(struct map *map, enum map_type type,
- u64 start, u64 end, u64 pgoff, struct dso *dso)
+ u64 start, u64 end, u64 pgoff, struct dso *dso, u64 ts)
{
map->type = type;
map->start = start;
@@ -137,12 +137,13 @@ void map__init(struct map *map, enum map_type type,
map->groups = NULL;
map->referenced = false;
map->erange_warned = false;
+ map->time = ts;
}

struct map *map__new(struct machine *machine, u64 start, u64 len,
u64 pgoff, u32 pid, u32 d_maj, u32 d_min, u64 ino,
u64 ino_gen, u32 prot, u32 flags, char *filename,
- enum map_type type, struct thread *thread)
+ enum map_type type, struct thread *thread, u64 ts)
{
struct map *map = malloc(sizeof(*map));

@@ -182,7 +183,7 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
if (dso == NULL)
goto out_delete;

- map__init(map, type, start, start + len, pgoff, dso);
+ map__init(map, type, start, start + len, pgoff, dso, ts);

if (anon || no_dso) {
map->map_ip = map->unmap_ip = identity__map_ip;
@@ -215,7 +216,7 @@ struct map *map__new2(u64 start, struct dso *dso, enum map_type type)
/*
* ->end will be filled after we load all the symbols
*/
- map__init(map, type, start, 0, 0, dso);
+ map__init(map, type, start, 0, 0, dso, 0);
}

return map;
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 0e42438..9320bf8 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -42,6 +42,7 @@ struct map {
u32 maj, min; /* only valid for MMAP2 record */
u64 ino; /* only valid for MMAP2 record */
u64 ino_generation;/* only valid for MMAP2 record */
+ u64 time;

/* ip -> dso rip */
u64 (*map_ip)(struct map *, u64);
@@ -135,11 +136,11 @@ struct thread;
typedef int (*symbol_filter_t)(struct map *map, struct symbol *sym);

void map__init(struct map *map, enum map_type type,
- u64 start, u64 end, u64 pgoff, struct dso *dso);
+ u64 start, u64 end, u64 pgoff, struct dso *dso, u64 ts);
struct map *map__new(struct machine *machine, u64 start, u64 len,
u64 pgoff, u32 pid, u32 d_maj, u32 d_min, u64 ino,
u64 ino_gen, u32 prot, u32 flags,
- char *filename, enum map_type type, struct thread *thread);
+ char *filename, enum map_type type, struct thread *thread, u64 time);
struct map *map__new2(u64 start, struct dso *dso, enum map_type type);
void map__delete(struct map *map);
struct map *map__clone(struct map *map);
--
1.9.1

2015-02-10 23:43:32

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH 3/4] perf inject: add jitdump mmap injection support

This patch adds a -j jitdump option to perf inject.

This options injects MMAP records into the perf.data
file to cover the jitted code mmaps. It also emits
ELF images for each function in the jidump file.
Those images are created where the jitdump file is.
The MMAP records point to that location as well.

Typical flow:
$ java -agentpath:libpjvmti.so java_class
$ perf inject -j ~/.debug/jit/java-jit-20140514.XXAb0e5C/jit-7640.dump \
-i perf.data \
-o perf.data.jitted

$ perf report -i perf.data.jitted

Note that jitdump.h support is not limited to Java, it works with
any jitted environment modified to emit the jitdump file format,
include those where code can be jitted multiple times and moved
around.

The jitdump.h format is adapted from the Oprofile project.

Signed-off-by: Stephane Eranian <[email protected]>
---
tools/perf/Documentation/perf-inject.txt | 11 +
tools/perf/Makefile.perf | 6 +-
tools/perf/builtin-inject.c | 205 ++++++++++++++
tools/perf/util/genelf.c | 463 +++++++++++++++++++++++++++++++
tools/perf/util/genelf.h | 6 +
tools/perf/util/jit.h | 27 ++
tools/perf/util/jitdump.c | 233 ++++++++++++++++
tools/perf/util/jitdump.h | 92 ++++++
8 files changed, 1042 insertions(+), 1 deletion(-)
create mode 100644 tools/perf/util/genelf.c
create mode 100644 tools/perf/util/genelf.h
create mode 100644 tools/perf/util/jit.h
create mode 100644 tools/perf/util/jitdump.c
create mode 100644 tools/perf/util/jitdump.h

diff --git a/tools/perf/Documentation/perf-inject.txt b/tools/perf/Documentation/perf-inject.txt
index dc7442c..237f195 100644
--- a/tools/perf/Documentation/perf-inject.txt
+++ b/tools/perf/Documentation/perf-inject.txt
@@ -40,6 +40,17 @@ OPTIONS
Merge sched_stat and sched_switch for getting events where and how long
tasks slept. sched_switch contains a callchain where a task slept and
sched_stat contains a timeslice how long a task slept.
+-j::
+--jit::
+ Merge a jitdump file into the perf.data file by adding mmap records to
+ cover jitted code and emit ELF images for each jitted function. The ELF
+ images are saved in the same directory as the jidump. Use -E to suppress
+ ELF images generation.
+-E::
+--jit-disable-elf::
+ When used with -, it prevents creating the ELF images for each jitted
+ function. Only the jitted code mmap records are injected into the perf.data
+ file. Option as no effect when -j is not used.

--kallsyms=<file>::
kallsyms pathname
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index aa6a504..c86c412 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -324,6 +324,7 @@ LIB_H += util/perf_regs.h
LIB_H += util/unwind.h
LIB_H += util/vdso.h
LIB_H += util/tsc.h
+LIB_H += util/jitdump.h
LIB_H += ui/helpline.h
LIB_H += ui/progress.h
LIB_H += ui/util.h
@@ -410,6 +411,8 @@ LIB_OBJS += $(OUTPUT)util/data.o
LIB_OBJS += $(OUTPUT)util/tsc.o
LIB_OBJS += $(OUTPUT)util/cloexec.o
LIB_OBJS += $(OUTPUT)util/thread-stack.o
+LIB_OBJS += $(OUTPUT)util/jitdump.o
+LIB_OBJS += $(OUTPUT)util/genelf.o

LIB_OBJS += $(OUTPUT)ui/setup.o
LIB_OBJS += $(OUTPUT)ui/helpline.o
@@ -496,7 +499,8 @@ BUILTIN_OBJS += $(OUTPUT)builtin-inject.o
BUILTIN_OBJS += $(OUTPUT)tests/builtin-test.o
BUILTIN_OBJS += $(OUTPUT)builtin-mem.o

-PERFLIBS = $(LIB_FILE) $(LIBAPIKFS) $(LIBTRACEEVENT)
+PERFLIBS = $(LIB_FILE) $(LIBAPIKFS) $(LIBTRACEEVENT) -lcrypto
+

# We choose to avoid "if .. else if .. else .. endif endif"
# because maintaining the nesting to match is a pain. If
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index a13641e..436203c 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -16,6 +16,8 @@
#include "util/debug.h"
#include "util/build-id.h"
#include "util/data.h"
+#include "util/jit.h"
+#include "util/genelf.h"

#include "util/parse-options.h"

@@ -30,6 +32,12 @@ struct perf_inject {
struct perf_data_file output;
u64 bytes_written;
struct list_head samples;
+
+ const char *jit_filename;
+ FILE *jit_file;
+ struct jit_buf_desc jit_desc;
+ char jit_dir[PATH_MAX];
+ bool jit_disable_elf;
};

struct event_entry {
@@ -334,7 +342,170 @@ static int perf_evsel__check_stype(struct perf_evsel *evsel,
name, sample_msg);
return -EINVAL;
}
+ return 0;
+}
+
+static int jit_emit_elf(char *filename,
+ const char *sym,
+ unsigned long code,
+ int csize)
+{
+ int ret, fd;
+ unsigned long addr = (unsigned long)code;
+
+ fd = open(filename, O_CREAT|O_TRUNC|O_WRONLY, 0644);
+ if (fd == -1) {
+ pr_warning("cannot create jit ELF %s: %s\n", filename, strerror(errno));
+ return -1;
+ }
+
+ ret = jit_write_elf(fd, addr, sym, (const void *)code, csize);
+
+ close(fd);
+
+ if (ret)
+ unlink(filename);
+
+ return ret;
+}
+
+static int jit_repipe_code_load(struct perf_inject *inject, union jr_entry *jr)
+{
+ struct perf_sample sample;
+ union perf_event *event;
+ unsigned long code, addr;
+ size_t size;
+ const char *sym;
+ uint32_t count;
+ int ret, csize;
+ pid_t pid;
+ struct {
+ u32 pid, tid;
+ u64 time;
+ } *id;
+
+ pid = jr->load.pid;
+ csize = jr->load.code_size;
+ addr = jr->load.code_addr;
+ sym = (void *)((unsigned long)jr + sizeof(jr->load));
+ code = (unsigned long)jr + jr->load.p.total_size - csize;
+ count = jr->load.code_index;
+
+ /*
+ * +16 to account for sample_id_all (hack)
+ */
+ event = malloc(sizeof(*event) + 16);
+ if (!event)
+ return -1;
+
+ memset(event, 0, sizeof(*event));
+
+ size = snprintf(event->mmap.filename, PATH_MAX, "%s/jitted-%d-%u",
+ inject->jit_dir,
+ pid,
+ count) + 1;
+ size = PERF_ALIGN(size, sizeof(u64));
+ if (!inject->jit_disable_elf) {
+ ret = jit_emit_elf(event->mmap.filename, sym, code, csize);
+ if (ret) {
+ free(event);
+ return -1;
+ }
+ }
+
+ event->mmap.header.type = PERF_RECORD_MMAP;
+ event->mmap.header.misc = PERF_RECORD_MISC_USER;
+ //event->mmap.header.size = sizeof(event->mmap) + 16;
+ event->mmap.header.size = (sizeof(event->mmap) -
+ (sizeof(event->mmap.filename) - size) + 16); //machine->id_hdr_size);
+ event->mmap.pgoff = 0;
+ event->mmap.start = addr;
+ event->mmap.len = csize;
+ event->mmap.pid = pid;
+ event->mmap.tid = jr->load.tid;
+
+ id = (void *)((unsigned long)event + event->mmap.header.size - 16);
+ id->pid = pid;
+ id->tid = jr->load.tid;
+ id->time = jr->load.p.timestamp;
+
+ memset(&sample, 0, sizeof(sample));
+ sample.time = id->time;
+
+ return perf_event__repipe_synth(&inject->tool, event);
+}
+
+static int jit_repipe_code_move(struct perf_inject *inject, union jr_entry *jr)
+{
+ struct perf_sample sample;
+ union perf_event *event;
+ pid_t pid;
+ struct {
+ u32 pid, tid;
+ u64 time;
+ } *id;
+
+ pid = jr->move.pid;
+
+ /*
+ * +16 to account for sample_id_all (hack)
+ */
+ event = malloc(sizeof(*event) + 16);
+ if (!event)
+ return -1;
+
+ memset(event, 0, sizeof(*event));
+
+ snprintf(event->mmap.filename, PATH_MAX, "%s/jitted-%d-%"PRIu64,
+ inject->jit_dir,
+ pid,
+ jr->move.code_index);
+
+ event->mmap.header.type = PERF_RECORD_MMAP;
+ event->mmap.header.misc = PERF_RECORD_MISC_USER;
+ event->mmap.header.size = sizeof(event->mmap) + 16;
+ event->mmap.pgoff = 0;
+ event->mmap.start = jr->move.new_code_addr;
+ event->mmap.len = jr->move.code_size;
+ event->mmap.pid = pid;
+ event->mmap.tid = jr->move.tid;
+
+ id = (void *)((unsigned long)event + sizeof(event->mmap));
+ id->pid = pid;
+ id->tid = jr->move.tid;
+ id->time = jr->move.p.timestamp;
+
+ memset(&sample, 0, sizeof(sample));
+ sample.time = id->time;
+
+ return perf_event__repipe_synth(&inject->tool, event);
+}
+
+static int perf_jit_inject(struct perf_inject *inject)
+{
+ union jr_entry *jr;
+
+ strncpy(inject->jit_dir, inject->jit_filename, PATH_MAX);
+ dirname(inject->jit_dir);
+
+ if (!strcmp(inject->jit_filename, inject->jit_dir)) {
+ inject->jit_dir[0] = '.';
+ inject->jit_dir[1] = '\0';
+ }

+ while ((jr = jit_get_next_entry(&inject->jit_desc))) {
+ switch(jr->prefix.id) {
+ case JIT_CODE_LOAD:
+ jit_repipe_code_load(inject, jr);
+ break;
+ case JIT_CODE_MOVE:
+ pr_warning("CODE_MOVE\n");
+ jit_repipe_code_move(inject, jr);
+ break;
+ default:
+ continue;
+ }
+ }
return 0;
}

@@ -381,6 +552,9 @@ static int __cmd_inject(struct perf_inject *inject)

ret = perf_session__process_events(session, &inject->tool);

+ if (ret == 0 && inject->jit_filename)
+ ret = perf_jit_inject(inject);
+
if (!file_out->is_pipe) {
if (inject->build_ids)
perf_header__set_feat(&session->header,
@@ -418,6 +592,8 @@ int cmd_inject(int argc, const char **argv, const char *prefix __maybe_unused)
.path = "-",
.mode = PERF_DATA_MODE_WRITE,
},
+ .jit_disable_elf = false,
+
};
struct perf_data_file file = {
.mode = PERF_DATA_MODE_READ,
@@ -434,6 +610,10 @@ int cmd_inject(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_BOOLEAN('s', "sched-stat", &inject.sched_stat,
"Merge sched-stat and sched-switch for getting events "
"where and how long tasks slept"),
+ OPT_STRING('j', "jit", &inject.jit_filename, "merge jitdump file",
+ "input file name"),
+ OPT_BOOLEAN('E', "jit-disable-elf", &inject.jit_disable_elf,
+ "Do not emit ELF images from jitdump file"),
OPT_INCR('v', "verbose", &verbose,
"be more verbose (show build ids, etc)"),
OPT_STRING(0, "kallsyms", &symbol_conf.kallsyms_name, "file",
@@ -463,6 +643,28 @@ int cmd_inject(int argc, const char **argv, const char *prefix __maybe_unused)
if (inject.session == NULL)
return -1;

+ if (inject.build_ids) {
+ /*
+ * to make sure the mmap records are ordered correctly
+ * and so that the correct especially due to jitted code
+ * mmaps. We cannot generate the buildid hit list and
+ * inject the jit mmaps at the same time for now.
+ */
+ inject.tool.ordered_events = true;
+ inject.tool.ordering_requires_timestamps = true;
+ }
+
+ if (inject.jit_filename) {
+ inject.tool.ordered_events = true;
+ inject.tool.ordering_requires_timestamps = true;
+ ret = jit_open_dump(inject.jit_filename, &inject.jit_desc);
+ if (ret) {
+ fprintf(stderr, "cannot open jitdump file %s\n", inject.jit_filename);
+ return -1;
+ }
+ }
+
+
if (symbol__init(&inject.session->header.env) < 0)
return -1;

@@ -470,5 +672,8 @@ int cmd_inject(int argc, const char **argv, const char *prefix __maybe_unused)

perf_session__delete(inject.session);

+ if (inject.jit_filename)
+ jit_close_dump(&inject.jit_desc);
+
return ret;
}
diff --git a/tools/perf/util/genelf.c b/tools/perf/util/genelf.c
new file mode 100644
index 0000000..a6f9e43
--- /dev/null
+++ b/tools/perf/util/genelf.c
@@ -0,0 +1,463 @@
+/*
+ * genelf.c
+ * Copyright (C) 2014, Google, Inc
+ *
+ * Contributed by:
+ * Stephane Eranian <[email protected]>
+ *
+ * Released under the GPL v2. (and only v2, not any later version)
+ */
+
+#include <sys/types.h>
+#include <stdio.h>
+#include <getopt.h>
+#include <stddef.h>
+#include <libelf.h>
+#include <string.h>
+#include <stdlib.h>
+#include <fcntl.h>
+#include <err.h>
+
+#include "perf.h"
+#include "genelf.h"
+
+#define JVMTI
+#define BUILD_ID_MD5
+#undef BUILD_ID_SHA /* does not seem to work well when linked with Java */
+#undef BUILD_ID_URANDOM /* different uuid for each run */
+
+#ifdef BUILD_ID_SHA
+#include <openssl/sha.h>
+#endif
+
+#ifdef BUILD_ID_MD5
+#include <openssl/md5.h>
+#endif
+
+#if defined(__arm__)
+#define GEN_ELF_ARCH EM_ARM
+#define GEN_ELF_ENDIAN ELFDATA2LSB
+#define GEN_ELF_CLASS ELFCLASS32
+#elif defined(__x86_64__)
+#define GEN_ELF_ARCH EM_X86_64
+#define GEN_ELF_ENDIAN ELFDATA2LSB
+#define GEN_ELF_CLASS ELFCLASS64
+#elif defined(__i386__)
+#define GEN_ELF_ARCH EM_386
+#define GEN_ELF_ENDIAN ELFDATA2LSB
+#define GEN_ELF_CLASS ELFCLASS32
+#elif defined(__ppcle__)
+#define GEN_ELF_ARCH EM_PPC
+#define GEN_ELF_ENDIAN ELFDATA2LSB
+#define GEN_ELF_CLASS ELFCLASS64
+#elif defined(__powerpc__)
+#define GEN_ELF_ARCH EM_PPC64
+#define GEN_ELF_ENDIAN ELFDATA2MSB
+#define GEN_ELF_CLASS ELFCLASS64
+#elif defined(__powerpcle__)
+#define GEN_ELF_ARCH EM_PPC64
+#define GEN_ELF_ENDIAN ELFDATA2LSB
+#define GEN_ELF_CLASS ELFCLASS64
+#else
+#error "unsupported architecture"
+#endif
+
+#if GEN_ELF_CLASS == ELFCLASS64
+#define elf_newehdr elf64_newehdr
+#define elf_getshdr elf64_getshdr
+#define Elf_Ehdr Elf64_Ehdr
+#define Elf_Shdr Elf64_Shdr
+#define Elf_Sym Elf64_Sym
+#define ELF_ST_TYPE(a) ELF64_ST_TYPE(a)
+#define ELF_ST_BIND(a) ELF64_ST_BIND(a)
+#define ELF_ST_VIS(a) ELF64_ST_VISIBILITY(a)
+#else
+#define elf_newehdr elf32_newehdr
+#define elf_getshdr elf32_getshdr
+#define Elf_Ehdr Elf32_Ehdr
+#define Elf_Shdr Elf32_Shdr
+#define Elf_Sym Elf32_Sym
+#define ELF_ST_TYPE(a) ELF32_ST_TYPE(a)
+#define ELF_ST_BIND(a) ELF32_ST_BIND(a)
+#define ELF_ST_VIS(a) ELF32_ST_VISIBILITY(a)
+#endif
+
+typedef struct {
+ unsigned int namesz; /* Size of entry's owner string */
+ unsigned int descsz; /* Size of the note descriptor */
+ unsigned int type; /* Interpretation of the descriptor */
+ char name[0]; /* Start of the name+desc data */
+} Elf_Note;
+
+struct options {
+ char *output;
+ int fd;
+};
+
+static char shd_string_table[] = {
+ 0,
+ '.', 't', 'e', 'x', 't', 0, /* 1 */
+ '.', 's', 'h', 's', 't', 'r', 't', 'a', 'b', 0, /* 7 */
+ '.', 's', 'y', 'm', 't', 'a', 'b', 0, /* 17 */
+ '.', 's', 't', 'r', 't', 'a', 'b', 0, /* 25 */
+ '.', 'n', 'o', 't', 'e', '.', 'g', 'n', 'u', '.', 'b', 'u', 'i', 'l', 'd', '-', 'i', 'd', 0, /* 33 */
+};
+
+
+static struct buildid_note {
+ Elf_Note desc; /* descsz: size of build-id, must be multiple of 4 */
+ char name[4]; /* GNU\0 */
+ char build_id[20];
+} bnote;
+
+static Elf_Sym symtab[]={
+ /* symbol 0 MUST be the undefined symbol */
+ { .st_name = 0, /* index in sym_string table */
+ .st_info = ELF_ST_TYPE(STT_NOTYPE),
+ .st_shndx = 0, /* for now */
+ .st_value = 0x0,
+ .st_other = ELF_ST_VIS(STV_DEFAULT),
+ .st_size = 0,
+ },
+ { .st_name = 1, /* index in sym_string table */
+ .st_info = ELF_ST_BIND(STB_LOCAL) | ELF_ST_TYPE(STT_FUNC),
+ .st_shndx = 1,
+ .st_value = 0, /* for now */
+ .st_other = ELF_ST_VIS(STV_DEFAULT),
+ .st_size = 0, /* for now */
+ }
+};
+
+#ifdef BUILD_ID_URANDOM
+static void
+gen_build_id(struct buildid_note *note, unsigned long load_addr, const void *code, size_t csize)
+{
+ int fd;
+
+ fd = open("/dev/urandom", O_RDONLY);
+ if (fd == -1)
+ err(1, "cannot access /dev/urandom for builid");
+ read(fd, note->build_id, sizeof(note->build_id));
+ close(fd);
+}
+#endif
+
+#ifdef BUILD_ID_SHA
+static void
+gen_build_id(struct buildid_note *note,
+ unsigned long load_addr __maybe_unused,
+ const void *code,
+ size_t csize)
+{
+ if (sizeof(note->build_id) < SHA_DIGEST_LENGTH)
+ errx(1, "build_id too small for SHA1");
+
+ SHA1(code, csize, (unsigned char *)note->build_id);
+}
+#endif
+
+#ifdef BUILD_ID_MD5
+static void
+gen_build_id(struct buildid_note *note, unsigned long load_addr, const void *code, size_t csize)
+{
+ MD5_CTX context;
+
+ if (sizeof(note->build_id) < 16)
+ errx(1, "build_id too small for MD5");
+
+ MD5_Init(&context);
+ MD5_Update(&context, &load_addr, sizeof(load_addr));
+ MD5_Update(&context, code, csize);
+ MD5_Final((unsigned char *)note->build_id, &context);
+}
+#endif
+
+/*
+ * fd: file descriptor open for writing for the output file
+ * load_addr: code load address (could be zero, just used for buildid)
+ * sym: function name (for native code - used as the symbol)
+ * code: the native code
+ * csize: the code size in bytes
+ */
+int
+jit_write_elf(int fd, unsigned long load_addr, const char *sym, const void *code, int csize)
+{
+ Elf *e;
+ Elf_Data *d;
+ Elf_Scn *scn;
+ Elf_Ehdr *ehdr;
+ Elf_Shdr *shdr;
+ char *strsym = NULL;
+ int symlen;
+ int retval = -1;
+
+ if (elf_version(EV_CURRENT) == EV_NONE) {
+ warnx("ELF initialization failed");
+ return -1;
+ }
+
+ e = elf_begin(fd, ELF_C_WRITE, NULL);
+ if (!e) {
+ warnx("elf_begin failed");
+ goto error;
+ }
+
+ /*
+ * setup ELF header
+ */
+ ehdr = elf_newehdr(e);
+ if (!ehdr) {
+ warnx("cannot get ehdr");
+ goto error;
+ }
+
+ ehdr->e_ident[EI_DATA] = GEN_ELF_ENDIAN;
+ ehdr->e_ident[EI_CLASS] = GEN_ELF_CLASS;
+ ehdr->e_machine = GEN_ELF_ARCH;
+ ehdr->e_type = ET_DYN;
+ ehdr->e_entry = 0x0;
+ ehdr->e_version = EV_CURRENT;
+ ehdr->e_shstrndx= 2; /* shdr index for section name */
+
+ /*
+ * setup text section
+ */
+ scn = elf_newscn(e);
+ if (!scn) {
+ warnx("cannot create section");
+ goto error;
+ }
+
+ d = elf_newdata(scn);
+ if (!d) {
+ warnx("cannot get new data");
+ goto error;
+ }
+
+ d->d_align = 16;
+ d->d_off = 0LL;
+ d->d_buf = (void *)code;
+ d->d_type = ELF_T_BYTE;
+ d->d_size = csize;
+ d->d_version = EV_CURRENT;
+
+ shdr = elf_getshdr(scn);
+ if (!shdr) {
+ warnx("cannot get section header");
+ goto error;
+ }
+
+ shdr->sh_name = 1;
+ shdr->sh_type = SHT_PROGBITS;
+ shdr->sh_addr = 0; /* must be zero or == sh_offset -> dynamic object */
+ shdr->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
+ shdr->sh_entsize = 0;
+
+ /*
+ * setup section headers string table
+ */
+ scn = elf_newscn(e);
+ if (!scn) {
+ warnx("cannot create section");
+ goto error;
+ }
+
+ d = elf_newdata(scn);
+ if (!d) {
+ warnx("cannot get new data");
+ goto error;
+ }
+
+ d->d_align = 1;
+ d->d_off = 0LL;
+ d->d_buf = shd_string_table;
+ d->d_type = ELF_T_BYTE;
+ d->d_size = sizeof(shd_string_table);
+ d->d_version = EV_CURRENT;
+
+ shdr = elf_getshdr(scn);
+ if (!shdr) {
+ warnx("cannot get section header");
+ goto error;
+ }
+
+ shdr->sh_name = 7; /* offset of '.shstrtab' in shd_string_table */
+ shdr->sh_type = SHT_STRTAB;
+ shdr->sh_flags = 0;
+ shdr->sh_entsize = 0;
+
+ /*
+ * setup symtab section
+ */
+ symtab[1].st_size = csize;
+
+ scn = elf_newscn(e);
+ if (!scn) {
+ warnx("cannot create section");
+ goto error;
+ }
+
+ d = elf_newdata(scn);
+ if (!d) {
+ warnx("cannot get new data");
+ goto error;
+ }
+
+ d->d_align = 8;
+ d->d_off = 0LL;
+ d->d_buf = symtab;
+ d->d_type = ELF_T_SYM;
+ d->d_size = sizeof(symtab);
+ d->d_version = EV_CURRENT;
+
+ shdr = elf_getshdr(scn);
+ if (!shdr) {
+ warnx("cannot get section header");
+ goto error;
+ }
+
+ shdr->sh_name = 17; /* offset of '.symtab' in shd_string_table */
+ shdr->sh_type = SHT_SYMTAB;
+ shdr->sh_flags = 0;
+ shdr->sh_entsize = sizeof(Elf_Sym);
+ shdr->sh_link = 4; /* index of .strtab section */
+
+ /*
+ * setup symbols string table
+ * 2 = 1 for 0 in 1st entry, 1 for the 0 at end of symbol for 2nd entry
+ */
+ symlen = 2 + strlen(sym);
+ strsym = calloc(1, symlen);
+ if (!strsym) {
+ warnx("cannot allocate strsym");
+ goto error;
+ }
+ strcpy(strsym + 1, sym);
+
+ scn = elf_newscn(e);
+ if (!scn) {
+ warnx("cannot create section");
+ goto error;
+ }
+
+ d = elf_newdata(scn);
+ if (!d) {
+ warnx("cannot get new data");
+ goto error;
+ }
+
+ d->d_align = 1;
+ d->d_off = 0LL;
+ d->d_buf = strsym;
+ d->d_type = ELF_T_BYTE;
+ d->d_size = symlen;
+ d->d_version = EV_CURRENT;
+
+ shdr = elf_getshdr(scn);
+ if (!shdr) {
+ warnx("cannot get section header");
+ goto error;
+ }
+
+ shdr->sh_name = 25; /* offset in shd_string_table */
+ shdr->sh_type = SHT_STRTAB;
+ shdr->sh_flags = 0;
+ shdr->sh_entsize = 0;
+
+ /*
+ * setup build-id section
+ */
+ scn = elf_newscn(e);
+ if (!scn) {
+ warnx("cannot create section");
+ goto error;
+ }
+
+ d = elf_newdata(scn);
+ if (!d) {
+ warnx("cannot get new data");
+ goto error;
+ }
+
+ /*
+ * build-id generation
+ */
+ gen_build_id(&bnote, load_addr, code, csize);
+ bnote.desc.namesz = sizeof(bnote.name); /* must include 0 termination */
+ bnote.desc.descsz = sizeof(bnote.build_id);
+ bnote.desc.type = NT_GNU_BUILD_ID;
+ strcpy(bnote.name, "GNU");
+
+ d->d_align = 4;
+ d->d_off = 0LL;
+ d->d_buf = &bnote;
+ d->d_type = ELF_T_BYTE;
+ d->d_size = sizeof(bnote);
+ d->d_version = EV_CURRENT;
+
+ shdr = elf_getshdr(scn);
+ if (!shdr) {
+ warnx("cannot get section header");
+ goto error;
+ }
+
+ shdr->sh_name = 33; /* offset in shd_string_table */
+ shdr->sh_type = SHT_NOTE;
+ shdr->sh_addr = 0x0;
+ shdr->sh_flags = SHF_ALLOC;
+ shdr->sh_size = sizeof(bnote);
+ shdr->sh_entsize = 0;
+
+ if (elf_update(e, ELF_C_WRITE) < 0) {
+ warnx("elf_update 4 failed");
+ goto error;
+ }
+ (void)elf_end(e);
+
+ retval = 0;
+error:
+ free(strsym);
+
+ return retval;
+}
+
+#ifndef JVMTI
+
+static unsigned char x86_code[] = {
+ 0xBB, 0x2A, 0x00, 0x00, 0x00, /* movl $42, %ebx */
+ 0xB8, 0x01, 0x00, 0x00, 0x00, /* movl $1, %eax */
+ 0xCD, 0x80 /* int $0x80 */
+};
+
+static struct options options;
+
+int main(int argc, char **argv)
+{
+ int c, fd, ret;
+
+ while ((c = getopt(argc, argv, "o:h")) != -1) {
+ switch (c) {
+ case 'o':
+ options.output = optarg;
+ break;
+ case 'h':
+ printf("Usage: genelf -o output_file [-h]\n");
+ return 0;
+ default:
+ errx(1, "unknown option");
+ }
+ }
+
+ fd = open(options.output, O_CREAT|O_TRUNC|O_RDWR, 0666);
+ if (fd == -1)
+ err(1, "cannot create file %s", options.output);
+
+ ret = jit_write_elf(fd, "main", x86_code, sizeof(x86_code));
+ close(fd);
+
+ if (ret != 0)
+ unlink(options.output);
+
+ return ret;
+}
+#endif
diff --git a/tools/perf/util/genelf.h b/tools/perf/util/genelf.h
new file mode 100644
index 0000000..11307a2
--- /dev/null
+++ b/tools/perf/util/genelf.h
@@ -0,0 +1,6 @@
+#ifndef __GENELF_H__
+#define __GENELF_H__
+
+extern int jit_write_elf(int fd, unsigned long code_addr, const char *sym, const void *code, int csize);
+
+#endif
diff --git a/tools/perf/util/jit.h b/tools/perf/util/jit.h
new file mode 100644
index 0000000..e7a54cb
--- /dev/null
+++ b/tools/perf/util/jit.h
@@ -0,0 +1,27 @@
+#ifndef __JIT_H__
+#define __JIT_H__
+#include <sys/time.h>
+#include <stdint.h>
+
+#include "jitdump.h"
+
+struct jit_buf_desc {
+ union jr_entry *entry;
+ void *buf;
+ size_t bufsize;
+ FILE *in;
+ int needs_bswap; /* handles cross-endianess */
+ uint32_t code_load_count;
+ struct rb_root code_root;
+};
+
+extern int jit_open_dump(const char *name, struct jit_buf_desc *jd);
+extern void jit_close_dump(struct jit_buf_desc *jd);
+extern union jr_entry *jit_get_next_entry(struct jit_buf_desc *jd);
+
+/*
+ * use type = -1 to match any record type
+ */
+extern union jr_entry *jit_get_next_entry_type(struct jit_buf_desc *jd, int type);
+
+#endif /* __JIT_H__ */
diff --git a/tools/perf/util/jitdump.c b/tools/perf/util/jitdump.c
new file mode 100644
index 0000000..28a9429
--- /dev/null
+++ b/tools/perf/util/jitdump.c
@@ -0,0 +1,233 @@
+#include <sys/types.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <inttypes.h>
+#include <byteswap.h>
+
+#include "util.h"
+#include "debug.h"
+#include "symbol.h"
+#include "strlist.h"
+#include <elf.h>
+
+#include "jit.h"
+
+struct debug_line_info {
+ unsigned long vma;
+ unsigned int lineno;
+ /* The filename format is unspecified, absolute path, relative etc. */
+ char const filename[0];
+};
+
+#define hmax(a, b) ((a) > (b) ? (a) : (b))
+
+void
+jit_close_dump(struct jit_buf_desc *jd)
+{
+ if (!(jd && jd->in))
+ return;
+ fclose(jd->in);
+ jd->in = NULL;
+}
+
+int
+jit_open_dump(const char *name, struct jit_buf_desc *jd)
+{
+ struct jitheader header;
+ struct jr_prefix *prefix;
+ ssize_t bs, bsz = 0;
+ void *n, *buf = NULL;
+ int ret, retval = -1;
+
+ if (!jd || jd->in)
+ return -1;
+
+ memset(jd, 0, sizeof(*jd));
+
+ jd->in = fopen(name, "r");
+ if (!jd->in)
+ return -1;
+
+ bsz = hmax(sizeof(header), sizeof(*prefix));
+
+ buf = malloc(bsz);
+ if (!buf)
+ goto error;
+
+ ret = fread(buf, sizeof(header), 1, jd->in);
+ if (ret != 1)
+ goto error;
+
+ memcpy(&header, buf, sizeof(header));
+
+ if (header.magic != JITHEADER_MAGIC) {
+ if (header.magic != JITHEADER_MAGIC_SW)
+ goto error;
+ jd->needs_bswap = 1;
+ }
+
+ if (jd->needs_bswap) {
+ header.version = bswap_32(header.version);
+ header.total_size = bswap_32(header.total_size);
+ header.pid = bswap_32(header.pid);
+ header.elf_mach = bswap_32(header.elf_mach);
+ header.timestamp = bswap_32(header.timestamp);
+ }
+
+ pr_debug("version=%u\nsize=%u(%zu)\nts=0x%llx\npid=%d\nelf_mach=%d\n",
+ header.version,
+ header.total_size,
+ header.total_size - sizeof(header),
+ (unsigned long long)header.timestamp,
+ header.pid,
+ header.elf_mach);
+
+ bs = header.total_size - sizeof(header);
+
+ if (bs > bsz) {
+ n = realloc(buf, bs);
+ if (!n)
+ goto error;
+ bsz = bs;
+ buf = n;
+ /* read extra we do not know about */
+ ret = fread(buf, bs - bsz, 1, jd->in);
+ if (ret != 1)
+ goto error;
+ }
+ retval = 0;
+error:
+ return retval;
+}
+
+union jr_entry *
+jit_get_next_entry(struct jit_buf_desc *jd)
+{
+ struct jr_prefix *prefix;
+ union jr_entry *jr;
+ void *addr;
+ size_t bs, size;
+ int id, ret;
+
+ if (!(jd && jd->in))
+ return NULL;
+
+ if (jd->buf == NULL) {
+ size_t sz = getpagesize();
+ if (sz < sizeof(*prefix))
+ sz = sizeof(*prefix);
+
+ jd->buf = malloc(sz);
+ if (jd->buf == NULL)
+ return NULL;
+
+ jd->bufsize = sz;
+ }
+
+ prefix = jd->buf;
+
+ ret = fread(prefix, sizeof(*prefix), 1, jd->in);
+ if (ret != 1)
+ return NULL;
+
+ if (jd->needs_bswap) {
+ prefix->id = bswap_32(prefix->id);
+ prefix->total_size = bswap_32(prefix->total_size);
+ prefix->timestamp = bswap_64(prefix->timestamp);
+ }
+ id = prefix->id;
+ size = prefix->total_size;
+
+ bs = (size_t)size;
+ if (bs < sizeof(*prefix))
+ return NULL;
+
+ if (id >= JIT_CODE_MAX) {
+ pr_warning("next_entry: unknown prefix %d, skipping\n", id);
+ return NULL;
+ }
+ if (bs > jd->bufsize) {
+ void *n;
+ n = realloc(jd->buf, bs);
+ if (!n)
+ return NULL;
+ jd->buf = n;
+ jd->bufsize = bs;
+ }
+
+ addr = ((void *)jd->buf) + sizeof(*prefix);
+
+ ret = fread(addr, bs - sizeof(*prefix), 1, jd->in);
+ if (ret != 1)
+ return NULL;
+
+ jr = (union jr_entry *)jd->buf;
+
+ switch(id) {
+ case JIT_CODE_DEBUG_INFO:
+ case JIT_CODE_CLOSE:
+ break;
+ case JIT_CODE_LOAD:
+ if (jd->needs_bswap) {
+ jr->load.pid = bswap_32(jr->load.pid);
+ jr->load.tid = bswap_32(jr->load.tid);
+ jr->load.vma = bswap_64(jr->load.vma);
+ jr->load.code_addr = bswap_64(jr->load.code_addr);
+ jr->load.code_size = bswap_64(jr->load.code_size);
+ jr->load.code_index= bswap_64(jr->load.code_index);
+ }
+ jd->code_load_count++;
+ break;
+ case JIT_CODE_MOVE:
+ if (jd->needs_bswap) {
+ jr->move.pid = bswap_32(jr->move.pid);
+ jr->move.tid = bswap_32(jr->move.tid);
+ jr->move.vma = bswap_64(jr->move.vma);
+ jr->move.old_code_addr = bswap_64(jr->move.old_code_addr);
+ jr->move.new_code_addr = bswap_64(jr->move.new_code_addr);
+ jr->move.code_size = bswap_64(jr->move.code_size);
+ jr->move.code_index = bswap_64(jr->move.code_index);
+ }
+ break;
+ case JIT_CODE_MAX:
+ default:
+ return NULL;
+ }
+ return jr;
+}
+
+union jr_entry *
+jit_get_next_entry_type(struct jit_buf_desc *jd, int type)
+{
+ union jr_entry *jr;
+
+ while ((jr = jit_get_next_entry(jd))) {
+ if (type == -1 || (uint32_t)type == jr->prefix.id)
+ goto found;
+ }
+ return NULL;
+found:
+ switch(jr->prefix.id) {
+ case JIT_CODE_DEBUG_INFO:
+ case JIT_CODE_CLOSE:
+ break;
+ case JIT_CODE_LOAD:
+ if (jd->needs_bswap) {
+ jr->load.pid = bswap_32(jr->load.pid);
+ jr->load.tid = bswap_32(jr->load.tid);
+ jr->load.vma = bswap_64(jr->load.vma);
+ jr->load.code_addr = bswap_64(jr->load.code_addr);
+ jr->load.code_size = bswap_32(jr->load.code_size);
+ jr->load.code_index= bswap_64(jr->load.code_index);
+ }
+ jd->code_load_count++;
+ break;
+ case JIT_CODE_MAX:
+ default:
+ return NULL;
+ }
+ return jr;
+}
diff --git a/tools/perf/util/jitdump.h b/tools/perf/util/jitdump.h
new file mode 100644
index 0000000..120bdcf
--- /dev/null
+++ b/tools/perf/util/jitdump.h
@@ -0,0 +1,92 @@
+/*
+ * jitdump.h: jitted code info encapsulation file format
+ *
+ * Adapted from OProfile GPLv2 support jidump.h:
+ * Copyright 2007 OProfile authors
+ * Jens Wilke
+ * Daniel Hansel
+ * Copyright IBM Corporation 2007
+ */
+#ifndef JITDUMP_H
+#define JITDUMP_H
+
+#include <sys/time.h>
+#include <time.h>
+#include <stdint.h>
+
+/* JiTD */
+#define JITHEADER_MAGIC 0x4A695444
+#define JITHEADER_MAGIC_SW 0x4454694A
+
+#define PADDING_8ALIGNED(x) ((((x) + 7) & 7) ^ 7)
+
+#define JITHEADER_VERSION 1
+
+struct jitheader {
+ uint32_t magic; /* characters "jItD" */
+ uint32_t version; /* header version */
+ uint32_t total_size; /* total size of header */
+ uint32_t elf_mach; /* elf mach target */
+ uint32_t pad1; /* reserved */
+ uint32_t pid; /* JIT process id */
+ uint64_t timestamp; /* timestamp */
+};
+
+enum jit_record_type {
+ JIT_CODE_LOAD = 0,
+ JIT_CODE_MOVE = 1,
+ JIT_CODE_DEBUG_INFO = 2,
+ JIT_CODE_CLOSE = 3,
+
+ JIT_CODE_MAX,
+};
+
+/* record prefix (mandatory in each record) */
+struct jr_prefix {
+ uint32_t id;
+ uint32_t total_size;
+ uint64_t timestamp;
+};
+
+struct jr_code_load {
+ struct jr_prefix p;
+
+ uint32_t pid;
+ uint32_t tid;
+ uint64_t vma;
+ uint64_t code_addr;
+ uint64_t code_size;
+ uint64_t code_index;
+};
+
+struct jr_code_close {
+ struct jr_prefix p;
+};
+
+struct jr_code_move {
+ struct jr_prefix p;
+
+ uint32_t pid;
+ uint32_t tid;
+ uint64_t vma;
+ uint64_t old_code_addr;
+ uint64_t new_code_addr;
+ uint64_t code_size;
+ uint64_t code_index;
+};
+
+struct jr_code_debug_info {
+ struct jr_prefix p;
+
+ uint64_t code_addr;
+ uint64_t nr_entry;
+};
+
+union jr_entry {
+ struct jr_code_debug_info info;
+ struct jr_code_close close;
+ struct jr_code_load load;
+ struct jr_code_move move;
+ struct jr_prefix prefix;
+};
+#endif /* !JITDUMP_H */
--
1.9.1

2015-02-10 23:43:33

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH 4/4] perf tools: add JVMTI agent library

This is a standalone JVMTI library to help profile Java jitted
code with perf record/perf report. The library is not installed
or compiled automatically by perf Makefile. It is not used
directly by perf. It is arch agnostic and has been tested on
X86 and ARM. It needs to be used with a Java runtime, such
as OpenJDK, as follows:

$ java -agentpath:libjvmti.so .......

When used this way, java will generate a jitdump binary file in
$HOME/.debug/java/jit/java-jit-*

This binary dump file contains information to help symbolize and
annotate jitted code.

The next step is to inject the jitdump information into the
perf.data file:
$ perf inject -j $HOME/.debug/java/jit/java-jit-XXXX/jit-ZZZ.dump \
-i perf.data -o perf.data.jitted

This injects the MMAP records to cover the jitted code and also generates
one ELF image for each jitted function. The ELF images are created in the
same subdir as the jitdump file. The MMAP records point there too.

Then to visualize the function or asm profile, simply use the regular
perf commands:
$ perf report -i perf.data.jitted
or
$ perf annotate -i perf.data.jitted

JVMTI agent code adapted from OProfile's opagent code.

Signed-off-by: Stephane Eranian <[email protected]>
---
tools/perf/jvmti/Makefile | 70 +++++++++
tools/perf/jvmti/jvmti_agent.c | 349 +++++++++++++++++++++++++++++++++++++++++
tools/perf/jvmti/jvmti_agent.h | 23 +++
tools/perf/jvmti/libjvmti.c | 149 ++++++++++++++++++
4 files changed, 591 insertions(+)
create mode 100644 tools/perf/jvmti/Makefile
create mode 100644 tools/perf/jvmti/jvmti_agent.c
create mode 100644 tools/perf/jvmti/jvmti_agent.h
create mode 100644 tools/perf/jvmti/libjvmti.c

diff --git a/tools/perf/jvmti/Makefile b/tools/perf/jvmti/Makefile
new file mode 100644
index 0000000..9eda64b
--- /dev/null
+++ b/tools/perf/jvmti/Makefile
@@ -0,0 +1,70 @@
+ARCH=$(shell uname -m)
+
+ifeq ($(ARCH), x86_64)
+JARCH=amd64
+endif
+ifeq ($(ARCH), armv7l)
+JARCH=armhf
+endif
+ifeq ($(ARCH), armv6l)
+JARCH=armhf
+endif
+ifeq ($(ARCH), ppc64)
+JARCH=powerpc
+endif
+ifeq ($(ARCH), ppc64le)
+JARCH=powerpc
+endif
+
+DESTDIR=/usr/local
+
+VERSION=1
+REVISION=0
+AGE=0
+
+LN=ln -sf
+RM=rm
+
+SJVMTI=libjvmti.so.$(VERSION).$(REVISION).$(AGE)
+VJVMTI=libjvmti.so.$(VERSION)
+SLDFLAGS=-shared -Wl,-soname -Wl,$(VLIBPFM)
+SOLIBEXT=so
+
+JDIR=$(shell /usr/sbin/update-java-alternatives -l | head -1 | cut -d ' ' -f 3)
+# -lrt required in 32-bit mode for clock_gettime()
+LIBS=-lelf -lrt
+INCDIR=-I $(JDIR)/include -I $(JDIR)/include/linux
+
+TARGETS=$(SJVMTI)
+
+SRCS=libjvmti.c jvmti_agent.c
+OBJS=$(SRCS:.c=.o)
+SOBJS=$(OBJS:.o=.lo)
+OPT=-O2 -g -Werror -Wall
+
+CFLAGS=$(INCDIR) $(OPT)
+
+all: $(TARGETS)
+
+.c.o:
+ $(CC) $(CFLAGS) -c $*.c
+.c.lo:
+ $(CC) -fPIC -DPIC $(CFLAGS) -c $*.c -o $*.lo
+
+$(OBJS) $(SOBJS): Makefile jvmti_agent.h ../util/jitdump.h
+
+$(SJVMTI): $(SOBJS)
+ $(CC) $(CFLAGS) $(SLDFLAGS) -o $@ $(SOBJS) $(LIBS)
+ $(LN) $@ libjvmti.$(SOLIBEXT)
+
+clean:
+ $(RM) -f *.o *.so.* *.so *.lo
+
+install:
+ -mkdir -p $(DESTDIR)/lib
+ install -m 755 $(SJVMTI) $(DESTDIR)/lib/
+ (cd $(DESTDIR)/lib; $(LN) $(SJVMTI) $(VJVMTI))
+ (cd $(DESTDIR)/lib; $(LN) $(SJVMTI) libjvmti.$(SOLIBEXT))
+ ldconfig
+
+.SUFFIXES: .c .S .o .lo
diff --git a/tools/perf/jvmti/jvmti_agent.c b/tools/perf/jvmti/jvmti_agent.c
new file mode 100644
index 0000000..d2d5215
--- /dev/null
+++ b/tools/perf/jvmti/jvmti_agent.c
@@ -0,0 +1,349 @@
+/*
+ * jvmti_agent.c: JVMTI agent interface
+ *
+ * Adapted from the Oprofile code in opagent.c:
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ * Copyright 2007 OProfile authors
+ * Jens Wilke
+ * Daniel Hansel
+ * Copyright IBM Corporation 2007
+ */
+#include <sys/types.h>
+#include <sys/stat.h> /* for mkdir() */
+#include <stdio.h>
+#include <errno.h>
+#include <string.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <limits.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <time.h>
+#include <syscall.h> /* for gettid() */
+#include <err.h>
+
+#include "jvmti_agent.h"
+#include "../util/jitdump.h"
+
+#define JIT_LANG "java"
+
+static char jit_path[PATH_MAX];
+
+/*
+ * padding buffer
+ */
+static const char pad_bytes[7];
+
+/*
+ * perf_events event fd
+ */
+static int perf_fd;
+
+static inline pid_t gettid(void)
+{
+ return (pid_t)syscall(__NR_gettid);
+}
+
+static int get_e_machine(struct jitheader *hdr)
+{
+ ssize_t sret;
+ char id[16];
+ int fd, ret = -1;
+ int m = -1;
+ struct {
+ uint16_t e_type;
+ uint16_t e_machine;
+ } info;
+
+ fd = open("/proc/self/exe", O_RDONLY);
+ if (fd == -1)
+ return -1;
+
+ sret = read(fd, id, sizeof(id));
+ if (sret != sizeof(id))
+ goto error;
+
+ /* check ELF signature */
+ if (id[0] != 0x7f || id[1] != 'E' || id[2] != 'L' || id[3] != 'F')
+ goto error;
+
+ sret = read(fd, &info, sizeof(info));
+ if (sret != sizeof(info))
+ goto error;
+
+ m = info.e_machine;
+ if (m < 0)
+ m = 0; /* ELF EM_NONE */
+
+ hdr->elf_mach = m;
+ ret = 0;
+error:
+ close(fd);
+ return ret;
+}
+
+#define CLOCK_DEVICE "/dev/trace_clock"
+#define CLOCKFD 3
+#define FD_TO_CLOCKID(fd) ((~(clockid_t) (fd) << 3) | CLOCKFD)
+#define CLOCKID_TO_FD(id) ((~(int) (id) >> 3) & ~CLOCKFD)
+
+#define NSEC_PER_SEC 1000000000
+
+#ifndef CLOCK_INVALID
+#define CLOCK_INVALID -1
+#endif
+
+static inline clockid_t get_clockid(int fd)
+{
+ return FD_TO_CLOCKID(fd);
+}
+
+static int
+perf_open_timestamp(void)
+{
+ int fd, id;
+
+ fd = open(CLOCK_DEVICE, O_RDONLY);
+ if (fd == -1) {
+ if (errno == ENOENT)
+ warnx("jvmti: %s not present, check your kernel for trace_clock module", CLOCK_DEVICE);
+ if (errno == EPERM)
+ warnx("jvmti: %s has wrong permissions, suggesting chmod 644 %s", CLOCK_DEVICE, CLOCK_DEVICE);
+ }
+
+ id = get_clockid(fd);
+ if (CLOCK_INVALID == id)
+ return CLOCK_INVALID;
+
+ return get_clockid(fd);
+}
+
+static inline void
+perf_close_timestamp(int id)
+{
+ close(CLOCKID_TO_FD(id));
+}
+
+
+static inline uint64_t
+timespec_to_ns(const struct timespec *ts)
+{
+ return ((uint64_t) ts->tv_sec * NSEC_PER_SEC) + ts->tv_nsec;
+}
+
+static inline uint64_t
+perf_get_timestamp(int id)
+{
+ struct timespec ts;
+
+ clock_gettime(id, &ts);
+ return timespec_to_ns(&ts);
+}
+
+static int
+debug_cache_init(void)
+{
+ char str[32];
+ char *base, *p;
+ struct tm tm;
+ time_t t;
+ int ret;
+
+ time(&t);
+ localtime_r(&t, &tm);
+
+ base = getenv("JITDUMPDIR");
+ if (!base)
+ base = getenv("HOME");
+ if (!base)
+ base = ".";
+
+ strftime(str, sizeof(str), JIT_LANG"-jit-%Y%m%d", &tm);
+
+ snprintf(jit_path, PATH_MAX - 1, "%s/.debug/", base);
+
+ ret = mkdir(jit_path, 0755);
+ if (ret == -1) {
+ if (errno != EEXIST) {
+ warn("jvmti: cannot create jit cache dir %s", jit_path);
+ return -1;
+ }
+ }
+
+ snprintf(jit_path, PATH_MAX - 1, "%s/.debug/jit", base);
+ ret = mkdir(jit_path, 0755);
+ if (ret == -1) {
+ if (errno != EEXIST) {
+ warn("cannot create jit cache dir %s", jit_path);
+ return -1;
+ }
+ }
+
+ snprintf(jit_path, PATH_MAX - 1, "%s/.debug/jit/%s.XXXXXXXX", base, str);
+
+ p = mkdtemp(jit_path);
+ if (p != jit_path) {
+ warn("cannot create jit cache dir %s", jit_path);
+ return -1;
+ }
+
+ return 0;
+}
+
+void *jvmti_open(void)
+{
+ int pad_cnt;
+ char dump_path[PATH_MAX];
+ struct jitheader header;
+ FILE *fp;
+
+ perf_fd = perf_open_timestamp();
+ if (perf_fd == -1)
+ warnx("jvmti: kernel does not support /dev/trace_clock or permissions are wrong on that device");
+
+ memset(&header, 0, sizeof(header));
+
+ debug_cache_init();
+
+ snprintf(dump_path, PATH_MAX, "%s/jit-%i.dump", jit_path, getpid());
+
+ fp = fopen(dump_path, "w");
+ if (!fp) {
+ warn("jvmti: cannot create %s", dump_path);
+ goto error;
+ }
+
+ warnx("jvmti: jitdump in %s", dump_path);
+
+ if (get_e_machine(&header)) {
+ warn("get_e_machine failed\n");
+ goto error;
+ }
+
+ header.magic = JITHEADER_MAGIC;
+ header.version = JITHEADER_VERSION;
+ header.total_size = sizeof(header);
+ header.pid = getpid();
+
+ /* calculate amount of padding '\0' */
+ pad_cnt = PADDING_8ALIGNED(header.total_size);
+ header.total_size += pad_cnt;
+
+ header.timestamp = perf_get_timestamp(perf_fd);
+
+ if (!fwrite(&header, sizeof(header), 1, fp)) {
+ warn("jvmti: cannot write dumpfile header");
+ goto error;
+ }
+
+ /* write padding '\0' if necessary */
+ if (pad_cnt && !fwrite(pad_bytes, pad_cnt, 1, fp)) {
+ warn("jvmti: cannot write dumpfile header padding");
+ goto error;
+ }
+
+ return fp;
+error:
+ fclose(fp);
+ perf_close_timestamp(perf_fd);
+ return NULL;
+}
+
+int
+jvmti_close(void *agent)
+{
+ struct jr_code_close rec;
+ FILE *fp = agent;
+
+ if (!fp) {
+ warnx("jvmti: incalid fd in close_agent");
+ return -1;
+ }
+
+ rec.p.id = JIT_CODE_CLOSE;
+ rec.p.total_size = sizeof(rec);
+
+ rec.p.timestamp = perf_get_timestamp(perf_fd);
+
+ if (!fwrite(&rec, sizeof(rec), 1, fp))
+ return -1;
+
+ fclose(fp);
+
+ perf_close_timestamp(perf_fd);
+
+ fp = NULL;
+
+ return 0;
+}
+
+int jvmti_write_code(void *agent, char const *sym,
+ uint64_t vma, void const *code, unsigned int const size)
+{
+ static int code_generation = 1;
+ struct jr_code_load rec;
+ size_t sym_len;
+ size_t padding_count;
+ FILE *fp = agent;
+ int ret = -1;
+
+ /* don't care about 0 length function, no samples */
+ if (size == 0)
+ return 0;
+
+ if (!fp) {
+ warnx("jvmti: invalid fd in write_native_code");
+ return -1;
+ }
+
+ sym_len = strlen(sym) + 1;
+
+ rec.p.id = JIT_CODE_LOAD;
+ rec.p.total_size = sizeof(rec) + sym_len;
+ padding_count = PADDING_8ALIGNED(rec.p.total_size);
+ rec.p. total_size += padding_count;
+ rec.p.timestamp = perf_get_timestamp(perf_fd);
+
+ rec.code_size = size;
+ rec.vma = vma;
+ rec.code_addr = vma;
+ rec.pid = getpid();
+ rec.tid = gettid();
+ rec.code_index = code_generation++;
+
+ if (code)
+ rec.p.total_size += size;
+
+ /*
+ * If JVM is multi-threaded, nultiple concurrent calls to agent
+ * may be possible, so protect file writes
+ */
+ flockfile(fp);
+
+ ret = fwrite_unlocked(&rec, sizeof(rec), 1, fp);
+ fwrite_unlocked(sym, sym_len, 1, fp);
+ if (code)
+ fwrite_unlocked(code, size, 1, fp);
+
+ if (padding_count)
+ fwrite_unlocked(pad_bytes, padding_count, 1, fp);
+
+ funlockfile(fp);
+
+ ret = 0;
+
+ return ret;
+}
diff --git a/tools/perf/jvmti/jvmti_agent.h b/tools/perf/jvmti/jvmti_agent.h
new file mode 100644
index 0000000..54e5c5e
--- /dev/null
+++ b/tools/perf/jvmti/jvmti_agent.h
@@ -0,0 +1,23 @@
+#ifndef __JVMTI_AGENT_H__
+#define __JVMTI_AGENT_H__
+
+#include <sys/types.h>
+#include <stdint.h>
+
+#define __unused __attribute__((unused))
+
+#if defined(__cplusplus)
+extern "C" {
+#endif
+
+void *jvmti_open(void);
+int jvmti_close(void *agent);
+int jvmti_write_code(void *agent, char const *symbol_name,
+ uint64_t vma, void const *code,
+ const unsigned int code_size);
+
+#if defined(__cplusplus)
+}
+
+#endif
+#endif /* __JVMTI_H__ */
diff --git a/tools/perf/jvmti/libjvmti.c b/tools/perf/jvmti/libjvmti.c
new file mode 100644
index 0000000..8b8d782
--- /dev/null
+++ b/tools/perf/jvmti/libjvmti.c
@@ -0,0 +1,149 @@
+#include <sys/types.h>
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <err.h>
+#include <jvmti.h>
+
+#include "jvmti_agent.h"
+
+void *jvmti_agent;
+
+static void JNICALL
+compiled_method_load_cb(jvmtiEnv *jvmti,
+ jmethodID method,
+ jint code_size,
+ void const *code_addr,
+ jint map_length,
+ jvmtiAddrLocationMap const *map,
+ void const *compile_info __unused)
+{
+ jclass decl_class;
+ char *class_sign = NULL;
+ char *func_name = NULL;
+ char *func_sign = NULL;
+ jvmtiError ret;
+ size_t len;
+
+ ret = (*jvmti)->GetMethodDeclaringClass(jvmti, method,
+ &decl_class);
+ if (ret != JVMTI_ERROR_NONE) {
+ warnx("jvmti: getmethoddeclaringclass failed");
+ return;
+ }
+
+ ret = (*jvmti)->GetClassSignature(jvmti, decl_class,
+ &class_sign, NULL);
+ if (ret != JVMTI_ERROR_NONE) {
+ warnx("jvmti: getclassignature failed");
+ goto error;
+ }
+
+ ret = (*jvmti)->GetMethodName(jvmti, method, &func_name,
+ &func_sign, NULL);
+ if (ret != JVMTI_ERROR_NONE) {
+ warnx("jvmti: failed getmethodname");
+ goto error;
+ }
+
+ len = strlen(func_name) + strlen(class_sign) + strlen(func_sign) + 2;
+
+ {
+ char str[len];
+ uint64_t addr = (uint64_t)(unsigned long)code_addr;
+ snprintf(str, len, "%s%s%s", class_sign, func_name, func_sign);
+ ret = jvmti_write_code(jvmti_agent, str, addr, code_addr, code_size);
+ if (ret)
+ warnx("jvmti: write_code() failed");
+ }
+error:
+ (*jvmti)->Deallocate(jvmti, (unsigned char *)func_name);
+ (*jvmti)->Deallocate(jvmti, (unsigned char *)func_sign);
+ (*jvmti)->Deallocate(jvmti, (unsigned char *)class_sign);
+}
+
+static void JNICALL
+code_generated_cb(jvmtiEnv *jvmti,
+ char const *name,
+ void const *code_addr,
+ jint code_size)
+{
+ uint64_t addr = (uint64_t)(unsigned long)code_addr;
+ int ret;
+
+ ret = jvmti_write_code(jvmti_agent, name, addr, code_addr, code_size);
+ if (ret)
+ warnx("jvmti: write_code() failed for code_generated");
+}
+
+JNIEXPORT jint JNICALL
+Agent_OnLoad(JavaVM *jvm, char *options, void *reserved __unused)
+{
+ jvmtiEventCallbacks cb;
+ jvmtiCapabilities caps1;
+ jvmtiEnv *jvmti = NULL;
+ jint ret;
+
+ jvmti_agent = jvmti_open();
+ if (!jvmti_agent) {
+ warnx("jvmti: open_agent failed");
+ return -1;
+ }
+
+ /*
+ * Request a JVMTI interface version 1 environment
+ */
+ ret = (*jvm)->GetEnv(jvm, (void *)&jvmti, JVMTI_VERSION_1);
+ if (ret != JNI_OK) {
+ warnx("jvmti: jvmti version 1 not supported");
+ return -1;
+ }
+
+ /*
+ * acquire method_load capability, we require it
+ */
+ memset(&caps1, 0, sizeof(caps1));
+ caps1.can_generate_compiled_method_load_events = 1;
+
+ ret = (*jvmti)->AddCapabilities(jvmti, &caps1);
+ if (ret != JVMTI_ERROR_NONE) {
+ warnx("jvmti: acquire compiled_method capability failed");
+ return -1;
+ }
+
+ memset(&cb, 0, sizeof(cb));
+
+ cb.CompiledMethodLoad = compiled_method_load_cb;
+ cb.DynamicCodeGenerated = code_generated_cb;
+
+ ret = (*jvmti)->SetEventCallbacks(jvmti, &cb, sizeof(cb));
+ if (ret != JVMTI_ERROR_NONE) {
+ warnx("jvmti: cannot set event callbacks");
+ return -1;
+ }
+
+ ret = (*jvmti)->SetEventNotificationMode(jvmti, JVMTI_ENABLE,
+ JVMTI_EVENT_COMPILED_METHOD_LOAD, NULL);
+ if (ret != JVMTI_ERROR_NONE) {
+ warnx("jvmti: setnotification failed for method_load");
+ return -1;
+ }
+
+ ret = (*jvmti)->SetEventNotificationMode(jvmti, JVMTI_ENABLE,
+ JVMTI_EVENT_DYNAMIC_CODE_GENERATED, NULL);
+ if (ret != JVMTI_ERROR_NONE) {
+ warnx("jvmti: setnotification failed on code_generated");
+ return -1;
+ }
+ return 0;
+}
+
+JNIEXPORT void JNICALL
+Agent_OnUnload(JavaVM *jvm __unused)
+{
+ int ret;
+
+ ret = jvmti_close(jvmti_agent);
+ if (ret)
+ errx(1, "Error: op_close_agent()");
+}
--
1.9.1

2015-02-11 11:40:15

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 0/4] perf: add support for profiling jitted code

On Wed, Feb 11, 2015 at 12:42:41AM +0100, Stephane Eranian wrote:
> To enable synchronization of the runtime MMAPs with those recorded by
> the kernel on behalf of the perf tool, the runtime needs to timestamp
> any record in the dump file using the same time source. The current
> patch series is relying on Sonny Rao's posix-clock patch series
> posted on LKML in 2013. The link to the patches is:
> https://lkml.org/lkml/2013/12/6/1044
>

At least for x86 you could use something like this:

lkml.kernel.org/r/aa2dd2e4f1e9f2225758be5ba00f14d6909a8ce1.1423180257.git.shli@fb.com

We can re-create the perf_clock() from the tsc with the mult, shift and
offset provided in the perf userpage.

2015-02-11 11:59:21

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 3/4] perf inject: add jitdump mmap injection support

On Wed, Feb 11, 2015 at 12:42:44AM +0100, Stephane Eranian wrote:
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index aa6a504..c86c412 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf

> @@ -496,7 +499,8 @@ BUILTIN_OBJS += $(OUTPUT)builtin-inject.o
> BUILTIN_OBJS += $(OUTPUT)tests/builtin-test.o
> BUILTIN_OBJS += $(OUTPUT)builtin-mem.o
>
> -PERFLIBS = $(LIB_FILE) $(LIBAPIKFS) $(LIBTRACEEVENT)
> +PERFLIBS = $(LIB_FILE) $(LIBAPIKFS) $(LIBTRACEEVENT) -lcrypto
> +

Should we not add auto-detect magic for that? Otherwise perf will stop
building on machines without libssl.

2015-02-11 13:13:26

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH 3/4] perf inject: add jitdump mmap injection support

On Wed, Feb 11, 2015 at 6:59 AM, Peter Zijlstra <[email protected]> wrote:
>
> On Wed, Feb 11, 2015 at 12:42:44AM +0100, Stephane Eranian wrote:
> > diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> > index aa6a504..c86c412 100644
> > --- a/tools/perf/Makefile.perf
> > +++ b/tools/perf/Makefile.perf
>
> > @@ -496,7 +499,8 @@ BUILTIN_OBJS += $(OUTPUT)builtin-inject.o
> > BUILTIN_OBJS += $(OUTPUT)tests/builtin-test.o
> > BUILTIN_OBJS += $(OUTPUT)builtin-mem.o
> >
> > -PERFLIBS = $(LIB_FILE) $(LIBAPIKFS) $(LIBTRACEEVENT)
> > +PERFLIBS = $(LIB_FILE) $(LIBAPIKFS) $(LIBTRACEEVENT) -lcrypto
> > +
>
> Should we not add auto-detect magic for that? Otherwise perf will stop
> building on machines without libssl.


Good point.
The ssl support is needed to create the buildid for each generated ELF
image.

2015-02-11 13:18:39

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH 0/4] perf: add support for profiling jitted code

Peter,

On Wed, Feb 11, 2015 at 6:39 AM, Peter Zijlstra <[email protected]> wrote:
> On Wed, Feb 11, 2015 at 12:42:41AM +0100, Stephane Eranian wrote:
>> To enable synchronization of the runtime MMAPs with those recorded by
>> the kernel on behalf of the perf tool, the runtime needs to timestamp
>> any record in the dump file using the same time source. The current
>> patch series is relying on Sonny Rao's posix-clock patch series
>> posted on LKML in 2013. The link to the patches is:
>> https://lkml.org/lkml/2013/12/6/1044
>>
>
> At least for x86 you could use something like this:
>
> lkml.kernel.org/r/aa2dd2e4f1e9f2225758be5ba00f14d6909a8ce1.1423180257.git.shli@fb.com
>
> We can re-create the perf_clock() from the tsc with the mult, shift and
> offset provided in the perf userpage.

I had forgotten that I had modified Sonny's patch to use sched_clock(). I will
post V2 using David Ahern's driver instead.

But, we need a portable solution, there are jitted environment on other
architectures.

2015-02-11 15:27:21

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH 3/4] perf inject: add jitdump mmap injection support

On 2/10/15 4:42 PM, Stephane Eranian wrote:
> diff --git a/tools/perf/Documentation/perf-inject.txt b/tools/perf/Documentation/perf-inject.txt
> index dc7442c..237f195 100644
> --- a/tools/perf/Documentation/perf-inject.txt
> +++ b/tools/perf/Documentation/perf-inject.txt
> @@ -40,6 +40,17 @@ OPTIONS
> Merge sched_stat and sched_switch for getting events where and how long
> tasks slept. sched_switch contains a callchain where a task slept and
> sched_stat contains a timeslice how long a task slept.
> +-j::
> +--jit::
> + Merge a jitdump file into the perf.data file by adding mmap records to
> + cover jitted code and emit ELF images for each jitted function. The ELF
> + images are saved in the same directory as the jidump. Use -E to suppress
> + ELF images generation.
> +-E::
> +--jit-disable-elf::
> + When used with -, it prevents creating the ELF images for each jitted
> + function. Only the jitted code mmap records are injected into the perf.data
> + file. Option as no effect when -j is not used.

s/as/has/. But it would better to avoid the double negative. Maye
something like this instead:

This option requires -j|--jit.

David

2015-02-11 16:15:24

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 0/4] perf: add support for profiling jitted code

On Wed, Feb 11, 2015 at 08:18:37AM -0500, Stephane Eranian wrote:

> But, we need a portable solution, there are jitted environment on other
> architectures.

Yeah, I'm aware of that. But the time people are not liking any of those
patches.

The thing that seems to have most traction is something like

lkml.kernel.org/r/[email protected]

Which would basically make the trace clock be CLOCK_MONOTONIC.

2015-02-11 16:25:10

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH 0/4] perf: add support for profiling jitted code

On 2/11/15 9:15 AM, Peter Zijlstra wrote:
> On Wed, Feb 11, 2015 at 08:18:37AM -0500, Stephane Eranian wrote:
>
>> But, we need a portable solution, there are jitted environment on other
>> architectures.
>
> Yeah, I'm aware of that. But the time people are not liking any of those
> patches.
>
> The thing that seems to have most traction is something like
>
> lkml.kernel.org/r/[email protected]
>
> Which would basically make the trace clock be CLOCK_MONOTONIC.
>

Sure, but IF that solution is ever adopted it will apply to top of tree
kernels. In the meantime users of long term kernels (enterprise, LTS,
other) need to be able to use this feature.

I understand that Ingo (and probably you as well) have a disdain for the
kernel module approach mentioned above and it is not the preferred
approach. But given the resistance for over *4 years* to solutions to
this problem the module does provide a solution for the existing user base.

David

2015-02-12 08:14:11

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 1/4] perf tools: add Java demangling support

On Wed, Feb 11, 2015 at 12:42:42AM +0100, Stephane Eranian wrote:
> Add Java function descriptor demangling support.
> Something bfd cannot do.
>
> Signed-off-by: Stephane Eranian <[email protected]>
> ---

[SNIP]
> +/*
> + * Demangle Java function signature (Hotspot, not GCJ)
> + * input:
> + * str: string to parse. String is not modified
> + * return:
> + * if can demangle then a a newly allocate string is returned.
> + * if cannot demangle, then NULL is returned
> + *
> + * Note that caller is responsible for freeing demangled string
> + */
> +char *
> +java_demangle_sym(const char *str)
> +{
> + char *buf, *ptr;
> + char *p;
> + size_t len, l1;
> +
> + if (!str)
> + return NULL;
> +
> + /* find start of retunr type */
> + p = strrchr(str, ')');
> + if (!p)
> + return NULL;
> +
> + /*
> + * expansion factor estimated to 3x
> + */
> + len = strlen(str) * 3 + 1;
> + buf = malloc(len);
> + if (!buf)
> + return NULL;
> +
> + buf[0] = '\0';
> + /*
> + * get return type first
> + */
> + ptr = __demangle_java_sym(p+1, NULL, buf, len, MODE_TYPE);
> + if (!ptr)
> + goto error;
> +
> + /* add space between return type and function prototype */
> + l1 = strlen(buf);
> + buf[l1++] = ' ';
> +
> + /* process function up to return type */
> + ptr = __demangle_java_sym(str, p + 1, buf + l1, len - l1, MODE_PREFIX);
> + if (!ptr)
> + goto error;

Do we really need to use return type for Java symbol name? Note that
for C++ demangling, we show function name only by default and
parameters will be shown when user gave -v option.

Thanks,
Namhyung


> +
> + return buf;
> +error:
> + free(buf);
> + return NULL;
> +}
> +
> +

2015-02-12 08:21:58

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 3/4] perf inject: add jitdump mmap injection support

On Wed, Feb 11, 2015 at 08:27:17AM -0700, David Ahern wrote:
> On 2/10/15 4:42 PM, Stephane Eranian wrote:
> >diff --git a/tools/perf/Documentation/perf-inject.txt b/tools/perf/Documentation/perf-inject.txt
> >index dc7442c..237f195 100644
> >--- a/tools/perf/Documentation/perf-inject.txt
> >+++ b/tools/perf/Documentation/perf-inject.txt
> >@@ -40,6 +40,17 @@ OPTIONS
> > Merge sched_stat and sched_switch for getting events where and how long
> > tasks slept. sched_switch contains a callchain where a task slept and
> > sched_stat contains a timeslice how long a task slept.
> >+-j::
> >+--jit::
> >+ Merge a jitdump file into the perf.data file by adding mmap records to
> >+ cover jitted code and emit ELF images for each jitted function. The ELF
> >+ images are saved in the same directory as the jidump. Use -E to suppress
> >+ ELF images generation.
> >+-E::
> >+--jit-disable-elf::
> >+ When used with -, it prevents creating the ELF images for each jitted
> >+ function. Only the jitted code mmap records are injected into the perf.data
> >+ file. Option as no effect when -j is not used.
>
> s/as/has/. But it would better to avoid the double negative. Maye something
> like this instead:
>
> This option requires -j|--jit.

I think same thing may go to the option name itself. It can be, say,
--jit-create-elf and turned on by default. When one wants to disable
it, [s]he can use --no-jit-create-elf. Of course this way we cannot
use the short name (-E) for disabling tho.

Thanks,
Namhyung

2015-02-12 08:22:40

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 4/4] perf tools: add JVMTI agent library

On Wed, Feb 11, 2015 at 12:42:45AM +0100, Stephane Eranian wrote:
> This is a standalone JVMTI library to help profile Java jitted
> code with perf record/perf report. The library is not installed
> or compiled automatically by perf Makefile. It is not used
> directly by perf. It is arch agnostic and has been tested on
> X86 and ARM. It needs to be used with a Java runtime, such
> as OpenJDK, as follows:
>
> $ java -agentpath:libjvmti.so .......
>
> When used this way, java will generate a jitdump binary file in
> $HOME/.debug/java/jit/java-jit-*
>
> This binary dump file contains information to help symbolize and
> annotate jitted code.
>
> The next step is to inject the jitdump information into the
> perf.data file:
> $ perf inject -j $HOME/.debug/java/jit/java-jit-XXXX/jit-ZZZ.dump \
> -i perf.data -o perf.data.jitted
>
> This injects the MMAP records to cover the jitted code and also generates
> one ELF image for each jitted function. The ELF images are created in the
> same subdir as the jitdump file. The MMAP records point there too.
>
> Then to visualize the function or asm profile, simply use the regular
> perf commands:
> $ perf report -i perf.data.jitted
> or
> $ perf annotate -i perf.data.jitted
>
> JVMTI agent code adapted from OProfile's opagent code.
>
> Signed-off-by: Stephane Eranian <[email protected]>
> ---
> tools/perf/jvmti/Makefile | 70 +++++++++
> tools/perf/jvmti/jvmti_agent.c | 349 +++++++++++++++++++++++++++++++++++++++++
> tools/perf/jvmti/jvmti_agent.h | 23 +++
> tools/perf/jvmti/libjvmti.c | 149 ++++++++++++++++++
> 4 files changed, 591 insertions(+)
> create mode 100644 tools/perf/jvmti/Makefile
> create mode 100644 tools/perf/jvmti/jvmti_agent.c
> create mode 100644 tools/perf/jvmti/jvmti_agent.h
> create mode 100644 tools/perf/jvmti/libjvmti.c
>
> diff --git a/tools/perf/jvmti/Makefile b/tools/perf/jvmti/Makefile
> new file mode 100644
> index 0000000..9eda64b
> --- /dev/null
> +++ b/tools/perf/jvmti/Makefile
> @@ -0,0 +1,70 @@
> +ARCH=$(shell uname -m)
> +
> +ifeq ($(ARCH), x86_64)
> +JARCH=amd64
> +endif
> +ifeq ($(ARCH), armv7l)
> +JARCH=armhf
> +endif
> +ifeq ($(ARCH), armv6l)
> +JARCH=armhf
> +endif
> +ifeq ($(ARCH), ppc64)
> +JARCH=powerpc
> +endif
> +ifeq ($(ARCH), ppc64le)
> +JARCH=powerpc
> +endif
> +
> +DESTDIR=/usr/local
> +
> +VERSION=1
> +REVISION=0
> +AGE=0
> +
> +LN=ln -sf
> +RM=rm
> +
> +SJVMTI=libjvmti.so.$(VERSION).$(REVISION).$(AGE)
> +VJVMTI=libjvmti.so.$(VERSION)
> +SLDFLAGS=-shared -Wl,-soname -Wl,$(VLIBPFM)

s/VLIBPFM/VLIBJVMTI/ ?

Thanks,
Namhyung


> +SOLIBEXT=so
> +
> +JDIR=$(shell /usr/sbin/update-java-alternatives -l | head -1 | cut -d ' ' -f 3)
> +# -lrt required in 32-bit mode for clock_gettime()
> +LIBS=-lelf -lrt
> +INCDIR=-I $(JDIR)/include -I $(JDIR)/include/linux
> +
> +TARGETS=$(SJVMTI)
> +
> +SRCS=libjvmti.c jvmti_agent.c
> +OBJS=$(SRCS:.c=.o)
> +SOBJS=$(OBJS:.o=.lo)
> +OPT=-O2 -g -Werror -Wall
> +
> +CFLAGS=$(INCDIR) $(OPT)
> +
> +all: $(TARGETS)
> +
> +.c.o:
> + $(CC) $(CFLAGS) -c $*.c
> +.c.lo:
> + $(CC) -fPIC -DPIC $(CFLAGS) -c $*.c -o $*.lo
> +
> +$(OBJS) $(SOBJS): Makefile jvmti_agent.h ../util/jitdump.h
> +
> +$(SJVMTI): $(SOBJS)
> + $(CC) $(CFLAGS) $(SLDFLAGS) -o $@ $(SOBJS) $(LIBS)
> + $(LN) $@ libjvmti.$(SOLIBEXT)
> +
> +clean:
> + $(RM) -f *.o *.so.* *.so *.lo
> +
> +install:
> + -mkdir -p $(DESTDIR)/lib
> + install -m 755 $(SJVMTI) $(DESTDIR)/lib/
> + (cd $(DESTDIR)/lib; $(LN) $(SJVMTI) $(VJVMTI))
> + (cd $(DESTDIR)/lib; $(LN) $(SJVMTI) libjvmti.$(SOLIBEXT))
> + ldconfig
> +
> +.SUFFIXES: .c .S .o .lo

2015-02-12 11:27:46

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH 1/4] perf tools: add Java demangling support

On Wed, Feb 11, 2015 at 12:42:42AM +0100, Stephane Eranian wrote:
> Add Java function descriptor demangling support.
> Something bfd cannot do.
>

SNIP

> + }
> + buf[rlen] = '\0';
> + return buf;
> +error:
> + return NULL;
> +}
> +
> +/*
> + * Demangle Java function signature (Hotspot, not GCJ)
> + * input:
> + * str: string to parse. String is not modified
> + * return:
> + * if can demangle then a a newly allocate string is returned.
> + * if cannot demangle, then NULL is returned
> + *
> + * Note that caller is responsible for freeing demangled string
> + */
> +char *
> +java_demangle_sym(const char *str)
> +{

This seem fairly separated functionality, could you please
move it into new object?

Also could you please document in more details requested
input and produced output strings of this function?

It's nice function to have automated test that translates
strings and verify expected output. IMO it'd help documenting
the function as well.

thanks,
jirka

2015-02-12 13:26:19

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH 4/4] perf tools: add JVMTI agent library

On Wed, Feb 11, 2015 at 12:42:45AM +0100, Stephane Eranian wrote:

SNIP

> +LN=ln -sf
> +RM=rm
> +
> +SJVMTI=libjvmti.so.$(VERSION).$(REVISION).$(AGE)
> +VJVMTI=libjvmti.so.$(VERSION)
> +SLDFLAGS=-shared -Wl,-soname -Wl,$(VLIBPFM)
> +SOLIBEXT=so
> +
> +JDIR=$(shell /usr/sbin/update-java-alternatives -l | head -1 | cut -d ' ' -f 3)

I had to se JDIR manualy..

is there some generic way to get above dir? It does not
work on Fedora, 'yum provides' did not find any package
carrying update-java-alternatives binary..

jirka

2015-02-12 13:44:24

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH 0/4] perf: add support for profiling jitted code

On Wed, Feb 11, 2015 at 12:42:41AM +0100, Stephane Eranian wrote:

SNIP

> To use the new feature:
> - install the posix clock driver
> - make sure you chmod 644 /dev/trace_clock
> - compile perf
> - cd tools/perf/jvmti; make; install wherever is appropriate
>
> Example using openJDK:
> $ perf record java -agentpath:libjvmti.so my_class
> java: jvmti: jitdump in $HOME/.debug/jit/java-jit-20150207.XXL9649H/jit-6320.dump
> $ perf inject -i perf.data -j $HOME/.debug/jit/java-jit-20150207.XXL9649H/jit-6320.dump -o perf.data.jitted
> $ perf report -i perf.data.jitted

So report should display Java symbols now right? Is there something
wrong with my test below?

thanks,
jirka


[jolsa@ibm-x3650m4-01 perf]$ ./perf record java -agentpath:./jvmti/libjvmti.so Puppy
java: jvmti: jitdump in /home/jolsa/.debug/jit/java-jit-20150212.XXru95af/jit-3777.dump
Passed Name is :tommy
Puppy's age is :2
Variable Value :2
^C[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 0.666 MB perf.data (17258 samples) ]


[jolsa@ibm-x3650m4-01 perf]$ ./perf inject -v -i perf.data -j /home/jolsa/.debug/jit/java-jit-20150212.XXru95af/jit-3777.dump -o perf.data.jitted
build id event received for /lib/modules/3.19.0-rc7jit+/build/vmlinux: 4bb9dab963421f041b3eb3ec588fc07a58de4fa8
build id event received for /usr/bin/bash: c9f090657c35c10d6edeca09f62de9d22060a706
build id event received for /usr/lib64/ld-2.18.so: dddaf5704dbe9ace5c31f12ab90d8bfeb8fc5eb3
build id event received for [vdso]: b9e5cb1dcb2ab0d9d969fdc0816c4ad9af37e26b
build id event received for /usr/lib64/libc-2.18.so: f1b808c85f949e82cf864c1ef5a13c8acc703e3b
build id event received for /usr/lib64/libpthread-2.18.so: 5b1e8bb0e7ae081585497d797e06614815b768ef
build id event received for /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.31.x86_64/jre/lib/amd64/server/libjvm.so: 7b87f33ae1b3150058839496dbe5b5521f8977b3
build id event received for /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.31.x86_64/jre/lib/amd64/libjava.so: 32594e2184996cd7ff8257c8ef70656d91d2b03c
build id event received for /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.31.x86_64/jre/lib/amd64/libzip.so: 5cbb2075b9b316fc39dd0d5f9b0dcc8825256a4b
version=1
size=32(0)
ts=0x26feca
pid=3777
elf_mach=62
failed to write feature 2


[jolsa@ibm-x3650m4-01 perf]$ ./perf report -i perf.data.jitted --stdio | head -13
Failed to open /tmp/perf-3777.map, continuing without symbols
# To display the perf.data header info, please use --header/--header-only options.
#
# Samples: 17K of event 'cycles'
# Event count (approx.): 9728779054
#
# Overhead Command Shared Object Symbol
# ........ ......... .................. .....................................................
#
50.90% java-abrt perf-3777.map [.] 0x00007f93a501e550
15.00% java-abrt perf-3777.map [.] 0x00007f93a501e586
14.16% java-abrt perf-3777.map [.] 0x00007f93a501e579
12.63% java-abrt perf-3777.map [.] 0x00007f93a501e67b
3.04% java-abrt perf-3777.map [.] 0x00007f93a501e685

2015-02-12 16:09:55

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH 4/4] perf tools: add JVMTI agent library

On Thu, Feb 12, 2015 at 8:25 AM, Jiri Olsa <[email protected]> wrote:
> On Wed, Feb 11, 2015 at 12:42:45AM +0100, Stephane Eranian wrote:
>
> SNIP
>
>> +LN=ln -sf
>> +RM=rm
>> +
>> +SJVMTI=libjvmti.so.$(VERSION).$(REVISION).$(AGE)
>> +VJVMTI=libjvmti.so.$(VERSION)
>> +SLDFLAGS=-shared -Wl,-soname -Wl,$(VLIBPFM)
>> +SOLIBEXT=so
>> +
>> +JDIR=$(shell /usr/sbin/update-java-alternatives -l | head -1 | cut -d ' ' -f 3)
>
> I had to se JDIR manualy..
>
> is there some generic way to get above dir? It does not
> work on Fedora, 'yum provides' did not find any package
> carrying update-java-alternatives binary..
>
This one comes from the java-common package on Ubuntu.

2015-02-12 16:59:01

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 4/4] perf tools: add JVMTI agent library

> This injects the MMAP records to cover the jitted code and also generates
> one ELF image for each jitted function. The ELF images are created in the
> same subdir as the jitdump file. The MMAP records point there too.

How about line numbers? That would need a lot of code to generate dwarf, right?

-Andi

2015-02-12 17:11:26

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH 4/4] perf tools: add JVMTI agent library

Andi,

On Thu, Feb 12, 2015 at 11:58 AM, Andi Kleen <[email protected]> wrote:
>> This injects the MMAP records to cover the jitted code and also generates
>> one ELF image for each jitted function. The ELF images are created in the
>> same subdir as the jitdump file. The MMAP records point there too.
>
> How about line numbers? That would need a lot of code to generate dwarf, right?
>
The JVMTI agent does not generate the ELF files. It can extract line numbers
form the JVMTI interface and put them into the jitdump file.

Then that information would need to be encoded into dwarf in the ELF file.

I have not looked at how much is needed to do that.

2015-02-12 17:27:05

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH 0/4] perf: add support for profiling jitted code

Jiri,

On Thu, Feb 12, 2015 at 8:42 AM, Jiri Olsa <[email protected]> wrote:
> On Wed, Feb 11, 2015 at 12:42:41AM +0100, Stephane Eranian wrote:
>
> SNIP
>
>> To use the new feature:
>> - install the posix clock driver
>> - make sure you chmod 644 /dev/trace_clock
>> - compile perf
>> - cd tools/perf/jvmti; make; install wherever is appropriate
>>
>> Example using openJDK:
>> $ perf record java -agentpath:libjvmti.so my_class
>> java: jvmti: jitdump in $HOME/.debug/jit/java-jit-20150207.XXL9649H/jit-6320.dump
>> $ perf inject -i perf.data -j $HOME/.debug/jit/java-jit-20150207.XXL9649H/jit-6320.dump -o perf.data.jitted
>> $ perf report -i perf.data.jitted
>
> So report should display Java symbols now right? Is there something
> wrong with my test below?
>
That is because the timestamps are incorrect. Sorry about that.
Sonny's module as is is not going to generate the correct timestamps.
I will V2 today. It uses another module from David and it works because
it is using the same time source as perf_events.

> thanks,
> jirka
>
>
> [jolsa@ibm-x3650m4-01 perf]$ ./perf record java -agentpath:./jvmti/libjvmti.so Puppy
> java: jvmti: jitdump in /home/jolsa/.debug/jit/java-jit-20150212.XXru95af/jit-3777.dump
> Passed Name is :tommy
> Puppy's age is :2
> Variable Value :2
> ^C[ perf record: Woken up 3 times to write data ]
> [ perf record: Captured and wrote 0.666 MB perf.data (17258 samples) ]
>
>
> [jolsa@ibm-x3650m4-01 perf]$ ./perf inject -v -i perf.data -j /home/jolsa/.debug/jit/java-jit-20150212.XXru95af/jit-3777.dump -o perf.data.jitted
> build id event received for /lib/modules/3.19.0-rc7jit+/build/vmlinux: 4bb9dab963421f041b3eb3ec588fc07a58de4fa8
> build id event received for /usr/bin/bash: c9f090657c35c10d6edeca09f62de9d22060a706
> build id event received for /usr/lib64/ld-2.18.so: dddaf5704dbe9ace5c31f12ab90d8bfeb8fc5eb3
> build id event received for [vdso]: b9e5cb1dcb2ab0d9d969fdc0816c4ad9af37e26b
> build id event received for /usr/lib64/libc-2.18.so: f1b808c85f949e82cf864c1ef5a13c8acc703e3b
> build id event received for /usr/lib64/libpthread-2.18.so: 5b1e8bb0e7ae081585497d797e06614815b768ef
> build id event received for /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.31.x86_64/jre/lib/amd64/server/libjvm.so: 7b87f33ae1b3150058839496dbe5b5521f8977b3
> build id event received for /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.31.x86_64/jre/lib/amd64/libjava.so: 32594e2184996cd7ff8257c8ef70656d91d2b03c
> build id event received for /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.31.x86_64/jre/lib/amd64/libzip.so: 5cbb2075b9b316fc39dd0d5f9b0dcc8825256a4b
> version=1
> size=32(0)
> ts=0x26feca
> pid=3777
> elf_mach=62
> failed to write feature 2
>
>
> [jolsa@ibm-x3650m4-01 perf]$ ./perf report -i perf.data.jitted --stdio | head -13
> Failed to open /tmp/perf-3777.map, continuing without symbols
> # To display the perf.data header info, please use --header/--header-only options.
> #
> # Samples: 17K of event 'cycles'
> # Event count (approx.): 9728779054
> #
> # Overhead Command Shared Object Symbol
> # ........ ......... .................. .....................................................
> #
> 50.90% java-abrt perf-3777.map [.] 0x00007f93a501e550
> 15.00% java-abrt perf-3777.map [.] 0x00007f93a501e586
> 14.16% java-abrt perf-3777.map [.] 0x00007f93a501e579
> 12.63% java-abrt perf-3777.map [.] 0x00007f93a501e67b
> 3.04% java-abrt perf-3777.map [.] 0x00007f93a501e685

2015-02-16 06:57:21

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH 2/4] perf tools: pass timestamp to map_init

On 11/02/15 01:42, Stephane Eranian wrote:
> This patch passes the sample timestamp down to the
> map_init function. This is used to sort mmap records.
>
> Signed-off-by: Stephane Eranian <[email protected]>
> ---

Are the changes of this patch used by your patch set? I couldn't see anywhere.

2015-02-16 07:03:29

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH 4/4] perf tools: add JVMTI agent library

On 11/02/15 01:42, Stephane Eranian wrote:
> This is a standalone JVMTI library to help profile Java jitted
> code with perf record/perf report. The library is not installed
> or compiled automatically by perf Makefile. It is not used
> directly by perf. It is arch agnostic and has been tested on
> X86 and ARM. It needs to be used with a Java runtime, such
> as OpenJDK, as follows:
>
> $ java -agentpath:libjvmti.so .......
>
> When used this way, java will generate a jitdump binary file in
> $HOME/.debug/java/jit/java-jit-*
>
> This binary dump file contains information to help symbolize and
> annotate jitted code.
>
> The next step is to inject the jitdump information into the
> perf.data file:
> $ perf inject -j $HOME/.debug/java/jit/java-jit-XXXX/jit-ZZZ.dump \
> -i perf.data -o perf.data.jitted
>
> This injects the MMAP records to cover the jitted code and also generates
> one ELF image for each jitted function. The ELF images are created in the
> same subdir as the jitdump file. The MMAP records point there too.
>
> Then to visualize the function or asm profile, simply use the regular
> perf commands:
> $ perf report -i perf.data.jitted
> or
> $ perf annotate -i perf.data.jitted
>
> JVMTI agent code adapted from OProfile's opagent code.
>
> Signed-off-by: Stephane Eranian <[email protected]>
> ---
> tools/perf/jvmti/Makefile | 70 +++++++++
> tools/perf/jvmti/jvmti_agent.c | 349 +++++++++++++++++++++++++++++++++++++++++
> tools/perf/jvmti/jvmti_agent.h | 23 +++
> tools/perf/jvmti/libjvmti.c | 149 ++++++++++++++++++
> 4 files changed, 591 insertions(+)
> create mode 100644 tools/perf/jvmti/Makefile
> create mode 100644 tools/perf/jvmti/jvmti_agent.c
> create mode 100644 tools/perf/jvmti/jvmti_agent.h
> create mode 100644 tools/perf/jvmti/libjvmti.c
>
> diff --git a/tools/perf/jvmti/Makefile b/tools/perf/jvmti/Makefile
> new file mode 100644
> index 0000000..9eda64b
> --- /dev/null
> +++ b/tools/perf/jvmti/Makefile
> @@ -0,0 +1,70 @@
> +ARCH=$(shell uname -m)
> +
> +ifeq ($(ARCH), x86_64)
> +JARCH=amd64
> +endif
> +ifeq ($(ARCH), armv7l)
> +JARCH=armhf
> +endif
> +ifeq ($(ARCH), armv6l)
> +JARCH=armhf
> +endif
> +ifeq ($(ARCH), ppc64)
> +JARCH=powerpc
> +endif
> +ifeq ($(ARCH), ppc64le)
> +JARCH=powerpc
> +endif
> +
> +DESTDIR=/usr/local
> +
> +VERSION=1
> +REVISION=0
> +AGE=0
> +
> +LN=ln -sf
> +RM=rm
> +
> +SJVMTI=libjvmti.so.$(VERSION).$(REVISION).$(AGE)
> +VJVMTI=libjvmti.so.$(VERSION)
> +SLDFLAGS=-shared -Wl,-soname -Wl,$(VLIBPFM)
> +SOLIBEXT=so
> +
> +JDIR=$(shell /usr/sbin/update-java-alternatives -l | head -1 | cut -d ' ' -f 3)
> +# -lrt required in 32-bit mode for clock_gettime()
> +LIBS=-lelf -lrt
> +INCDIR=-I $(JDIR)/include -I $(JDIR)/include/linux
> +
> +TARGETS=$(SJVMTI)
> +
> +SRCS=libjvmti.c jvmti_agent.c
> +OBJS=$(SRCS:.c=.o)
> +SOBJS=$(OBJS:.o=.lo)
> +OPT=-O2 -g -Werror -Wall
> +
> +CFLAGS=$(INCDIR) $(OPT)
> +
> +all: $(TARGETS)
> +
> +.c.o:
> + $(CC) $(CFLAGS) -c $*.c
> +.c.lo:
> + $(CC) -fPIC -DPIC $(CFLAGS) -c $*.c -o $*.lo
> +
> +$(OBJS) $(SOBJS): Makefile jvmti_agent.h ../util/jitdump.h
> +
> +$(SJVMTI): $(SOBJS)
> + $(CC) $(CFLAGS) $(SLDFLAGS) -o $@ $(SOBJS) $(LIBS)
> + $(LN) $@ libjvmti.$(SOLIBEXT)
> +
> +clean:
> + $(RM) -f *.o *.so.* *.so *.lo
> +
> +install:
> + -mkdir -p $(DESTDIR)/lib
> + install -m 755 $(SJVMTI) $(DESTDIR)/lib/
> + (cd $(DESTDIR)/lib; $(LN) $(SJVMTI) $(VJVMTI))
> + (cd $(DESTDIR)/lib; $(LN) $(SJVMTI) libjvmti.$(SOLIBEXT))
> + ldconfig
> +
> +.SUFFIXES: .c .S .o .lo
> diff --git a/tools/perf/jvmti/jvmti_agent.c b/tools/perf/jvmti/jvmti_agent.c
> new file mode 100644
> index 0000000..d2d5215
> --- /dev/null
> +++ b/tools/perf/jvmti/jvmti_agent.c
> @@ -0,0 +1,349 @@
> +/*
> + * jvmti_agent.c: JVMTI agent interface
> + *
> + * Adapted from the Oprofile code in opagent.c:
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
> + *
> + * Copyright 2007 OProfile authors
> + * Jens Wilke
> + * Daniel Hansel
> + * Copyright IBM Corporation 2007
> + */
> +#include <sys/types.h>
> +#include <sys/stat.h> /* for mkdir() */
> +#include <stdio.h>
> +#include <errno.h>
> +#include <string.h>
> +#include <stdlib.h>
> +#include <stdint.h>
> +#include <limits.h>
> +#include <fcntl.h>
> +#include <unistd.h>
> +#include <time.h>
> +#include <syscall.h> /* for gettid() */
> +#include <err.h>
> +
> +#include "jvmti_agent.h"
> +#include "../util/jitdump.h"
> +
> +#define JIT_LANG "java"
> +
> +static char jit_path[PATH_MAX];
> +
> +/*
> + * padding buffer
> + */
> +static const char pad_bytes[7];
> +
> +/*
> + * perf_events event fd
> + */
> +static int perf_fd;
> +
> +static inline pid_t gettid(void)
> +{
> + return (pid_t)syscall(__NR_gettid);
> +}
> +
> +static int get_e_machine(struct jitheader *hdr)
> +{
> + ssize_t sret;
> + char id[16];
> + int fd, ret = -1;
> + int m = -1;
> + struct {
> + uint16_t e_type;
> + uint16_t e_machine;
> + } info;
> +
> + fd = open("/proc/self/exe", O_RDONLY);
> + if (fd == -1)
> + return -1;
> +
> + sret = read(fd, id, sizeof(id));
> + if (sret != sizeof(id))
> + goto error;
> +
> + /* check ELF signature */
> + if (id[0] != 0x7f || id[1] != 'E' || id[2] != 'L' || id[3] != 'F')
> + goto error;
> +
> + sret = read(fd, &info, sizeof(info));
> + if (sret != sizeof(info))
> + goto error;
> +
> + m = info.e_machine;
> + if (m < 0)
> + m = 0; /* ELF EM_NONE */
> +
> + hdr->elf_mach = m;
> + ret = 0;
> +error:
> + close(fd);
> + return ret;
> +}
> +
> +#define CLOCK_DEVICE "/dev/trace_clock"
> +#define CLOCKFD 3
> +#define FD_TO_CLOCKID(fd) ((~(clockid_t) (fd) << 3) | CLOCKFD)
> +#define CLOCKID_TO_FD(id) ((~(int) (id) >> 3) & ~CLOCKFD)
> +
> +#define NSEC_PER_SEC 1000000000
> +
> +#ifndef CLOCK_INVALID
> +#define CLOCK_INVALID -1
> +#endif
> +
> +static inline clockid_t get_clockid(int fd)
> +{
> + return FD_TO_CLOCKID(fd);
> +}
> +
> +static int
> +perf_open_timestamp(void)
> +{
> + int fd, id;
> +
> + fd = open(CLOCK_DEVICE, O_RDONLY);
> + if (fd == -1) {
> + if (errno == ENOENT)
> + warnx("jvmti: %s not present, check your kernel for trace_clock module", CLOCK_DEVICE);
> + if (errno == EPERM)
> + warnx("jvmti: %s has wrong permissions, suggesting chmod 644 %s", CLOCK_DEVICE, CLOCK_DEVICE);
> + }
> +
> + id = get_clockid(fd);
> + if (CLOCK_INVALID == id)
> + return CLOCK_INVALID;
> +
> + return get_clockid(fd);
> +}
> +
> +static inline void
> +perf_close_timestamp(int id)
> +{
> + close(CLOCKID_TO_FD(id));
> +}
> +
> +
> +static inline uint64_t
> +timespec_to_ns(const struct timespec *ts)
> +{
> + return ((uint64_t) ts->tv_sec * NSEC_PER_SEC) + ts->tv_nsec;
> +}
> +
> +static inline uint64_t
> +perf_get_timestamp(int id)
> +{
> + struct timespec ts;
> +
> + clock_gettime(id, &ts);
> + return timespec_to_ns(&ts);
> +}
> +
> +static int
> +debug_cache_init(void)
> +{
> + char str[32];
> + char *base, *p;
> + struct tm tm;
> + time_t t;
> + int ret;
> +
> + time(&t);
> + localtime_r(&t, &tm);
> +
> + base = getenv("JITDUMPDIR");
> + if (!base)
> + base = getenv("HOME");
> + if (!base)
> + base = ".";
> +
> + strftime(str, sizeof(str), JIT_LANG"-jit-%Y%m%d", &tm);
> +
> + snprintf(jit_path, PATH_MAX - 1, "%s/.debug/", base);
> +
> + ret = mkdir(jit_path, 0755);
> + if (ret == -1) {
> + if (errno != EEXIST) {
> + warn("jvmti: cannot create jit cache dir %s", jit_path);
> + return -1;
> + }
> + }
> +
> + snprintf(jit_path, PATH_MAX - 1, "%s/.debug/jit", base);
> + ret = mkdir(jit_path, 0755);
> + if (ret == -1) {
> + if (errno != EEXIST) {
> + warn("cannot create jit cache dir %s", jit_path);
> + return -1;
> + }
> + }
> +
> + snprintf(jit_path, PATH_MAX - 1, "%s/.debug/jit/%s.XXXXXXXX", base, str);
> +
> + p = mkdtemp(jit_path);
> + if (p != jit_path) {
> + warn("cannot create jit cache dir %s", jit_path);
> + return -1;
> + }
> +
> + return 0;
> +}
> +
> +void *jvmti_open(void)
> +{
> + int pad_cnt;
> + char dump_path[PATH_MAX];
> + struct jitheader header;
> + FILE *fp;
> +
> + perf_fd = perf_open_timestamp();
> + if (perf_fd == -1)
> + warnx("jvmti: kernel does not support /dev/trace_clock or permissions are wrong on that device");
> +
> + memset(&header, 0, sizeof(header));
> +
> + debug_cache_init();
> +
> + snprintf(dump_path, PATH_MAX, "%s/jit-%i.dump", jit_path, getpid());
> +
> + fp = fopen(dump_path, "w");
> + if (!fp) {
> + warn("jvmti: cannot create %s", dump_path);
> + goto error;
> + }
> +
> + warnx("jvmti: jitdump in %s", dump_path);
> +
> + if (get_e_machine(&header)) {
> + warn("get_e_machine failed\n");
> + goto error;
> + }
> +
> + header.magic = JITHEADER_MAGIC;
> + header.version = JITHEADER_VERSION;
> + header.total_size = sizeof(header);
> + header.pid = getpid();
> +
> + /* calculate amount of padding '\0' */
> + pad_cnt = PADDING_8ALIGNED(header.total_size);
> + header.total_size += pad_cnt;
> +
> + header.timestamp = perf_get_timestamp(perf_fd);
> +
> + if (!fwrite(&header, sizeof(header), 1, fp)) {
> + warn("jvmti: cannot write dumpfile header");
> + goto error;
> + }
> +
> + /* write padding '\0' if necessary */
> + if (pad_cnt && !fwrite(pad_bytes, pad_cnt, 1, fp)) {
> + warn("jvmti: cannot write dumpfile header padding");
> + goto error;
> + }
> +
> + return fp;
> +error:
> + fclose(fp);
> + perf_close_timestamp(perf_fd);
> + return NULL;
> +}
> +
> +int
> +jvmti_close(void *agent)
> +{
> + struct jr_code_close rec;
> + FILE *fp = agent;
> +
> + if (!fp) {
> + warnx("jvmti: incalid fd in close_agent");
> + return -1;
> + }
> +
> + rec.p.id = JIT_CODE_CLOSE;
> + rec.p.total_size = sizeof(rec);
> +
> + rec.p.timestamp = perf_get_timestamp(perf_fd);
> +
> + if (!fwrite(&rec, sizeof(rec), 1, fp))
> + return -1;
> +
> + fclose(fp);
> +
> + perf_close_timestamp(perf_fd);
> +
> + fp = NULL;
> +
> + return 0;
> +}
> +
> +int jvmti_write_code(void *agent, char const *sym,
> + uint64_t vma, void const *code, unsigned int const size)
> +{
> + static int code_generation = 1;
> + struct jr_code_load rec;
> + size_t sym_len;
> + size_t padding_count;
> + FILE *fp = agent;
> + int ret = -1;
> +
> + /* don't care about 0 length function, no samples */
> + if (size == 0)
> + return 0;
> +
> + if (!fp) {
> + warnx("jvmti: invalid fd in write_native_code");
> + return -1;
> + }
> +
> + sym_len = strlen(sym) + 1;
> +
> + rec.p.id = JIT_CODE_LOAD;
> + rec.p.total_size = sizeof(rec) + sym_len;
> + padding_count = PADDING_8ALIGNED(rec.p.total_size);
> + rec.p. total_size += padding_count;
> + rec.p.timestamp = perf_get_timestamp(perf_fd);

Do you know whether the JVM is guaranteed not to start executing the
generated code before the return of compiled_method_load_cb(), otherwise the
timestamp will be too late?

> +
> + rec.code_size = size;
> + rec.vma = vma;
> + rec.code_addr = vma;
> + rec.pid = getpid();
> + rec.tid = gettid();
> + rec.code_index = code_generation++;
> +
> + if (code)
> + rec.p.total_size += size;
> +
> + /*
> + * If JVM is multi-threaded, nultiple concurrent calls to agent
> + * may be possible, so protect file writes
> + */
> + flockfile(fp);
> +
> + ret = fwrite_unlocked(&rec, sizeof(rec), 1, fp);
> + fwrite_unlocked(sym, sym_len, 1, fp);
> + if (code)
> + fwrite_unlocked(code, size, 1, fp);
> +
> + if (padding_count)
> + fwrite_unlocked(pad_bytes, padding_count, 1, fp);
> +
> + funlockfile(fp);
> +
> + ret = 0;
> +
> + return ret;
> +}
> diff --git a/tools/perf/jvmti/jvmti_agent.h b/tools/perf/jvmti/jvmti_agent.h
> new file mode 100644
> index 0000000..54e5c5e
> --- /dev/null
> +++ b/tools/perf/jvmti/jvmti_agent.h
> @@ -0,0 +1,23 @@
> +#ifndef __JVMTI_AGENT_H__
> +#define __JVMTI_AGENT_H__
> +
> +#include <sys/types.h>
> +#include <stdint.h>
> +
> +#define __unused __attribute__((unused))
> +
> +#if defined(__cplusplus)
> +extern "C" {
> +#endif
> +
> +void *jvmti_open(void);
> +int jvmti_close(void *agent);
> +int jvmti_write_code(void *agent, char const *symbol_name,
> + uint64_t vma, void const *code,
> + const unsigned int code_size);
> +
> +#if defined(__cplusplus)
> +}
> +
> +#endif
> +#endif /* __JVMTI_H__ */
> diff --git a/tools/perf/jvmti/libjvmti.c b/tools/perf/jvmti/libjvmti.c
> new file mode 100644
> index 0000000..8b8d782
> --- /dev/null
> +++ b/tools/perf/jvmti/libjvmti.c
> @@ -0,0 +1,149 @@
> +#include <sys/types.h>
> +#include <stdio.h>
> +#include <string.h>
> +#include <stdlib.h>
> +#include <err.h>
> +#include <jvmti.h>
> +
> +#include "jvmti_agent.h"
> +
> +void *jvmti_agent;
> +
> +static void JNICALL
> +compiled_method_load_cb(jvmtiEnv *jvmti,
> + jmethodID method,
> + jint code_size,
> + void const *code_addr,
> + jint map_length,
> + jvmtiAddrLocationMap const *map,
> + void const *compile_info __unused)
> +{
> + jclass decl_class;
> + char *class_sign = NULL;
> + char *func_name = NULL;
> + char *func_sign = NULL;
> + jvmtiError ret;
> + size_t len;
> +
> + ret = (*jvmti)->GetMethodDeclaringClass(jvmti, method,
> + &decl_class);
> + if (ret != JVMTI_ERROR_NONE) {
> + warnx("jvmti: getmethoddeclaringclass failed");
> + return;
> + }
> +
> + ret = (*jvmti)->GetClassSignature(jvmti, decl_class,
> + &class_sign, NULL);
> + if (ret != JVMTI_ERROR_NONE) {
> + warnx("jvmti: getclassignature failed");
> + goto error;
> + }
> +
> + ret = (*jvmti)->GetMethodName(jvmti, method, &func_name,
> + &func_sign, NULL);
> + if (ret != JVMTI_ERROR_NONE) {
> + warnx("jvmti: failed getmethodname");
> + goto error;
> + }
> +
> + len = strlen(func_name) + strlen(class_sign) + strlen(func_sign) + 2;
> +
> + {
> + char str[len];
> + uint64_t addr = (uint64_t)(unsigned long)code_addr;
> + snprintf(str, len, "%s%s%s", class_sign, func_name, func_sign);
> + ret = jvmti_write_code(jvmti_agent, str, addr, code_addr, code_size);
> + if (ret)
> + warnx("jvmti: write_code() failed");
> + }
> +error:
> + (*jvmti)->Deallocate(jvmti, (unsigned char *)func_name);
> + (*jvmti)->Deallocate(jvmti, (unsigned char *)func_sign);
> + (*jvmti)->Deallocate(jvmti, (unsigned char *)class_sign);
> +}
> +
> +static void JNICALL
> +code_generated_cb(jvmtiEnv *jvmti,
> + char const *name,
> + void const *code_addr,
> + jint code_size)
> +{
> + uint64_t addr = (uint64_t)(unsigned long)code_addr;
> + int ret;
> +
> + ret = jvmti_write_code(jvmti_agent, name, addr, code_addr, code_size);
> + if (ret)
> + warnx("jvmti: write_code() failed for code_generated");
> +}
> +
> +JNIEXPORT jint JNICALL
> +Agent_OnLoad(JavaVM *jvm, char *options, void *reserved __unused)
> +{
> + jvmtiEventCallbacks cb;
> + jvmtiCapabilities caps1;
> + jvmtiEnv *jvmti = NULL;
> + jint ret;
> +
> + jvmti_agent = jvmti_open();
> + if (!jvmti_agent) {
> + warnx("jvmti: open_agent failed");
> + return -1;
> + }
> +
> + /*
> + * Request a JVMTI interface version 1 environment
> + */
> + ret = (*jvm)->GetEnv(jvm, (void *)&jvmti, JVMTI_VERSION_1);
> + if (ret != JNI_OK) {
> + warnx("jvmti: jvmti version 1 not supported");
> + return -1;
> + }
> +
> + /*
> + * acquire method_load capability, we require it
> + */
> + memset(&caps1, 0, sizeof(caps1));
> + caps1.can_generate_compiled_method_load_events = 1;
> +
> + ret = (*jvmti)->AddCapabilities(jvmti, &caps1);
> + if (ret != JVMTI_ERROR_NONE) {
> + warnx("jvmti: acquire compiled_method capability failed");
> + return -1;
> + }
> +
> + memset(&cb, 0, sizeof(cb));
> +
> + cb.CompiledMethodLoad = compiled_method_load_cb;
> + cb.DynamicCodeGenerated = code_generated_cb;
> +
> + ret = (*jvmti)->SetEventCallbacks(jvmti, &cb, sizeof(cb));
> + if (ret != JVMTI_ERROR_NONE) {
> + warnx("jvmti: cannot set event callbacks");
> + return -1;
> + }
> +
> + ret = (*jvmti)->SetEventNotificationMode(jvmti, JVMTI_ENABLE,
> + JVMTI_EVENT_COMPILED_METHOD_LOAD, NULL);
> + if (ret != JVMTI_ERROR_NONE) {
> + warnx("jvmti: setnotification failed for method_load");
> + return -1;
> + }
> +
> + ret = (*jvmti)->SetEventNotificationMode(jvmti, JVMTI_ENABLE,
> + JVMTI_EVENT_DYNAMIC_CODE_GENERATED, NULL);
> + if (ret != JVMTI_ERROR_NONE) {
> + warnx("jvmti: setnotification failed on code_generated");
> + return -1;
> + }
> + return 0;
> +}
> +
> +JNIEXPORT void JNICALL
> +Agent_OnUnload(JavaVM *jvm __unused)
> +{
> + int ret;
> +
> + ret = jvmti_close(jvmti_agent);
> + if (ret)
> + errx(1, "Error: op_close_agent()");
> +}
>

2015-02-16 20:22:21

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH 4/4] perf tools: add JVMTI agent library

On Mon, Feb 16, 2015 at 2:01 AM, Adrian Hunter <[email protected]> wrote:
> On 11/02/15 01:42, Stephane Eranian wrote:
>> This is a standalone JVMTI library to help profile Java jitted
>> code with perf record/perf report. The library is not installed
>> or compiled automatically by perf Makefile. It is not used
>> directly by perf. It is arch agnostic and has been tested on
>> X86 and ARM. It needs to be used with a Java runtime, such
>> as OpenJDK, as follows:
>>
>> $ java -agentpath:libjvmti.so .......
>>
>> When used this way, java will generate a jitdump binary file in
>> $HOME/.debug/java/jit/java-jit-*
>>
>> This binary dump file contains information to help symbolize and
>> annotate jitted code.
>>
>> The next step is to inject the jitdump information into the
>> perf.data file:
>> $ perf inject -j $HOME/.debug/java/jit/java-jit-XXXX/jit-ZZZ.dump \
>> -i perf.data -o perf.data.jitted
>>
>> This injects the MMAP records to cover the jitted code and also generates
>> one ELF image for each jitted function. The ELF images are created in the
>> same subdir as the jitdump file. The MMAP records point there too.
>>
>> Then to visualize the function or asm profile, simply use the regular
>> perf commands:
>> $ perf report -i perf.data.jitted
>> or
>> $ perf annotate -i perf.data.jitted
>>
>> JVMTI agent code adapted from OProfile's opagent code.
>>
>> Signed-off-by: Stephane Eranian <[email protected]>
>> ---
>> tools/perf/jvmti/Makefile | 70 +++++++++
>> tools/perf/jvmti/jvmti_agent.c | 349 +++++++++++++++++++++++++++++++++++++++++
>> tools/perf/jvmti/jvmti_agent.h | 23 +++
>> tools/perf/jvmti/libjvmti.c | 149 ++++++++++++++++++
>> 4 files changed, 591 insertions(+)
>> create mode 100644 tools/perf/jvmti/Makefile
>> create mode 100644 tools/perf/jvmti/jvmti_agent.c
>> create mode 100644 tools/perf/jvmti/jvmti_agent.h
>> create mode 100644 tools/perf/jvmti/libjvmti.c
>>
>> diff --git a/tools/perf/jvmti/Makefile b/tools/perf/jvmti/Makefile
>> new file mode 100644
>> index 0000000..9eda64b
>> --- /dev/null
>> +++ b/tools/perf/jvmti/Makefile
>> @@ -0,0 +1,70 @@
>> +ARCH=$(shell uname -m)
>> +
>> +ifeq ($(ARCH), x86_64)
>> +JARCH=amd64
>> +endif
>> +ifeq ($(ARCH), armv7l)
>> +JARCH=armhf
>> +endif
>> +ifeq ($(ARCH), armv6l)
>> +JARCH=armhf
>> +endif
>> +ifeq ($(ARCH), ppc64)
>> +JARCH=powerpc
>> +endif
>> +ifeq ($(ARCH), ppc64le)
>> +JARCH=powerpc
>> +endif
>> +
>> +DESTDIR=/usr/local
>> +
>> +VERSION=1
>> +REVISION=0
>> +AGE=0
>> +
>> +LN=ln -sf
>> +RM=rm
>> +
>> +SJVMTI=libjvmti.so.$(VERSION).$(REVISION).$(AGE)
>> +VJVMTI=libjvmti.so.$(VERSION)
>> +SLDFLAGS=-shared -Wl,-soname -Wl,$(VLIBPFM)
>> +SOLIBEXT=so
>> +
>> +JDIR=$(shell /usr/sbin/update-java-alternatives -l | head -1 | cut -d ' ' -f 3)
>> +# -lrt required in 32-bit mode for clock_gettime()
>> +LIBS=-lelf -lrt
>> +INCDIR=-I $(JDIR)/include -I $(JDIR)/include/linux
>> +
>> +TARGETS=$(SJVMTI)
>> +
>> +SRCS=libjvmti.c jvmti_agent.c
>> +OBJS=$(SRCS:.c=.o)
>> +SOBJS=$(OBJS:.o=.lo)
>> +OPT=-O2 -g -Werror -Wall
>> +
>> +CFLAGS=$(INCDIR) $(OPT)
>> +
>> +all: $(TARGETS)
>> +
>> +.c.o:
>> + $(CC) $(CFLAGS) -c $*.c
>> +.c.lo:
>> + $(CC) -fPIC -DPIC $(CFLAGS) -c $*.c -o $*.lo
>> +
>> +$(OBJS) $(SOBJS): Makefile jvmti_agent.h ../util/jitdump.h
>> +
>> +$(SJVMTI): $(SOBJS)
>> + $(CC) $(CFLAGS) $(SLDFLAGS) -o $@ $(SOBJS) $(LIBS)
>> + $(LN) $@ libjvmti.$(SOLIBEXT)
>> +
>> +clean:
>> + $(RM) -f *.o *.so.* *.so *.lo
>> +
>> +install:
>> + -mkdir -p $(DESTDIR)/lib
>> + install -m 755 $(SJVMTI) $(DESTDIR)/lib/
>> + (cd $(DESTDIR)/lib; $(LN) $(SJVMTI) $(VJVMTI))
>> + (cd $(DESTDIR)/lib; $(LN) $(SJVMTI) libjvmti.$(SOLIBEXT))
>> + ldconfig
>> +
>> +.SUFFIXES: .c .S .o .lo
>> diff --git a/tools/perf/jvmti/jvmti_agent.c b/tools/perf/jvmti/jvmti_agent.c
>> new file mode 100644
>> index 0000000..d2d5215
>> --- /dev/null
>> +++ b/tools/perf/jvmti/jvmti_agent.c
>> @@ -0,0 +1,349 @@
>> +/*
>> + * jvmti_agent.c: JVMTI agent interface
>> + *
>> + * Adapted from the Oprofile code in opagent.c:
>> + * This library is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU Lesser General Public
>> + * License as published by the Free Software Foundation; either
>> + * version 2.1 of the License, or (at your option) any later version.
>> + *
>> + * This library is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + * Lesser General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU Lesser General Public
>> + * License along with this library; if not, write to the Free Software
>> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
>> + *
>> + * Copyright 2007 OProfile authors
>> + * Jens Wilke
>> + * Daniel Hansel
>> + * Copyright IBM Corporation 2007
>> + */
>> +#include <sys/types.h>
>> +#include <sys/stat.h> /* for mkdir() */
>> +#include <stdio.h>
>> +#include <errno.h>
>> +#include <string.h>
>> +#include <stdlib.h>
>> +#include <stdint.h>
>> +#include <limits.h>
>> +#include <fcntl.h>
>> +#include <unistd.h>
>> +#include <time.h>
>> +#include <syscall.h> /* for gettid() */
>> +#include <err.h>
>> +
>> +#include "jvmti_agent.h"
>> +#include "../util/jitdump.h"
>> +
>> +#define JIT_LANG "java"
>> +
>> +static char jit_path[PATH_MAX];
>> +
>> +/*
>> + * padding buffer
>> + */
>> +static const char pad_bytes[7];
>> +
>> +/*
>> + * perf_events event fd
>> + */
>> +static int perf_fd;
>> +
>> +static inline pid_t gettid(void)
>> +{
>> + return (pid_t)syscall(__NR_gettid);
>> +}
>> +
>> +static int get_e_machine(struct jitheader *hdr)
>> +{
>> + ssize_t sret;
>> + char id[16];
>> + int fd, ret = -1;
>> + int m = -1;
>> + struct {
>> + uint16_t e_type;
>> + uint16_t e_machine;
>> + } info;
>> +
>> + fd = open("/proc/self/exe", O_RDONLY);
>> + if (fd == -1)
>> + return -1;
>> +
>> + sret = read(fd, id, sizeof(id));
>> + if (sret != sizeof(id))
>> + goto error;
>> +
>> + /* check ELF signature */
>> + if (id[0] != 0x7f || id[1] != 'E' || id[2] != 'L' || id[3] != 'F')
>> + goto error;
>> +
>> + sret = read(fd, &info, sizeof(info));
>> + if (sret != sizeof(info))
>> + goto error;
>> +
>> + m = info.e_machine;
>> + if (m < 0)
>> + m = 0; /* ELF EM_NONE */
>> +
>> + hdr->elf_mach = m;
>> + ret = 0;
>> +error:
>> + close(fd);
>> + return ret;
>> +}
>> +
>> +#define CLOCK_DEVICE "/dev/trace_clock"
>> +#define CLOCKFD 3
>> +#define FD_TO_CLOCKID(fd) ((~(clockid_t) (fd) << 3) | CLOCKFD)
>> +#define CLOCKID_TO_FD(id) ((~(int) (id) >> 3) & ~CLOCKFD)
>> +
>> +#define NSEC_PER_SEC 1000000000
>> +
>> +#ifndef CLOCK_INVALID
>> +#define CLOCK_INVALID -1
>> +#endif
>> +
>> +static inline clockid_t get_clockid(int fd)
>> +{
>> + return FD_TO_CLOCKID(fd);
>> +}
>> +
>> +static int
>> +perf_open_timestamp(void)
>> +{
>> + int fd, id;
>> +
>> + fd = open(CLOCK_DEVICE, O_RDONLY);
>> + if (fd == -1) {
>> + if (errno == ENOENT)
>> + warnx("jvmti: %s not present, check your kernel for trace_clock module", CLOCK_DEVICE);
>> + if (errno == EPERM)
>> + warnx("jvmti: %s has wrong permissions, suggesting chmod 644 %s", CLOCK_DEVICE, CLOCK_DEVICE);
>> + }
>> +
>> + id = get_clockid(fd);
>> + if (CLOCK_INVALID == id)
>> + return CLOCK_INVALID;
>> +
>> + return get_clockid(fd);
>> +}
>> +
>> +static inline void
>> +perf_close_timestamp(int id)
>> +{
>> + close(CLOCKID_TO_FD(id));
>> +}
>> +
>> +
>> +static inline uint64_t
>> +timespec_to_ns(const struct timespec *ts)
>> +{
>> + return ((uint64_t) ts->tv_sec * NSEC_PER_SEC) + ts->tv_nsec;
>> +}
>> +
>> +static inline uint64_t
>> +perf_get_timestamp(int id)
>> +{
>> + struct timespec ts;
>> +
>> + clock_gettime(id, &ts);
>> + return timespec_to_ns(&ts);
>> +}
>> +
>> +static int
>> +debug_cache_init(void)
>> +{
>> + char str[32];
>> + char *base, *p;
>> + struct tm tm;
>> + time_t t;
>> + int ret;
>> +
>> + time(&t);
>> + localtime_r(&t, &tm);
>> +
>> + base = getenv("JITDUMPDIR");
>> + if (!base)
>> + base = getenv("HOME");
>> + if (!base)
>> + base = ".";
>> +
>> + strftime(str, sizeof(str), JIT_LANG"-jit-%Y%m%d", &tm);
>> +
>> + snprintf(jit_path, PATH_MAX - 1, "%s/.debug/", base);
>> +
>> + ret = mkdir(jit_path, 0755);
>> + if (ret == -1) {
>> + if (errno != EEXIST) {
>> + warn("jvmti: cannot create jit cache dir %s", jit_path);
>> + return -1;
>> + }
>> + }
>> +
>> + snprintf(jit_path, PATH_MAX - 1, "%s/.debug/jit", base);
>> + ret = mkdir(jit_path, 0755);
>> + if (ret == -1) {
>> + if (errno != EEXIST) {
>> + warn("cannot create jit cache dir %s", jit_path);
>> + return -1;
>> + }
>> + }
>> +
>> + snprintf(jit_path, PATH_MAX - 1, "%s/.debug/jit/%s.XXXXXXXX", base, str);
>> +
>> + p = mkdtemp(jit_path);
>> + if (p != jit_path) {
>> + warn("cannot create jit cache dir %s", jit_path);
>> + return -1;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +void *jvmti_open(void)
>> +{
>> + int pad_cnt;
>> + char dump_path[PATH_MAX];
>> + struct jitheader header;
>> + FILE *fp;
>> +
>> + perf_fd = perf_open_timestamp();
>> + if (perf_fd == -1)
>> + warnx("jvmti: kernel does not support /dev/trace_clock or permissions are wrong on that device");
>> +
>> + memset(&header, 0, sizeof(header));
>> +
>> + debug_cache_init();
>> +
>> + snprintf(dump_path, PATH_MAX, "%s/jit-%i.dump", jit_path, getpid());
>> +
>> + fp = fopen(dump_path, "w");
>> + if (!fp) {
>> + warn("jvmti: cannot create %s", dump_path);
>> + goto error;
>> + }
>> +
>> + warnx("jvmti: jitdump in %s", dump_path);
>> +
>> + if (get_e_machine(&header)) {
>> + warn("get_e_machine failed\n");
>> + goto error;
>> + }
>> +
>> + header.magic = JITHEADER_MAGIC;
>> + header.version = JITHEADER_VERSION;
>> + header.total_size = sizeof(header);
>> + header.pid = getpid();
>> +
>> + /* calculate amount of padding '\0' */
>> + pad_cnt = PADDING_8ALIGNED(header.total_size);
>> + header.total_size += pad_cnt;
>> +
>> + header.timestamp = perf_get_timestamp(perf_fd);
>> +
>> + if (!fwrite(&header, sizeof(header), 1, fp)) {
>> + warn("jvmti: cannot write dumpfile header");
>> + goto error;
>> + }
>> +
>> + /* write padding '\0' if necessary */
>> + if (pad_cnt && !fwrite(pad_bytes, pad_cnt, 1, fp)) {
>> + warn("jvmti: cannot write dumpfile header padding");
>> + goto error;
>> + }
>> +
>> + return fp;
>> +error:
>> + fclose(fp);
>> + perf_close_timestamp(perf_fd);
>> + return NULL;
>> +}
>> +
>> +int
>> +jvmti_close(void *agent)
>> +{
>> + struct jr_code_close rec;
>> + FILE *fp = agent;
>> +
>> + if (!fp) {
>> + warnx("jvmti: incalid fd in close_agent");
>> + return -1;
>> + }
>> +
>> + rec.p.id = JIT_CODE_CLOSE;
>> + rec.p.total_size = sizeof(rec);
>> +
>> + rec.p.timestamp = perf_get_timestamp(perf_fd);
>> +
>> + if (!fwrite(&rec, sizeof(rec), 1, fp))
>> + return -1;
>> +
>> + fclose(fp);
>> +
>> + perf_close_timestamp(perf_fd);
>> +
>> + fp = NULL;
>> +
>> + return 0;
>> +}
>> +
>> +int jvmti_write_code(void *agent, char const *sym,
>> + uint64_t vma, void const *code, unsigned int const size)
>> +{
>> + static int code_generation = 1;
>> + struct jr_code_load rec;
>> + size_t sym_len;
>> + size_t padding_count;
>> + FILE *fp = agent;
>> + int ret = -1;
>> +
>> + /* don't care about 0 length function, no samples */
>> + if (size == 0)
>> + return 0;
>> +
>> + if (!fp) {
>> + warnx("jvmti: invalid fd in write_native_code");
>> + return -1;
>> + }
>> +
>> + sym_len = strlen(sym) + 1;
>> +
>> + rec.p.id = JIT_CODE_LOAD;
>> + rec.p.total_size = sizeof(rec) + sym_len;
>> + padding_count = PADDING_8ALIGNED(rec.p.total_size);
>> + rec.p. total_size += padding_count;
>> + rec.p.timestamp = perf_get_timestamp(perf_fd);
>
> Do you know whether the JVM is guaranteed not to start executing the
> generated code before the return of compiled_method_load_cb(), otherwise the
> timestamp will be too late?
>
I don't know that. I did not check.
But are you saying the callback may be asynchronous with the JIT compiler?
The callback need to happen only after the code is jitted for obvious reasons.

>> +
>> + rec.code_size = size;
>> + rec.vma = vma;
>> + rec.code_addr = vma;
>> + rec.pid = getpid();
>> + rec.tid = gettid();
>> + rec.code_index = code_generation++;
>> +
>> + if (code)
>> + rec.p.total_size += size;
>> +
>> + /*
>> + * If JVM is multi-threaded, nultiple concurrent calls to agent
>> + * may be possible, so protect file writes
>> + */
>> + flockfile(fp);
>> +
>> + ret = fwrite_unlocked(&rec, sizeof(rec), 1, fp);
>> + fwrite_unlocked(sym, sym_len, 1, fp);
>> + if (code)
>> + fwrite_unlocked(code, size, 1, fp);
>> +
>> + if (padding_count)
>> + fwrite_unlocked(pad_bytes, padding_count, 1, fp);
>> +
>> + funlockfile(fp);
>> +
>> + ret = 0;
>> +
>> + return ret;
>> +}
>> diff --git a/tools/perf/jvmti/jvmti_agent.h b/tools/perf/jvmti/jvmti_agent.h
>> new file mode 100644
>> index 0000000..54e5c5e
>> --- /dev/null
>> +++ b/tools/perf/jvmti/jvmti_agent.h
>> @@ -0,0 +1,23 @@
>> +#ifndef __JVMTI_AGENT_H__
>> +#define __JVMTI_AGENT_H__
>> +
>> +#include <sys/types.h>
>> +#include <stdint.h>
>> +
>> +#define __unused __attribute__((unused))
>> +
>> +#if defined(__cplusplus)
>> +extern "C" {
>> +#endif
>> +
>> +void *jvmti_open(void);
>> +int jvmti_close(void *agent);
>> +int jvmti_write_code(void *agent, char const *symbol_name,
>> + uint64_t vma, void const *code,
>> + const unsigned int code_size);
>> +
>> +#if defined(__cplusplus)
>> +}
>> +
>> +#endif
>> +#endif /* __JVMTI_H__ */
>> diff --git a/tools/perf/jvmti/libjvmti.c b/tools/perf/jvmti/libjvmti.c
>> new file mode 100644
>> index 0000000..8b8d782
>> --- /dev/null
>> +++ b/tools/perf/jvmti/libjvmti.c
>> @@ -0,0 +1,149 @@
>> +#include <sys/types.h>
>> +#include <stdio.h>
>> +#include <string.h>
>> +#include <stdlib.h>
>> +#include <err.h>
>> +#include <jvmti.h>
>> +
>> +#include "jvmti_agent.h"
>> +
>> +void *jvmti_agent;
>> +
>> +static void JNICALL
>> +compiled_method_load_cb(jvmtiEnv *jvmti,
>> + jmethodID method,
>> + jint code_size,
>> + void const *code_addr,
>> + jint map_length,
>> + jvmtiAddrLocationMap const *map,
>> + void const *compile_info __unused)
>> +{
>> + jclass decl_class;
>> + char *class_sign = NULL;
>> + char *func_name = NULL;
>> + char *func_sign = NULL;
>> + jvmtiError ret;
>> + size_t len;
>> +
>> + ret = (*jvmti)->GetMethodDeclaringClass(jvmti, method,
>> + &decl_class);
>> + if (ret != JVMTI_ERROR_NONE) {
>> + warnx("jvmti: getmethoddeclaringclass failed");
>> + return;
>> + }
>> +
>> + ret = (*jvmti)->GetClassSignature(jvmti, decl_class,
>> + &class_sign, NULL);
>> + if (ret != JVMTI_ERROR_NONE) {
>> + warnx("jvmti: getclassignature failed");
>> + goto error;
>> + }
>> +
>> + ret = (*jvmti)->GetMethodName(jvmti, method, &func_name,
>> + &func_sign, NULL);
>> + if (ret != JVMTI_ERROR_NONE) {
>> + warnx("jvmti: failed getmethodname");
>> + goto error;
>> + }
>> +
>> + len = strlen(func_name) + strlen(class_sign) + strlen(func_sign) + 2;
>> +
>> + {
>> + char str[len];
>> + uint64_t addr = (uint64_t)(unsigned long)code_addr;
>> + snprintf(str, len, "%s%s%s", class_sign, func_name, func_sign);
>> + ret = jvmti_write_code(jvmti_agent, str, addr, code_addr, code_size);
>> + if (ret)
>> + warnx("jvmti: write_code() failed");
>> + }
>> +error:
>> + (*jvmti)->Deallocate(jvmti, (unsigned char *)func_name);
>> + (*jvmti)->Deallocate(jvmti, (unsigned char *)func_sign);
>> + (*jvmti)->Deallocate(jvmti, (unsigned char *)class_sign);
>> +}
>> +
>> +static void JNICALL
>> +code_generated_cb(jvmtiEnv *jvmti,
>> + char const *name,
>> + void const *code_addr,
>> + jint code_size)
>> +{
>> + uint64_t addr = (uint64_t)(unsigned long)code_addr;
>> + int ret;
>> +
>> + ret = jvmti_write_code(jvmti_agent, name, addr, code_addr, code_size);
>> + if (ret)
>> + warnx("jvmti: write_code() failed for code_generated");
>> +}
>> +
>> +JNIEXPORT jint JNICALL
>> +Agent_OnLoad(JavaVM *jvm, char *options, void *reserved __unused)
>> +{
>> + jvmtiEventCallbacks cb;
>> + jvmtiCapabilities caps1;
>> + jvmtiEnv *jvmti = NULL;
>> + jint ret;
>> +
>> + jvmti_agent = jvmti_open();
>> + if (!jvmti_agent) {
>> + warnx("jvmti: open_agent failed");
>> + return -1;
>> + }
>> +
>> + /*
>> + * Request a JVMTI interface version 1 environment
>> + */
>> + ret = (*jvm)->GetEnv(jvm, (void *)&jvmti, JVMTI_VERSION_1);
>> + if (ret != JNI_OK) {
>> + warnx("jvmti: jvmti version 1 not supported");
>> + return -1;
>> + }
>> +
>> + /*
>> + * acquire method_load capability, we require it
>> + */
>> + memset(&caps1, 0, sizeof(caps1));
>> + caps1.can_generate_compiled_method_load_events = 1;
>> +
>> + ret = (*jvmti)->AddCapabilities(jvmti, &caps1);
>> + if (ret != JVMTI_ERROR_NONE) {
>> + warnx("jvmti: acquire compiled_method capability failed");
>> + return -1;
>> + }
>> +
>> + memset(&cb, 0, sizeof(cb));
>> +
>> + cb.CompiledMethodLoad = compiled_method_load_cb;
>> + cb.DynamicCodeGenerated = code_generated_cb;
>> +
>> + ret = (*jvmti)->SetEventCallbacks(jvmti, &cb, sizeof(cb));
>> + if (ret != JVMTI_ERROR_NONE) {
>> + warnx("jvmti: cannot set event callbacks");
>> + return -1;
>> + }
>> +
>> + ret = (*jvmti)->SetEventNotificationMode(jvmti, JVMTI_ENABLE,
>> + JVMTI_EVENT_COMPILED_METHOD_LOAD, NULL);
>> + if (ret != JVMTI_ERROR_NONE) {
>> + warnx("jvmti: setnotification failed for method_load");
>> + return -1;
>> + }
>> +
>> + ret = (*jvmti)->SetEventNotificationMode(jvmti, JVMTI_ENABLE,
>> + JVMTI_EVENT_DYNAMIC_CODE_GENERATED, NULL);
>> + if (ret != JVMTI_ERROR_NONE) {
>> + warnx("jvmti: setnotification failed on code_generated");
>> + return -1;
>> + }
>> + return 0;
>> +}
>> +
>> +JNIEXPORT void JNICALL
>> +Agent_OnUnload(JavaVM *jvm __unused)
>> +{
>> + int ret;
>> +
>> + ret = jvmti_close(jvmti_agent);
>> + if (ret)
>> + errx(1, "Error: op_close_agent()");
>> +}
>>
>

2015-02-18 08:45:29

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH 4/4] perf tools: add JVMTI agent library

On 16/02/15 22:22, Stephane Eranian wrote:
> On Mon, Feb 16, 2015 at 2:01 AM, Adrian Hunter <[email protected]> wrote:
>> On 11/02/15 01:42, Stephane Eranian wrote:
>>> This is a standalone JVMTI library to help profile Java jitted
>>> code with perf record/perf report. The library is not installed
>>> or compiled automatically by perf Makefile. It is not used
>>> directly by perf. It is arch agnostic and has been tested on
>>> X86 and ARM. It needs to be used with a Java runtime, such
>>> as OpenJDK, as follows:
>>>
>>> $ java -agentpath:libjvmti.so .......
>>>
>>> When used this way, java will generate a jitdump binary file in
>>> $HOME/.debug/java/jit/java-jit-*
>>>
>>> This binary dump file contains information to help symbolize and
>>> annotate jitted code.
>>>
>>> The next step is to inject the jitdump information into the
>>> perf.data file:
>>> $ perf inject -j $HOME/.debug/java/jit/java-jit-XXXX/jit-ZZZ.dump \
>>> -i perf.data -o perf.data.jitted
>>>
>>> This injects the MMAP records to cover the jitted code and also generates
>>> one ELF image for each jitted function. The ELF images are created in the
>>> same subdir as the jitdump file. The MMAP records point there too.
>>>
>>> Then to visualize the function or asm profile, simply use the regular
>>> perf commands:
>>> $ perf report -i perf.data.jitted
>>> or
>>> $ perf annotate -i perf.data.jitted
>>>
>>> JVMTI agent code adapted from OProfile's opagent code.
>>>
>>> Signed-off-by: Stephane Eranian <[email protected]>
>>> ---

[snip]

>>> +
>>> +int jvmti_write_code(void *agent, char const *sym,
>>> + uint64_t vma, void const *code, unsigned int const size)
>>> +{
>>> + static int code_generation = 1;
>>> + struct jr_code_load rec;
>>> + size_t sym_len;
>>> + size_t padding_count;
>>> + FILE *fp = agent;
>>> + int ret = -1;
>>> +
>>> + /* don't care about 0 length function, no samples */
>>> + if (size == 0)
>>> + return 0;
>>> +
>>> + if (!fp) {
>>> + warnx("jvmti: invalid fd in write_native_code");
>>> + return -1;
>>> + }
>>> +
>>> + sym_len = strlen(sym) + 1;
>>> +
>>> + rec.p.id = JIT_CODE_LOAD;
>>> + rec.p.total_size = sizeof(rec) + sym_len;
>>> + padding_count = PADDING_8ALIGNED(rec.p.total_size);
>>> + rec.p. total_size += padding_count;
>>> + rec.p.timestamp = perf_get_timestamp(perf_fd);
>>
>> Do you know whether the JVM is guaranteed not to start executing the
>> generated code before the return of compiled_method_load_cb(), otherwise the
>> timestamp will be too late?
>>
> I don't know that. I did not check.
> But are you saying the callback may be asynchronous with the JIT compiler?

Possibly, although it seems unlikely. But perhaps there are other threads
waiting for the code and it lets them go at the same time. I guess we can
assume that doesn't happen. When we use it with Intel PT it will show up if
the timestamp isn't before the first execution, because there will be
decoder errors.