2013-10-07 06:47:12

by Hemant Kumar

[permalink] [raw]
Subject: [PATCH v2 0/3] Perf support to SDT markers

This patchset helps in probing dtrace style markers(SDT) present in user space
applications through perf. Notes/markes are placed at important places by the
developers. They have a negligible overhead when not enabled. We can enable
them and probe at these places and find some important information like the
arguments' values, etc.

How to add SDT markers into user applications:
We need to have this header sys/sdt.h present.
sys/sdt.h used is version 3.
If not present, install systemtap-sdt-devel package (for fedora-18).

A simple example to show this follows.
- Create a file with .d extension and mention the probe names in it with
provider name and marker name.

$ cat probes.d
provider user_app {
probe foo_start();
probe fun_start();
};

- Now create the probes.h and probes.o file :
$ dtrace -C -h -s probes.d -o probes.h
$ dtrace -C -G -s probes.d -o probes.o

- A program using the markers:

$ cat user_app.c

#include <stdio.h>
#include "probes.h"

void foo(void)
{
USER_APP_FOO_START();
printf("This is foo\n");
}

void fun(void)
{
USER_APP_FUN_START();
printf("Inside fun\n");
}
int main(void)
{
printf("In main\n");
foo();
fun();
return 0;
}

- Compile it and also provide probes.o file to linker:
$ gcc user_app.c probes.o -o user_app

- Now use perf to list the markers in the app:
# perf probe --markers -x ./user_app

%user_app:foo_start
%user_app:fun_start

- And then use perf probe to add a probe point :

# perf probe -x ./user_app -a '%user_app:foo_start'

Added new event :
event = foo_start (on 0x530)

You can now use it on all perf tools such as :

perf record -e probe_user:foo_start -aR sleep 1

# perf record -e probe_user:foo_start -aR ./user_app
In main
This is foo
Inside fun
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.235 MB perf.data (~10279 samples) ]

- Then use perf tools to analyze it.
# perf report --stdio

# ========
# captured on: Tue Sep 3 16:19:55 2013
# hostname : hemant-fedora
# os release : 3.11.0-rc3+
# perf version : 3.9.4-200.fc18.x86_64
# arch : x86_64
# nrcpus online : 2
# nrcpus avail : 2
# cpudesc : QEMU Virtual CPU version 1.2.2
# cpuid : GenuineIntel,6,2,3
# total memory : 2051912 kBIf these are not enabled, they are present in the \
ELF as nop.

# cmdline : /usr/bin/perf record -e probe_user:foo_start -aR ./user_app
# event : name = probe_user:foo_start, type = 2, config = 0x38e, config1
= 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, excl_host = 0,
excl_guest = 1, precise_ip = 0
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# pmu mappings: software = 1, tracepoint = 2, breakpoint = 5
# ========
#
# Samples: 1 of event 'probe_user:foo_start'
# Event count (approx.): 1
#
# Overhead Command Shared Object Symbol
# ........ ........ ............. .......
#
100.00% user_app user_app [.] foo


#
# (For a higher level overview, try: perf report --sort comm,dso)
#

This link shows an example of marker probing with Systemtap:
https://sourceware.org/systemtap/wiki/AddingUserSpaceProbingToApps

Also, this link provides important info regarding SDT notes:
http://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation

- Markers in binaries :
These SDT markers are present in the ELF in the section named
".note.stapsdt".
Here, the name of the marker, its provider, type, location, base
address, semaphore address are stored.
We can retrieve these values using the members name_off and desc_off in
Nhdr structure. If these are not enabled, they are present in the ELF as nop.

Changes since v1:
- Made some structural changes.
- Changed the option required to list/probe into SDT notes.
- Unified function names.
- Added some necessary checks.
- Ignored semaphore enabled SDT notes.
- Added documentation.
- Removed some redundancies.

TODO:
- Recognizing SDT notes' arguments and support to probe on them.
---

Hemant Kumar (3):
SDT markers listing by perf:
Support for perf to probe into SDT markers:
Documentation regarding perf/sdt


tools/perf/Documentation/perf-probe.txt | 15 ++
tools/perf/Documentation/sdt-probes.txt | 163 ++++++++++++++++++++
tools/perf/builtin-probe.c | 35 ++++
tools/perf/util/probe-event.c | 128 +++++++++++++++-
tools/perf/util/probe-event.h | 4
tools/perf/util/symbol-elf.c | 256 +++++++++++++++++++++++++++++++
tools/perf/util/symbol.h | 21 +++
7 files changed, 610 insertions(+), 12 deletions(-)
create mode 100644 tools/perf/Documentation/sdt-probes.txt

--


2013-10-07 06:48:09

by Hemant Kumar

[permalink] [raw]
Subject: [PATCH v2 1/3] SDT markers listing by perf:

This patch will enable perf to list all the sdt markers present
in an elf file. The markers are present in the .note.stapsdt section
of the elf. We can traverse through this section and collect the
required info about the markers.
We can use '-S/--markers' with perf to view the SDT notes.

Currently, the sdt notes which have their semaphores enabled, are being
ignored silently. But, they will be supported soon.

I think wrapping this inside #ifdef LIBELF_SUPPORT pair is not required,
because, if NO_LIBELF = 1, then 'probe' command of perf is itself disabled.

Usage:
perf probe --markers -x /lib64/libc.so.6

Output :
%libc:setjmp
%libc:longjmp
%libc:longjmp_target
%libc:lll_futex_wake
%libc:lll_lock_wait_private
%libc:longjmp
%libc:longjmp_target
%libc:lll_futex_wake

Signed-off-by: Hemant Kumar Shaw <[email protected]>
---
tools/perf/builtin-probe.c | 24 +++++-
tools/perf/util/probe-event.c | 39 +++++++++
tools/perf/util/probe-event.h | 2
tools/perf/util/symbol-elf.c | 176 +++++++++++++++++++++++++++++++++++++++++
tools/perf/util/symbol.h | 18 ++++
5 files changed, 257 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index e8a66f9..cbd2383 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -55,6 +55,7 @@ static struct {
bool show_funcs;
bool mod_events;
bool uprobes;
+ bool sdt;
int nevents;
struct perf_probe_event events[MAX_PROBES];
struct strlist *dellist;
@@ -325,6 +326,7 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
opt_set_filter),
OPT_CALLBACK('x', "exec", NULL, "executable|path",
"target executable name or path", opt_set_target),
+ OPT_BOOLEAN('S', "markers", &params.sdt, "Show probe-able sdt notes"),
OPT_END()
};
int ret;
@@ -347,7 +349,7 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
params.max_probe_points = MAX_PROBES;

if ((!params.nevents && !params.dellist && !params.list_events &&
- !params.show_lines && !params.show_funcs))
+ !params.show_lines && !params.show_funcs && !params.sdt))
usage_with_options(probe_usage, options);

/*
@@ -355,6 +357,26 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
*/
symbol_conf.try_vmlinux_path = (symbol_conf.vmlinux_name == NULL);

+ if (params.sdt) {
+ if (params.show_lines) {
+ pr_err("Error: Don't use --markers with --lines.\n");
+ usage_with_options(probe_usage, options);
+ }
+ if (params.show_vars) {
+ pr_err("Error: Don't use --markers with --vars.\n");
+ usage_with_options(probe_usage, options);
+ }
+ if (params.show_funcs) {
+ pr_err("Error: Don't use --markers with --funcs.\n");
+ usage_with_options(probe_usage, options);
+ }
+ ret = show_sdt_notes(params.target);
+ if (ret < 0) {
+ pr_err(" Error : Failed to find SDT markers!"
+ "(%d)\n", ret);
+ }
+ return ret;
+ }
if (params.list_events) {
if (params.mod_events) {
pr_err(" Error: Don't use --list with --add/--del.\n");
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index aa04bf9..4e94092 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1909,6 +1909,20 @@ static int __add_probe_trace_events(struct perf_probe_event *pev,
return ret;
}

+static void cleanup_sdt_note_list(struct list_head *sdt_notes)
+{
+ struct sdt_note *tmp;
+ struct list_head *pos, *s;
+
+ list_for_each_safe(pos, s, sdt_notes) {
+ tmp = list_entry(pos, struct sdt_note, note_list);
+ list_del(pos);
+ free(tmp->name);
+ free(tmp->provider);
+ free(tmp);
+ }
+}
+
static int convert_to_probe_trace_events(struct perf_probe_event *pev,
struct probe_trace_event **tevs,
int max_tevs, const char *target)
@@ -2372,3 +2386,28 @@ out:
free(name);
return ret;
}
+
+static void display_sdt_note_info(struct list_head *start)
+{
+ struct list_head *pos;
+ struct sdt_note *tmp;
+
+ list_for_each(pos, start) {
+ tmp = list_entry(pos, struct sdt_note, note_list);
+ printf("%%%s:%s\n", tmp->provider, tmp->name);
+ }
+}
+
+int show_sdt_notes(const char *target)
+{
+ struct list_head sdt_notes;
+ int ret = -1;
+
+ INIT_LIST_HEAD(&sdt_notes);
+ ret = get_sdt_note_list(&sdt_notes, target);
+ if (!ret) {
+ display_sdt_note_info(&sdt_notes);
+ cleanup_sdt_note_list(&sdt_notes);
+ }
+ return ret;
+}
diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
index f9f3de8..b8a9347 100644
--- a/tools/perf/util/probe-event.h
+++ b/tools/perf/util/probe-event.h
@@ -133,7 +133,7 @@ extern int show_available_vars(struct perf_probe_event *pevs, int npevs,
struct strfilter *filter, bool externs);
extern int show_available_funcs(const char *module, struct strfilter *filter,
bool user);
-
+int show_sdt_notes(const char *target);
/* Maximum index number of event-name postfix */
#define MAX_EVENT_INDEX 1024

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 4b12bf8..6b17260 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -846,6 +846,182 @@ out_elf_end:
return err;
}

+/*
+ * Populate the name, type, offset in the SDT note structure and
+ * ignore the argument fields (for now)
+ */
+static struct sdt_note *populate_sdt_note(Elf **elf, const char *data,
+ size_t len, int type)
+{
+ const char *provider, *name;
+ struct sdt_note *note;
+
+ /*
+ * Three addresses need to be obtained :
+ * Marker location, address of base section and semaphore location
+ */
+ union {
+ Elf64_Addr a64[3];
+ Elf32_Addr a32[3];
+ } buf;
+
+ /*
+ * dst and src are required for translation from file to memory
+ * representation
+ */
+ Elf_Data dst = {
+ .d_buf = &buf, .d_type = ELF_T_ADDR, .d_version = EV_CURRENT,
+ .d_size = gelf_fsize((*elf), ELF_T_ADDR, 3, EV_CURRENT),
+ .d_off = 0, .d_align = 0
+ };
+
+ Elf_Data src = {
+ .d_buf = (void *) data, .d_type = ELF_T_ADDR,
+ .d_version = EV_CURRENT, .d_size = dst.d_size, .d_off = 0,
+ .d_align = 0
+ };
+
+ /* Check the type of each of the notes */
+ if (type != SDT_NOTE_TYPE)
+ goto out_ret;
+
+ note = (struct sdt_note *)zalloc(sizeof(struct sdt_note));
+ if (note == NULL) {
+ pr_err("Memory allocation error\n");
+ goto out_ret;
+ }
+ INIT_LIST_HEAD(&note->note_list);
+
+ if (len < dst.d_size + 3)
+ goto out_free;
+
+ /* Translation from file representation to memory representation */
+ if (gelf_xlatetom(*elf, &dst, &src,
+ elf_getident(*elf, NULL)[EI_DATA]) == NULL)
+ pr_debug("gelf_xlatetom : %s", elf_errmsg(-1));
+
+ /* Populate the fields of sdt_note */
+ provider = data + dst.d_size;
+
+ name = (const char *)memchr(provider, '\0', data + len - provider);
+ if (name++ == NULL)
+ goto out_free;
+ note->provider = strdup(provider);
+ note->name = strdup(name);
+
+ /* Obtain the addresses and ignore notes with semaphores set*/
+ if (gelf_getclass(*elf) == ELFCLASS32) {
+ note->addr.a32[0] = buf.a32[0];
+ note->addr.a32[1] = buf.a32[1];
+ note->addr.a32[2] = buf.a32[2];
+ note->bit32 = true;
+ if (buf.a32[2] != 0)
+ goto out_free;
+ } else {
+ note->addr.a64[0] = buf.a64[0];
+ note->addr.a64[1] = buf.a64[1];
+ note->addr.a64[2] = buf.a64[2];
+ note->bit32 = false;
+ if (buf.a64[2] != 0)
+ goto out_free;
+ }
+ return note;
+
+out_free:
+ free(note);
+out_ret:
+ return NULL;
+}
+
+static struct list_head *construct_sdt_notes_list(Elf *elf)
+{
+ GElf_Ehdr ehdr;
+ Elf_Scn *scn = NULL;
+ Elf_Data *data;
+ GElf_Shdr shdr;
+ size_t shstrndx;
+ size_t next;
+ GElf_Nhdr nhdr;
+ size_t name_off, desc_off, offset;
+ struct sdt_note *tmp = NULL, *note = NULL;
+
+ if (gelf_getehdr(elf, &ehdr) == NULL) {
+ pr_debug("%s: Can't get elf header.\n", __func__);
+ goto out_err;
+ }
+ if (elf_getshdrstrndx(elf, &shstrndx) != 0) {
+ pr_debug("getshdrstrndx failed\n");
+ goto out_err;
+ }
+
+ /*
+ * Look for section type = SHT_NOTE, flags = no SHF_ALLOC
+ * and name = .note.stapsdt
+ */
+ scn = elf_section_by_name(elf, &ehdr, &shdr, NOTE_SCN, NULL);
+ if (!scn) {
+ printf("SDT markers not present!\n");
+ goto out_err;
+ }
+ if (!(shdr.sh_type == SHT_NOTE) || (shdr.sh_flags & SHF_ALLOC))
+ goto out_err;
+
+ data = elf_getdata(scn, NULL);
+
+ /* Get the SDT notes */
+ for (offset = 0; (next = gelf_getnote(data, offset, &nhdr, &name_off,
+ &desc_off)) > 0; offset = next) {
+ if (nhdr.n_namesz == sizeof(SDT_NOTE_NAME) &&
+ !memcmp(data->d_buf + name_off, SDT_NOTE_NAME,
+ sizeof(SDT_NOTE_NAME))) {
+ tmp = populate_sdt_note(&elf, (const char *)
+ ((long)(data->d_buf) +
+ (long)desc_off),
+ nhdr.n_descsz, nhdr.n_type);
+ if (!note && tmp)
+ note = tmp;
+ else if (tmp)
+ list_add_tail(&tmp->note_list,
+ &(note->note_list));
+ }
+ }
+ if (tmp)
+ return &tmp->note_list;
+out_err:
+ return NULL;
+}
+
+int get_sdt_note_list(struct list_head *head, const char *target)
+{
+ Elf *elf;
+ int fd, ret = -1;
+ struct list_head *tmp = NULL;
+
+ fd = open(target, O_RDONLY);
+ if (fd < 0) {
+ pr_err("Failed to open %s\n", target);
+ goto out_ret;
+ }
+
+ symbol__elf_init();
+ elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL);
+ if (!elf) {
+ pr_err("%s: cannot read %s ELF file.\n", __func__, target);
+ goto out_close;
+ }
+ tmp = construct_sdt_notes_list(elf);
+ if (tmp) {
+ list_add(head, tmp);
+ ret = 0;
+ }
+ elf_end(elf);
+
+out_close:
+ close(fd);
+out_ret:
+ return ret;
+}
+
void symbol__elf_init(void)
{
elf_version(EV_CURRENT);
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 5f720dc..037185c 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -197,6 +197,17 @@ struct symsrc {
#endif
};

+struct sdt_note {
+ char *name;
+ char *provider;
+ bool bit32; /* 32 or 64 bit flag */
+ union {
+ Elf64_Addr a64[3];
+ Elf32_Addr a32[3];
+ } addr;
+ struct list_head note_list;
+};
+
void symsrc__destroy(struct symsrc *ss);
int symsrc__init(struct symsrc *ss, struct dso *dso, const char *name,
enum dso_binary_type type);
@@ -247,4 +258,11 @@ void symbols__fixup_duplicate(struct rb_root *symbols);
void symbols__fixup_end(struct rb_root *symbols);
void __map_groups__fixup_end(struct map_groups *mg, enum map_type type);

+/* Specific to SDT notes */
+int get_sdt_note_list(struct list_head *head, const char *target);
+
+#define SDT_NOTE_TYPE 3
+#define NOTE_SCN ".note.stapsdt"
+#define SDT_NOTE_NAME "stapsdt"
+
#endif /* __PERF_SYMBOL */

2013-10-07 06:48:37

by Hemant Kumar

[permalink] [raw]
Subject: [PATCH v2 2/3] Support for perf to probe into SDT markers:

This allows perf to probe into the sdt markers/notes present in
the libraries and executables. We try to find the associated location
and handle prelinking (since, stapsdt notes section is not allocated
during runtime). Prelinking is handled with the help of base
section which is allocated during runtime. This address can be compared
with the address retrieved from the notes' description. If its different,
we can take this difference and then add to the note's location.

We can use existing '-a/--add' option to add events for sdt markers.
Also, we can add multiple events at once using the same '-a' option.

Usage:
perf probe -x /lib64/libc.so.6 -a 'my_event=%libc:setjmp'

or

perf probe -x /lib64/libc.so.6 %libc:setjmp

Output (corresponding to the first usage):
Added new event:
libc:my_event (on 0x35981)

You can now use it in all perf tools, such as:

perf record -e libc:my_event -aR sleep 1


Signed-off-by: Hemant Kumar Shaw <[email protected]>
---
tools/perf/builtin-probe.c | 11 +++++
tools/perf/util/probe-event.c | 89 +++++++++++++++++++++++++++++++++++++----
tools/perf/util/probe-event.h | 2 +
tools/perf/util/symbol-elf.c | 80 +++++++++++++++++++++++++++++++++++++
tools/perf/util/symbol.h | 3 +
5 files changed, 177 insertions(+), 8 deletions(-)

diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index cbd2383..6f09723 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -370,6 +370,17 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
pr_err("Error: Don't use --markers with --funcs.\n");
usage_with_options(probe_usage, options);
}
+ if (params.mod_events) {
+ ret = add_perf_probe_events(params.events,
+ params.nevents,
+ params.max_probe_points,
+ params.target,
+ params.force_add);
+ if (ret < 0) {
+ pr_err(" Error: Failed to add events. "
+ " (%d)\n", ret);
+ }
+ }
ret = show_sdt_notes(params.target);
if (ret < 0) {
pr_err(" Error : Failed to find SDT markers!"
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 4e94092..43f8a69 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -817,6 +817,35 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
arg = tmp;
}

+ /* Check for SDT marker */
+ if (*arg == '%') {
+ ptr = strchr(++arg, ':');
+ if (!ptr) {
+ semantic_error("Provider name must follow an event "
+ "name\n");
+ return -EINVAL;
+ }
+ *ptr++ = '\0';
+ tmp = strdup(arg);
+ if (!tmp)
+ return -ENOMEM;
+
+ pev->point.note = (struct sdt_note *)
+ zalloc(sizeof(struct sdt_note));
+ if (!pev->point.note)
+ return -ENOMEM;
+ pev->point.note->provider = tmp;
+
+ tmp = strdup(ptr);
+ if (!tmp)
+ return -ENOMEM;
+ pev->point.note->name = tmp;
+ pev->group = pev->point.note->provider;
+ if (!pev->event)
+ pev->event = pev->point.note->name;
+ pev->sdt = true;
+ return 0;
+ }
ptr = strpbrk(arg, ";:+@%");
if (ptr) {
nc = *ptr;
@@ -1270,6 +1299,13 @@ static char *synthesize_perf_probe_point(struct perf_probe_point *pp)
ret = e_snprintf(buf, MAX_CMDLEN, "%s%s%s%s%s", pp->function,
offs, pp->retprobe ? "%return" : "", line,
file);
+ else if (pp->note)
+ if (pp->note->bit32)
+ ret = e_snprintf(buf, MAX_CMDLEN, "0x%x",
+ pp->note->addr.a32[0]);
+ else
+ ret = e_snprintf(buf, MAX_CMDLEN, "0x%lx",
+ pp->note->addr.a64[0]);
else
ret = e_snprintf(buf, MAX_CMDLEN, "%s%s", file, line);
if (ret <= 0)
@@ -1923,6 +1959,19 @@ static void cleanup_sdt_note_list(struct list_head *sdt_notes)
}
}

+static int try_to_find_sdt_notes(struct perf_probe_event *pev,
+ const char *target)
+{
+ struct list_head sdt_notes;
+ int ret = -1;
+
+ INIT_LIST_HEAD(&sdt_notes);
+ ret = search_sdt_note(pev->point.note, &sdt_notes, target);
+ if (!list_empty(&sdt_notes))
+ cleanup_sdt_note_list(&sdt_notes);
+ return ret;
+}
+
static int convert_to_probe_trace_events(struct perf_probe_event *pev,
struct probe_trace_event **tevs,
int max_tevs, const char *target)
@@ -1930,11 +1979,20 @@ static int convert_to_probe_trace_events(struct perf_probe_event *pev,
struct symbol *sym;
int ret = 0, i;
struct probe_trace_event *tev;
+ char *buf;

- /* Convert perf_probe_event with debuginfo */
- ret = try_to_find_probe_trace_events(pev, tevs, max_tevs, target);
- if (ret != 0)
- return ret; /* Found in debuginfo or got an error */
+ if (pev->sdt) {
+ ret = try_to_find_sdt_notes(pev, target);
+ if (ret)
+ return ret;
+ } else {
+ /* Convert perf_probe_event with debuginfo */
+ ret = try_to_find_probe_trace_events(pev, tevs, max_tevs,
+ target);
+ /* Found in debuginfo or got an error */
+ if (ret != 0)
+ return ret;
+ }

/* Allocate trace event buffer */
tev = *tevs = zalloc(sizeof(struct probe_trace_event));
@@ -1942,10 +2000,25 @@ static int convert_to_probe_trace_events(struct perf_probe_event *pev,
return -ENOMEM;

/* Copy parameters */
- tev->point.symbol = strdup(pev->point.function);
- if (tev->point.symbol == NULL) {
- ret = -ENOMEM;
- goto error;
+ if (pev->sdt) {
+ buf = (char *)zalloc(sizeof(char) * MAX_CMDLEN);
+ if (!buf) {
+ ret = -ENOMEM;
+ goto error;
+ }
+ if (pev->point.note->bit32)
+ sprintf(buf, "0x%x",
+ (unsigned)pev->point.note->addr.a32[0]);
+ else
+ sprintf(buf, "0x%lx",
+ (unsigned long)pev->point.note->addr.a64[0]);
+ tev->point.symbol = buf;
+ } else {
+ tev->point.symbol = strdup(pev->point.function);
+ if (tev->point.symbol == NULL) {
+ ret = -ENOMEM;
+ goto error;
+ }
}

if (target) {
diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
index b8a9347..4bd50cc 100644
--- a/tools/perf/util/probe-event.h
+++ b/tools/perf/util/probe-event.h
@@ -47,6 +47,7 @@ struct perf_probe_point {
bool retprobe; /* Return probe flag */
char *lazy_line; /* Lazy matching pattern */
unsigned long offset; /* Offset from function entry */
+ struct sdt_note *note;
};

/* Perf probe probing argument field chain */
@@ -72,6 +73,7 @@ struct perf_probe_event {
struct perf_probe_point point; /* Probe point */
int nargs; /* Number of arguments */
bool uprobes;
+ bool sdt;
struct perf_probe_arg *args; /* Arguments */
};

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 6b17260..d0e7a66 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1022,6 +1022,86 @@ out_ret:
return ret;
}

+static void adjust_note_addr(struct sdt_note *tmp, struct sdt_note *key,
+ Elf *elf)
+{
+ GElf_Ehdr ehdr;
+ GElf_Addr base_off = 0;
+ GElf_Shdr shdr;
+
+ if (!gelf_getehdr(elf, &ehdr)) {
+ pr_debug("%s : cannot get elf header.\n", __func__);
+ return;
+ }
+
+ /*
+ * Find out the .stapsdt.base section.
+ * This scn will help us to handle prelinking (if present).
+ * Compare the retrieved file offset of the base section with the
+ * base address in the description of the SDT note. If its different,
+ * then accordingly, adjust the note location.
+ */
+ if (elf_section_by_name(elf, &ehdr, &shdr, SDT_BASE, NULL))
+ base_off = shdr.sh_offset;
+ if (base_off) {
+ if (tmp->bit32)
+ key->addr.a32[0] = tmp->addr.a32[0] + base_off -
+ tmp->addr.a32[1];
+ else
+ key->addr.a64[0] = tmp->addr.a64[0] + base_off -
+ tmp->addr.a64[1];
+ }
+ key->bit32 = tmp->bit32;
+}
+
+int search_sdt_note(struct sdt_note *key, struct list_head *sdt_notes,
+ const char *target)
+{
+ Elf *elf;
+ int fd, ret = -1;
+ struct list_head *pos = NULL, *head = NULL;
+ struct sdt_note *tmp = NULL;
+
+ fd = open(target, O_RDONLY);
+ if (fd < 0) {
+ pr_err("Failed to open %s\n", target);
+ goto out_ret;
+ }
+
+ symbol__elf_init();
+ elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL);
+ if (!elf) {
+ pr_err("%s : can't read %s ELF file.\n", __func__, target);
+ goto out_close;
+ }
+
+ head = construct_sdt_notes_list(elf);
+ if (!head)
+ goto out_end;
+
+ list_add(sdt_notes, head);
+
+ /* Iterate through the notes and retrieve the required note */
+ list_for_each(pos, sdt_notes) {
+ tmp = list_entry(pos, struct sdt_note, note_list);
+ if (!strcmp(key->name, tmp->name) &&
+ !strcmp(key->provider, tmp->provider)) {
+ adjust_note_addr(tmp, key, elf);
+ ret = 0;
+ break;
+ }
+ }
+ if (ret)
+ printf("%%%s:%s not found!\n", key->provider, key->name);
+
+out_end:
+ elf_end(elf);
+out_close:
+ close(fd);
+out_ret:
+ return ret;
+}
+
void symbol__elf_init(void)
{
elf_version(EV_CURRENT);
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 037185c..0b0545b 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -260,9 +260,12 @@ void __map_groups__fixup_end(struct map_groups *mg, enum map_type type);

/* Specific to SDT notes */
int get_sdt_note_list(struct list_head *head, const char *target);
+int search_sdt_note(struct sdt_note *key, struct list_head *head,
+ const char *target);

#define SDT_NOTE_TYPE 3
#define NOTE_SCN ".note.stapsdt"
#define SDT_NOTE_NAME "stapsdt"
+#define SDT_BASE ".stapsdt.base"

#endif /* __PERF_SYMBOL */

2013-10-07 06:48:55

by Hemant Kumar

[permalink] [raw]
Subject: [PATCH v2 3/3] Documentation regarding perf/sdt

This patch adds documentation regarding perf support
to SDT notes/markers.

Signed-off-by: Hemant Kumar Shaw <[email protected]>
---
tools/perf/Documentation/perf-probe.txt | 15 ++-
tools/perf/Documentation/sdt-probes.txt | 163 +++++++++++++++++++++++++++++++
2 files changed, 176 insertions(+), 2 deletions(-)
create mode 100644 tools/perf/Documentation/sdt-probes.txt

diff --git a/tools/perf/Documentation/perf-probe.txt b/tools/perf/Documentation/perf-probe.txt
index b715cb7..8a3aa2a 100644
--- a/tools/perf/Documentation/perf-probe.txt
+++ b/tools/perf/Documentation/perf-probe.txt
@@ -99,10 +99,14 @@ OPTIONS
--max-probes::
Set the maximum number of probe points for an event. Default is 128.

+-S::
+--markers::
+ View the SDT markers present in a user space application/library.
+
-x::
--exec=PATH::
Specify path to the executable or shared library file for user
- space tracing. Can also be used with --funcs option.
+ space tracing. Can also be used with --funcs option or --markers option.

In absence of -m/-x options, perf probe checks if the first argument after
the options is an absolute path name. If its an absolute path, perf probe
@@ -121,11 +125,14 @@ Probe points are defined by following syntax.
3) Define event based on source file with lazy pattern
[EVENT=]SRC;PTN [ARG ...]

+ 4) Define event based on SDT marker
+ [[EVENT=]%PROVIDER:MARKER

-'EVENT' specifies the name of new event, if omitted, it will be set the name of the probed function. Currently, event group name is set as 'probe'.
+'EVENT' specifies the name of new event, if omitted, it will be set to the name of the probed function. Currently, event group name is set as 'probe' except in case of SDT markers where it is set to provider name.
'FUNC' specifies a probed function name, and it may have one of the following options; '+OFFS' is the offset from function entry address in bytes, ':RLN' is the relative-line number from function entry line, and '%return' means that it probes function return. And ';PTN' means lazy matching pattern (see LAZY MATCHING). Note that ';PTN' must be the end of the probe point definition. In addition, '@SRC' specifies a source file which has that function.
It is also possible to specify a probe point by the source line number or lazy matching by using 'SRC:ALN' or 'SRC;PTN' syntax, where 'SRC' is the source file path, ':ALN' is the line number and ';PTN' is the lazy matching pattern.
'ARG' specifies the arguments of this probe point, (see PROBE ARGUMENT).
+'%PROVIDER:MARKER' is the syntax of SDT markers present in an ELF.

PROBE ARGUMENT
--------------
@@ -200,6 +207,10 @@ Add probes at malloc() function on libc

./perf probe -x /lib/libc.so.6 malloc or ./perf probe /lib/libc.so.6 malloc

+Add probes at longjmp SDT marker on libc
+
+ ./perf probe -x /lib64/libc.so.6 %libc:longjmp
+
SEE ALSO
--------
linkperf:perf-trace[1], linkperf:perf-record[1]
diff --git a/tools/perf/Documentation/sdt-probes.txt b/tools/perf/Documentation/sdt-probes.txt
new file mode 100644
index 0000000..b298675
--- /dev/null
+++ b/tools/perf/Documentation/sdt-probes.txt
@@ -0,0 +1,163 @@
+Perf probing on SDT markers:
+
+Goal:
+Probe dtrace style markers(SDT) present in user space applications.
+
+Scope:
+Put probe points at SDT markers in user space applications and libraries
+and also probe them using perf.
+
+Why supprt SDT markers? :
+We have lots of applications which use SDT markers today like:
+Postgresql, MySql, Mozilla, Perl, Python, Java, Ruby, libvirt, QEMU, glib
+
+These markers are placed at important places by the developers. Now, these markers
+have a negligible overhead when not enabled. We can enable them and probe at
+these places and find some important information like the arguments' values, etc.
+
+How to add SDT markers into user applications:
+We need to have this header sys/sdt.h present.
+sys/sdt.h used is version 3.
+If not present, install systemtap-sdt-devel package.
+
+A simple example to show this:
+- Create a file with .d extension and mention the probe names in it with
+provider name and marker name.
+
+$ cat probes.d
+provider user_app {
+ probe foo_start();
+ probe fun_start();
+};
+
+- Now create the probes.h and probes.o file :
+$ dtrace -C -h -s probes.d -o probes.h
+$ dtrace -C -G -s probes.d -o probes.o
+
+- A program using the markers:
+
+$ cat user_app.c
+
+#include <stdio.h>
+#include "probes.h"
+
+void foo(void)
+{
+ USER_APP_FOO_START();
+ printf("This is foo\n");
+}
+
+void fun(void)
+{
+ USER_APP_FUN_START();
+ printf("Inside fun\n");
+}
+int main(void)
+{
+ printf("In main\n");
+ foo();
+ fun();
+ return 0;
+}
+
+- Compile it and also provide probes.o file to linker:
+$ gcc user_app.c probes.o -o user_app
+
+- Now use perf to list the markers in the app:
+# perf probe -x ./user_app --markers
+
+%user_app:foo_start
+%user_app:fun_start
+
+- And then use perf probe to add a probe point :
+
+# perf probe -x ./user_app 'my_event=%user_app:foo_start'
+
+Added new event :
+user_app:my_event (on 0x530)
+
+You can now use it on all perf tools such as :
+
+ perf record -e user_app:my_event -aR sleep 1
+
+# perf record -e probe_user:my_event -aR ./user_app
+In main
+This is foo
+Inside fun
+[ perf record: Woken up 1 times to write data ]
+[ perf record: Captured and wrote 0.235 MB perf.data (~10279 samples) ]
+
+- Then use perf tools to analyze it.
+# perf report --stdio
+
+# ========
+# captured on: Tue Sep 3 16:19:55 2013
+# hostname : hemant-fedora
+# os release : 3.11.0-rc3+
+# perf version : 3.9.4-200.fc18.x86_64
+# arch : x86_64
+# nrcpus online : 2
+# nrcpus avail : 2
+# cpudesc : QEMU Virtual CPU version 1.2.2
+# cpuid : GenuineIntel,6,2,3
+# total memory : 2051912 kBIf these are not enabled, they are present in the ELF as nop\
+.
+
+# cmdline : /usr/bin/perf record -e probe_user:foo_start -aR ./user_app
+# event : name = probe_user:foo_start, type = 2, config = 0x38e, config1
+= 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, excl_host = 0,
+excl_guest = 1, precise_ip = 0
+# HEADER_CPU_TOPOLOGY info available, use -I to display
+# HEADER_NUMA_TOPOLOGY info available, use -I to display
+# pmu mappings: software = 1, tracepoint = 2, breakpoint = 5
+# ========
+#
+# Samples: 1 of event 'probe_user:foo_start'
+# Event count (approx.): 1
+#
+# Overhead Command Shared Object Symbol
+# ........ ........ ............. .......
+#
+ 100.00% user_app user_app [.] foo
+
+
+#
+# (For a higher level overview, try: perf report --sort comm,dso)
+#
+
+
+We can see the markers in libvirt (if it is compiled with --with-dtrace option) :
+perf probe -x /lib64/libc.so.6 --markers
+%libc:setjmp
+%libc:longjmp
+%libc:longjmp_target
+%libc:lll_futex_wake
+%libc:lll_lock_wait_private
+%libc:longjmp
+%libc:longjmp_target
+%libc:lll_futex_wake
+
+- And then use perf to probe into any marker:
+
+# perf probe -x /lib64/libc.so.6 %libc:setjmp
+Added new event:
+ libc:setjmp (on 0x35981)
+
+You can now use it in all perf tools, such as:
+
+ perf record -e libc:setjmp -aR sleep 1
+
+
+This link shows an example of marker probing with Systemtap:
+https://sourceware.org/systemtap/wiki/AddingUserSpaceProbingToApps
+
+And, this link shows more info on SDT markers:
+http://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation
+
+- Markers in binaries :
+These SDT markers are present in the ELF in the section named
+".note.stapsdt".
+Here, the name of the marker, its provider, type, location, base
+address, semaphore address, arguments are present.
+We can retrieve these values using the members name_off and desc_off in
+Nhdr structure. If these are not enabled, they are present in the ELF as nop.

2013-10-07 15:48:40

by Frank Ch. Eigler

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] Perf support to SDT markers

Hemant Kumar <[email protected]> writes:

> [...]
> A simple example to show this follows.
> - Create a file with .d extension and mention the probe names in it with
> provider name and marker name.
> [...]
> - Now create the probes.h and probes.o file :
> $ dtrace -C -h -s probes.d -o probes.h
> $ dtrace -C -G -s probes.d -o probes.o
> [...]

It may be worthwhile to document an even-simpler case:

- no .d file
- no invocation of the dtrace python script
- no generated .h or .o file
- in the C file, just add:

#include <sys/sdt.h>

void main () {
/* ... */
STAP_PROBE(provider_name,probe_name);
/* ... */
}

- gcc file.c
- stap -l 'process("./a.out").mark("*")' to list


- FChE

Subject: Re: [PATCH v2 0/3] Perf support to SDT markers

Hi,

(2013/10/07 15:46), Hemant Kumar wrote:
> This patchset helps in probing dtrace style markers(SDT) present in user space
> applications through perf. Notes/markes are placed at important places by the
> developers. They have a negligible overhead when not enabled. We can enable
> them and probe at these places and find some important information like the
> arguments' values, etc.

Thank you for updating!
By the way, could you rebase this series to the latest -tip kernel next time?
It is more convenient for us to test & merge it.

[...]
> - Now use perf to list the markers in the app:
> # perf probe --markers -x ./user_app
>
> %user_app:foo_start
> %user_app:fun_start

Here, what I got.

$ ./perf probe --markers -x /lib64/librt.so
%librt:lll_futex_wake

OK, it lists up markers.

$ ./perf probe --markers -x ./perf
SDT markers not present!
Error : Failed to find SDT markers!(-1)

OK, this shows an error, INHO, it'd better show which binary doesn't have markers.

$ ./perf probe --markers
Failed to open (null)
Error : Failed to find SDT markers!(-1)

Hmm, it's a bit odd error. It should check if there is an elf binary is specified.

$ ./perf probe --markers /lib64/librt.so
%librt:lll_futex_wake

It seems to work without -x. However, that is only when the path is started with /.
And also the usage is not described by --help.

$ ./perf probe --markers ./perf
Semantic error :File always requires line number or lazy pattern.
Error: Parse Error. (-22)

So, to avoid confusion, --markers should abort without -x(--exec) option.

> - And then use perf probe to add a probe point :
>
> # perf probe -x ./user_app -a '%user_app:foo_start'

And also, without -x, -a also exits with an odd error message.

$ ./perf probe -a %libm:slowexp_p6
Failed to open (null)
Error: Failed to add events. (-1)

I'll review each patch soon.

Thank you,

--
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

2013-10-08 08:57:52

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] SDT markers listing by perf:

Hi Hemant,

On Mon, 07 Oct 2013 12:17:57 +0530, Hemant Kumar wrote:
> This patch will enable perf to list all the sdt markers present
> in an elf file. The markers are present in the .note.stapsdt section
> of the elf. We can traverse through this section and collect the
> required info about the markers.
> We can use '-S/--markers' with perf to view the SDT notes.
>
> Currently, the sdt notes which have their semaphores enabled, are being
> ignored silently. But, they will be supported soon.
>
> I think wrapping this inside #ifdef LIBELF_SUPPORT pair is not required,
> because, if NO_LIBELF = 1, then 'probe' command of perf is itself disabled.
>
> Usage:
> perf probe --markers -x /lib64/libc.so.6
>
> Output :
> %libc:setjmp
> %libc:longjmp
> %libc:longjmp_target
> %libc:lll_futex_wake
> %libc:lll_lock_wait_private
> %libc:longjmp
> %libc:longjmp_target
> %libc:lll_futex_wake
>
> Signed-off-by: Hemant Kumar Shaw <[email protected]>
> ---
> tools/perf/builtin-probe.c | 24 +++++-
> tools/perf/util/probe-event.c | 39 +++++++++
> tools/perf/util/probe-event.h | 2
> tools/perf/util/symbol-elf.c | 176 +++++++++++++++++++++++++++++++++++++++++
> tools/perf/util/symbol.h | 18 ++++
> 5 files changed, 257 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
> index e8a66f9..cbd2383 100644
> --- a/tools/perf/builtin-probe.c
> +++ b/tools/perf/builtin-probe.c
> @@ -55,6 +55,7 @@ static struct {
> bool show_funcs;
> bool mod_events;
> bool uprobes;
> + bool sdt;
> int nevents;
> struct perf_probe_event events[MAX_PROBES];
> struct strlist *dellist;
> @@ -325,6 +326,7 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
> opt_set_filter),
> OPT_CALLBACK('x', "exec", NULL, "executable|path",
> "target executable name or path", opt_set_target),
> + OPT_BOOLEAN('S', "markers", &params.sdt, "Show probe-able sdt notes"),
> OPT_END()
> };
> int ret;
> @@ -347,7 +349,7 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
> params.max_probe_points = MAX_PROBES;
>
> if ((!params.nevents && !params.dellist && !params.list_events &&
> - !params.show_lines && !params.show_funcs))
> + !params.show_lines && !params.show_funcs && !params.sdt))
> usage_with_options(probe_usage, options);
>
> /*
> @@ -355,6 +357,26 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
> */
> symbol_conf.try_vmlinux_path = (symbol_conf.vmlinux_name == NULL);
>
> + if (params.sdt) {
> + if (params.show_lines) {
> + pr_err("Error: Don't use --markers with --lines.\n");
> + usage_with_options(probe_usage, options);
> + }
> + if (params.show_vars) {
> + pr_err("Error: Don't use --markers with --vars.\n");
> + usage_with_options(probe_usage, options);
> + }
> + if (params.show_funcs) {
> + pr_err("Error: Don't use --markers with --funcs.\n");
> + usage_with_options(probe_usage, options);
> + }
> + ret = show_sdt_notes(params.target);
> + if (ret < 0) {
> + pr_err(" Error : Failed to find SDT markers!"
> + "(%d)\n", ret);
> + }
> + return ret;
> + }
> if (params.list_events) {
> if (params.mod_events) {
> pr_err(" Error: Don't use --list with --add/--del.\n");
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index aa04bf9..4e94092 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -1909,6 +1909,20 @@ static int __add_probe_trace_events(struct perf_probe_event *pev,
> return ret;
> }
>
> +static void cleanup_sdt_note_list(struct list_head *sdt_notes)
> +{
> + struct sdt_note *tmp;
> + struct list_head *pos, *s;
> +
> + list_for_each_safe(pos, s, sdt_notes) {
> + tmp = list_entry(pos, struct sdt_note, note_list);

You might use list_for_each_entry_safe() instead.


> + list_del(pos);
> + free(tmp->name);
> + free(tmp->provider);
> + free(tmp);
> + }
> +}
> +
> static int convert_to_probe_trace_events(struct perf_probe_event *pev,
> struct probe_trace_event **tevs,
> int max_tevs, const char *target)
> @@ -2372,3 +2386,28 @@ out:
> free(name);
> return ret;
> }
> +
> +static void display_sdt_note_info(struct list_head *start)
> +{
> + struct list_head *pos;
> + struct sdt_note *tmp;
> +
> + list_for_each(pos, start) {
> + tmp = list_entry(pos, struct sdt_note, note_list);

Ditto. list_for_each_entry().


> + printf("%%%s:%s\n", tmp->provider, tmp->name);
> + }
> +}
> +
> +int show_sdt_notes(const char *target)
> +{
> + struct list_head sdt_notes;
> + int ret = -1;
> +
> + INIT_LIST_HEAD(&sdt_notes);

You can use LIST_HEAD(sdt_notes) here.


> + ret = get_sdt_note_list(&sdt_notes, target);
> + if (!ret) {
> + display_sdt_note_info(&sdt_notes);
> + cleanup_sdt_note_list(&sdt_notes);
> + }
> + return ret;
> +}
> diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
> index f9f3de8..b8a9347 100644
> --- a/tools/perf/util/probe-event.h
> +++ b/tools/perf/util/probe-event.h
> @@ -133,7 +133,7 @@ extern int show_available_vars(struct perf_probe_event *pevs, int npevs,
> struct strfilter *filter, bool externs);
> extern int show_available_funcs(const char *module, struct strfilter *filter,
> bool user);
> -
> +int show_sdt_notes(const char *target);
> /* Maximum index number of event-name postfix */
> #define MAX_EVENT_INDEX 1024
>
> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> index 4b12bf8..6b17260 100644
> --- a/tools/perf/util/symbol-elf.c
> +++ b/tools/perf/util/symbol-elf.c
> @@ -846,6 +846,182 @@ out_elf_end:
> return err;
> }
>
> +/*
> + * Populate the name, type, offset in the SDT note structure and
> + * ignore the argument fields (for now)
> + */
> +static struct sdt_note *populate_sdt_note(Elf **elf, const char *data,
> + size_t len, int type)
> +{
> + const char *provider, *name;
> + struct sdt_note *note;
> +
> + /*
> + * Three addresses need to be obtained :
> + * Marker location, address of base section and semaphore location
> + */
> + union {
> + Elf64_Addr a64[3];
> + Elf32_Addr a32[3];
> + } buf;
> +
> + /*
> + * dst and src are required for translation from file to memory
> + * representation
> + */
> + Elf_Data dst = {
> + .d_buf = &buf, .d_type = ELF_T_ADDR, .d_version = EV_CURRENT,
> + .d_size = gelf_fsize((*elf), ELF_T_ADDR, 3, EV_CURRENT),
> + .d_off = 0, .d_align = 0
> + };
> +
> + Elf_Data src = {
> + .d_buf = (void *) data, .d_type = ELF_T_ADDR,
> + .d_version = EV_CURRENT, .d_size = dst.d_size, .d_off = 0,
> + .d_align = 0
> + };
> +
> + /* Check the type of each of the notes */
> + if (type != SDT_NOTE_TYPE)
> + goto out_ret;
> +
> + note = (struct sdt_note *)zalloc(sizeof(struct sdt_note));
> + if (note == NULL) {
> + pr_err("Memory allocation error\n");
> + goto out_ret;
> + }
> + INIT_LIST_HEAD(&note->note_list);
> +
> + if (len < dst.d_size + 3)
> + goto out_free;
> +
> + /* Translation from file representation to memory representation */
> + if (gelf_xlatetom(*elf, &dst, &src,
> + elf_getident(*elf, NULL)[EI_DATA]) == NULL)
> + pr_debug("gelf_xlatetom : %s", elf_errmsg(-1));

I doubt this translation is really needed as we only deal with SDTs on a
local system only, right?

> +
> + /* Populate the fields of sdt_note */
> + provider = data + dst.d_size;
> +
> + name = (const char *)memchr(provider, '\0', data + len - provider);
> + if (name++ == NULL)
> + goto out_free;
> + note->provider = strdup(provider);
> + note->name = strdup(name);

You need to check the return value of strdup's and it should also be
freed when an error occurres after this.

> +
> + /* Obtain the addresses and ignore notes with semaphores set*/
> + if (gelf_getclass(*elf) == ELFCLASS32) {
> + note->addr.a32[0] = buf.a32[0];
> + note->addr.a32[1] = buf.a32[1];
> + note->addr.a32[2] = buf.a32[2];
> + note->bit32 = true;
> + if (buf.a32[2] != 0)
> + goto out_free;
> + } else {
> + note->addr.a64[0] = buf.a64[0];
> + note->addr.a64[1] = buf.a64[1];
> + note->addr.a64[2] = buf.a64[2];
> + note->bit32 = false;
> + if (buf.a64[2] != 0)
> + goto out_free;
> + }
> + return note;
> +
> +out_free:
> + free(note);
> +out_ret:
> + return NULL;
> +}
> +
> +static struct list_head *construct_sdt_notes_list(Elf *elf)
> +{
> + GElf_Ehdr ehdr;
> + Elf_Scn *scn = NULL;
> + Elf_Data *data;
> + GElf_Shdr shdr;
> + size_t shstrndx;
> + size_t next;
> + GElf_Nhdr nhdr;
> + size_t name_off, desc_off, offset;
> + struct sdt_note *tmp = NULL, *note = NULL;
> +
> + if (gelf_getehdr(elf, &ehdr) == NULL) {
> + pr_debug("%s: Can't get elf header.\n", __func__);
> + goto out_err;
> + }
> + if (elf_getshdrstrndx(elf, &shstrndx) != 0) {
> + pr_debug("getshdrstrndx failed\n");
> + goto out_err;
> + }
> +
> + /*
> + * Look for section type = SHT_NOTE, flags = no SHF_ALLOC
> + * and name = .note.stapsdt
> + */
> + scn = elf_section_by_name(elf, &ehdr, &shdr, NOTE_SCN, NULL);
> + if (!scn) {
> + printf("SDT markers not present!\n");
> + goto out_err;
> + }
> + if (!(shdr.sh_type == SHT_NOTE) || (shdr.sh_flags & SHF_ALLOC))
> + goto out_err;
> +
> + data = elf_getdata(scn, NULL);
> +
> + /* Get the SDT notes */
> + for (offset = 0; (next = gelf_getnote(data, offset, &nhdr, &name_off,
> + &desc_off)) > 0; offset = next) {
> + if (nhdr.n_namesz == sizeof(SDT_NOTE_NAME) &&
> + !memcmp(data->d_buf + name_off, SDT_NOTE_NAME,
> + sizeof(SDT_NOTE_NAME))) {
> + tmp = populate_sdt_note(&elf, (const char *)
> + ((long)(data->d_buf) +
> + (long)desc_off),

It seems that data->d_buf is type of void *, so these casts can go away,
I guess.

> + nhdr.n_descsz, nhdr.n_type);
> + if (!note && tmp)
> + note = tmp;
> + else if (tmp)
> + list_add_tail(&tmp->note_list,
> + &(note->note_list));
> + }
> + }
> + if (tmp)
> + return &tmp->note_list;

This list handling code looks very strange to me. Why not just passing
a list head and connect notes to it?


> +out_err:
> + return NULL;
> +}
> +
> +int get_sdt_note_list(struct list_head *head, const char *target)
> +{
> + Elf *elf;
> + int fd, ret = -1;
> + struct list_head *tmp = NULL;
> +
> + fd = open(target, O_RDONLY);
> + if (fd < 0) {
> + pr_err("Failed to open %s\n", target);
> + goto out_ret;
> + }
> +
> + symbol__elf_init();
> + elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL);
> + if (!elf) {
> + pr_err("%s: cannot read %s ELF file.\n", __func__, target);
> + goto out_close;
> + }
> + tmp = construct_sdt_notes_list(elf);
> + if (tmp) {
> + list_add(head, tmp);

Look like the params are exchanged?

/**
* list_add - add a new entry
* @new: new entry to be added
* @head: list head to add it after
*
* Insert a new entry after the specified head.
* This is good for implementing stacks.
*/
static inline void list_add(struct list_head *new, struct list_head *head)
{
__list_add(new, head, head->next);
}


> + ret = 0;
> + }
> + elf_end(elf);
> +
> +out_close:
> + close(fd);
> +out_ret:
> + return ret;
> +}
> +
> void symbol__elf_init(void)
> {
> elf_version(EV_CURRENT);
> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> index 5f720dc..037185c 100644
> --- a/tools/perf/util/symbol.h
> +++ b/tools/perf/util/symbol.h
> @@ -197,6 +197,17 @@ struct symsrc {
> #endif
> };
>
> +struct sdt_note {
> + char *name;
> + char *provider;
> + bool bit32; /* 32 or 64 bit flag */
> + union {
> + Elf64_Addr a64[3];
> + Elf32_Addr a32[3];
> + } addr;
> + struct list_head note_list;
> +};
> +
> void symsrc__destroy(struct symsrc *ss);
> int symsrc__init(struct symsrc *ss, struct dso *dso, const char *name,
> enum dso_binary_type type);
> @@ -247,4 +258,11 @@ void symbols__fixup_duplicate(struct rb_root *symbols);
> void symbols__fixup_end(struct rb_root *symbols);
> void __map_groups__fixup_end(struct map_groups *mg, enum map_type type);
>
> +/* Specific to SDT notes */
> +int get_sdt_note_list(struct list_head *head, const char *target);
> +
> +#define SDT_NOTE_TYPE 3
> +#define NOTE_SCN ".note.stapsdt"

What about SDT_NOTE_SCN for consistency?

Thanks,
Namhyung


> +#define SDT_NOTE_NAME "stapsdt"
> +
> #endif /* __PERF_SYMBOL */

2013-10-08 09:09:36

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] Support for perf to probe into SDT markers:

On Mon, 07 Oct 2013 12:18:29 +0530, Hemant Kumar wrote:
> This allows perf to probe into the sdt markers/notes present in
> the libraries and executables. We try to find the associated location
> and handle prelinking (since, stapsdt notes section is not allocated
> during runtime). Prelinking is handled with the help of base
> section which is allocated during runtime. This address can be compared
> with the address retrieved from the notes' description. If its different,
> we can take this difference and then add to the note's location.
>
> We can use existing '-a/--add' option to add events for sdt markers.
> Also, we can add multiple events at once using the same '-a' option.
>
> Usage:
> perf probe -x /lib64/libc.so.6 -a 'my_event=%libc:setjmp'
>
> or
>
> perf probe -x /lib64/libc.so.6 %libc:setjmp
>
> Output (corresponding to the first usage):
> Added new event:
> libc:my_event (on 0x35981)
>
> You can now use it in all perf tools, such as:
>
> perf record -e libc:my_event -aR sleep 1
>
>
> Signed-off-by: Hemant Kumar Shaw <[email protected]>
> ---
> tools/perf/builtin-probe.c | 11 +++++
> tools/perf/util/probe-event.c | 89 +++++++++++++++++++++++++++++++++++++----
> tools/perf/util/probe-event.h | 2 +
> tools/perf/util/symbol-elf.c | 80 +++++++++++++++++++++++++++++++++++++
> tools/perf/util/symbol.h | 3 +
> 5 files changed, 177 insertions(+), 8 deletions(-)
>
> diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
> index cbd2383..6f09723 100644
> --- a/tools/perf/builtin-probe.c
> +++ b/tools/perf/builtin-probe.c
> @@ -370,6 +370,17 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
> pr_err("Error: Don't use --markers with --funcs.\n");
> usage_with_options(probe_usage, options);
> }
> + if (params.mod_events) {
> + ret = add_perf_probe_events(params.events,
> + params.nevents,
> + params.max_probe_points,
> + params.target,
> + params.force_add);
> + if (ret < 0) {
> + pr_err(" Error: Failed to add events. "
> + " (%d)\n", ret);
> + }
> + }
> ret = show_sdt_notes(params.target);
> if (ret < 0) {
> pr_err(" Error : Failed to find SDT markers!"
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index 4e94092..43f8a69 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -817,6 +817,35 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
> arg = tmp;
> }
>
> + /* Check for SDT marker */
> + if (*arg == '%') {
> + ptr = strchr(++arg, ':');
> + if (!ptr) {
> + semantic_error("Provider name must follow an event "
> + "name\n");
> + return -EINVAL;
> + }
> + *ptr++ = '\0';
> + tmp = strdup(arg);
> + if (!tmp)
> + return -ENOMEM;
> +
> + pev->point.note = (struct sdt_note *)
> + zalloc(sizeof(struct sdt_note));
> + if (!pev->point.note)
> + return -ENOMEM;
> + pev->point.note->provider = tmp;
> +
> + tmp = strdup(ptr);
> + if (!tmp)
> + return -ENOMEM;

These -ENOMEM returning should free all memory region allocated
previously.


> + pev->point.note->name = tmp;
> + pev->group = pev->point.note->provider;
> + if (!pev->event)
> + pev->event = pev->point.note->name;
> + pev->sdt = true;
> + return 0;
> + }
> ptr = strpbrk(arg, ";:+@%");
> if (ptr) {
> nc = *ptr;
> @@ -1270,6 +1299,13 @@ static char *synthesize_perf_probe_point(struct perf_probe_point *pp)
> ret = e_snprintf(buf, MAX_CMDLEN, "%s%s%s%s%s", pp->function,
> offs, pp->retprobe ? "%return" : "", line,
> file);
> + else if (pp->note)
> + if (pp->note->bit32)
> + ret = e_snprintf(buf, MAX_CMDLEN, "0x%x",
> + pp->note->addr.a32[0]);
> + else
> + ret = e_snprintf(buf, MAX_CMDLEN, "0x%lx",
> + pp->note->addr.a64[0]);

This kind of code tends to cause 32/64-bit build problem. Did you test
it on a 32-bit system too? Anyway, I think things like below should
work:

unsigned long long addr;

if (pp->note->bit32)
addr = pp->note->addr.a32[0];
else
addr = pp->note->addr.a64[0];

ret = e_snprintf(buf, MAX_CMDLEN, "0x%llx", addr);


> else
> ret = e_snprintf(buf, MAX_CMDLEN, "%s%s", file, line);
> if (ret <= 0)
> @@ -1923,6 +1959,19 @@ static void cleanup_sdt_note_list(struct list_head *sdt_notes)
> }
> }
>
> +static int try_to_find_sdt_notes(struct perf_probe_event *pev,
> + const char *target)
> +{
> + struct list_head sdt_notes;
> + int ret = -1;
> +
> + INIT_LIST_HEAD(&sdt_notes);

You can use just LIST_HEAD(sdt_notes) here.


> + ret = search_sdt_note(pev->point.note, &sdt_notes, target);
> + if (!list_empty(&sdt_notes))
> + cleanup_sdt_note_list(&sdt_notes);
> + return ret;
> +}
> +
> static int convert_to_probe_trace_events(struct perf_probe_event *pev,
> struct probe_trace_event **tevs,
> int max_tevs, const char *target)
> @@ -1930,11 +1979,20 @@ static int convert_to_probe_trace_events(struct perf_probe_event *pev,
> struct symbol *sym;
> int ret = 0, i;
> struct probe_trace_event *tev;
> + char *buf;
>
> - /* Convert perf_probe_event with debuginfo */
> - ret = try_to_find_probe_trace_events(pev, tevs, max_tevs, target);
> - if (ret != 0)
> - return ret; /* Found in debuginfo or got an error */
> + if (pev->sdt) {
> + ret = try_to_find_sdt_notes(pev, target);
> + if (ret)
> + return ret;
> + } else {
> + /* Convert perf_probe_event with debuginfo */
> + ret = try_to_find_probe_trace_events(pev, tevs, max_tevs,
> + target);
> + /* Found in debuginfo or got an error */
> + if (ret != 0)
> + return ret;
> + }
>
> /* Allocate trace event buffer */
> tev = *tevs = zalloc(sizeof(struct probe_trace_event));
> @@ -1942,10 +2000,25 @@ static int convert_to_probe_trace_events(struct perf_probe_event *pev,
> return -ENOMEM;
>
> /* Copy parameters */
> - tev->point.symbol = strdup(pev->point.function);
> - if (tev->point.symbol == NULL) {
> - ret = -ENOMEM;
> - goto error;
> + if (pev->sdt) {
> + buf = (char *)zalloc(sizeof(char) * MAX_CMDLEN);
> + if (!buf) {
> + ret = -ENOMEM;
> + goto error;
> + }
> + if (pev->point.note->bit32)
> + sprintf(buf, "0x%x",
> + (unsigned)pev->point.note->addr.a32[0]);
> + else
> + sprintf(buf, "0x%lx",
> + (unsigned long)pev->point.note->addr.a64[0]);

Please see my comment above.


> + tev->point.symbol = buf;
> + } else {
> + tev->point.symbol = strdup(pev->point.function);
> + if (tev->point.symbol == NULL) {
> + ret = -ENOMEM;
> + goto error;
> + }
> }
>
> if (target) {
> diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
> index b8a9347..4bd50cc 100644
> --- a/tools/perf/util/probe-event.h
> +++ b/tools/perf/util/probe-event.h
> @@ -47,6 +47,7 @@ struct perf_probe_point {
> bool retprobe; /* Return probe flag */
> char *lazy_line; /* Lazy matching pattern */
> unsigned long offset; /* Offset from function entry */
> + struct sdt_note *note;

I guess what you need is a 'struct list_head'.


> };
>
> /* Perf probe probing argument field chain */
> @@ -72,6 +73,7 @@ struct perf_probe_event {
> struct perf_probe_point point; /* Probe point */
> int nargs; /* Number of arguments */
> bool uprobes;
> + bool sdt;
> struct perf_probe_arg *args; /* Arguments */
> };
>
> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> index 6b17260..d0e7a66 100644
> --- a/tools/perf/util/symbol-elf.c
> +++ b/tools/perf/util/symbol-elf.c
> @@ -1022,6 +1022,86 @@ out_ret:
> return ret;
> }
>
> +static void adjust_note_addr(struct sdt_note *tmp, struct sdt_note *key,
> + Elf *elf)
> +{
> + GElf_Ehdr ehdr;
> + GElf_Addr base_off = 0;
> + GElf_Shdr shdr;
> +
> + if (!gelf_getehdr(elf, &ehdr)) {
> + pr_debug("%s : cannot get elf header.\n", __func__);
> + return;
> + }
> +
> + /*
> + * Find out the .stapsdt.base section.
> + * This scn will help us to handle prelinking (if present).
> + * Compare the retrieved file offset of the base section with the
> + * base address in the description of the SDT note. If its different,
> + * then accordingly, adjust the note location.
> + */
> + if (elf_section_by_name(elf, &ehdr, &shdr, SDT_BASE, NULL))
> + base_off = shdr.sh_offset;
> + if (base_off) {
> + if (tmp->bit32)
> + key->addr.a32[0] = tmp->addr.a32[0] + base_off -
> + tmp->addr.a32[1];
> + else
> + key->addr.a64[0] = tmp->addr.a64[0] + base_off -
> + tmp->addr.a64[1];
> + }
> + key->bit32 = tmp->bit32;
> +}
> +
> +int search_sdt_note(struct sdt_note *key, struct list_head *sdt_notes,
> + const char *target)
> +{
> + Elf *elf;
> + int fd, ret = -1;
> + struct list_head *pos = NULL, *head = NULL;
> + struct sdt_note *tmp = NULL;
> +
> + fd = open(target, O_RDONLY);
> + if (fd < 0) {
> + pr_err("Failed to open %s\n", target);
> + goto out_ret;
> + }
> +
> + symbol__elf_init();
> + elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL);
> + if (!elf) {
> + pr_err("%s : can't read %s ELF file.\n", __func__, target);
> + goto out_close;
> + }
> +
> + head = construct_sdt_notes_list(elf);
> + if (!head)
> + goto out_end;
> +
> + list_add(sdt_notes, head);

This list code looks strange.


> +
> + /* Iterate through the notes and retrieve the required note */
> + list_for_each(pos, sdt_notes) {
> + tmp = list_entry(pos, struct sdt_note, note_list);
> + if (!strcmp(key->name, tmp->name) &&
> + !strcmp(key->provider, tmp->provider)) {
> + adjust_note_addr(tmp, key, elf);
> + ret = 0;
> + break;
> + }
> + }
> + if (ret)
> + printf("%%%s:%s not found!\n", key->provider, key->name);
> +
> +out_end:
> + elf_end(elf);
> +out_close:
> + close(fd);
> +out_ret:
> + return ret;
> +}
> +
> void symbol__elf_init(void)
> {
> elf_version(EV_CURRENT);
> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> index 037185c..0b0545b 100644
> --- a/tools/perf/util/symbol.h
> +++ b/tools/perf/util/symbol.h
> @@ -260,9 +260,12 @@ void __map_groups__fixup_end(struct map_groups *mg, enum map_type type);
>
> /* Specific to SDT notes */
> int get_sdt_note_list(struct list_head *head, const char *target);
> +int search_sdt_note(struct sdt_note *key, struct list_head *head,
> + const char *target);
>
> #define SDT_NOTE_TYPE 3
> #define NOTE_SCN ".note.stapsdt"
> #define SDT_NOTE_NAME "stapsdt"
> +#define SDT_BASE ".stapsdt.base"

What about SDT_BASE_SCN ?

Thanks,
Namhyung


>
> #endif /* __PERF_SYMBOL */

2013-10-08 09:10:55

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] Perf support to SDT markers

Hi Frank,

On Mon, 07 Oct 2013 11:47:09 -0400, Frank Ch. Eigler wrote:
> Hemant Kumar <[email protected]> writes:
>
>> [...]
>> A simple example to show this follows.
>> - Create a file with .d extension and mention the probe names in it with
>> provider name and marker name.
>> [...]
>> - Now create the probes.h and probes.o file :
>> $ dtrace -C -h -s probes.d -o probes.h
>> $ dtrace -C -G -s probes.d -o probes.o
>> [...]
>
> It may be worthwhile to document an even-simpler case:
>
> - no .d file
> - no invocation of the dtrace python script
> - no generated .h or .o file
> - in the C file, just add:
>
> #include <sys/sdt.h>
>
> void main () {
> /* ... */
> STAP_PROBE(provider_name,probe_name);
> /* ... */
> }
>
> - gcc file.c
> - stap -l 'process("./a.out").mark("*")' to list

Yes, looks much simpler and better.

Hemant, would you add it to the doc too?

Thanks,
Namhyung

Subject: Re: [PATCH v2 1/3] SDT markers listing by perf:

(2013/10/07 15:47), Hemant Kumar wrote:
> @@ -325,6 +326,7 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
> opt_set_filter),
> OPT_CALLBACK('x', "exec", NULL, "executable|path",
> "target executable name or path", opt_set_target),
> + OPT_BOOLEAN('S', "markers", &params.sdt, "Show probe-able sdt notes"),

BTW, "-S" for short of "--markers" option is a bit odd. Would we better use -M ? :)

[...]
> @@ -355,6 +357,26 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
> */
> symbol_conf.try_vmlinux_path = (symbol_conf.vmlinux_name == NULL);
>
> + if (params.sdt) {
> + if (params.show_lines) {
> + pr_err("Error: Don't use --markers with --lines.\n");
> + usage_with_options(probe_usage, options);
> + }
> + if (params.show_vars) {
> + pr_err("Error: Don't use --markers with --vars.\n");
> + usage_with_options(probe_usage, options);
> + }
> + if (params.show_funcs) {
> + pr_err("Error: Don't use --markers with --funcs.\n");
> + usage_with_options(probe_usage, options);
> + }
> + ret = show_sdt_notes(params.target);

You have to ensure params.target is set before use it.

> + if (ret < 0) {
> + pr_err(" Error : Failed to find SDT markers!"
> + "(%d)\n", ret);
> + }
> + return ret;
> + }
> if (params.list_events) {
> if (params.mod_events) {
> pr_err(" Error: Don't use --list with --add/--del.\n");

[...]
> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> index 4b12bf8..6b17260 100644
> --- a/tools/perf/util/symbol-elf.c
> +++ b/tools/perf/util/symbol-elf.c
> @@ -846,6 +846,182 @@ out_elf_end:
> return err;
> }
>
> +/*
> + * Populate the name, type, offset in the SDT note structure and
> + * ignore the argument fields (for now)
> + */
> +static struct sdt_note *populate_sdt_note(Elf **elf, const char *data,
> + size_t len, int type)
> +{
> + const char *provider, *name;
> + struct sdt_note *note;
> +
> + /*
> + * Three addresses need to be obtained :
> + * Marker location, address of base section and semaphore location
> + */
> + union {
> + Elf64_Addr a64[3];
> + Elf32_Addr a32[3];
> + } buf;
> +
> + /*
> + * dst and src are required for translation from file to memory
> + * representation
> + */
> + Elf_Data dst = {
> + .d_buf = &buf, .d_type = ELF_T_ADDR, .d_version = EV_CURRENT,
> + .d_size = gelf_fsize((*elf), ELF_T_ADDR, 3, EV_CURRENT),
> + .d_off = 0, .d_align = 0
> + };
> +
> + Elf_Data src = {
> + .d_buf = (void *) data, .d_type = ELF_T_ADDR,
> + .d_version = EV_CURRENT, .d_size = dst.d_size, .d_off = 0,
> + .d_align = 0
> + };
> +
> + /* Check the type of each of the notes */
> + if (type != SDT_NOTE_TYPE)
> + goto out_ret;
> +
> + note = (struct sdt_note *)zalloc(sizeof(struct sdt_note));
> + if (note == NULL) {
> + pr_err("Memory allocation error\n");

No need to state out of memory for each place...
If you hit an out of memory for such small piece, you'd better
abort execution. For that purpose, I recommend you to return
errno from each function, and caller must check it.

> + goto out_ret;
> + }
> + INIT_LIST_HEAD(&note->note_list);
> +
> + if (len < dst.d_size + 3)
> + goto out_free;
> +
> + /* Translation from file representation to memory representation */
> + if (gelf_xlatetom(*elf, &dst, &src,
> + elf_getident(*elf, NULL)[EI_DATA]) == NULL)
> + pr_debug("gelf_xlatetom : %s", elf_errmsg(-1));
> +
> + /* Populate the fields of sdt_note */
> + provider = data + dst.d_size;
> +
> + name = (const char *)memchr(provider, '\0', data + len - provider);
> + if (name++ == NULL)
> + goto out_free;
> + note->provider = strdup(provider);
> + note->name = strdup(name);
> +
> + /* Obtain the addresses and ignore notes with semaphores set*/
> + if (gelf_getclass(*elf) == ELFCLASS32) {
> + note->addr.a32[0] = buf.a32[0];
> + note->addr.a32[1] = buf.a32[1];
> + note->addr.a32[2] = buf.a32[2];
> + note->bit32 = true;
> + if (buf.a32[2] != 0)
> + goto out_free;
> + } else {
> + note->addr.a64[0] = buf.a64[0];
> + note->addr.a64[1] = buf.a64[1];
> + note->addr.a64[2] = buf.a64[2];
> + note->bit32 = false;
> + if (buf.a64[2] != 0)
> + goto out_free;
> + }
> + return note;
> +
> +out_free:
> + free(note);
> +out_ret:
> + return NULL;
> +}
> +
> +static struct list_head *construct_sdt_notes_list(Elf *elf)

This also would better return errno.

> +{
> + GElf_Ehdr ehdr;
> + Elf_Scn *scn = NULL;
> + Elf_Data *data;
> + GElf_Shdr shdr;
> + size_t shstrndx;
> + size_t next;
> + GElf_Nhdr nhdr;
> + size_t name_off, desc_off, offset;
> + struct sdt_note *tmp = NULL, *note = NULL;
> +
> + if (gelf_getehdr(elf, &ehdr) == NULL) {
> + pr_debug("%s: Can't get elf header.\n", __func__);
> + goto out_err;
> + }
> + if (elf_getshdrstrndx(elf, &shstrndx) != 0) {
> + pr_debug("getshdrstrndx failed\n");
> + goto out_err;
> + }
> +
> + /*
> + * Look for section type = SHT_NOTE, flags = no SHF_ALLOC
> + * and name = .note.stapsdt
> + */
> + scn = elf_section_by_name(elf, &ehdr, &shdr, NOTE_SCN, NULL);
> + if (!scn) {
> + printf("SDT markers not present!\n");

Why you don't use pr_err() here ? :)
And also, as I commented, I'd like to know which binary doesn't have SDT markers,
at least with -v (verbose) option.

> + goto out_err;
> + }
> + if (!(shdr.sh_type == SHT_NOTE) || (shdr.sh_flags & SHF_ALLOC))
> + goto out_err;
> +
> + data = elf_getdata(scn, NULL);
> +
> + /* Get the SDT notes */
> + for (offset = 0; (next = gelf_getnote(data, offset, &nhdr, &name_off,
> + &desc_off)) > 0; offset = next) {
> + if (nhdr.n_namesz == sizeof(SDT_NOTE_NAME) &&
> + !memcmp(data->d_buf + name_off, SDT_NOTE_NAME,
> + sizeof(SDT_NOTE_NAME))) {
> + tmp = populate_sdt_note(&elf, (const char *)
> + ((long)(data->d_buf) +
> + (long)desc_off),
> + nhdr.n_descsz, nhdr.n_type);
> + if (!note && tmp)
> + note = tmp;
> + else if (tmp)
> + list_add_tail(&tmp->note_list,
> + &(note->note_list));
> + }
> + }
> + if (tmp)
> + return &tmp->note_list;
> +out_err:
> + return NULL;
> +}
> +

Thank you,


--
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

Subject: Re: [PATCH v2 2/3] Support for perf to probe into SDT markers:

(2013/10/07 15:48), Hemant Kumar wrote:
> This allows perf to probe into the sdt markers/notes present in
> the libraries and executables. We try to find the associated location
> and handle prelinking (since, stapsdt notes section is not allocated
> during runtime). Prelinking is handled with the help of base
> section which is allocated during runtime. This address can be compared
> with the address retrieved from the notes' description. If its different,
> we can take this difference and then add to the note's location.
>
> We can use existing '-a/--add' option to add events for sdt markers.
> Also, we can add multiple events at once using the same '-a' option.
>
> Usage:
> perf probe -x /lib64/libc.so.6 -a 'my_event=%libc:setjmp'
>
> or
>
> perf probe -x /lib64/libc.so.6 %libc:setjmp
>
> Output (corresponding to the first usage):
> Added new event:
> libc:my_event (on 0x35981)
>
> You can now use it in all perf tools, such as:
>
> perf record -e libc:my_event -aR sleep 1
>
>
> Signed-off-by: Hemant Kumar Shaw <[email protected]>
> ---
> tools/perf/builtin-probe.c | 11 +++++
> tools/perf/util/probe-event.c | 89 +++++++++++++++++++++++++++++++++++++----
> tools/perf/util/probe-event.h | 2 +
> tools/perf/util/symbol-elf.c | 80 +++++++++++++++++++++++++++++++++++++
> tools/perf/util/symbol.h | 3 +
> 5 files changed, 177 insertions(+), 8 deletions(-)
>
> diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
> index cbd2383..6f09723 100644
> --- a/tools/perf/builtin-probe.c
> +++ b/tools/perf/builtin-probe.c
> @@ -370,6 +370,17 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
> pr_err("Error: Don't use --markers with --funcs.\n");
> usage_with_options(probe_usage, options);
> }
> + if (params.mod_events) {
> + ret = add_perf_probe_events(params.events,
> + params.nevents,
> + params.max_probe_points,
> + params.target,
> + params.force_add);
> + if (ret < 0) {
> + pr_err(" Error: Failed to add events. "
> + " (%d)\n", ret);
> + }
> + }

What is this code for? params.sdt is true only if "--markers" is set, and that
should not be used with --add and --del, because it's an action "query markers".
We should give an error and abort here.

Other points are covered by Namhyung's review(thanks!).

Thank you!

--
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

2013-10-08 12:54:13

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] Perf support to SDT markers

On 10/08/2013 11:33 AM, Masami Hiramatsu wrote:
> Hi,

Hi Masami and thanks a lot for testing these patches.

>
> (2013/10/07 15:46), Hemant Kumar wrote:
>> This patchset helps in probing dtrace style markers(SDT) present in user space
>> applications through perf. Notes/markes are placed at important places by the
>> developers. They have a negligible overhead when not enabled. We can enable
>> them and probe at these places and find some important information like the
>> arguments' values, etc.
> Thank you for updating!
> By the way, could you rebase this series to the latest -tip kernel next time?
> It is more convenient for us to test & merge it.

Yes, I will rebase this series next time.

>
> [...]
>> - Now use perf to list the markers in the app:
>> # perf probe --markers -x ./user_app
>>
>> %user_app:foo_start
>> %user_app:fun_start
> Here, what I got.
>
> $ ./perf probe --markers -x /lib64/librt.so
> %librt:lll_futex_wake
>
> OK, it lists up markers.
>
> $ ./perf probe --markers -x ./perf
> SDT markers not present!
> Error : Failed to find SDT markers!(-1)
>
> OK, this shows an error, INHO, it'd better show which binary doesn't have markers.

OK, will include the binary name too.

>
> $ ./perf probe --markers
> Failed to open (null)
> Error : Failed to find SDT markers!(-1)
>
> Hmm, it's a bit odd error. It should check if there is an elf binary is specified.

Yeah, correct.

>
> $ ./perf probe --markers /lib64/librt.so
> %librt:lll_futex_wake
>
> It seems to work without -x. However, that is only when the path is started with /.
> And also the usage is not described by --help.
>
> $ ./perf probe --markers ./perf
> Semantic error :File always requires line number or lazy pattern.
> Error: Parse Error. (-22)
>
> So, to avoid confusion, --markers should abort without -x(--exec) option.

Yes, we can restrict action of --markers for -x option only. I thought
of keeping the same behavior as in case of function listing but I guess
restricting it would be better since, --markers should only be available
in user space.

>
>> - And then use perf probe to add a probe point :
>>
>> # perf probe -x ./user_app -a '%user_app:foo_start'
> And also, without -x, -a also exits with an odd error message.
>
> $ ./perf probe -a %libm:slowexp_p6
> Failed to open (null)
> Error: Failed to add events. (-1)

Ok, will rectify this behavior.

>
> I'll review each patch soon.

Thanks, looking forward to that.

- Hemant

2013-10-08 13:04:45

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] SDT markers listing by perf:

On 10/08/2013 02:27 PM, Namhyung Kim wrote:
> Hi Hemant,

Hi and thanks for reviewing the patches...

>
> On Mon, 07 Oct 2013 12:17:57 +0530, Hemant Kumar wrote:
>> [...]
>> +static void cleanup_sdt_note_list(struct list_head *sdt_notes)
>> +{
>> + struct sdt_note *tmp;
>> + struct list_head *pos, *s;
>> +
>> + list_for_each_safe(pos, s, sdt_notes) {
>> + tmp = list_entry(pos, struct sdt_note, note_list);
> You might use list_for_each_entry_safe() instead.

Ok, will use that.

>
>> + list_del(pos);
>> + free(tmp->name);
>> + free(tmp->provider);
>> + free(tmp);
>> + }
>> +}
>> +
>> static int convert_to_probe_trace_events(struct perf_probe_event *pev,
>> struct probe_trace_event **tevs,
>> int max_tevs, const char *target)
>> @@ -2372,3 +2386,28 @@ out:
>> free(name);
>> return ret;
>> }
>> +
>> +static void display_sdt_note_info(struct list_head *start)
>> +{
>> + struct list_head *pos;
>> + struct sdt_note *tmp;
>> +
>> + list_for_each(pos, start) {
>> + tmp = list_entry(pos, struct sdt_note, note_list);
> Ditto. list_for_each_entry().

Ok...

>
>> + printf("%%%s:%s\n", tmp->provider, tmp->name);
>> + }
>> +}
>> +
>> +int show_sdt_notes(const char *target)
>> +{
>> + struct list_head sdt_notes;
>> + int ret = -1;
>> +
>> + INIT_LIST_HEAD(&sdt_notes);
> You can use LIST_HEAD(sdt_notes) here.

Yes. Will use LIST_HEAD() instead.

>
>> + ret = get_sdt_note_list(&sdt_notes, target);
>> + if (!ret) {
>> + display_sdt_note_info(&sdt_notes);
>> + cleanup_sdt_note_list(&sdt_notes);
>> + }
>> + return ret;
>> +}
>> diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
>> index f9f3de8..b8a9347 100644
>> --- a/tools/perf/util/probe-event.h
>> +++ b/tools/perf/util/probe-event.h
>> @@ -133,7 +133,7 @@ extern int show_available_vars(struct perf_probe_event *pevs, int npevs,
>> struct strfilter *filter, bool externs);
>> extern int show_available_funcs(const char *module, struct strfilter *filter,
>> bool user);
>> -
>> +int show_sdt_notes(const char *target);
>> /* Maximum index number of event-name postfix */
>> #define MAX_EVENT_INDEX 1024
>>
>> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
>> index 4b12bf8..6b17260 100644
>> --- a/tools/perf/util/symbol-elf.c
>> +++ b/tools/perf/util/symbol-elf.c
>> @@ -846,6 +846,182 @@ out_elf_end:
>> return err;
>> }
>>
>> +/*
>> + * Populate the name, type, offset in the SDT note structure and
>> + * ignore the argument fields (for now)
>> + */
>> +static struct sdt_note *populate_sdt_note(Elf **elf, const char *data,
>> + size_t len, int type)
>> +{
>> + const char *provider, *name;
>> + struct sdt_note *note;
>> +
>> + /*
>> + * Three addresses need to be obtained :
>> + * Marker location, address of base section and semaphore location
>> + */
>> + union {
>> + Elf64_Addr a64[3];
>> + Elf32_Addr a32[3];
>> + } buf;
>> +
>> + /*
>> + * dst and src are required for translation from file to memory
>> + * representation
>> + */
>> + Elf_Data dst = {
>> + .d_buf = &buf, .d_type = ELF_T_ADDR, .d_version = EV_CURRENT,
>> + .d_size = gelf_fsize((*elf), ELF_T_ADDR, 3, EV_CURRENT),
>> + .d_off = 0, .d_align = 0
>> + };
>> +
>> + Elf_Data src = {
>> + .d_buf = (void *) data, .d_type = ELF_T_ADDR,
>> + .d_version = EV_CURRENT, .d_size = dst.d_size, .d_off = 0,
>> + .d_align = 0
>> + };
>> +
>> + /* Check the type of each of the notes */
>> + if (type != SDT_NOTE_TYPE)
>> + goto out_ret;
>> +
>> + note = (struct sdt_note *)zalloc(sizeof(struct sdt_note));
>> + if (note == NULL) {
>> + pr_err("Memory allocation error\n");
>> + goto out_ret;
>> + }
>> + INIT_LIST_HEAD(&note->note_list);
>> +
>> + if (len < dst.d_size + 3)
>> + goto out_free;
>> +
>> + /* Translation from file representation to memory representation */
>> + if (gelf_xlatetom(*elf, &dst, &src,
>> + elf_getident(*elf, NULL)[EI_DATA]) == NULL)
>> + pr_debug("gelf_xlatetom : %s", elf_errmsg(-1));
> I doubt this translation is really needed as we only deal with SDTs on a
> local system only, right?

In case of SDTs on a local system, we don't need gelf_xlatetom().

>> +
>> + /* Populate the fields of sdt_note */
>> + provider = data + dst.d_size;
>> +
>> + name = (const char *)memchr(provider, '\0', data + len - provider);
>> + if (name++ == NULL)
>> + goto out_free;
>> + note->provider = strdup(provider);
>> + note->name = strdup(name);
> You need to check the return value of strdup's and it should also be
> freed when an error occurres after this.

Missed that. Thanks for pointing that out.

>> [...]
>> + /* Get the SDT notes */
>> + for (offset = 0; (next = gelf_getnote(data, offset, &nhdr, &name_off,
>> + &desc_off)) > 0; offset = next) {
>> + if (nhdr.n_namesz == sizeof(SDT_NOTE_NAME) &&
>> + !memcmp(data->d_buf + name_off, SDT_NOTE_NAME,
>> + sizeof(SDT_NOTE_NAME))) {
>> + tmp = populate_sdt_note(&elf, (const char *)
>> + ((long)(data->d_buf) +
>> + (long)desc_off),
> It seems that data->d_buf is type of void *, so these casts can go away,
> I guess.

Yeah correct, we don't need these casts.

>> + nhdr.n_descsz, nhdr.n_type);
>> + if (!note && tmp)
>> + note = tmp;
>> + else if (tmp)
>> + list_add_tail(&tmp->note_list,
>> + &(note->note_list));
>> + }
>> + }
>> + if (tmp)
>> + return &tmp->note_list;
> This list handling code looks very strange to me. Why not just passing
> a list head and connect notes to it?
>

Yes it will be better to use list_head...

>> +out_err:
>> + return NULL;
>> +}
>> +
>> +int get_sdt_note_list(struct list_head *head, const char *target)
>> +{
>> + Elf *elf;
>> + int fd, ret = -1;
>> + struct list_head *tmp = NULL;
>> +
>> + fd = open(target, O_RDONLY);
>> + if (fd < 0) {
>> + pr_err("Failed to open %s\n", target);
>> + goto out_ret;
>> + }
>> +
>> + symbol__elf_init();
>> + elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL);
>> + if (!elf) {
>> + pr_err("%s: cannot read %s ELF file.\n", __func__, target);
>> + goto out_close;
>> + }
>> + tmp = construct_sdt_notes_list(elf);
>> + if (tmp) {
>> + list_add(head, tmp);
> Look like the params are exchanged?
>
> /**
> * list_add - add a new entry
> * @new: new entry to be added
> * @head: list head to add it after
> *
> * Insert a new entry after the specified head.
> * This is good for implementing stacks.
> */
> static inline void list_add(struct list_head *new, struct list_head *head)
> {
> __list_add(new, head, head->next);
> }
>


I guess they won't be exchanged... tmp will be added to head, right?
Anyway, this won't be needed if we go with list_head in struct
probe_point instead of sdt_note. Will make the required changes.

>> + ret = 0;
>> + }
>> + elf_end(elf);
>> +
>> +out_close:
>> + close(fd);
>> +out_ret:
>> + return ret;
>> +}
>> +
>> void symbol__elf_init(void)
>> {
>> elf_version(EV_CURRENT);
>> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
>> index 5f720dc..037185c 100644
>> --- a/tools/perf/util/symbol.h
>> +++ b/tools/perf/util/symbol.h
>> @@ -197,6 +197,17 @@ struct symsrc {
>> #endif
>> };
>>
>> +struct sdt_note {
>> + char *name;
>> + char *provider;
>> + bool bit32; /* 32 or 64 bit flag */
>> + union {
>> + Elf64_Addr a64[3];
>> + Elf32_Addr a32[3];
>> + } addr;
>> + struct list_head note_list;
>> +};
>> +
>> void symsrc__destroy(struct symsrc *ss);
>> int symsrc__init(struct symsrc *ss, struct dso *dso, const char *name,
>> enum dso_binary_type type);
>> @@ -247,4 +258,11 @@ void symbols__fixup_duplicate(struct rb_root *symbols);
>> void symbols__fixup_end(struct rb_root *symbols);
>> void __map_groups__fixup_end(struct map_groups *mg, enum map_type type);
>>
>> +/* Specific to SDT notes */
>> +int get_sdt_note_list(struct list_head *head, const char *target);
>> +
>> +#define SDT_NOTE_TYPE 3
>> +#define NOTE_SCN ".note.stapsdt"
> What about SDT_NOTE_SCN for consistency?

Seems better. Will change to that.

--
Thanks
Hemant

2013-10-08 13:09:00

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] Support for perf to probe into SDT markers:

On 10/08/2013 02:39 PM, Namhyung Kim wrote:
> [...]
> +
> + tmp = strdup(ptr);
> + if (!tmp)
> + return -ENOMEM;
> These -ENOMEM returning should free all memory region allocated
> previously.

Yes, missed that.

>
>> + pev->point.note->name = tmp;
>> + pev->group = pev->point.note->provider;
>> + if (!pev->event)
>> + pev->event = pev->point.note->name;
>> + pev->sdt = true;
>> + return 0;
>> + }
>> ptr = strpbrk(arg, ";:+@%");
>> if (ptr) {
>> nc = *ptr;
>> @@ -1270,6 +1299,13 @@ static char *synthesize_perf_probe_point(struct perf_probe_point *pp)
>> ret = e_snprintf(buf, MAX_CMDLEN, "%s%s%s%s%s", pp->function,
>> offs, pp->retprobe ? "%return" : "", line,
>> file);
>> + else if (pp->note)
>> + if (pp->note->bit32)
>> + ret = e_snprintf(buf, MAX_CMDLEN, "0x%x",
>> + pp->note->addr.a32[0]);
>> + else
>> + ret = e_snprintf(buf, MAX_CMDLEN, "0x%lx",
>> + pp->note->addr.a64[0]);
> This kind of code tends to cause 32/64-bit build problem. Did you test
> it on a 32-bit system too? Anyway, I think things like below should
> work:
>
> unsigned long long addr;
>
> if (pp->note->bit32)
> addr = pp->note->addr.a32[0];
> else
> addr = pp->note->addr.a64[0];
>
> ret = e_snprintf(buf, MAX_CMDLEN, "0x%llx", addr);
>

Looks better, thanks, will make the required changes.

>> else
>> ret = e_snprintf(buf, MAX_CMDLEN, "%s%s", file, line);
>> if (ret <= 0)
>> @@ -1923,6 +1959,19 @@ static void cleanup_sdt_note_list(struct list_head *sdt_notes)
>> }
>> }
>>
>> +static int try_to_find_sdt_notes(struct perf_probe_event *pev,
>> + const char *target)
>> +{
>> + struct list_head sdt_notes;
>> + int ret = -1;
>> +
>> + INIT_LIST_HEAD(&sdt_notes);
> You can use just LIST_HEAD(sdt_notes) here.
>

Ok.

>> + ret = search_sdt_note(pev->point.note, &sdt_notes, target);
>> + if (!list_empty(&sdt_notes))
>> + cleanup_sdt_note_list(&sdt_notes);
>> + return ret;
>> +}
>> +
>> static int convert_to_probe_trace_events(struct perf_probe_event *pev,
>> struct probe_trace_event **tevs,
>> int max_tevs, const char *target)
>> @@ -1930,11 +1979,20 @@ static int convert_to_probe_trace_events(struct perf_probe_event *pev,
>> struct symbol *sym;
>> int ret = 0, i;
>> struct probe_trace_event *tev;
>> + char *buf;
>>
>> - /* Convert perf_probe_event with debuginfo */
>> - ret = try_to_find_probe_trace_events(pev, tevs, max_tevs, target);
>> - if (ret != 0)
>> - return ret; /* Found in debuginfo or got an error */
>> + if (pev->sdt) {
>> + ret = try_to_find_sdt_notes(pev, target);
>> + if (ret)
>> + return ret;
>> + } else {
>> + /* Convert perf_probe_event with debuginfo */
>> + ret = try_to_find_probe_trace_events(pev, tevs, max_tevs,
>> + target);
>> + /* Found in debuginfo or got an error */
>> + if (ret != 0)
>> + return ret;
>> + }
>>
>> /* Allocate trace event buffer */
>> tev = *tevs = zalloc(sizeof(struct probe_trace_event));
>> @@ -1942,10 +2000,25 @@ static int convert_to_probe_trace_events(struct perf_probe_event *pev,
>> return -ENOMEM;
>>
>> /* Copy parameters */
>> - tev->point.symbol = strdup(pev->point.function);
>> - if (tev->point.symbol == NULL) {
>> - ret = -ENOMEM;
>> - goto error;
>> + if (pev->sdt) {
>> + buf = (char *)zalloc(sizeof(char) * MAX_CMDLEN);
>> + if (!buf) {
>> + ret = -ENOMEM;
>> + goto error;
>> + }
>> + if (pev->point.note->bit32)
>> + sprintf(buf, "0x%x",
>> + (unsigned)pev->point.note->addr.a32[0]);
>> + else
>> + sprintf(buf, "0x%lx",
>> + (unsigned long)pev->point.note->addr.a64[0]);
> Please see my comment above.

Ok. Will change that too.

>
>> + tev->point.symbol = buf;
>> + } else {
>> + tev->point.symbol = strdup(pev->point.function);
>> + if (tev->point.symbol == NULL) {
>> + ret = -ENOMEM;
>> + goto error;
>> + }
>> }
>>
>> if (target) {
>> diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
>> index b8a9347..4bd50cc 100644
>> --- a/tools/perf/util/probe-event.h
>> +++ b/tools/perf/util/probe-event.h
>> @@ -47,6 +47,7 @@ struct perf_probe_point {
>> bool retprobe; /* Return probe flag */
>> char *lazy_line; /* Lazy matching pattern */
>> unsigned long offset; /* Offset from function entry */
>> + struct sdt_note *note;
> I guess what you need is a 'struct list_head'.

Yes, struct list_head will be a better choice in struct perf_probe_point.

>
>> };
>>
>> /* Perf probe probing argument field chain */
>> @@ -72,6 +73,7 @@ struct perf_probe_event {
>> struct perf_probe_point point; /* Probe point */
>> int nargs; /* Number of arguments */
>> bool uprobes;
>> + bool sdt;
>> struct perf_probe_arg *args; /* Arguments */
>> };
>>
>> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
>> index 6b17260..d0e7a66 100644
>> --- a/tools/perf/util/symbol-elf.c
>> +++ b/tools/perf/util/symbol-elf.c
>> @@ -1022,6 +1022,86 @@ out_ret:
>> return ret;
>> }
>>
>> +static void adjust_note_addr(struct sdt_note *tmp, struct sdt_note *key,
>> + Elf *elf)
>> +{
>> + GElf_Ehdr ehdr;
>> + GElf_Addr base_off = 0;
>> + GElf_Shdr shdr;
>> +
>> + if (!gelf_getehdr(elf, &ehdr)) {
>> + pr_debug("%s : cannot get elf header.\n", __func__);
>> + return;
>> + }
>> +
>> + /*
>> + * Find out the .stapsdt.base section.
>> + * This scn will help us to handle prelinking (if present).
>> + * Compare the retrieved file offset of the base section with the
>> + * base address in the description of the SDT note. If its different,
>> + * then accordingly, adjust the note location.
>> + */
>> + if (elf_section_by_name(elf, &ehdr, &shdr, SDT_BASE, NULL))
>> + base_off = shdr.sh_offset;
>> + if (base_off) {
>> + if (tmp->bit32)
>> + key->addr.a32[0] = tmp->addr.a32[0] + base_off -
>> + tmp->addr.a32[1];
>> + else
>> + key->addr.a64[0] = tmp->addr.a64[0] + base_off -
>> + tmp->addr.a64[1];
>> + }
>> + key->bit32 = tmp->bit32;
>> +}
>> +
>> +int search_sdt_note(struct sdt_note *key, struct list_head *sdt_notes,
>> + const char *target)
>> +{
>> + Elf *elf;
>> + int fd, ret = -1;
>> + struct list_head *pos = NULL, *head = NULL;
>> + struct sdt_note *tmp = NULL;
>> +
>> + fd = open(target, O_RDONLY);
>> + if (fd < 0) {
>> + pr_err("Failed to open %s\n", target);
>> + goto out_ret;
>> + }
>> +
>> + symbol__elf_init();
>> + elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL);
>> + if (!elf) {
>> + pr_err("%s : can't read %s ELF file.\n", __func__, target);
>> + goto out_close;
>> + }
>> +
>> + head = construct_sdt_notes_list(elf);
>> + if (!head)
>> + goto out_end;
>> +
>> + list_add(sdt_notes, head);
> This list code looks strange.
>

Won't need that after making the required changes.

>> +
>> + /* Iterate through the notes and retrieve the required note */
>> + list_for_each(pos, sdt_notes) {
>> + tmp = list_entry(pos, struct sdt_note, note_list);
>> + if (!strcmp(key->name, tmp->name) &&
>> + !strcmp(key->provider, tmp->provider)) {
>> + adjust_note_addr(tmp, key, elf);
>> + ret = 0;
>> + break;
>> + }
>> + }
>> + if (ret)
>> + printf("%%%s:%s not found!\n", key->provider, key->name);
>> +
>> +out_end:
>> + elf_end(elf);
>> +out_close:
>> + close(fd);
>> +out_ret:
>> + return ret;
>> +}
>> +
>> void symbol__elf_init(void)
>> {
>> elf_version(EV_CURRENT);
>> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
>> index 037185c..0b0545b 100644
>> --- a/tools/perf/util/symbol.h
>> +++ b/tools/perf/util/symbol.h
>> @@ -260,9 +260,12 @@ void __map_groups__fixup_end(struct map_groups *mg, enum map_type type);
>>
>> /* Specific to SDT notes */
>> int get_sdt_note_list(struct list_head *head, const char *target);
>> +int search_sdt_note(struct sdt_note *key, struct list_head *head,
>> + const char *target);
>>
>> #define SDT_NOTE_TYPE 3
>> #define NOTE_SCN ".note.stapsdt"
>> #define SDT_NOTE_NAME "stapsdt"
>> +#define SDT_BASE ".stapsdt.base"
> What about SDT_BASE_SCN ?

Seems better.

--
Thanks
Hemant

2013-10-08 13:10:05

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] Perf support to SDT markers

On 10/08/2013 02:40 PM, Namhyung Kim wrote:
> Hi Frank,
>
> On Mon, 07 Oct 2013 11:47:09 -0400, Frank Ch. Eigler wrote:
>> Hemant Kumar <[email protected]> writes:
>>
>>> [...]
>>> A simple example to show this follows.
>>> - Create a file with .d extension and mention the probe names in it with
>>> provider name and marker name.
>>> [...]
>>> - Now create the probes.h and probes.o file :
>>> $ dtrace -C -h -s probes.d -o probes.h
>>> $ dtrace -C -G -s probes.d -o probes.o
>>> [...]
>> It may be worthwhile to document an even-simpler case:
>>
>> - no .d file
>> - no invocation of the dtrace python script
>> - no generated .h or .o file
>> - in the C file, just add:
>>
>> #include <sys/sdt.h>
>>
>> void main () {
>> /* ... */
>> STAP_PROBE(provider_name,probe_name);
>> /* ... */
>> }
>>
>> - gcc file.c
>> - stap -l 'process("./a.out").mark("*")' to list
> Yes, looks much simpler and better.
>
> Hemant, would you add it to the doc too?
>
> Thanks,
> Namhyung
>

Yes, it will be good to add this simpler example in the doc.

--
Thanks
Hemant

2013-10-08 13:25:43

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] SDT markers listing by perf:

On 10/08/2013 05:09 PM, Masami Hiramatsu wrote:
> (2013/10/07 15:47), Hemant Kumar wrote:
>> @@ -325,6 +326,7 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
>> opt_set_filter),
>> OPT_CALLBACK('x', "exec", NULL, "executable|path",
>> "target executable name or path", opt_set_target),
>> + OPT_BOOLEAN('S', "markers", &params.sdt, "Show probe-able sdt notes"),
> BTW, "-S" for short of "--markers" option is a bit odd. Would we better use -M ? :)

I thought of -S as SDT. :) Anyways, on second thought, will change that
to -M.

>
> [...]
>> @@ -355,6 +357,26 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
>> */
>> symbol_conf.try_vmlinux_path = (symbol_conf.vmlinux_name == NULL);
>>
>> + if (params.sdt) {
>> + if (params.show_lines) {
>> + pr_err("Error: Don't use --markers with --lines.\n");
>> + usage_with_options(probe_usage, options);
>> + }
>> + if (params.show_vars) {
>> + pr_err("Error: Don't use --markers with --vars.\n");
>> + usage_with_options(probe_usage, options);
>> + }
>> + if (params.show_funcs) {
>> + pr_err("Error: Don't use --markers with --funcs.\n");
>> + usage_with_options(probe_usage, options);
>> + }
>> + ret = show_sdt_notes(params.target);
> You have to ensure params.target is set before use it.

Yes, will do that.

>> + if (ret < 0) {
>> + pr_err(" Error : Failed to find SDT markers!"
>> + "(%d)\n", ret);
>> + }
>> + return ret;
>> + }
>> if (params.list_events) {
>> if (params.mod_events) {
>> pr_err(" Error: Don't use --list with --add/--del.\n");
> [...]
>> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
>> index 4b12bf8..6b17260 100644
>> --- a/tools/perf/util/symbol-elf.c
>> +++ b/tools/perf/util/symbol-elf.c
>> @@ -846,6 +846,182 @@ out_elf_end:
>> return err;
>> }
>>
>> +/*
>> + * Populate the name, type, offset in the SDT note structure and
>> + * ignore the argument fields (for now)
>> + */
>> +static struct sdt_note *populate_sdt_note(Elf **elf, const char *data,
>> + size_t len, int type)
>> +{
>> + const char *provider, *name;
>> + struct sdt_note *note;
>> +
>> + /*
>> + * Three addresses need to be obtained :
>> + * Marker location, address of base section and semaphore location
>> + */
>> + union {
>> + Elf64_Addr a64[3];
>> + Elf32_Addr a32[3];
>> + } buf;
>> +
>> + /*
>> + * dst and src are required for translation from file to memory
>> + * representation
>> + */
>> + Elf_Data dst = {
>> + .d_buf = &buf, .d_type = ELF_T_ADDR, .d_version = EV_CURRENT,
>> + .d_size = gelf_fsize((*elf), ELF_T_ADDR, 3, EV_CURRENT),
>> + .d_off = 0, .d_align = 0
>> + };
>> +
>> + Elf_Data src = {
>> + .d_buf = (void *) data, .d_type = ELF_T_ADDR,
>> + .d_version = EV_CURRENT, .d_size = dst.d_size, .d_off = 0,
>> + .d_align = 0
>> + };
>> +
>> + /* Check the type of each of the notes */
>> + if (type != SDT_NOTE_TYPE)
>> + goto out_ret;
>> +
>> + note = (struct sdt_note *)zalloc(sizeof(struct sdt_note));
>> + if (note == NULL) {
>> + pr_err("Memory allocation error\n");
> No need to state out of memory for each place...
> If you hit an out of memory for such small piece, you'd better
> abort execution. For that purpose, I recommend you to return
> errno from each function, and caller must check it.

Yeah this will be better... will return errno from each function.

>> + goto out_ret;
>> + }
>> + INIT_LIST_HEAD(&note->note_list);
>> +
>> + if (len < dst.d_size + 3)
>> + goto out_free;
>> +
>> + /* Translation from file representation to memory representation */
>> + if (gelf_xlatetom(*elf, &dst, &src,
>> + elf_getident(*elf, NULL)[EI_DATA]) == NULL)
>> + pr_debug("gelf_xlatetom : %s", elf_errmsg(-1));
>> +
>> + /* Populate the fields of sdt_note */
>> + provider = data + dst.d_size;
>> +
>> + name = (const char *)memchr(provider, '\0', data + len - provider);
>> + if (name++ == NULL)
>> + goto out_free;
>> + note->provider = strdup(provider);
>> + note->name = strdup(name);
>> +
>> + /* Obtain the addresses and ignore notes with semaphores set*/
>> + if (gelf_getclass(*elf) == ELFCLASS32) {
>> + note->addr.a32[0] = buf.a32[0];
>> + note->addr.a32[1] = buf.a32[1];
>> + note->addr.a32[2] = buf.a32[2];
>> + note->bit32 = true;
>> + if (buf.a32[2] != 0)
>> + goto out_free;
>> + } else {
>> + note->addr.a64[0] = buf.a64[0];
>> + note->addr.a64[1] = buf.a64[1];
>> + note->addr.a64[2] = buf.a64[2];
>> + note->bit32 = false;
>> + if (buf.a64[2] != 0)
>> + goto out_free;
>> + }
>> + return note;
>> +
>> +out_free:
>> + free(note);
>> +out_ret:
>> + return NULL;
>> +}
>> +
>> +static struct list_head *construct_sdt_notes_list(Elf *elf)
> This also would better return errno.

Ok.

>> +{
>> + GElf_Ehdr ehdr;
>> + Elf_Scn *scn = NULL;
>> + Elf_Data *data;
>> + GElf_Shdr shdr;
>> + size_t shstrndx;
>> + size_t next;
>> + GElf_Nhdr nhdr;
>> + size_t name_off, desc_off, offset;
>> + struct sdt_note *tmp = NULL, *note = NULL;
>> +
>> + if (gelf_getehdr(elf, &ehdr) == NULL) {
>> + pr_debug("%s: Can't get elf header.\n", __func__);
>> + goto out_err;
>> + }
>> + if (elf_getshdrstrndx(elf, &shstrndx) != 0) {
>> + pr_debug("getshdrstrndx failed\n");
>> + goto out_err;
>> + }
>> +
>> + /*
>> + * Look for section type = SHT_NOTE, flags = no SHF_ALLOC
>> + * and name = .note.stapsdt
>> + */
>> + scn = elf_section_by_name(elf, &ehdr, &shdr, NOTE_SCN, NULL);
>> + if (!scn) {
>> + printf("SDT markers not present!\n");
> Why you don't use pr_err() here ? :)

Because, I thought it isn't an error that SDT markers are not present.
It won't make much of a difference though. :)

> And also, as I commented, I'd like to know which binary doesn't have SDT markers,
> at least with -v (verbose) option.

Yes, I guess it would be better to show the binary name even without -v
option. Will make the changes.

--
Thanks
Hemant

2013-10-08 13:30:57

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] Support for perf to probe into SDT markers:

On 10/08/2013 05:17 PM, Masami Hiramatsu wrote:
> (2013/10/07 15:48), Hemant Kumar wrote:
>> [...]
>> diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
>> index cbd2383..6f09723 100644
>> --- a/tools/perf/builtin-probe.c
>> +++ b/tools/perf/builtin-probe.c
>> @@ -370,6 +370,17 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
>> pr_err("Error: Don't use --markers with --funcs.\n");
>> usage_with_options(probe_usage, options);
>> }
>> + if (params.mod_events) {
>> + ret = add_perf_probe_events(params.events,
>> + params.nevents,
>> + params.max_probe_points,
>> + params.target,
>> + params.force_add);
>> + if (ret < 0) {
>> + pr_err(" Error: Failed to add events. "
>> + " (%d)\n", ret);
>> + }
>> + }
> What is this code for? params.sdt is true only if "--markers" is set, and that
> should not be used with --add and --del, because it's an action "query markers".
> We should give an error and abort here.

Yeah, I see your point. We should not add an event in this case. Instead
an error should be displayed. Thanks for pointing that. And we already
have add_perf_probe_events() call. That should be called instead.

>
> Other points are covered by Namhyung's review(thanks!).
>
> Thank you!
>
Yeah, will make the required changes and post the next iteration ASAP.

--
Thanks
Hemant

2013-10-08 13:32:37

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] Perf support to SDT markers

Hi,

On 10/07/2013 09:17 PM, Frank Ch. Eigler wrote:
> Hemant Kumar <[email protected]> writes:
>
>> [...]
>> A simple example to show this follows.
>> - Create a file with .d extension and mention the probe names in it with
>> provider name and marker name.
>> [...]
>> - Now create the probes.h and probes.o file :
>> $ dtrace -C -h -s probes.d -o probes.h
>> $ dtrace -C -G -s probes.d -o probes.o
>> [...]
> It may be worthwhile to document an even-simpler case:
>
> - no .d file
> - no invocation of the dtrace python script
> - no generated .h or .o file
> - in the C file, just add:
>
> #include <sys/sdt.h>
>
> void main () {
> /* ... */
> STAP_PROBE(provider_name,probe_name);
> /* ... */
> }
>
> - gcc file.c
> - stap -l 'process("./a.out").mark("*")' to list
>
>
> - FChE
>

Will add this example to the doc too.

--
Thanks
Hemant