2013-09-03 07:36:57

by Hemant Kumar

[permalink] [raw]
Subject: [RFC PATCH 0/2] Perf support to SDT markers

This series adds support to perf to list and probe into the SDT markers.
The first patch implements listing of all the SDT markers present in
the ELFs (executables or libraries). The SDT markers are present in the
.note.stapsdt section of the elf. That section can be traversed to list
all the markers. Recognition of markers follows the SystemTap approach.

The second patch will allow perf to probe into these markers. This is
done by writing the marker name and its offset into the
uprobe_events file in the tracing directory.
Then, perf tools can be used to analyze perf.data file.

---

Hemant Kumar (2):
SDT markers listing by perf
Support to perf to probe on SDT markers


tools/perf/builtin-probe.c | 33 ++++
tools/perf/util/probe-event.c | 17 ++
tools/perf/util/symbol-elf.c | 317 +++++++++++++++++++++++++++++++++++++++++
tools/perf/util/symbol.h | 24 +++
4 files changed, 391 insertions(+)

--


2013-09-03 07:37:10

by Hemant Kumar

[permalink] [raw]
Subject: [PATCH 1/2] SDT markers listing by perf

This patch will enable perf to list all the sdt markers present
in an elf file. The markers are present in the .note.stapsdt section
of the elf. We can traverse through this section and collect the
required info about the markers.
This hasn't been thoroughly tested with the other
options of perf.
----
Usage :
./perf probe --list -x /lib64/libc.so.6

Output :
setjmp
longjmp
longjmp_target
lll_futex_wake
lll_lock_wait_private
longjmp
longjmp_target
lll_futex_wake

Total markers = 8
----
Signed-off-by: Hemant Kumar Shaw <[email protected]>
---
tools/perf/builtin-probe.c | 26 +++++
tools/perf/util/probe-event.c | 6 +
tools/perf/util/symbol-elf.c | 205 +++++++++++++++++++++++++++++++++++++++++
tools/perf/util/symbol.h | 19 ++++
4 files changed, 256 insertions(+)

diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index e8a66f9..3d8dcdf 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -55,6 +55,7 @@ static struct {
bool show_funcs;
bool mod_events;
bool uprobes;
+ bool sdt;
int nevents;
struct perf_probe_event events[MAX_PROBES];
struct strlist *dellist;
@@ -325,6 +326,8 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
opt_set_filter),
OPT_CALLBACK('x', "exec", NULL, "executable|path",
"target executable name or path", opt_set_target),
+ OPT_BOOLEAN('S', "sdt", &params.sdt,
+ "Show and probe on the SDT markers"),
OPT_END()
};
int ret;
@@ -355,6 +358,28 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
*/
symbol_conf.try_vmlinux_path = (symbol_conf.vmlinux_name == NULL);

+ if (params.sdt) {
+ if (params.show_lines) {
+ pr_err(" Error: Don't use --sdt with --line.\n");
+ usage_with_options(probe_usage, options);
+ }
+ if (params.show_vars) {
+ pr_err(" Error: Don't use --sdt with --vars.\n");
+ usage_with_options(probe_usage, options);
+ }
+ if (params.show_funcs) {
+ pr_err(" Error: Don't use --sdt with --funcs.\n");
+ usage_with_options(probe_usage, options);
+ }
+ if (params.list_events) {
+ ret = show_available_markers(params.target);
+ if (ret < 0)
+ pr_err(" Error: Failed to show markers."
+ " (%d)\n", ret);
+ return ret;
+ }
+ }
+
if (params.list_events) {
if (params.mod_events) {
pr_err(" Error: Don't use --list with --add/--del.\n");
@@ -382,6 +407,7 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
ret);
return ret;
}
+
if (params.show_funcs) {
if (params.nevents != 0 || params.dellist) {
pr_err(" Error: Don't use --funcs with"
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index aa04bf9..7f846f9 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -2372,3 +2372,9 @@ out:
free(name);
return ret;
}
+
+int show_available_markers(const char *target)
+{
+ setup_pager();
+ return list_markers(target);
+}
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 4b12bf8..f3630f2 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -846,6 +846,211 @@ out_elf_end:
return err;
}

+/* Populate the name, type, offset and argument fields in the note structure */
+static struct sdt_note *populate_note(Elf **elf, const char *data, size_t len,
+ int type)
+{
+ const char *provider, *name, *args;
+ struct sdt_note *note;
+
+ /*
+ * There are 3 address values to be obtained: marker offset, base address
+ * and semaphore
+ */
+ union {
+ Elf64_Addr a64[3];
+ Elf32_Addr a32[3];
+ } buf;
+
+ /*
+ * dst and src (of Elf_Data) are required for translation from file
+ * to memory representation
+ */
+ Elf_Data dst = {
+ .d_buf = &buf, .d_type = ELF_T_ADDR, .d_version = EV_CURRENT,
+ .d_size = gelf_fsize((*elf), ELF_T_ADDR, 3, EV_CURRENT),
+ .d_off = 0, .d_align = 0
+ };
+
+ Elf_Data src = {
+ .d_buf = (void *) data, .d_type = ELF_T_ADDR,
+ .d_version = EV_CURRENT, .d_size = dst.d_size, .d_off = 0,
+ .d_align = 0
+ };
+
+ if (type != SDT_NOTE_TYPE)
+ goto out_ret;
+
+ note = zalloc(sizeof(struct sdt_note));
+ if (note == NULL) {
+ pr_err("memory allocation error\n");
+ goto out_ret;
+ }
+ note->next = NULL;
+
+ if (len < dst.d_size + 3)
+ goto out_free;
+ /*
+ * Translating from file representation to memory representation
+ * Data now points to the address (offset)
+ */
+ if (gelf_xlatetom((*elf), &dst, &src,
+ elf_getident((*elf), NULL)[EI_DATA]) == NULL)
+ pr_debug("gelf_xlatetom : %s", elf_errmsg(-1));
+
+ /* Populate the fields of sdt_note */
+ provider = data + dst.d_size;
+
+ name = (const char *)memchr(provider, '\0', data + len - provider);
+ if (name++ == NULL)
+ goto out_ret;
+
+ args = (const char *)memchr(name, '\0', data + len - name);
+ if (args++ == NULL ||
+ memchr(args, '\0', data + len - name) != data + len - 1)
+ goto out_ret;
+ note->provider = provider;
+ note->name = name;
+
+ /* Three addr's for location, base and semaphore addresses */
+ note->addr.a64[0] = (buf.a64[0]);
+ note->addr.a64[1] = (buf.a64[1]);
+ note->addr.a64[2] = (buf.a64[2]);
+ return note;
+
+out_free:
+ free(note);
+out_ret:
+ return NULL;
+}
+
+/*
+ * Traverse the elf of the object, find out the .note.stapsdt section
+ * and accordingly initialize the notes' list head
+ */
+static struct sdt_note *get_elf_markers(Elf *elf, bool *exe, bool probe)
+{
+ Elf_Scn *scn = NULL;
+ Elf_Data *data;
+ GElf_Shdr shdr;
+ GElf_Ehdr ehdr;
+ size_t shstrndx;
+ size_t next;
+ GElf_Nhdr nhdr;
+ size_t name_off, desc_off, offset;
+ struct sdt_note *note = NULL, *tmp, *head = NULL;
+
+ if (gelf_getehdr(elf, &ehdr) == NULL) {
+ pr_debug("%s: cannot get elf header.\n", __func__);
+ goto out_end;
+ }
+
+ if (elf_getshdrstrndx(elf, &shstrndx) != 0) {
+ pr_debug("getshdrstrndx failed\n");
+ goto out_end;
+ }
+
+ /* library or exec */
+ if (probe) {
+ if (ehdr.e_type == ET_EXEC)
+ *exe = true;
+ }
+
+ /*
+ * Look for Section type = SHT_NOTE, flags = no SHF_ALLOC
+ * and name = .note.stapsdt
+ */
+ scn = elf_section_by_name(elf, &ehdr, &shdr, NOTE_SCN, NULL);
+ if (scn == NULL) {
+ pr_err("%s section not found!\n", NOTE_SCN);
+ goto out_end;
+ }
+
+ if (!(shdr.sh_type == SHT_NOTE) || (shdr.sh_flags & SHF_ALLOC))
+ goto out_end;
+
+ data = elf_getdata(scn, NULL);
+
+ /* Get the notes */
+ for (offset = 0; (next = gelf_getnote(data, offset, &nhdr, &name_off,
+ &desc_off)) > 0; offset = next) {
+ tmp = populate_note(&elf, (const char *)((long)(data->d_buf) +
+ (long)desc_off),
+ nhdr.n_descsz, nhdr.n_type);
+ /* For first note */
+ if (!head) {
+ note = tmp;
+ head = note;
+ continue;
+ }
+ note->next = tmp;
+ note = note->next;
+ }
+
+ if (!head)
+ pr_err("No markers found\n");
+
+out_end:
+ return head;
+}
+
+static void print_notes(struct sdt_note *start)
+{
+ struct sdt_note *temp;
+ int c;
+
+ for (temp = start, c = 0; temp != NULL; temp = temp->next, c++)
+ printf("%s:%s\n", temp->provider, temp->name);
+
+ printf("\nTotal markers = %d\n", c);
+}
+
+int list_markers(const char *name)
+{
+ int fd, ret = -1;
+ Elf *elf;
+ struct sdt_note *head = NULL;
+
+ fd = open(name, O_RDONLY);
+ if (fd < 0) {
+ pr_err("Failed to open %s\n", name);
+ goto out_ret;
+ }
+
+ symbol__elf_init();
+ elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL);
+ if (elf == NULL) {
+ pr_err("%s: cannot read %s ELF file.\n", __func__, name);
+ goto out_close;
+ }
+
+ head = get_elf_markers(elf, NULL, false);
+ if (head) {
+ print_notes(head);
+ cleanup_notes(head);
+ ret = 0;
+ } else {
+ pr_err("No SDT markers found in %s\n", name);
+ }
+ elf_end(elf);
+
+out_close:
+ close(fd);
+out_ret:
+ return ret;
+}
+
+void cleanup_notes(struct sdt_note *start)
+{
+ struct sdt_note *tmp;
+
+ while (start) {
+ tmp = start->next;
+ free(start);
+ start = tmp;
+ }
+}
+
void symbol__elf_init(void)
{
elf_version(EV_CURRENT);
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 5f720dc..f2d17b7 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -197,6 +197,17 @@ struct symsrc {
#endif
};

+/* Note structure */
+struct sdt_note {
+ const char *name;
+ const char *provider;
+ union {
+ Elf64_Addr a64[3];
+ Elf32_Addr a32[3];
+ } addr;
+ struct sdt_note *next;
+};
+
void symsrc__destroy(struct symsrc *ss);
int symsrc__init(struct symsrc *ss, struct dso *dso, const char *name,
enum dso_binary_type type);
@@ -247,4 +258,12 @@ void symbols__fixup_duplicate(struct rb_root *symbols);
void symbols__fixup_end(struct rb_root *symbols);
void __map_groups__fixup_end(struct map_groups *mg, enum map_type type);

+/* For SDT markers */
+int show_available_markers(const char *module);
+int list_markers(const char *name);
+void cleanup_notes(struct sdt_note *start);
+
+#define SDT_NOTE_TYPE 3
+#define NOTE_SCN ".note.stapsdt"
+
#endif /* __PERF_SYMBOL */

2013-09-03 07:37:17

by Hemant Kumar

[permalink] [raw]
Subject: [PATCH 2/2] Support to perf to probe on SDT markers:

This patch enables perf to probe on the marker name specified on the command line.
---
tools/perf/builtin-probe.c | 7 +++
tools/perf/util/probe-event.c | 11 ++++
tools/perf/util/symbol-elf.c | 112 +++++++++++++++++++++++++++++++++++++++++
tools/perf/util/symbol.h | 5 ++
4 files changed, 135 insertions(+)

diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index 3d8dcdf..8382853 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -378,6 +378,13 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
" (%d)\n", ret);
return ret;
}
+ params.uprobes = true;
+ ret = probe_marker(params.target,
+ params.events[0].point.function);
+ if (ret < 0)
+ pr_err("Could not probe at %s marker\n",
+ params.events[0].point.function);
+ return ret;
}

if (params.list_events) {
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 7f846f9..014d642 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -2378,3 +2378,14 @@ int show_available_markers(const char *target)
setup_pager();
return list_markers(target);
}
+
+int probe_marker(const char *name, char *mark)
+{
+ int fd;
+
+ fd = open_uprobe_events(true);
+ if (fd == -1)
+ return fd;
+ else
+ return probe__marker(name, mark, fd);
+}
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index f3630f2..60938a5 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1040,6 +1040,118 @@ out_ret:
return ret;
}

+static void extract_first_name(const char *target, char *fname)
+{
+ int i, len;
+ char *file;
+
+ file = strrchr(target, DIR_SEP);
+ file++;
+ len = strlen(file);
+ for (i = 0; i <= len; i++) {
+ if (!isalpha(file[i]))
+ break;
+ fname[i] = file[i];
+ }
+ fname[i] = '\0';
+}
+
+static int probe_at_note(struct sdt_note *r_note, const char *target, bool exec,
+ int fd)
+{
+ char buf[MAX_CMDLEN];
+ int len, err = -1;
+ Elf64_Addr offset;
+ char *fname = NULL;
+
+ if (exec)
+ offset = r_note->addr.a64[0] - TEXT_SCN;
+ else
+ offset = r_note->addr.a64[0];
+
+ fname = (char *)zalloc(sizeof(char) * strlen(target));
+ if (fname == NULL) {
+ pr_err("Error in allocating memory to fname\n");
+ goto out_ret;
+ }
+
+ extract_first_name(target, fname);
+ len = snprintf(buf, MAX_CMDLEN, "%c:%s%s/%s %s:0x%x", 'p', "probe_",
+ fname, r_note->name, target, (unsigned)offset);
+
+ len = write(fd, buf, MAX_CMDLEN);
+ if (len < 0) {
+ pr_err("Couldn't write into uprobe_events!\n");
+ goto out_close;
+ } else {
+ printf("Added new event :\n");
+ printf("event = %s \t (on 0x%x)\n\n", r_note->name,
+ (unsigned)offset);
+ printf("You can now use it on all perf tools such as :\n\n");
+ printf("\t perf record -e %s%s:%s -aR sleep 1\n\n", "probe_", fname, r_note->name);
+ err = 0;
+ }
+
+out_close:
+ close(fd);
+ free(fname);
+out_ret:
+ return err;
+}
+
+static int search_and_probe_at_note(char *key, struct sdt_note **start,
+ const char *target, bool exec, int fd)
+{
+ int ret = -1;
+ struct sdt_note *req;
+
+ for (req = (*start); req != NULL; req = req->next) {
+ if (!strcmp(key, req->name))
+ break;
+ }
+ if (!req) {
+ pr_err("Could not find marker %s\n", key);
+ return ret;
+ }
+
+ ret = probe_at_note(req, target, exec, fd);
+ return ret;
+}
+
+int probe__marker(const char *name, char *mark, int evfd)
+{
+ int ret = -1, fd;
+ Elf *elf;
+ bool exec = false;
+ struct sdt_note *head = NULL;
+
+ fd = open(name, O_RDONLY);
+ if (fd < 0) {
+ pr_err("Failed to open the file\n");
+ goto out_ret;
+ }
+
+ symbol__elf_init();
+ elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL);
+ if (elf == NULL) {
+ pr_debug("%s: cannot read %s ELF file.\n", __func__, name);
+ goto out_close;
+ }
+
+ head = get_elf_markers(elf, &exec, true);
+ if (head) {
+ ret = search_and_probe_at_note(mark, &head, name, exec, evfd);
+ cleanup_notes(head);
+ }
+
+ elf_end(elf);
+
+out_close:
+ close(fd);
+out_ret:
+ return ret;
+}
+
void cleanup_notes(struct sdt_note *start)
{
struct sdt_note *tmp;
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index f2d17b7..95289fd 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -262,8 +262,13 @@ void __map_groups__fixup_end(struct map_groups *mg, enum map_type type);
int show_available_markers(const char *module);
int list_markers(const char *name);
void cleanup_notes(struct sdt_note *start);
+int probe_marker(const char *name, char *mark);
+int probe__marker(const char *name, char *mark, int fd);

#define SDT_NOTE_TYPE 3
#define NOTE_SCN ".note.stapsdt"
+#define TEXT_SCN 0x400000
+#define DIR_SEP '/'
+#define MAX_CMDLEN 256

#endif /* __PERF_SYMBOL */

2013-09-03 08:19:49

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH 1/2] SDT markers listing by perf

On 09/03/2013 01:06 PM, Hemant Kumar wrote:
> This patch will enable perf to list all the sdt markers present
> in an elf file. The markers are present in the .note.stapsdt section
> of the elf. We can traverse through this section and collect the
> required info about the markers.
> This hasn't been thoroughly tested with the other
> options of perf.
> ----
> Usage :
> ./perf probe --list -x /lib64/libc.so.6

There is a mismatch between usage and the patch here. This should be:
perf probe --list -S /lib64/libc.so.6

>
> Output :
> setjmp
> longjmp
> longjmp_target
> lll_futex_wake
> lll_lock_wait_private
> longjmp
> longjmp_target
> lll_futex_wake
>
> Total markers = 8

And the correct Output should be :

libc:setjmp
libc:longjmp
libc:longjmp_target
libc:lll_futex_wake
libc:lll_lock_wait_private
libc:longjmp
libc:longjmp_target
libc:lll_futex_wake

Total markers = 8

> ----
> Signed-off-by: Hemant Kumar Shaw <[email protected]>
> ---
> tools/perf/builtin-probe.c | 26 +++++
> tools/perf/util/probe-event.c | 6 +
> tools/perf/util/symbol-elf.c | 205 +++++++++++++++++++++++++++++++++++++++++
> tools/perf/util/symbol.h | 19 ++++
> 4 files changed, 256 insertions(+)
>
> diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
> index e8a66f9..3d8dcdf 100644
> --- a/tools/perf/builtin-probe.c
> +++ b/tools/perf/builtin-probe.c
> @@ -55,6 +55,7 @@ static struct {
> bool show_funcs;
> bool mod_events;
> bool uprobes;
> + bool sdt;
> int nevents;
> struct perf_probe_event events[MAX_PROBES];
> struct strlist *dellist;
> @@ -325,6 +326,8 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
> opt_set_filter),
> OPT_CALLBACK('x', "exec", NULL, "executable|path",
> "target executable name or path", opt_set_target),
> + OPT_BOOLEAN('S', "sdt", &params.sdt,
> + "Show and probe on the SDT markers"),
> OPT_END()
> };
> int ret;
> @@ -355,6 +358,28 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
> */
> symbol_conf.try_vmlinux_path = (symbol_conf.vmlinux_name == NULL);
>
> + if (params.sdt) {
> + if (params.show_lines) {
> + pr_err(" Error: Don't use --sdt with --line.\n");
> + usage_with_options(probe_usage, options);
> + }
> + if (params.show_vars) {
> + pr_err(" Error: Don't use --sdt with --vars.\n");
> + usage_with_options(probe_usage, options);
> + }
> + if (params.show_funcs) {
> + pr_err(" Error: Don't use --sdt with --funcs.\n");
> + usage_with_options(probe_usage, options);
> + }
> + if (params.list_events) {
> + ret = show_available_markers(params.target);
> + if (ret < 0)
> + pr_err(" Error: Failed to show markers."
> + " (%d)\n", ret);
> + return ret;
> + }
> + }
> +
> if (params.list_events) {
> if (params.mod_events) {
> pr_err(" Error: Don't use --list with --add/--del.\n");
> @@ -382,6 +407,7 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
> ret);
> return ret;
> }
> +
> if (params.show_funcs) {
> if (params.nevents != 0 || params.dellist) {
> pr_err(" Error: Don't use --funcs with"
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index aa04bf9..7f846f9 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -2372,3 +2372,9 @@ out:
> free(name);
> return ret;
> }
> +
> +int show_available_markers(const char *target)
> +{
> + setup_pager();
> + return list_markers(target);
> +}
> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> index 4b12bf8..f3630f2 100644
> --- a/tools/perf/util/symbol-elf.c
> +++ b/tools/perf/util/symbol-elf.c
> @@ -846,6 +846,211 @@ out_elf_end:
> return err;
> }
>
> +/* Populate the name, type, offset and argument fields in the note structure */
> +static struct sdt_note *populate_note(Elf **elf, const char *data, size_t len,
> + int type)
> +{
> + const char *provider, *name, *args;
> + struct sdt_note *note;
> +
> + /*
> + * There are 3 address values to be obtained: marker offset, base address
> + * and semaphore
> + */
> + union {
> + Elf64_Addr a64[3];
> + Elf32_Addr a32[3];
> + } buf;
> +
> + /*
> + * dst and src (of Elf_Data) are required for translation from file
> + * to memory representation
> + */
> + Elf_Data dst = {
> + .d_buf = &buf, .d_type = ELF_T_ADDR, .d_version = EV_CURRENT,
> + .d_size = gelf_fsize((*elf), ELF_T_ADDR, 3, EV_CURRENT),
> + .d_off = 0, .d_align = 0
> + };
> +
> + Elf_Data src = {
> + .d_buf = (void *) data, .d_type = ELF_T_ADDR,
> + .d_version = EV_CURRENT, .d_size = dst.d_size, .d_off = 0,
> + .d_align = 0
> + };
> +
> + if (type != SDT_NOTE_TYPE)
> + goto out_ret;
> +
> + note = zalloc(sizeof(struct sdt_note));
> + if (note == NULL) {
> + pr_err("memory allocation error\n");
> + goto out_ret;
> + }
> + note->next = NULL;
> +
> + if (len < dst.d_size + 3)
> + goto out_free;
> + /*
> + * Translating from file representation to memory representation
> + * Data now points to the address (offset)
> + */
> + if (gelf_xlatetom((*elf), &dst, &src,
> + elf_getident((*elf), NULL)[EI_DATA]) == NULL)
> + pr_debug("gelf_xlatetom : %s", elf_errmsg(-1));
> +
> + /* Populate the fields of sdt_note */
> + provider = data + dst.d_size;
> +
> + name = (const char *)memchr(provider, '\0', data + len - provider);
> + if (name++ == NULL)
> + goto out_ret;
> +
> + args = (const char *)memchr(name, '\0', data + len - name);
> + if (args++ == NULL ||
> + memchr(args, '\0', data + len - name) != data + len - 1)
> + goto out_ret;
> + note->provider = provider;
> + note->name = name;
> +
> + /* Three addr's for location, base and semaphore addresses */
> + note->addr.a64[0] = (buf.a64[0]);
> + note->addr.a64[1] = (buf.a64[1]);
> + note->addr.a64[2] = (buf.a64[2]);
> + return note;
> +
> +out_free:
> + free(note);
> +out_ret:
> + return NULL;
> +}
> +
> +/*
> + * Traverse the elf of the object, find out the .note.stapsdt section
> + * and accordingly initialize the notes' list head
> + */
> +static struct sdt_note *get_elf_markers(Elf *elf, bool *exe, bool probe)
> +{
> + Elf_Scn *scn = NULL;
> + Elf_Data *data;
> + GElf_Shdr shdr;
> + GElf_Ehdr ehdr;
> + size_t shstrndx;
> + size_t next;
> + GElf_Nhdr nhdr;
> + size_t name_off, desc_off, offset;
> + struct sdt_note *note = NULL, *tmp, *head = NULL;
> +
> + if (gelf_getehdr(elf, &ehdr) == NULL) {
> + pr_debug("%s: cannot get elf header.\n", __func__);
> + goto out_end;
> + }
> +
> + if (elf_getshdrstrndx(elf, &shstrndx) != 0) {
> + pr_debug("getshdrstrndx failed\n");
> + goto out_end;
> + }
> +
> + /* library or exec */
> + if (probe) {
> + if (ehdr.e_type == ET_EXEC)
> + *exe = true;
> + }
> +
> + /*
> + * Look for Section type = SHT_NOTE, flags = no SHF_ALLOC
> + * and name = .note.stapsdt
> + */
> + scn = elf_section_by_name(elf, &ehdr, &shdr, NOTE_SCN, NULL);
> + if (scn == NULL) {
> + pr_err("%s section not found!\n", NOTE_SCN);
> + goto out_end;
> + }
> +
> + if (!(shdr.sh_type == SHT_NOTE) || (shdr.sh_flags & SHF_ALLOC))
> + goto out_end;
> +
> + data = elf_getdata(scn, NULL);
> +
> + /* Get the notes */
> + for (offset = 0; (next = gelf_getnote(data, offset, &nhdr, &name_off,
> + &desc_off)) > 0; offset = next) {
> + tmp = populate_note(&elf, (const char *)((long)(data->d_buf) +
> + (long)desc_off),
> + nhdr.n_descsz, nhdr.n_type);
> + /* For first note */
> + if (!head) {
> + note = tmp;
> + head = note;
> + continue;
> + }
> + note->next = tmp;
> + note = note->next;
> + }
> +
> + if (!head)
> + pr_err("No markers found\n");
> +
> +out_end:
> + return head;
> +}
> +
> +static void print_notes(struct sdt_note *start)
> +{
> + struct sdt_note *temp;
> + int c;
> +
> + for (temp = start, c = 0; temp != NULL; temp = temp->next, c++)
> + printf("%s:%s\n", temp->provider, temp->name);
> +
> + printf("\nTotal markers = %d\n", c);
> +}
> +
> +int list_markers(const char *name)
> +{
> + int fd, ret = -1;
> + Elf *elf;
> + struct sdt_note *head = NULL;
> +
> + fd = open(name, O_RDONLY);
> + if (fd < 0) {
> + pr_err("Failed to open %s\n", name);
> + goto out_ret;
> + }
> +
> + symbol__elf_init();
> + elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL);
> + if (elf == NULL) {
> + pr_err("%s: cannot read %s ELF file.\n", __func__, name);
> + goto out_close;
> + }
> +
> + head = get_elf_markers(elf, NULL, false);
> + if (head) {
> + print_notes(head);
> + cleanup_notes(head);
> + ret = 0;
> + } else {
> + pr_err("No SDT markers found in %s\n", name);
> + }
> + elf_end(elf);
> +
> +out_close:
> + close(fd);
> +out_ret:
> + return ret;
> +}
> +
> +void cleanup_notes(struct sdt_note *start)
> +{
> + struct sdt_note *tmp;
> +
> + while (start) {
> + tmp = start->next;
> + free(start);
> + start = tmp;
> + }
> +}
> +
> void symbol__elf_init(void)
> {
> elf_version(EV_CURRENT);
> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> index 5f720dc..f2d17b7 100644
> --- a/tools/perf/util/symbol.h
> +++ b/tools/perf/util/symbol.h
> @@ -197,6 +197,17 @@ struct symsrc {
> #endif
> };
>
> +/* Note structure */
> +struct sdt_note {
> + const char *name;
> + const char *provider;
> + union {
> + Elf64_Addr a64[3];
> + Elf32_Addr a32[3];
> + } addr;
> + struct sdt_note *next;
> +};
> +
> void symsrc__destroy(struct symsrc *ss);
> int symsrc__init(struct symsrc *ss, struct dso *dso, const char *name,
> enum dso_binary_type type);
> @@ -247,4 +258,12 @@ void symbols__fixup_duplicate(struct rb_root *symbols);
> void symbols__fixup_end(struct rb_root *symbols);
> void __map_groups__fixup_end(struct map_groups *mg, enum map_type type);
>
> +/* For SDT markers */
> +int show_available_markers(const char *module);
> +int list_markers(const char *name);
> +void cleanup_notes(struct sdt_note *start);
> +
> +#define SDT_NOTE_TYPE 3
> +#define NOTE_SCN ".note.stapsdt"
> +
> #endif /* __PERF_SYMBOL */
>
>

2013-09-03 08:25:11

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC PATCH 0/2] Perf support to SDT markers


* Hemant Kumar Shaw <[email protected]> wrote:

> This series adds support to perf to list and probe into the SDT markers.
> The first patch implements listing of all the SDT markers present in
> the ELFs (executables or libraries). The SDT markers are present in the
> .note.stapsdt section of the elf. That section can be traversed to list
> all the markers. Recognition of markers follows the SystemTap approach.
>
> The second patch will allow perf to probe into these markers. This is
> done by writing the marker name and its offset into the
> uprobe_events file in the tracing directory.
> Then, perf tools can be used to analyze perf.data file.

Please provide a better high level description that explains the history
and scope of SDT markers, how SDT markers get into binaries, how they can
be used for probing, a real-life usage example that shows something
interesting not possible via other ways, etc.

Thanks,

Ingo

Subject: Re: [PATCH 2/2] Support to perf to probe on SDT markers:

(2013/09/03 16:37), Hemant Kumar wrote:
> This patch enables perf to probe on the marker name specified on the command line.

At least you must show how this works, and how to use in the comment.

Thank you,

> ---
> tools/perf/builtin-probe.c | 7 +++
> tools/perf/util/probe-event.c | 11 ++++
> tools/perf/util/symbol-elf.c | 112 +++++++++++++++++++++++++++++++++++++++++
> tools/perf/util/symbol.h | 5 ++
> 4 files changed, 135 insertions(+)
>
> diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
> index 3d8dcdf..8382853 100644
> --- a/tools/perf/builtin-probe.c
> +++ b/tools/perf/builtin-probe.c
> @@ -378,6 +378,13 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
> " (%d)\n", ret);
> return ret;
> }
> + params.uprobes = true;
> + ret = probe_marker(params.target,
> + params.events[0].point.function);
> + if (ret < 0)
> + pr_err("Could not probe at %s marker\n",
> + params.events[0].point.function);
> + return ret;
> }
>
> if (params.list_events) {
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index 7f846f9..014d642 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -2378,3 +2378,14 @@ int show_available_markers(const char *target)
> setup_pager();
> return list_markers(target);
> }
> +
> +int probe_marker(const char *name, char *mark)
> +{
> + int fd;
> +
> + fd = open_uprobe_events(true);
> + if (fd == -1)
> + return fd;
> + else
> + return probe__marker(name, mark, fd);
> +}
> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> index f3630f2..60938a5 100644
> --- a/tools/perf/util/symbol-elf.c
> +++ b/tools/perf/util/symbol-elf.c
> @@ -1040,6 +1040,118 @@ out_ret:
> return ret;
> }
>
> +static void extract_first_name(const char *target, char *fname)
> +{
> + int i, len;
> + char *file;
> +
> + file = strrchr(target, DIR_SEP);
> + file++;
> + len = strlen(file);
> + for (i = 0; i <= len; i++) {
> + if (!isalpha(file[i]))
> + break;
> + fname[i] = file[i];
> + }
> + fname[i] = '\0';
> +}
> +
> +static int probe_at_note(struct sdt_note *r_note, const char *target, bool exec,
> + int fd)
> +{
> + char buf[MAX_CMDLEN];
> + int len, err = -1;
> + Elf64_Addr offset;
> + char *fname = NULL;
> +
> + if (exec)
> + offset = r_note->addr.a64[0] - TEXT_SCN;
> + else
> + offset = r_note->addr.a64[0];
> +
> + fname = (char *)zalloc(sizeof(char) * strlen(target));
> + if (fname == NULL) {
> + pr_err("Error in allocating memory to fname\n");
> + goto out_ret;
> + }
> +
> + extract_first_name(target, fname);
> + len = snprintf(buf, MAX_CMDLEN, "%c:%s%s/%s %s:0x%x", 'p', "probe_",
> + fname, r_note->name, target, (unsigned)offset);
> +
> + len = write(fd, buf, MAX_CMDLEN);
> + if (len < 0) {
> + pr_err("Couldn't write into uprobe_events!\n");
> + goto out_close;
> + } else {
> + printf("Added new event :\n");
> + printf("event = %s \t (on 0x%x)\n\n", r_note->name,
> + (unsigned)offset);
> + printf("You can now use it on all perf tools such as :\n\n");
> + printf("\t perf record -e %s%s:%s -aR sleep 1\n\n", "probe_", fname, r_note->name);
> + err = 0;
> + }
> +
> +out_close:
> + close(fd);
> + free(fname);
> +out_ret:
> + return err;
> +}
> +
> +static int search_and_probe_at_note(char *key, struct sdt_note **start,
> + const char *target, bool exec, int fd)
> +{
> + int ret = -1;
> + struct sdt_note *req;
> +
> + for (req = (*start); req != NULL; req = req->next) {
> + if (!strcmp(key, req->name))
> + break;
> + }
> + if (!req) {
> + pr_err("Could not find marker %s\n", key);
> + return ret;
> + }
> +
> + ret = probe_at_note(req, target, exec, fd);
> + return ret;
> +}
> +
> +int probe__marker(const char *name, char *mark, int evfd)
> +{
> + int ret = -1, fd;
> + Elf *elf;
> + bool exec = false;
> + struct sdt_note *head = NULL;
> +
> + fd = open(name, O_RDONLY);
> + if (fd < 0) {
> + pr_err("Failed to open the file\n");
> + goto out_ret;
> + }
> +
> + symbol__elf_init();
> + elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL);
> + if (elf == NULL) {
> + pr_debug("%s: cannot read %s ELF file.\n", __func__, name);
> + goto out_close;
> + }
> +
> + head = get_elf_markers(elf, &exec, true);
> + if (head) {
> + ret = search_and_probe_at_note(mark, &head, name, exec, evfd);
> + cleanup_notes(head);
> + }
> +
> + elf_end(elf);
> +
> +out_close:
> + close(fd);
> +out_ret:
> + return ret;
> +}
> +
> void cleanup_notes(struct sdt_note *start)
> {
> struct sdt_note *tmp;
> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> index f2d17b7..95289fd 100644
> --- a/tools/perf/util/symbol.h
> +++ b/tools/perf/util/symbol.h
> @@ -262,8 +262,13 @@ void __map_groups__fixup_end(struct map_groups *mg, enum map_type type);
> int show_available_markers(const char *module);
> int list_markers(const char *name);
> void cleanup_notes(struct sdt_note *start);
> +int probe_marker(const char *name, char *mark);
> +int probe__marker(const char *name, char *mark, int fd);
>
> #define SDT_NOTE_TYPE 3
> #define NOTE_SCN ".note.stapsdt"
> +#define TEXT_SCN 0x400000
> +#define DIR_SEP '/'
> +#define MAX_CMDLEN 256
>
> #endif /* __PERF_SYMBOL */
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>


--
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

Subject: Re: Re: [RFC PATCH 0/2] Perf support to SDT markers

(2013/09/03 17:25), Ingo Molnar wrote:
>
> * Hemant Kumar Shaw <[email protected]> wrote:
>
>> This series adds support to perf to list and probe into the SDT markers.
>> The first patch implements listing of all the SDT markers present in
>> the ELFs (executables or libraries). The SDT markers are present in the
>> .note.stapsdt section of the elf. That section can be traversed to list
>> all the markers. Recognition of markers follows the SystemTap approach.
>>
>> The second patch will allow perf to probe into these markers. This is
>> done by writing the marker name and its offset into the
>> uprobe_events file in the tracing directory.
>> Then, perf tools can be used to analyze perf.data file.
>
> Please provide a better high level description that explains the history
> and scope of SDT markers, how SDT markers get into binaries, how they can
> be used for probing, a real-life usage example that shows something
> interesting not possible via other ways, etc.

Indeed, and also I'd like to know what versions of SDT this support,
and where we can see the technical document of that. As far as I know,
the previous(?) SDT implementation also involves ugly semaphores.
Have that already gone?

Thank you,

--
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

2013-09-03 13:23:25

by Hemant Kumar

[permalink] [raw]
Subject: Re: [RFC PATCH 0/2] Perf support to SDT markers

On 09/03/2013 02:47 PM, Masami Hiramatsu wrote:
> (2013/09/03 17:25), Ingo Molnar wrote:
>> * Hemant Kumar Shaw <[email protected]> wrote:
>>
>>> This series adds support to perf to list and probe into the SDT markers.
>>> The first patch implements listing of all the SDT markers present in
>>> the ELFs (executables or libraries). The SDT markers are present in the
>>> .note.stapsdt section of the elf. That section can be traversed to list
>>> all the markers. Recognition of markers follows the SystemTap approach.
>>>
>>> The second patch will allow perf to probe into these markers. This is
>>> done by writing the marker name and its offset into the
>>> uprobe_events file in the tracing directory.
>>> Then, perf tools can be used to analyze perf.data file.
>> Please provide a better high level description that explains the history
>> and scope of SDT markers, how SDT markers get into binaries, how they can
>> be used for probing, a real-life usage example that shows something
>> interesting not possible via other ways, etc.
> Indeed, and also I'd like to know what versions of SDT this support,
> and where we can see the technical document of that. As far as I know,
> the previous(?) SDT implementation also involves ugly semaphores.
> Have that already gone?
>
> Thank you,
>
>

Here is an overview and a high-level-description:
Goal:
Probe dtrace style markers(SDT) present in user space applications.

Scope:
Put probe points at SDT markers in user space and also probe them using
perf.

Why supprt SDT markers? :
We have lots of applications which use SDT markers today like:
Postgresql, MySql, Mozilla, Perl, Python, Java, Ruby, libvirt, QEMU, glib

These markers are placed at important places by the developers. Now,
these markers
have a negligible overhead when not enabled. We can enable them and probe at
these places and find some important information like the arguments'
values, etc.

How to add SDT markers into user applications:
We need to have this header sys/sdt.h present.
sys/sdt.h used is version 3.
If not present, install systemtap-sdt-devel package.

I will show this through a simple example.
- Create a file with .d extension and mention the probe names in it with
provider name and marker name.

$ cat probes.d
provider user_app {
probe foo_start();
probe fun_start();
};

- Now create the probes.h and probes.o file :
$ dtrace -C -h -s probes.d -o probes.h
$ dtrace -C -G -s probes.d -o probes.o

- A program using the markers:

$ cat user_app.c

#include <stdio.h>
#include "probes.h"

void foo(void)
{
USER_APP_FOO_START();
printf("This is foo\n");
}

void fun(void)
{
USER_APP_FUN_START();
printf("Inside fun\n");
}
int main(void)
{
printf("In main\n");
foo();
fun();
return 0;
}

- Compile it and also provide probes.o file to linker:
$ gcc user_app.c probes.o -o user_app

- Now use perf to list the markers in the app:
# perf probe --list -S -x ./user_app

user_app:foo_start
user_app:fun_start

Total markers = 2

- And then use perf probe to add a probe point :

# perf probe -S -x ./user_app foo_start

Added new event :
event = foo_start (on 0x530)

You can now use it on all perf tools such as :

perf record -e probe_user:foo_start -aR sleep 1

# perf record -e probe_user:foo_start -aR ./user_app
In main
This is foo
Inside fun
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.235 MB perf.data (~10279 samples) ]

- Then use perf tools to analyze it.
# perf report --stdio

# ========
# captured on: Tue Sep 3 16:19:55 2013
# hostname : hemant-fedora
# os release : 3.11.0-rc3+
# perf version : 3.9.4-200.fc18.x86_64
# arch : x86_64
# nrcpus online : 2
# nrcpus avail : 2
# cpudesc : QEMU Virtual CPU version 1.2.2
# cpuid : GenuineIntel,6,2,3
# total memory : 2051912 kBIf these are not enabled, they are present in
the ELF as nop.

# cmdline : /usr/bin/perf record -e probe_user:foo_start -aR ./user_app
# event : name = probe_user:foo_start, type = 2, config = 0x38e, config1
= 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, excl_host = 0,
excl_guest = 1, precise_ip = 0
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# pmu mappings: software = 1, tracepoint = 2, breakpoint = 5
# ========
#
# Samples: 1 of event 'probe_user:foo_start'
# Event count (approx.): 1
#
# Overhead Command Shared Object Symbol
# ........ ........ ............. .......
#
100.00% user_app user_app [.] foo


#
# (For a higher level overview, try: perf report --sort comm,dso)
#


We can see the markers in libvirt (if it is compiled with --with-dtrace
option) :
# perf probe -l -S -x /lib64/libvirt.so.0.10.2

libvirt:event_poll_purge_timeout
libvirt:event_poll_purge_handle
libvirt:event_poll_remove_handle
libvirt:event_poll_add_timeout
libvirt:event_poll_update_timeout
libvirt:event_poll_remove_timeout
libvirt:event_poll_update_handle
libvirt:event_poll_add_handle
libvirt:event_poll_run
libvirt:event_poll_dispatch_timeout
libvirt:event_poll_dispatch_handle
libvirt:object_new
libvirt:object_unref
libvirt:object_dispose
libvirt:object_ref
libvirt:rpc_client_msg_tx_queue
libvirt:rpc_client_msg_rx
libvirt:rpc_client_dispose
libvirt:rpc_client_new
libvirt:rpc_client_msg_tx_queue
libvirt:rpc_server_client_new
libvirt:rpc_server_client_dispose
libvirt:rpc_server_client_msg_tx_queue
libvirt:rpc_server_client_msg_rx
libvirt:rpc_keepalive_dispose
libvirt:rpc_keepalive_send
libvirt:rpc_keepalive_timeout
libvirt:rpc_keepalive_new
libvirt:rpc_keepalive_start
libvirt:rpc_keepalive_stop
libvirt:rpc_keepalive_received
libvirt:rpc_socket_new
libvirt:rpc_socket_dispose
libvirt:rpc_socket_send_fd
libvirt:rpc_socket_recv_fd

- And then use perf to probe into any marker:

# perf probe -S -x /lib64/libvirt.so.0.10.2 rpc_client_msg_tx_queue

Added new event :
event = rpc_client_msg_tx_queue (on 0x1462d9)

You can now use it on all perf tools such as :

perf record -e probe_libvirt:rpc_client_msg_tx_queue -aR sleep 1

This link shows an example of marker probing with Systemtap:
https://sourceware.org/systemtap/wiki/AddingUserSpaceProbingToApps

- Markers in binaries :
These SDT markers are present in the ELF in the section named
".note.stapsdt".
Here, the name of the marker, its provider, type, location, base
address, semaphore address, arguments are present.
We can retrieve these values using the members name_off and desc_off in
Nhdr structure. If these are not enabled, they are present in the ELF as
nop.

Thanks
Hemant Kumar Shaw

2013-09-03 14:22:05

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC PATCH 0/2] Perf support to SDT markers


* Hemant <[email protected]> wrote:

> Here is an overview and a high-level-description:

Thanks, looks like a pretty useful feature - especially if SDT probes are
already widely present in various server binaries on a typical Linux
distro (are they?).

Would be nice to stick some or all of that info into
tools/perf/Documentation/sdt-probes.txt, and put the most useful part of
that info into the man page as well - or something like that.

Thanks,

Ingo

2013-09-03 15:12:56

by Mark Wielaard

[permalink] [raw]
Subject: Re: [RFC PATCH 0/2] Perf support to SDT markers

On Tue, 2013-09-03 at 16:21 +0200, Ingo Molnar wrote:
> * Hemant <[email protected]> wrote:
>
> > Here is an overview and a high-level-description:
>
> Thanks, looks like a pretty useful feature - especially if SDT probes are
> already widely present in various server binaries on a typical Linux
> distro (are they?).

They are becoming fairly common, especially now that glibc has adopted
them. And they were designed to be source compatible with DTRACE markers
[*]. So any program that has probes for dtrace can just be recompiled on
with sys/sdt.h to get them.

My Fedora 19 box already has a couple of applications and libraries
installed that support them:

$ stap -l 'process("/usr/*/*").mark("*")' | cut -f2 -d\" | uniq -c
9 /usr/bin/Xorg
9 /usr/bin/Xvfb
4 /usr/bin/c++
4 /usr/bin/cpp
4 /usr/bin/g++
4 /usr/bin/gcc
1 /usr/bin/gcov
4 /usr/bin/gfortran
3 /usr/bin/qemu-ga
75 /usr/bin/qemu-img
75 /usr/bin/qemu-io
75 /usr/bin/qemu-nbd
575 /usr/bin/qemu-system-i386
575 /usr/bin/qemu-system-x86_64
29 /usr/bin/stap
3 /usr/bin/stapdyn
3 /usr/bin/virtfs-proxy-helper
4 /usr/bin/x86_64-redhat-linux-c++
4 /usr/bin/x86_64-redhat-linux-g++
4 /usr/bin/x86_64-redhat-linux-gcc
13 /usr/lib/ld-2.17.so
1 /usr/lib/libanl-2.17.so
5 /usr/lib/libc-2.17.so
1 /usr/lib/libgcc_s-4.8.1-20130603.so.1
22 /usr/lib/libpthread-2.17.so
1 /usr/lib/librt-2.17.so
3 /usr/lib/libstdc++.so.6.0.18
13 /usr/lib64/ld-2.17.so
1 /usr/lib64/libanl-2.17.so
5 /usr/lib64/libc-2.17.so
3 /usr/lib64/libcacard.so.0.0.0
1 /usr/lib64/libgcc_s-4.8.1-20130603.so.1
6 /usr/lib64/libglib-2.0.so.0.3600.3
11 /usr/lib64/libgobject-2.0.so.0.3600.3
4 /usr/lib64/libm-2.17.so
23 /usr/lib64/libpthread-2.17.so
2 /usr/lib64/libpython2.7.so.1.0
2 /usr/lib64/libpython3.3m.so.1.0
1 /usr/lib64/librt-2.17.so
3 /usr/lib64/libstdc++.so.6.0.18
15 /usr/lib64/libtcl8.5.so
41 /usr/lib64/libvirt.so.0.1000.5
37 /usr/libexec/libvirt_lxc
12 /usr/sbin/ldconfig
3 /usr/sbin/libvirtd
34 /usr/sbin/prelink
12 /usr/sbin/sln
37 /usr/sbin/virtlockd

[*] http://tromey.com/blog/?p=687

2013-09-03 15:24:40

by Hemant Kumar

[permalink] [raw]
Subject: Re: [RFC PATCH 0/2] Perf support to SDT markers

On 09/03/2013 07:51 PM, Ingo Molnar wrote:
> * Hemant <[email protected]> wrote:
>
>> Here is an overview and a high-level-description:
> Thanks, looks like a pretty useful feature - especially if SDT probes are
> already widely present in various server binaries on a typical Linux
> distro (are they?).

Yes, as far as I know, SDT markers are present in these binaries on a
typical linux distro.

>
> Would be nice to stick some or all of that info into
> tools/perf/Documentation/sdt-probes.txt, and put the most useful part of
> that info into the man page as well - or something like that.
>
> Thanks,
>
> Ingo
>

Yes, will write the required documentation.

Thanks
Hemant Kumar Shaw

2013-09-03 19:24:16

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC PATCH 0/2] Perf support to SDT markers


* Mark Wielaard <[email protected]> wrote:

> On Tue, 2013-09-03 at 16:21 +0200, Ingo Molnar wrote:
> > * Hemant <[email protected]> wrote:
> >
> > > Here is an overview and a high-level-description:
> >
> > Thanks, looks like a pretty useful feature - especially if SDT probes are
> > already widely present in various server binaries on a typical Linux
> > distro (are they?).
>
> They are becoming fairly common, especially now that glibc has adopted
> them. And they were designed to be source compatible with DTRACE markers
> [*]. So any program that has probes for dtrace can just be recompiled on
> with sys/sdt.h to get them.
>
> My Fedora 19 box already has a couple of applications and libraries
> installed that support them:
>
> $ stap -l 'process("/usr/*/*").mark("*")' | cut -f2 -d\" | uniq -c
> 9 /usr/bin/Xorg
> 9 /usr/bin/Xvfb
> 4 /usr/bin/c++
> 4 /usr/bin/cpp
> 4 /usr/bin/g++
> 4 /usr/bin/gcc
> 1 /usr/bin/gcov
> 4 /usr/bin/gfortran
> 3 /usr/bin/qemu-ga
> 75 /usr/bin/qemu-img
> 75 /usr/bin/qemu-io
> 75 /usr/bin/qemu-nbd
> 575 /usr/bin/qemu-system-i386
> 575 /usr/bin/qemu-system-x86_64
> 29 /usr/bin/stap
> 3 /usr/bin/stapdyn
> 3 /usr/bin/virtfs-proxy-helper
> 4 /usr/bin/x86_64-redhat-linux-c++
> 4 /usr/bin/x86_64-redhat-linux-g++
> 4 /usr/bin/x86_64-redhat-linux-gcc
> 13 /usr/lib/ld-2.17.so
> 1 /usr/lib/libanl-2.17.so
> 5 /usr/lib/libc-2.17.so
> 1 /usr/lib/libgcc_s-4.8.1-20130603.so.1
> 22 /usr/lib/libpthread-2.17.so
> 1 /usr/lib/librt-2.17.so
> 3 /usr/lib/libstdc++.so.6.0.18
> 13 /usr/lib64/ld-2.17.so
> 1 /usr/lib64/libanl-2.17.so
> 5 /usr/lib64/libc-2.17.so
> 3 /usr/lib64/libcacard.so.0.0.0
> 1 /usr/lib64/libgcc_s-4.8.1-20130603.so.1
> 6 /usr/lib64/libglib-2.0.so.0.3600.3
> 11 /usr/lib64/libgobject-2.0.so.0.3600.3
> 4 /usr/lib64/libm-2.17.so
> 23 /usr/lib64/libpthread-2.17.so
> 2 /usr/lib64/libpython2.7.so.1.0
> 2 /usr/lib64/libpython3.3m.so.1.0
> 1 /usr/lib64/librt-2.17.so
> 3 /usr/lib64/libstdc++.so.6.0.18
> 15 /usr/lib64/libtcl8.5.so
> 41 /usr/lib64/libvirt.so.0.1000.5
> 37 /usr/libexec/libvirt_lxc
> 12 /usr/sbin/ldconfig
> 3 /usr/sbin/libvirtd
> 34 /usr/sbin/prelink
> 12 /usr/sbin/sln
> 37 /usr/sbin/virtlockd

Nice!

Thanks,

Ingo

2013-09-04 06:08:59

by Namhyung Kim

[permalink] [raw]
Subject: Re: [RFC PATCH 0/2] Perf support to SDT markers

Hi Hemant,

On Tue, 03 Sep 2013 13:06:40 +0530, Hemant Kumar Shaw wrote:
> This series adds support to perf to list and probe into the SDT markers.
> The first patch implements listing of all the SDT markers present in
> the ELFs (executables or libraries). The SDT markers are present in the
> .note.stapsdt section of the elf. That section can be traversed to list
> all the markers. Recognition of markers follows the SystemTap approach.
>
> The second patch will allow perf to probe into these markers. This is
> done by writing the marker name and its offset into the
> uprobe_events file in the tracing directory.
> Then, perf tools can be used to analyze perf.data file.

First, thanks for your work!

In fact, I was working on the same topic of support SDT markers. But I
started it by adding support for accessing arguments in uprobes [1].
The patchset is floating around the list for now but I hope it'll get
merged soon. I guess you want to have it too :).

Anyway, it'd great if you add me to CC next time.

Thanks,
Namhyung


[1] http://thread.gmane.org/gmane.linux.kernel/1555044

2013-09-04 06:42:31

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 1/2] SDT markers listing by perf

On Tue, 03 Sep 2013 13:06:55 +0530, Hemant Kumar wrote:

[SNIP]

> diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
> index e8a66f9..3d8dcdf 100644
> --- a/tools/perf/builtin-probe.c
> +++ b/tools/perf/builtin-probe.c
> @@ -55,6 +55,7 @@ static struct {
> bool show_funcs;
> bool mod_events;
> bool uprobes;
> + bool sdt;
> int nevents;
> struct perf_probe_event events[MAX_PROBES];
> struct strlist *dellist;
> @@ -325,6 +326,8 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
> opt_set_filter),
> OPT_CALLBACK('x', "exec", NULL, "executable|path",
> "target executable name or path", opt_set_target),
> + OPT_BOOLEAN('S', "sdt", &params.sdt,
> + "Show and probe on the SDT markers"),

You need to add it to Documentation/perf-probe.txt too. In addition if
the --sdt option is only able to work with libelf, it should be wrapped
into the #ifdef LIBELF_SUPPORT pair.

And I'm not sure that it's a good idea to have two behavior on a single
option (S) - show and probe (add). Maybe it can be separated into two
or the S option can be used as a flag with existing --list and --add
option?


> OPT_END()
> };
> int ret;
> @@ -355,6 +358,28 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
> */
> symbol_conf.try_vmlinux_path = (symbol_conf.vmlinux_name == NULL);
>
> + if (params.sdt) {
> + if (params.show_lines) {
> + pr_err(" Error: Don't use --sdt with --line.\n");
> + usage_with_options(probe_usage, options);
> + }
> + if (params.show_vars) {
> + pr_err(" Error: Don't use --sdt with --vars.\n");
> + usage_with_options(probe_usage, options);
> + }
> + if (params.show_funcs) {
> + pr_err(" Error: Don't use --sdt with --funcs.\n");
> + usage_with_options(probe_usage, options);
> + }
> + if (params.list_events) {
> + ret = show_available_markers(params.target);
> + if (ret < 0)
> + pr_err(" Error: Failed to show markers."
> + " (%d)\n", ret);
> + return ret;
> + }
> + }
> +
> if (params.list_events) {
> if (params.mod_events) {
> pr_err(" Error: Don't use --list with --add/--del.\n");
> @@ -382,6 +407,7 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
> ret);
> return ret;
> }
> +
> if (params.show_funcs) {
> if (params.nevents != 0 || params.dellist) {
> pr_err(" Error: Don't use --funcs with"
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index aa04bf9..7f846f9 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -2372,3 +2372,9 @@ out:
> free(name);
> return ret;
> }
> +
> +int show_available_markers(const char *target)
> +{
> + setup_pager();
> + return list_markers(target);
> +}

Did you build it with NO_LIBELF=1 ? I guess not. At least you need a
dummy list_markers() function which just returns -1 for such case.


> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> index 4b12bf8..f3630f2 100644
> --- a/tools/perf/util/symbol-elf.c
> +++ b/tools/perf/util/symbol-elf.c
> @@ -846,6 +846,211 @@ out_elf_end:
> return err;
> }
>
> +/* Populate the name, type, offset and argument fields in the note structure */
> +static struct sdt_note *populate_note(Elf **elf, const char *data, size_t len,
> + int type)

Hmm.. I think the name needs to be changed more specific like
populate_sdt_note() or something?


> +{
> + const char *provider, *name, *args;
> + struct sdt_note *note;
> +
> + /*
> + * There are 3 address values to be obtained: marker offset, base address
> + * and semaphore
> + */
> + union {
> + Elf64_Addr a64[3];
> + Elf32_Addr a32[3];
> + } buf;
> +
> + /*
> + * dst and src (of Elf_Data) are required for translation from file
> + * to memory representation
> + */
> + Elf_Data dst = {
> + .d_buf = &buf, .d_type = ELF_T_ADDR, .d_version = EV_CURRENT,
> + .d_size = gelf_fsize((*elf), ELF_T_ADDR, 3, EV_CURRENT),
> + .d_off = 0, .d_align = 0
> + };
> +
> + Elf_Data src = {
> + .d_buf = (void *) data, .d_type = ELF_T_ADDR,
> + .d_version = EV_CURRENT, .d_size = dst.d_size, .d_off = 0,
> + .d_align = 0
> + };
> +
> + if (type != SDT_NOTE_TYPE)
> + goto out_ret;
> +
> + note = zalloc(sizeof(struct sdt_note));
> + if (note == NULL) {
> + pr_err("memory allocation error\n");
> + goto out_ret;
> + }
> + note->next = NULL;
> +
> + if (len < dst.d_size + 3)
> + goto out_free;
> + /*
> + * Translating from file representation to memory representation
> + * Data now points to the address (offset)
> + */
> + if (gelf_xlatetom((*elf), &dst, &src,
> + elf_getident((*elf), NULL)[EI_DATA]) == NULL)
> + pr_debug("gelf_xlatetom : %s", elf_errmsg(-1));
> +
> + /* Populate the fields of sdt_note */
> + provider = data + dst.d_size;
> +
> + name = (const char *)memchr(provider, '\0', data + len - provider);
> + if (name++ == NULL)
> + goto out_ret;
> +
> + args = (const char *)memchr(name, '\0', data + len - name);
> + if (args++ == NULL ||
> + memchr(args, '\0', data + len - name) != data + len - 1)
> + goto out_ret;
> + note->provider = provider;
> + note->name = name;
> +
> + /* Three addr's for location, base and semaphore addresses */
> + note->addr.a64[0] = (buf.a64[0]);
> + note->addr.a64[1] = (buf.a64[1]);
> + note->addr.a64[2] = (buf.a64[2]);
> + return note;
> +
> +out_free:
> + free(note);
> +out_ret:
> + return NULL;
> +}
> +
> +/*
> + * Traverse the elf of the object, find out the .note.stapsdt section
> + * and accordingly initialize the notes' list head
> + */
> +static struct sdt_note *get_elf_markers(Elf *elf, bool *exe, bool probe)

Ditto. How about get_sdt_markers() or get_sdt_notes()? I can see that
there are places those 'marker' and 'note' are used here and there. If
they indicate same thing it'd better to unify the term.


> +{
> + Elf_Scn *scn = NULL;
> + Elf_Data *data;
> + GElf_Shdr shdr;
> + GElf_Ehdr ehdr;
> + size_t shstrndx;
> + size_t next;
> + GElf_Nhdr nhdr;
> + size_t name_off, desc_off, offset;
> + struct sdt_note *note = NULL, *tmp, *head = NULL;
> +
> + if (gelf_getehdr(elf, &ehdr) == NULL) {
> + pr_debug("%s: cannot get elf header.\n", __func__);
> + goto out_end;
> + }
> +
> + if (elf_getshdrstrndx(elf, &shstrndx) != 0) {
> + pr_debug("getshdrstrndx failed\n");
> + goto out_end;
> + }
> +
> + /* library or exec */
> + if (probe) {
> + if (ehdr.e_type == ET_EXEC)
> + *exe = true;
> + }
> +
> + /*
> + * Look for Section type = SHT_NOTE, flags = no SHF_ALLOC
> + * and name = .note.stapsdt
> + */
> + scn = elf_section_by_name(elf, &ehdr, &shdr, NOTE_SCN, NULL);
> + if (scn == NULL) {
> + pr_err("%s section not found!\n", NOTE_SCN);
> + goto out_end;
> + }
> +
> + if (!(shdr.sh_type == SHT_NOTE) || (shdr.sh_flags & SHF_ALLOC))
> + goto out_end;
> +
> + data = elf_getdata(scn, NULL);
> +
> + /* Get the notes */
> + for (offset = 0; (next = gelf_getnote(data, offset, &nhdr, &name_off,
> + &desc_off)) > 0; offset = next) {
> + tmp = populate_note(&elf, (const char *)((long)(data->d_buf) +
> + (long)desc_off),
> + nhdr.n_descsz, nhdr.n_type);

Shouldn't we check the name of note being "stapsdt" as well as version
(type) 3?

Thanks,
Namhyung


> + /* For first note */
> + if (!head) {
> + note = tmp;
> + head = note;
> + continue;
> + }
> + note->next = tmp;
> + note = note->next;
> + }
> +
> + if (!head)
> + pr_err("No markers found\n");
> +
> +out_end:
> + return head;
> +}
> +
> +static void print_notes(struct sdt_note *start)
> +{
> + struct sdt_note *temp;
> + int c;
> +
> + for (temp = start, c = 0; temp != NULL; temp = temp->next, c++)
> + printf("%s:%s\n", temp->provider, temp->name);
> +
> + printf("\nTotal markers = %d\n", c);
> +}
> +
> +int list_markers(const char *name)
> +{
> + int fd, ret = -1;
> + Elf *elf;
> + struct sdt_note *head = NULL;
> +
> + fd = open(name, O_RDONLY);
> + if (fd < 0) {
> + pr_err("Failed to open %s\n", name);
> + goto out_ret;
> + }
> +
> + symbol__elf_init();
> + elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL);
> + if (elf == NULL) {
> + pr_err("%s: cannot read %s ELF file.\n", __func__, name);
> + goto out_close;
> + }
> +
> + head = get_elf_markers(elf, NULL, false);
> + if (head) {
> + print_notes(head);
> + cleanup_notes(head);
> + ret = 0;
> + } else {
> + pr_err("No SDT markers found in %s\n", name);
> + }
> + elf_end(elf);
> +
> +out_close:
> + close(fd);
> +out_ret:
> + return ret;
> +}
> +
> +void cleanup_notes(struct sdt_note *start)
> +{
> + struct sdt_note *tmp;
> +
> + while (start) {
> + tmp = start->next;
> + free(start);
> + start = tmp;
> + }
> +}
> +
> void symbol__elf_init(void)
> {
> elf_version(EV_CURRENT);
> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> index 5f720dc..f2d17b7 100644
> --- a/tools/perf/util/symbol.h
> +++ b/tools/perf/util/symbol.h
> @@ -197,6 +197,17 @@ struct symsrc {
> #endif
> };
>
> +/* Note structure */
> +struct sdt_note {
> + const char *name;
> + const char *provider;
> + union {
> + Elf64_Addr a64[3];
> + Elf32_Addr a32[3];
> + } addr;
> + struct sdt_note *next;
> +};
> +
> void symsrc__destroy(struct symsrc *ss);
> int symsrc__init(struct symsrc *ss, struct dso *dso, const char *name,
> enum dso_binary_type type);
> @@ -247,4 +258,12 @@ void symbols__fixup_duplicate(struct rb_root *symbols);
> void symbols__fixup_end(struct rb_root *symbols);
> void __map_groups__fixup_end(struct map_groups *mg, enum map_type type);
>
> +/* For SDT markers */
> +int show_available_markers(const char *module);
> +int list_markers(const char *name);
> +void cleanup_notes(struct sdt_note *start);
> +
> +#define SDT_NOTE_TYPE 3
> +#define NOTE_SCN ".note.stapsdt"
> +
> #endif /* __PERF_SYMBOL */

2013-09-04 06:43:16

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 1/2] SDT markers listing by perf

On Tue, 03 Sep 2013 13:49:35 +0530, Hemant wrote:
> On 09/03/2013 01:06 PM, Hemant Kumar wrote:
>> This patch will enable perf to list all the sdt markers present
>> in an elf file. The markers are present in the .note.stapsdt section
>> of the elf. We can traverse through this section and collect the
>> required info about the markers.
>> This hasn't been thoroughly tested with the other
>> options of perf.
>> ----
>> Usage :
>> ./perf probe --list -x /lib64/libc.so.6
>
> There is a mismatch between usage and the patch here. This should be:
> perf probe --list -S /lib64/libc.so.6

I guess it should be

perf probe -S -x /lib64/libc.so.6


Thanks,
Namhyung

2013-09-04 06:49:30

by Namhyung Kim

[permalink] [raw]
Subject: Re: [RFC PATCH 0/2] Perf support to SDT markers

On Tue, 03 Sep 2013 18:53:17 +0530, Hemant wrote:
> On 09/03/2013 02:47 PM, Masami Hiramatsu wrote:
>> (2013/09/03 17:25), Ingo Molnar wrote:
>>> * Hemant Kumar Shaw <[email protected]> wrote:
>>>
>>>> This series adds support to perf to list and probe into the SDT markers.
>>>> The first patch implements listing of all the SDT markers present in
>>>> the ELFs (executables or libraries). The SDT markers are present in the
>>>> .note.stapsdt section of the elf. That section can be traversed to list
>>>> all the markers. Recognition of markers follows the SystemTap approach.
>>>>
>>>> The second patch will allow perf to probe into these markers. This is
>>>> done by writing the marker name and its offset into the
>>>> uprobe_events file in the tracing directory.
>>>> Then, perf tools can be used to analyze perf.data file.
>>> Please provide a better high level description that explains the history
>>> and scope of SDT markers, how SDT markers get into binaries, how they can
>>> be used for probing, a real-life usage example that shows something
>>> interesting not possible via other ways, etc.
>> Indeed, and also I'd like to know what versions of SDT this support,
>> and where we can see the technical document of that. As far as I know,
>> the previous(?) SDT implementation also involves ugly semaphores.
>> Have that already gone?

It seems it's not. I see the SDT v3 document still mentions semaphores.


> This link shows an example of marker probing with Systemtap:
> https://sourceware.org/systemtap/wiki/AddingUserSpaceProbingToApps

I think the link below would be more helpful for us :)

http://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation

Thanks,
Namhyung

>
> - Markers in binaries :
> These SDT markers are present in the ELF in the section named
> ".note.stapsdt".
> Here, the name of the marker, its provider, type, location, base
> address, semaphore address, arguments are present.
> We can retrieve these values using the members name_off and desc_off in
> Nhdr structure. If these are not enabled, they are present in the ELF
> as nop.

2013-09-04 07:00:50

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 2/2] Support to perf to probe on SDT markers:

On Tue, 03 Sep 2013 13:07:03 +0530, Hemant Kumar wrote:
> This patch enables perf to probe on the marker name specified on the command line.

It looks like you didn't consider prelinked libraries. You need to check
the address of .stapsdt.base section too. And obviously this patch
ignores any argument the SDT has which I think pretty important info.
But we can add it later once the uprobes arg fetch patches are in.

Also please see my previous comment on mixed usage of 'note' and
'marker'.

Thanks,
Namhyung


> ---
> tools/perf/builtin-probe.c | 7 +++
> tools/perf/util/probe-event.c | 11 ++++
> tools/perf/util/symbol-elf.c | 112 +++++++++++++++++++++++++++++++++++++++++
> tools/perf/util/symbol.h | 5 ++
> 4 files changed, 135 insertions(+)
>
> diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
> index 3d8dcdf..8382853 100644
> --- a/tools/perf/builtin-probe.c
> +++ b/tools/perf/builtin-probe.c
> @@ -378,6 +378,13 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
> " (%d)\n", ret);
> return ret;
> }
> + params.uprobes = true;
> + ret = probe_marker(params.target,
> + params.events[0].point.function);
> + if (ret < 0)
> + pr_err("Could not probe at %s marker\n",
> + params.events[0].point.function);
> + return ret;
> }
>
> if (params.list_events) {
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index 7f846f9..014d642 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -2378,3 +2378,14 @@ int show_available_markers(const char *target)
> setup_pager();
> return list_markers(target);
> }
> +
> +int probe_marker(const char *name, char *mark)
> +{
> + int fd;
> +
> + fd = open_uprobe_events(true);
> + if (fd == -1)
> + return fd;
> + else
> + return probe__marker(name, mark, fd);
> +}
> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> index f3630f2..60938a5 100644
> --- a/tools/perf/util/symbol-elf.c
> +++ b/tools/perf/util/symbol-elf.c
> @@ -1040,6 +1040,118 @@ out_ret:
> return ret;
> }
>
> +static void extract_first_name(const char *target, char *fname)
> +{
> + int i, len;
> + char *file;
> +
> + file = strrchr(target, DIR_SEP);
> + file++;
> + len = strlen(file);
> + for (i = 0; i <= len; i++) {
> + if (!isalpha(file[i]))
> + break;
> + fname[i] = file[i];
> + }
> + fname[i] = '\0';
> +}
> +
> +static int probe_at_note(struct sdt_note *r_note, const char *target, bool exec,
> + int fd)
> +{
> + char buf[MAX_CMDLEN];
> + int len, err = -1;
> + Elf64_Addr offset;
> + char *fname = NULL;
> +
> + if (exec)
> + offset = r_note->addr.a64[0] - TEXT_SCN;
> + else
> + offset = r_note->addr.a64[0];
> +
> + fname = (char *)zalloc(sizeof(char) * strlen(target));
> + if (fname == NULL) {
> + pr_err("Error in allocating memory to fname\n");
> + goto out_ret;
> + }
> +
> + extract_first_name(target, fname);
> + len = snprintf(buf, MAX_CMDLEN, "%c:%s%s/%s %s:0x%x", 'p', "probe_",
> + fname, r_note->name, target, (unsigned)offset);
> +
> + len = write(fd, buf, MAX_CMDLEN);
> + if (len < 0) {
> + pr_err("Couldn't write into uprobe_events!\n");
> + goto out_close;
> + } else {
> + printf("Added new event :\n");
> + printf("event = %s \t (on 0x%x)\n\n", r_note->name,
> + (unsigned)offset);
> + printf("You can now use it on all perf tools such as :\n\n");
> + printf("\t perf record -e %s%s:%s -aR sleep 1\n\n", "probe_", fname, r_note->name);
> + err = 0;
> + }
> +
> +out_close:
> + close(fd);
> + free(fname);
> +out_ret:
> + return err;
> +}
> +
> +static int search_and_probe_at_note(char *key, struct sdt_note **start,
> + const char *target, bool exec, int fd)
> +{
> + int ret = -1;
> + struct sdt_note *req;
> +
> + for (req = (*start); req != NULL; req = req->next) {
> + if (!strcmp(key, req->name))
> + break;
> + }
> + if (!req) {
> + pr_err("Could not find marker %s\n", key);
> + return ret;
> + }
> +
> + ret = probe_at_note(req, target, exec, fd);
> + return ret;
> +}
> +
> +int probe__marker(const char *name, char *mark, int evfd)
> +{
> + int ret = -1, fd;
> + Elf *elf;
> + bool exec = false;
> + struct sdt_note *head = NULL;
> +
> + fd = open(name, O_RDONLY);
> + if (fd < 0) {
> + pr_err("Failed to open the file\n");
> + goto out_ret;
> + }
> +
> + symbol__elf_init();
> + elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL);
> + if (elf == NULL) {
> + pr_debug("%s: cannot read %s ELF file.\n", __func__, name);
> + goto out_close;
> + }
> +
> + head = get_elf_markers(elf, &exec, true);
> + if (head) {
> + ret = search_and_probe_at_note(mark, &head, name, exec, evfd);
> + cleanup_notes(head);
> + }
> +
> + elf_end(elf);
> +
> +out_close:
> + close(fd);
> +out_ret:
> + return ret;
> +}
> +
> void cleanup_notes(struct sdt_note *start)
> {
> struct sdt_note *tmp;
> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> index f2d17b7..95289fd 100644
> --- a/tools/perf/util/symbol.h
> +++ b/tools/perf/util/symbol.h
> @@ -262,8 +262,13 @@ void __map_groups__fixup_end(struct map_groups *mg, enum map_type type);
> int show_available_markers(const char *module);
> int list_markers(const char *name);
> void cleanup_notes(struct sdt_note *start);
> +int probe_marker(const char *name, char *mark);
> +int probe__marker(const char *name, char *mark, int fd);
>
> #define SDT_NOTE_TYPE 3
> #define NOTE_SCN ".note.stapsdt"
> +#define TEXT_SCN 0x400000
> +#define DIR_SEP '/'
> +#define MAX_CMDLEN 256
>
> #endif /* __PERF_SYMBOL */

Subject: Re: [PATCH 1/2] SDT markers listing by perf

(2013/09/03 16:36), Hemant Kumar wrote:
> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> index 5f720dc..f2d17b7 100644
> --- a/tools/perf/util/symbol.h
> +++ b/tools/perf/util/symbol.h
> @@ -197,6 +197,17 @@ struct symsrc {
> #endif
> };
>
> +/* Note structure */
> +struct sdt_note {
> + const char *name;
> + const char *provider;
> + union {
> + Elf64_Addr a64[3];
> + Elf32_Addr a32[3];
> + } addr;
> + struct sdt_note *next;
> +};

Hmm, could you use struct list_head for listing up the data?

Thank you,

--
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

Subject: Re: [PATCH 1/2] SDT markers listing by perf

(2013/09/04 15:42), Namhyung Kim wrote:
> On Tue, 03 Sep 2013 13:06:55 +0530, Hemant Kumar wrote:
>
> [SNIP]
>
>> diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
>> index e8a66f9..3d8dcdf 100644
>> --- a/tools/perf/builtin-probe.c
>> +++ b/tools/perf/builtin-probe.c
>> @@ -55,6 +55,7 @@ static struct {
>> bool show_funcs;
>> bool mod_events;
>> bool uprobes;
>> + bool sdt;
>> int nevents;
>> struct perf_probe_event events[MAX_PROBES];
>> struct strlist *dellist;
>> @@ -325,6 +326,8 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
>> opt_set_filter),
>> OPT_CALLBACK('x', "exec", NULL, "executable|path",
>> "target executable name or path", opt_set_target),
>> + OPT_BOOLEAN('S', "sdt", &params.sdt,
>> + "Show and probe on the SDT markers"),
>
> You need to add it to Documentation/perf-probe.txt too. In addition if
> the --sdt option is only able to work with libelf, it should be wrapped
> into the #ifdef LIBELF_SUPPORT pair.
>
> And I'm not sure that it's a good idea to have two behavior on a single
> option (S) - show and probe (add). Maybe it can be separated into two
> or the S option can be used as a flag with existing --list and --add
> option?
>

Good catch! :)
No, that is really bad idea. All probes must be added by "--add" action.
So we need a new probe syntax for specifying sdt marker.

How about the below syntax?

[EVENT=]%PROVIDER:MARKER [ARG ...]

Of course, this will require to list up all markers with "%" prefix for
continuity.

And since --list option is to list up all existing(defined) probe events,
I think --markers (as like as --funcs) is better for listing it up.

Thank you!

--
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

Subject: Re: Re: [PATCH 2/2] Support to perf to probe on SDT markers:

(2013/09/04 16:00), Namhyung Kim wrote:
> On Tue, 03 Sep 2013 13:07:03 +0530, Hemant Kumar wrote:
>> This patch enables perf to probe on the marker name specified on the command line.
>
> It looks like you didn't consider prelinked libraries. You need to check
> the address of .stapsdt.base section too. And obviously this patch
> ignores any argument the SDT has which I think pretty important info.
> But we can add it later once the uprobes arg fetch patches are in.

Agreed, I'd like to see the argument support afterward.
Thanks!

--
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

Subject: Re: Re: [RFC PATCH 0/2] Perf support to SDT markers

(2013/09/04 15:49), Namhyung Kim wrote:
> On Tue, 03 Sep 2013 18:53:17 +0530, Hemant wrote:
>> On 09/03/2013 02:47 PM, Masami Hiramatsu wrote:
>>> (2013/09/03 17:25), Ingo Molnar wrote:
>>>> * Hemant Kumar Shaw <[email protected]> wrote:
>>>>
>>>>> This series adds support to perf to list and probe into the SDT markers.
>>>>> The first patch implements listing of all the SDT markers present in
>>>>> the ELFs (executables or libraries). The SDT markers are present in the
>>>>> .note.stapsdt section of the elf. That section can be traversed to list
>>>>> all the markers. Recognition of markers follows the SystemTap approach.
>>>>>
>>>>> The second patch will allow perf to probe into these markers. This is
>>>>> done by writing the marker name and its offset into the
>>>>> uprobe_events file in the tracing directory.
>>>>> Then, perf tools can be used to analyze perf.data file.
>>>> Please provide a better high level description that explains the history
>>>> and scope of SDT markers, how SDT markers get into binaries, how they can
>>>> be used for probing, a real-life usage example that shows something
>>>> interesting not possible via other ways, etc.
>>> Indeed, and also I'd like to know what versions of SDT this support,
>>> and where we can see the technical document of that. As far as I know,
>>> the previous(?) SDT implementation also involves ugly semaphores.
>>> Have that already gone?
>
> It seems it's not. I see the SDT v3 document still mentions semaphores.

Uh, how the current implementation can avoid those semaphores?
(It seems that this patch doesn't mention it.)

I had discussed that the semaphore problem 3 years ago with systemtap
guys, it should be there to keep the main process away from "heavy"
argument processing when the marker is disabled, especially for python vm.
If those are still there, we can't enable markers without tweaking the
"semaphore" by using systemtap or a debugger.

>> This link shows an example of marker probing with Systemtap:
>> https://sourceware.org/systemtap/wiki/AddingUserSpaceProbingToApps
>
> I think the link below would be more helpful for us :)
>
> http://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation

Thank you :)


--
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

2013-09-04 08:26:25

by Mark Wielaard

[permalink] [raw]
Subject: Re: [RFC PATCH 0/2] Perf support to SDT markers

On Wed, 2013-09-04 at 15:49 +0900, Namhyung Kim wrote:
> On Tue, 03 Sep 2013 18:53:17 +0530, Hemant wrote:
> > On 09/03/2013 02:47 PM, Masami Hiramatsu wrote:
> >> Indeed, and also I'd like to know what versions of SDT this support,
> >> and where we can see the technical document of that. As far as I know,
> >> the previous(?) SDT implementation also involves ugly semaphores.
> >> Have that already gone?
>
> It seems it's not. I see the SDT v3 document still mentions semaphores.

It mentions them, but should normally not be used. They are there for
dtrace (source) compatibility. And you don't have to use them.

Since normally a SDT probe marker is just a NOP it doesn't have any
overhead. But if you want to add complicated arguments that you would
normally not generate in your code, then you might want to add a
semaphore. That way you can have probes with a bit more overhead that
still have zero overhead when not being probed.

Note that if you use the normal DTRACE_PROBE macros no semaphore will be
inserted. And you can opt to not support probes that have a semaphore in
perf if you think that is easier (just check the semaphore link-time
address for the probe, it should normally be zero). Just warn: "No way I
am going to probe something that might have a little extra overhead! I
am no debugger..." :)

> > This link shows an example of marker probing with Systemtap:
> > https://sourceware.org/systemtap/wiki/AddingUserSpaceProbingToApps
>
> I think the link below would be more helpful for us :)
>
> http://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation

Yes, that should be the canonical description.

Cheers,

Mark

Subject: Re: Re: [RFC PATCH 0/2] Perf support to SDT markers

(2013/09/04 17:25), Mark Wielaard wrote:
> On Wed, 2013-09-04 at 15:49 +0900, Namhyung Kim wrote:
>> On Tue, 03 Sep 2013 18:53:17 +0530, Hemant wrote:
>>> On 09/03/2013 02:47 PM, Masami Hiramatsu wrote:
>>>> Indeed, and also I'd like to know what versions of SDT this support,
>>>> and where we can see the technical document of that. As far as I know,
>>>> the previous(?) SDT implementation also involves ugly semaphores.
>>>> Have that already gone?
>>
>> It seems it's not. I see the SDT v3 document still mentions semaphores.
>
> It mentions them, but should normally not be used. They are there for
> dtrace (source) compatibility. And you don't have to use them.
>
> Since normally a SDT probe marker is just a NOP it doesn't have any
> overhead. But if you want to add complicated arguments that you would
> normally not generate in your code, then you might want to add a
> semaphore. That way you can have probes with a bit more overhead that
> still have zero overhead when not being probed.
>
> Note that if you use the normal DTRACE_PROBE macros no semaphore will be
> inserted. And you can opt to not support probes that have a semaphore in
> perf if you think that is easier (just check the semaphore link-time
> address for the probe, it should normally be zero). Just warn: "No way I
> am going to probe something that might have a little extra overhead! I
> am no debugger..." :)

OK, I see. And in that case, we'd better filter out the markers which
use a semaphore when list it up, since we can not enable it.

Thank you,


--
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

2013-09-04 17:08:18

by Hemant Kumar

[permalink] [raw]
Subject: Re: [RFC PATCH 0/2] Perf support to SDT markers

On 09/04/2013 11:38 AM, Namhyung Kim wrote:
> Hi Hemant,
>
> On Tue, 03 Sep 2013 13:06:40 +0530, Hemant Kumar Shaw wrote:
>> This series adds support to perf to list and probe into the SDT markers.
>> The first patch implements listing of all the SDT markers present in
>> the ELFs (executables or libraries). The SDT markers are present in the
>> .note.stapsdt section of the elf. That section can be traversed to list
>> all the markers. Recognition of markers follows the SystemTap approach.
>>
>> The second patch will allow perf to probe into these markers. This is
>> done by writing the marker name and its offset into the
>> uprobe_events file in the tracing directory.
>> Then, perf tools can be used to analyze perf.data file.
> First, thanks for your work!
>
> In fact, I was working on the same topic of support SDT markers. But I
> started it by adding support for accessing arguments in uprobes [1].

I was thinking of adding support to SDT markers' arguments too!! I
already have your patches and did some basic testing. I will add support
to arguments in the next iteration of my patches.

> The patchset is floating around the list for now but I hope it'll get
> merged soon. I guess you want to have it too :).

Of course, that would be great!

>
> Anyway, it'd great if you add me to CC next time.

Sure, will do that.

Thanks
Hemant
>
> Thanks,
> Namhyung
>
>
> [1] http://thread.gmane.org/gmane.linux.kernel/1555044
>

2013-09-04 17:38:08

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH 1/2] SDT markers listing by perf

On 09/04/2013 12:12 PM, Namhyung Kim wrote:
> On Tue, 03 Sep 2013 13:06:55 +0530, Hemant Kumar wrote:
>
> [SNIP]
>
>> diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
>> index e8a66f9..3d8dcdf 100644
>> --- a/tools/perf/builtin-probe.c
>> +++ b/tools/perf/builtin-probe.c
>> @@ -55,6 +55,7 @@ static struct {
>> bool show_funcs;
>> bool mod_events;
>> bool uprobes;
>> + bool sdt;
>> int nevents;
>> struct perf_probe_event events[MAX_PROBES];
>> struct strlist *dellist;
>> @@ -325,6 +326,8 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
>> opt_set_filter),
>> OPT_CALLBACK('x', "exec", NULL, "executable|path",
>> "target executable name or path", opt_set_target),
>> + OPT_BOOLEAN('S', "sdt", &params.sdt,
>> + "Show and probe on the SDT markers"),
> You need to add it to Documentation/perf-probe.txt too. In addition if
> the --sdt option is only able to work with libelf, it should be wrapped
> into the #ifdef LIBELF_SUPPORT pair.
First of all, thanks for the review.
Yes, will add it to the Documentation. And, yes, it should be wrapped
around #ifdef LIBELF_SUPPORT pair... Will do that in the next iteration.

>
> And I'm not sure that it's a good idea to have two behavior on a single
> option (S) - show and probe (add). Maybe it can be separated into two
> or the S option can be used as a flag with existing --list and --add
> option?

Hmmm, I will have to think more on this issue. But I think, adding -S
(--sdt) flag to the existing --list option will be a bit confusing. I
thought of keeping the actions related to sdt markers separate. Let me
do some more thinking on this issue.

>
>
>> OPT_END()
>> };
>> int ret;
>> @@ -355,6 +358,28 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
>> */
>> symbol_conf.try_vmlinux_path = (symbol_conf.vmlinux_name == NULL);
>>
>> + if (params.sdt) {
>> + if (params.show_lines) {
>> + pr_err(" Error: Don't use --sdt with --line.\n");
>> + usage_with_options(probe_usage, options);
>> + }
>> + if (params.show_vars) {
>> + pr_err(" Error: Don't use --sdt with --vars.\n");
>> + usage_with_options(probe_usage, options);
>> + }
>> + if (params.show_funcs) {
>> + pr_err(" Error: Don't use --sdt with --funcs.\n");
>> + usage_with_options(probe_usage, options);
>> + }
>> + if (params.list_events) {
>> + ret = show_available_markers(params.target);
>> + if (ret < 0)
>> + pr_err(" Error: Failed to show markers."
>> + " (%d)\n", ret);
>> + return ret;
>> + }
>> + }
>> +
>> if (params.list_events) {
>> if (params.mod_events) {
>> pr_err(" Error: Don't use --list with --add/--del.\n");
>> @@ -382,6 +407,7 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
>> ret);
>> return ret;
>> }
>> +
>> if (params.show_funcs) {
>> if (params.nevents != 0 || params.dellist) {
>> pr_err(" Error: Don't use --funcs with"
>> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
>> index aa04bf9..7f846f9 100644
>> --- a/tools/perf/util/probe-event.c
>> +++ b/tools/perf/util/probe-event.c
>> @@ -2372,3 +2372,9 @@ out:
>> free(name);
>> return ret;
>> }
>> +
>> +int show_available_markers(const char *target)
>> +{
>> + setup_pager();
>> + return list_markers(target);
>> +}
> Did you build it with NO_LIBELF=1 ? I guess not. At least you need a
> dummy list_markers() function which just returns -1 for such case.

No, I didn't. Ok, will add one dummy list_markers() function in the next
iteration.

>
>
>> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
>> index 4b12bf8..f3630f2 100644
>> --- a/tools/perf/util/symbol-elf.c
>> +++ b/tools/perf/util/symbol-elf.c
>> @@ -846,6 +846,211 @@ out_elf_end:
>> return err;
>> }
>>
>> +/* Populate the name, type, offset and argument fields in the note structure */
>> +static struct sdt_note *populate_note(Elf **elf, const char *data, size_t len,
>> + int type)
> Hmm.. I think the name needs to be changed more specific like
> populate_sdt_note() or something?

Hmm, seems reasonable. Will do that.

>
>
>> +{
>> + const char *provider, *name, *args;
>> + struct sdt_note *note;
>> +
>> + /*
>> + * There are 3 address values to be obtained: marker offset, base address
>> + * and semaphore
>> + */
>> + union {
>> + Elf64_Addr a64[3];
>> + Elf32_Addr a32[3];
>> + } buf;
>> +
>> + /*
>> + * dst and src (of Elf_Data) are required for translation from file
>> + * to memory representation
>> + */
>> + Elf_Data dst = {
>> + .d_buf = &buf, .d_type = ELF_T_ADDR, .d_version = EV_CURRENT,
>> + .d_size = gelf_fsize((*elf), ELF_T_ADDR, 3, EV_CURRENT),
>> + .d_off = 0, .d_align = 0
>> + };
>> +
>> + Elf_Data src = {
>> + .d_buf = (void *) data, .d_type = ELF_T_ADDR,
>> + .d_version = EV_CURRENT, .d_size = dst.d_size, .d_off = 0,
>> + .d_align = 0
>> + };
>> +
>> + if (type != SDT_NOTE_TYPE)
>> + goto out_ret;
>> +
>> + note = zalloc(sizeof(struct sdt_note));
>> + if (note == NULL) {
>> + pr_err("memory allocation error\n");
>> + goto out_ret;
>> + }
>> + note->next = NULL;
>> +
>> + if (len < dst.d_size + 3)
>> + goto out_free;
>> + /*
>> + * Translating from file representation to memory representation
>> + * Data now points to the address (offset)
>> + */
>> + if (gelf_xlatetom((*elf), &dst, &src,
>> + elf_getident((*elf), NULL)[EI_DATA]) == NULL)
>> + pr_debug("gelf_xlatetom : %s", elf_errmsg(-1));
>> +
>> + /* Populate the fields of sdt_note */
>> + provider = data + dst.d_size;
>> +
>> + name = (const char *)memchr(provider, '\0', data + len - provider);
>> + if (name++ == NULL)
>> + goto out_ret;
>> +
>> + args = (const char *)memchr(name, '\0', data + len - name);
>> + if (args++ == NULL ||
>> + memchr(args, '\0', data + len - name) != data + len - 1)
>> + goto out_ret;
>> + note->provider = provider;
>> + note->name = name;
>> +
>> + /* Three addr's for location, base and semaphore addresses */
>> + note->addr.a64[0] = (buf.a64[0]);
>> + note->addr.a64[1] = (buf.a64[1]);
>> + note->addr.a64[2] = (buf.a64[2]);
>> + return note;
>> +
>> +out_free:
>> + free(note);
>> +out_ret:
>> + return NULL;
>> +}
>> +
>> +/*
>> + * Traverse the elf of the object, find out the .note.stapsdt section
>> + * and accordingly initialize the notes' list head
>> + */
>> +static struct sdt_note *get_elf_markers(Elf *elf, bool *exe, bool probe)
> Ditto. How about get_sdt_markers() or get_sdt_notes()? I can see that
> there are places those 'marker' and 'note' are used here and there. If
> they indicate same thing it'd better to unify the term.

Yes, unifying them will be a better approach. Will modify these.

>
>
>> +{
>> + Elf_Scn *scn = NULL;
>> + Elf_Data *data;
>> + GElf_Shdr shdr;
>> + GElf_Ehdr ehdr;
>> + size_t shstrndx;
>> + size_t next;
>> + GElf_Nhdr nhdr;
>> + size_t name_off, desc_off, offset;
>> + struct sdt_note *note = NULL, *tmp, *head = NULL;
>> +
>> + if (gelf_getehdr(elf, &ehdr) == NULL) {
>> + pr_debug("%s: cannot get elf header.\n", __func__);
>> + goto out_end;
>> + }
>> +
>> + if (elf_getshdrstrndx(elf, &shstrndx) != 0) {
>> + pr_debug("getshdrstrndx failed\n");
>> + goto out_end;
>> + }
>> +
>> + /* library or exec */
>> + if (probe) {
>> + if (ehdr.e_type == ET_EXEC)
>> + *exe = true;
>> + }
>> +
>> + /*
>> + * Look for Section type = SHT_NOTE, flags = no SHF_ALLOC
>> + * and name = .note.stapsdt
>> + */
>> + scn = elf_section_by_name(elf, &ehdr, &shdr, NOTE_SCN, NULL);
>> + if (scn == NULL) {
>> + pr_err("%s section not found!\n", NOTE_SCN);
>> + goto out_end;
>> + }
>> +
>> + if (!(shdr.sh_type == SHT_NOTE) || (shdr.sh_flags & SHF_ALLOC))
>> + goto out_end;
>> +
>> + data = elf_getdata(scn, NULL);
>> +
>> + /* Get the notes */
>> + for (offset = 0; (next = gelf_getnote(data, offset, &nhdr, &name_off,
>> + &desc_off)) > 0; offset = next) {
>> + tmp = populate_note(&elf, (const char *)((long)(data->d_buf) +
>> + (long)desc_off),
>> + nhdr.n_descsz, nhdr.n_type);
> Shouldn't we check the name of note being "stapsdt" as well as version
> (type) 3?

Since, we are already fetching the section NOTE_SCN (".note.stapsdt")
and then we check for the type being SHT_NOTE and SHF_ALLOC, is it
required to do the same for the individual notes?

Thanks
Hemant
>
> Thanks,
> Namhyung
>
>
>> + /* For first note */
>> + if (!head) {
>> + note = tmp;
>> + head = note;
>> + continue;
>> + }
>> + note->next = tmp;
>> + note = note->next;
>> + }
>> +
>> + if (!head)
>> + pr_err("No markers found\n");
>> +
>> +out_end:
>> + return head;
>> +}
>> +
>> +static void print_notes(struct sdt_note *start)
>> +{
>> + struct sdt_note *temp;
>> + int c;
>> +
>> + for (temp = start, c = 0; temp != NULL; temp = temp->next, c++)
>> + printf("%s:%s\n", temp->provider, temp->name);
>> +
>> + printf("\nTotal markers = %d\n", c);
>> +}
>> +
>> +int list_markers(const char *name)
>> +{
>> + int fd, ret = -1;
>> + Elf *elf;
>> + struct sdt_note *head = NULL;
>> +
>> + fd = open(name, O_RDONLY);
>> + if (fd < 0) {
>> + pr_err("Failed to open %s\n", name);
>> + goto out_ret;
>> + }
>> +
>> + symbol__elf_init();
>> + elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL);
>> + if (elf == NULL) {
>> + pr_err("%s: cannot read %s ELF file.\n", __func__, name);
>> + goto out_close;
>> + }
>> +
>> + head = get_elf_markers(elf, NULL, false);
>> + if (head) {
>> + print_notes(head);
>> + cleanup_notes(head);
>> + ret = 0;
>> + } else {
>> + pr_err("No SDT markers found in %s\n", name);
>> + }
>> + elf_end(elf);
>> +
>> +out_close:
>> + close(fd);
>> +out_ret:
>> + return ret;
>> +}
>> +
>> +void cleanup_notes(struct sdt_note *start)
>> +{
>> + struct sdt_note *tmp;
>> +
>> + while (start) {
>> + tmp = start->next;
>> + free(start);
>> + start = tmp;
>> + }
>> +}
>> +
>> void symbol__elf_init(void)
>> {
>> elf_version(EV_CURRENT);
>> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
>> index 5f720dc..f2d17b7 100644
>> --- a/tools/perf/util/symbol.h
>> +++ b/tools/perf/util/symbol.h
>> @@ -197,6 +197,17 @@ struct symsrc {
>> #endif
>> };
>>
>> +/* Note structure */
>> +struct sdt_note {
>> + const char *name;
>> + const char *provider;
>> + union {
>> + Elf64_Addr a64[3];
>> + Elf32_Addr a32[3];
>> + } addr;
>> + struct sdt_note *next;
>> +};
>> +
>> void symsrc__destroy(struct symsrc *ss);
>> int symsrc__init(struct symsrc *ss, struct dso *dso, const char *name,
>> enum dso_binary_type type);
>> @@ -247,4 +258,12 @@ void symbols__fixup_duplicate(struct rb_root *symbols);
>> void symbols__fixup_end(struct rb_root *symbols);
>> void __map_groups__fixup_end(struct map_groups *mg, enum map_type type);
>>
>> +/* For SDT markers */
>> +int show_available_markers(const char *module);
>> +int list_markers(const char *name);
>> +void cleanup_notes(struct sdt_note *start);
>> +
>> +#define SDT_NOTE_TYPE 3
>> +#define NOTE_SCN ".note.stapsdt"
>> +
>> #endif /* __PERF_SYMBOL */

2013-09-04 17:40:11

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH 1/2] SDT markers listing by perf

On 09/04/2013 12:13 PM, Namhyung Kim wrote:
> On Tue, 03 Sep 2013 13:49:35 +0530, Hemant wrote:
>> On 09/03/2013 01:06 PM, Hemant Kumar wrote:
>>> This patch will enable perf to list all the sdt markers present
>>> in an elf file. The markers are present in the .note.stapsdt section
>>> of the elf. We can traverse through this section and collect the
>>> required info about the markers.
>>> This hasn't been thoroughly tested with the other
>>> options of perf.
>>> ----
>>> Usage :
>>> ./perf probe --list -x /lib64/libc.so.6
>> There is a mismatch between usage and the patch here. This should be:
>> perf probe --list -S /lib64/libc.so.6
> I guess it should be
>
> perf probe -S -x /lib64/libc.so.6

Yes, missed the -x option.

Thanks
Hemant

>
>
> Thanks,
> Namhyung
>

2013-09-04 17:50:23

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH 2/2] Support to perf to probe on SDT markers:

On 09/04/2013 12:30 PM, Namhyung Kim wrote:
> On Tue, 03 Sep 2013 13:07:03 +0530, Hemant Kumar wrote:
>> This patch enables perf to probe on the marker name specified on the command line.
> It looks like you didn't consider prelinked libraries. You need to check
> the address of .stapsdt.base section too. And obviously this patch

Will make the required changes to handle prelinking as well in the next
iteration.

> ignores any argument the SDT has which I think pretty important info.
> But we can add it later once the uprobes arg fetch patches are in.

Yes, will add the arguments' support too in the next iteration.
> Also please see my previous comment on mixed usage of 'note' and
> 'marker'.

Yeah, made a note of that and make the required modifications.

Thanks
Hemant
>
> Thanks,
> Namhyung
>
>
>> ---
>> tools/perf/builtin-probe.c | 7 +++
>> tools/perf/util/probe-event.c | 11 ++++
>> tools/perf/util/symbol-elf.c | 112 +++++++++++++++++++++++++++++++++++++++++
>> tools/perf/util/symbol.h | 5 ++
>> 4 files changed, 135 insertions(+)
>>
>> diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
>> index 3d8dcdf..8382853 100644
>> --- a/tools/perf/builtin-probe.c
>> +++ b/tools/perf/builtin-probe.c
>> @@ -378,6 +378,13 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
>> " (%d)\n", ret);
>> return ret;
>> }
>> + params.uprobes = true;
>> + ret = probe_marker(params.target,
>> + params.events[0].point.function);
>> + if (ret < 0)
>> + pr_err("Could not probe at %s marker\n",
>> + params.events[0].point.function);
>> + return ret;
>> }
>>
>> if (params.list_events) {
>> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
>> index 7f846f9..014d642 100644
>> --- a/tools/perf/util/probe-event.c
>> +++ b/tools/perf/util/probe-event.c
>> @@ -2378,3 +2378,14 @@ int show_available_markers(const char *target)
>> setup_pager();
>> return list_markers(target);
>> }
>> +
>> +int probe_marker(const char *name, char *mark)
>> +{
>> + int fd;
>> +
>> + fd = open_uprobe_events(true);
>> + if (fd == -1)
>> + return fd;
>> + else
>> + return probe__marker(name, mark, fd);
>> +}
>> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
>> index f3630f2..60938a5 100644
>> --- a/tools/perf/util/symbol-elf.c
>> +++ b/tools/perf/util/symbol-elf.c
>> @@ -1040,6 +1040,118 @@ out_ret:
>> return ret;
>> }
>>
>> +static void extract_first_name(const char *target, char *fname)
>> +{
>> + int i, len;
>> + char *file;
>> +
>> + file = strrchr(target, DIR_SEP);
>> + file++;
>> + len = strlen(file);
>> + for (i = 0; i <= len; i++) {
>> + if (!isalpha(file[i]))
>> + break;
>> + fname[i] = file[i];
>> + }
>> + fname[i] = '\0';
>> +}
>> +
>> +static int probe_at_note(struct sdt_note *r_note, const char *target, bool exec,
>> + int fd)
>> +{
>> + char buf[MAX_CMDLEN];
>> + int len, err = -1;
>> + Elf64_Addr offset;
>> + char *fname = NULL;
>> +
>> + if (exec)
>> + offset = r_note->addr.a64[0] - TEXT_SCN;
>> + else
>> + offset = r_note->addr.a64[0];
>> +
>> + fname = (char *)zalloc(sizeof(char) * strlen(target));
>> + if (fname == NULL) {
>> + pr_err("Error in allocating memory to fname\n");
>> + goto out_ret;
>> + }
>> +
>> + extract_first_name(target, fname);
>> + len = snprintf(buf, MAX_CMDLEN, "%c:%s%s/%s %s:0x%x", 'p', "probe_",
>> + fname, r_note->name, target, (unsigned)offset);
>> +
>> + len = write(fd, buf, MAX_CMDLEN);
>> + if (len < 0) {
>> + pr_err("Couldn't write into uprobe_events!\n");
>> + goto out_close;
>> + } else {
>> + printf("Added new event :\n");
>> + printf("event = %s \t (on 0x%x)\n\n", r_note->name,
>> + (unsigned)offset);
>> + printf("You can now use it on all perf tools such as :\n\n");
>> + printf("\t perf record -e %s%s:%s -aR sleep 1\n\n", "probe_", fname, r_note->name);
>> + err = 0;
>> + }
>> +
>> +out_close:
>> + close(fd);
>> + free(fname);
>> +out_ret:
>> + return err;
>> +}
>> +
>> +static int search_and_probe_at_note(char *key, struct sdt_note **start,
>> + const char *target, bool exec, int fd)
>> +{
>> + int ret = -1;
>> + struct sdt_note *req;
>> +
>> + for (req = (*start); req != NULL; req = req->next) {
>> + if (!strcmp(key, req->name))
>> + break;
>> + }
>> + if (!req) {
>> + pr_err("Could not find marker %s\n", key);
>> + return ret;
>> + }
>> +
>> + ret = probe_at_note(req, target, exec, fd);
>> + return ret;
>> +}
>> +
>> +int probe__marker(const char *name, char *mark, int evfd)
>> +{
>> + int ret = -1, fd;
>> + Elf *elf;
>> + bool exec = false;
>> + struct sdt_note *head = NULL;
>> +
>> + fd = open(name, O_RDONLY);
>> + if (fd < 0) {
>> + pr_err("Failed to open the file\n");
>> + goto out_ret;
>> + }
>> +
>> + symbol__elf_init();
>> + elf = elf_begin(fd, PERF_ELF_C_READ_MMAP, NULL);
>> + if (elf == NULL) {
>> + pr_debug("%s: cannot read %s ELF file.\n", __func__, name);
>> + goto out_close;
>> + }
>> +
>> + head = get_elf_markers(elf, &exec, true);
>> + if (head) {
>> + ret = search_and_probe_at_note(mark, &head, name, exec, evfd);
>> + cleanup_notes(head);
>> + }
>> +
>> + elf_end(elf);
>> +
>> +out_close:
>> + close(fd);
>> +out_ret:
>> + return ret;
>> +}
>> +
>> void cleanup_notes(struct sdt_note *start)
>> {
>> struct sdt_note *tmp;
>> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
>> index f2d17b7..95289fd 100644
>> --- a/tools/perf/util/symbol.h
>> +++ b/tools/perf/util/symbol.h
>> @@ -262,8 +262,13 @@ void __map_groups__fixup_end(struct map_groups *mg, enum map_type type);
>> int show_available_markers(const char *module);
>> int list_markers(const char *name);
>> void cleanup_notes(struct sdt_note *start);
>> +int probe_marker(const char *name, char *mark);
>> +int probe__marker(const char *name, char *mark, int fd);
>>
>> #define SDT_NOTE_TYPE 3
>> #define NOTE_SCN ".note.stapsdt"
>> +#define TEXT_SCN 0x400000
>> +#define DIR_SEP '/'
>> +#define MAX_CMDLEN 256
>>
>> #endif /* __PERF_SYMBOL */

2013-09-04 17:52:28

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH 1/2] SDT markers listing by perf

On 09/04/2013 12:51 PM, Masami Hiramatsu wrote:
> (2013/09/03 16:36), Hemant Kumar wrote:
>> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
>> index 5f720dc..f2d17b7 100644
>> --- a/tools/perf/util/symbol.h
>> +++ b/tools/perf/util/symbol.h
>> @@ -197,6 +197,17 @@ struct symsrc {
>> #endif
>> };
>>
>> +/* Note structure */
>> +struct sdt_note {
>> + const char *name;
>> + const char *provider;
>> + union {
>> + Elf64_Addr a64[3];
>> + Elf32_Addr a32[3];
>> + } addr;
>> + struct sdt_note *next;
>> +};
> Hmm, could you use struct list_head for listing up the data?
>
> Thank you,
>

Yes, it will be better to use struct list_head for fetching the markers
in a list. Will do that in the next iteration.

Thanks
Hemant

2013-09-04 17:59:09

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH 1/2] SDT markers listing by perf

On 09/04/2013 01:31 PM, Masami Hiramatsu wrote:
> (2013/09/04 15:42), Namhyung Kim wrote:
>> On Tue, 03 Sep 2013 13:06:55 +0530, Hemant Kumar wrote:
>>
>> [SNIP]
>>
>>> diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
>>> index e8a66f9..3d8dcdf 100644
>>> --- a/tools/perf/builtin-probe.c
>>> +++ b/tools/perf/builtin-probe.c
>>> @@ -55,6 +55,7 @@ static struct {
>>> bool show_funcs;
>>> bool mod_events;
>>> bool uprobes;
>>> + bool sdt;
>>> int nevents;
>>> struct perf_probe_event events[MAX_PROBES];
>>> struct strlist *dellist;
>>> @@ -325,6 +326,8 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
>>> opt_set_filter),
>>> OPT_CALLBACK('x', "exec", NULL, "executable|path",
>>> "target executable name or path", opt_set_target),
>>> + OPT_BOOLEAN('S', "sdt", &params.sdt,
>>> + "Show and probe on the SDT markers"),
>> You need to add it to Documentation/perf-probe.txt too. In addition if
>> the --sdt option is only able to work with libelf, it should be wrapped
>> into the #ifdef LIBELF_SUPPORT pair.
>>
>> And I'm not sure that it's a good idea to have two behavior on a single
>> option (S) - show and probe (add). Maybe it can be separated into two
>> or the S option can be used as a flag with existing --list and --add
>> option?
>>
> Good catch! :)
> No, that is really bad idea. All probes must be added by "--add" action.
> So we need a new probe syntax for specifying sdt marker.
>
> How about the below syntax?
>
> [EVENT=]%PROVIDER:MARKER [ARG ...]

Yes, looks good.

>
> Of course, this will require to list up all markers with "%" prefix for
> continuity.

Ok.

>
> And since --list option is to list up all existing(defined) probe events,
> I think --markers (as like as --funcs) is better for listing it up.

Hmmm, I think --markers is a good idea for listing the markers.

Thanks
Hemant
>
> Thank you!
>

2013-09-04 18:00:12

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH 2/2] Support to perf to probe on SDT markers:

On 09/04/2013 01:40 PM, Masami Hiramatsu wrote:
> (2013/09/04 16:00), Namhyung Kim wrote:
>> On Tue, 03 Sep 2013 13:07:03 +0530, Hemant Kumar wrote:
>>> This patch enables perf to probe on the marker name specified on the command line.
>> It looks like you didn't consider prelinked libraries. You need to check
>> the address of .stapsdt.base section too. And obviously this patch
>> ignores any argument the SDT has which I think pretty important info.
>> But we can add it later once the uprobes arg fetch patches are in.
> Agreed, I'd like to see the argument support afterward.
> Thanks!
>

Will add the argument support in the next iteration.

Thanks
Hemant

2013-09-04 18:02:09

by Hemant Kumar

[permalink] [raw]
Subject: Re: [RFC PATCH 0/2] Perf support to SDT markers

On 09/04/2013 12:19 PM, Namhyung Kim wrote:
> On Tue, 03 Sep 2013 18:53:17 +0530, Hemant wrote:
>> On 09/03/2013 02:47 PM, Masami Hiramatsu wrote:
>>> (2013/09/03 17:25), Ingo Molnar wrote:
>>>> * Hemant Kumar Shaw <[email protected]> wrote:
>>>>
>>>>> This series adds support to perf to list and probe into the SDT markers.
>>>>> The first patch implements listing of all the SDT markers present in
>>>>> the ELFs (executables or libraries). The SDT markers are present in the
>>>>> .note.stapsdt section of the elf. That section can be traversed to list
>>>>> all the markers. Recognition of markers follows the SystemTap approach.
>>>>>
>>>>> The second patch will allow perf to probe into these markers. This is
>>>>> done by writing the marker name and its offset into the
>>>>> uprobe_events file in the tracing directory.
>>>>> Then, perf tools can be used to analyze perf.data file.
>>>> Please provide a better high level description that explains the history
>>>> and scope of SDT markers, how SDT markers get into binaries, how they can
>>>> be used for probing, a real-life usage example that shows something
>>>> interesting not possible via other ways, etc.
>>> Indeed, and also I'd like to know what versions of SDT this support,
>>> and where we can see the technical document of that. As far as I know,
>>> the previous(?) SDT implementation also involves ugly semaphores.
>>> Have that already gone?
> It seems it's not. I see the SDT v3 document still mentions semaphores.

Yes, It still mentions semaphores and I will be handling these semaphores.

Thanks
Hemant
>
>
>> This link shows an example of marker probing with Systemtap:
>> https://sourceware.org/systemtap/wiki/AddingUserSpaceProbingToApps
> I think the link below would be more helpful for us :)
>
> http://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation
>
> Thanks,
> Namhyung
>
>> - Markers in binaries :
>> These SDT markers are present in the ELF in the section named
>> ".note.stapsdt".
>> Here, the name of the marker, its provider, type, location, base
>> address, semaphore address, arguments are present.
>> We can retrieve these values using the members name_off and desc_off in
>> Nhdr structure. If these are not enabled, they are present in the ELF
>> as nop.

2013-09-04 18:08:33

by Hemant Kumar

[permalink] [raw]
Subject: Re: [RFC PATCH 0/2] Perf support to SDT markers

On 09/04/2013 02:09 PM, Masami Hiramatsu wrote:
> (2013/09/04 17:25), Mark Wielaard wrote:
>> On Wed, 2013-09-04 at 15:49 +0900, Namhyung Kim wrote:
>>> On Tue, 03 Sep 2013 18:53:17 +0530, Hemant wrote:
>>>> On 09/03/2013 02:47 PM, Masami Hiramatsu wrote:
>>>>> Indeed, and also I'd like to know what versions of SDT this support,
>>>>> and where we can see the technical document of that. As far as I know,
>>>>> the previous(?) SDT implementation also involves ugly semaphores.
>>>>> Have that already gone?
>>> It seems it's not. I see the SDT v3 document still mentions semaphores.
>> It mentions them, but should normally not be used. They are there for
>> dtrace (source) compatibility. And you don't have to use them.
>>
>> Since normally a SDT probe marker is just a NOP it doesn't have any
>> overhead. But if you want to add complicated arguments that you would
>> normally not generate in your code, then you might want to add a
>> semaphore. That way you can have probes with a bit more overhead that
>> still have zero overhead when not being probed.
>>
>> Note that if you use the normal DTRACE_PROBE macros no semaphore will be
>> inserted. And you can opt to not support probes that have a semaphore in
>> perf if you think that is easier (just check the semaphore link-time
>> address for the probe, it should normally be zero). Just warn: "No way I
>> am going to probe something that might have a little extra overhead! I
>> am no debugger..." :)
> OK, I see. And in that case, we'd better filter out the markers which
> use a semaphore when list it up, since we can not enable it.
>
> Thank you,
>
>

But isn't it a better idea to handle semaphores, because there can be
many important markers using semaphores and people may still want to
probe on them?

Thanks
Hemant

2013-09-04 18:12:34

by Hemant Kumar

[permalink] [raw]
Subject: Re: [RFC PATCH 0/2] Perf support to SDT markers

On 09/04/2013 01:55 PM, Mark Wielaard wrote:
> On Wed, 2013-09-04 at 15:49 +0900, Namhyung Kim wrote:
>> On Tue, 03 Sep 2013 18:53:17 +0530, Hemant wrote:
>>> On 09/03/2013 02:47 PM, Masami Hiramatsu wrote:
>>>> Indeed, and also I'd like to know what versions of SDT this support,
>>>> and where we can see the technical document of that. As far as I know,
>>>> the previous(?) SDT implementation also involves ugly semaphores.
>>>> Have that already gone?
>> It seems it's not. I see the SDT v3 document still mentions semaphores.
> It mentions them, but should normally not be used. They are there for
> dtrace (source) compatibility. And you don't have to use them.
>
> Since normally a SDT probe marker is just a NOP it doesn't have any
> overhead. But if you want to add complicated arguments that you would
> normally not generate in your code, then you might want to add a
> semaphore. That way you can have probes with a bit more overhead that
> still have zero overhead when not being probed.
>
> Note that if you use the normal DTRACE_PROBE macros no semaphore will be
> inserted. And you can opt to not support probes that have a semaphore in
> perf if you think that is easier (just check the semaphore link-time
> address for the probe, it should normally be zero). Just warn: "No way I
> am going to probe something that might have a little extra overhead! I
> am no debugger..." :)

I agree. There will be an extra overhead but there may be some important
markers (on which we need to probe) may be worth this overhead?

Thanks
Hemant

>
>>> This link shows an example of marker probing with Systemtap:
>>> https://sourceware.org/systemtap/wiki/AddingUserSpaceProbingToApps
>> I think the link below would be more helpful for us :)
>>
>> http://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation
> Yes, that should be the canonical description.
>
> Cheers,
>
> Mark
>

2013-09-04 18:52:53

by Mark Wielaard

[permalink] [raw]
Subject: Re: [RFC PATCH 0/2] Perf support to SDT markers

On Wed, 2013-09-04 at 23:42 +0530, Hemant wrote:
> On 09/04/2013 01:55 PM, Mark Wielaard wrote:
> > Note that if you use the normal DTRACE_PROBE macros no semaphore will be
> > inserted. And you can opt to not support probes that have a semaphore in
> > perf if you think that is easier (just check the semaphore link-time
> > address for the probe, it should normally be zero). Just warn: "No way I
> > am going to probe something that might have a little extra overhead! I
> > am no debugger..." :)
>
> I agree. There will be an extra overhead but there may be some important
> markers (on which we need to probe) may be worth this overhead?

Yes, there maybe. And gdb and stap do support them. But it means not
just setting the probe, but also incrementing (and decrementing) the
semaphore. See "Semaphore Handling" under
https://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation

Which is extra work, so for a minimal implementation that just supports
normal (no-overhead) probes you might want to skip the extra work
required to support them. I believe they are normally not used. I
wouldn't recommend them and when I have added SDT probes myself I never
used/needed them, but I haven't actually looked what others do.

Cheers,

Mark

2013-09-04 20:39:41

by Hemant Kumar

[permalink] [raw]
Subject: Re: [RFC PATCH 0/2] Perf support to SDT markers

On 09/05/2013 12:22 AM, Mark Wielaard wrote:
> On Wed, 2013-09-04 at 23:42 +0530, Hemant wrote:
>> On 09/04/2013 01:55 PM, Mark Wielaard wrote:
>>> Note that if you use the normal DTRACE_PROBE macros no semaphore will be
>>> inserted. And you can opt to not support probes that have a semaphore in
>>> perf if you think that is easier (just check the semaphore link-time
>>> address for the probe, it should normally be zero). Just warn: "No way I
>>> am going to probe something that might have a little extra overhead! I
>>> am no debugger..." :)
>> I agree. There will be an extra overhead but there may be some important
>> markers (on which we need to probe) may be worth this overhead?
> Yes, there maybe. And gdb and stap do support them. But it means not
> just setting the probe, but also incrementing (and decrementing) the
> semaphore. See "Semaphore Handling" under
> https://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation
>
> Which is extra work, so for a minimal implementation that just supports
> normal (no-overhead) probes you might want to skip the extra work
> required to support them. I believe they are normally not used. I
> wouldn't recommend them and when I have added SDT probes myself I never
> used/needed them, but I haven't actually looked what others do.

Hmm, I agree as normally they aren't used. Also in normal usage, they
aren't needed. Avoiding this seems the right choice for now. Will just
filter out them as suggested by Masami.

Thanks
Hemant

>
> Cheers,
>
> Mark
>

2013-09-04 23:42:11

by Andi Kleen

[permalink] [raw]
Subject: Re: [RFC PATCH 0/2] Perf support to SDT markers

Hemant Kumar Shaw <[email protected]> writes:

> This series adds support to perf to list and probe into the SDT markers.
> The first patch implements listing of all the SDT markers present in
> the ELFs (executables or libraries). The SDT markers are present in the
> .note.stapsdt section of the elf. That section can be traversed to list
> all the markers. Recognition of markers follows the SystemTap approach.
>
> The second patch will allow perf to probe into these markers. This is
> done by writing the marker name and its offset into the
> uprobe_events file in the tracing directory.
> Then, perf tools can be used to analyze perf.data file.

Nice feature

-Andi

--
[email protected] -- Speaking for myself only

Subject: Re: Re: [RFC PATCH 0/2] Perf support to SDT markers

(2013/09/05 3:08), Hemant wrote:
> On 09/04/2013 02:09 PM, Masami Hiramatsu wrote:
>> (2013/09/04 17:25), Mark Wielaard wrote:
>>> On Wed, 2013-09-04 at 15:49 +0900, Namhyung Kim wrote:
>>>> On Tue, 03 Sep 2013 18:53:17 +0530, Hemant wrote:
>>>>> On 09/03/2013 02:47 PM, Masami Hiramatsu wrote:
>>>>>> Indeed, and also I'd like to know what versions of SDT this support,
>>>>>> and where we can see the technical document of that. As far as I know,
>>>>>> the previous(?) SDT implementation also involves ugly semaphores.
>>>>>> Have that already gone?
>>>> It seems it's not. I see the SDT v3 document still mentions semaphores.
>>> It mentions them, but should normally not be used. They are there for
>>> dtrace (source) compatibility. And you don't have to use them.
>>>
>>> Since normally a SDT probe marker is just a NOP it doesn't have any
>>> overhead. But if you want to add complicated arguments that you would
>>> normally not generate in your code, then you might want to add a
>>> semaphore. That way you can have probes with a bit more overhead that
>>> still have zero overhead when not being probed.
>>>
>>> Note that if you use the normal DTRACE_PROBE macros no semaphore will be
>>> inserted. And you can opt to not support probes that have a semaphore in
>>> perf if you think that is easier (just check the semaphore link-time
>>> address for the probe, it should normally be zero). Just warn: "No way I
>>> am going to probe something that might have a little extra overhead! I
>>> am no debugger..." :)
>> OK, I see. And in that case, we'd better filter out the markers which
>> use a semaphore when list it up, since we can not enable it.
>>
>> Thank you,
>>
>>
>
> But isn't it a better idea to handle semaphores, because there can be
> many important markers using semaphores and people may still want to
> probe on them?

Yeah, I didn't mean that all markers with semaphore keep disabled forever.
Instead, I meant that should not be for the first step. And in the first step,
we'd better support only markers without semaphore, and to avoid confusing
users, it should list only probable markers at this point.

I agree that we need to continue discussing how we handle those semaphores.

Thank you!

--
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

2013-09-06 06:37:59

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 1/2] SDT markers listing by perf

Hi Hemant,

On Wed, 04 Sep 2013 23:07:57 +0530, Hemant wrote:
> On 09/04/2013 12:12 PM, Namhyung Kim wrote:
>> On Tue, 03 Sep 2013 13:06:55 +0530, Hemant Kumar wrote:
>>> + /*
>>> + * Look for Section type = SHT_NOTE, flags = no SHF_ALLOC
>>> + * and name = .note.stapsdt
>>> + */
>>> + scn = elf_section_by_name(elf, &ehdr, &shdr, NOTE_SCN, NULL);
>>> + if (scn == NULL) {
>>> + pr_err("%s section not found!\n", NOTE_SCN);
>>> + goto out_end;
>>> + }
>>> +
>>> + if (!(shdr.sh_type == SHT_NOTE) || (shdr.sh_flags & SHF_ALLOC))
>>> + goto out_end;
>>> +
>>> + data = elf_getdata(scn, NULL);
>>> +
>>> + /* Get the notes */
>>> + for (offset = 0; (next = gelf_getnote(data, offset, &nhdr, &name_off,
>>> + &desc_off)) > 0; offset = next) {
>>> + tmp = populate_note(&elf, (const char *)((long)(data->d_buf) +
>>> + (long)desc_off),
>>> + nhdr.n_descsz, nhdr.n_type);
>> Shouldn't we check the name of note being "stapsdt" as well as version
>> (type) 3?
>
> Since, we are already fetching the section NOTE_SCN (".note.stapsdt")
> and then we check for the type being SHT_NOTE and SHF_ALLOC, is it
> required to do the same for the individual notes?

I don't know. Now it seems only includes SDT notes with name being
"stapsdt" and type being 3. But things can be changed in future..

Thanks,
Namhyung

2013-09-06 08:41:49

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH 1/2] SDT markers listing by perf

Hi Namhyung,

On 09/06/2013 12:07 PM, Namhyung Kim wrote:
> Hi Hemant,
>
> On Wed, 04 Sep 2013 23:07:57 +0530, Hemant wrote:
>> On 09/04/2013 12:12 PM, Namhyung Kim wrote:
>>> On Tue, 03 Sep 2013 13:06:55 +0530, Hemant Kumar wrote:
>>>> + /*
>>>> + * Look for Section type = SHT_NOTE, flags = no SHF_ALLOC
>>>> + * and name = .note.stapsdt
>>>> + */
>>>> + scn = elf_section_by_name(elf, &ehdr, &shdr, NOTE_SCN, NULL);
>>>> + if (scn == NULL) {
>>>> + pr_err("%s section not found!\n", NOTE_SCN);
>>>> + goto out_end;
>>>> + }
>>>> +
>>>> + if (!(shdr.sh_type == SHT_NOTE) || (shdr.sh_flags & SHF_ALLOC))
>>>> + goto out_end;
>>>> +
>>>> + data = elf_getdata(scn, NULL);
>>>> +
>>>> + /* Get the notes */
>>>> + for (offset = 0; (next = gelf_getnote(data, offset, &nhdr, &name_off,
>>>> + &desc_off)) > 0; offset = next) {
>>>> + tmp = populate_note(&elf, (const char *)((long)(data->d_buf) +
>>>> + (long)desc_off),
>>>> + nhdr.n_descsz, nhdr.n_type);
>>> Shouldn't we check the name of note being "stapsdt" as well as version
>>> (type) 3?
>> Since, we are already fetching the section NOTE_SCN (".note.stapsdt")
>> and then we check for the type being SHT_NOTE and SHF_ALLOC, is it
>> required to do the same for the individual notes?
> I don't know. Now it seems only includes SDT notes with name being
> "stapsdt" and type being 3. But things can be changed in future..
>
> Thanks,
> Namhyung
>

Yeah that may be a case. Ok, will put a check on the individual markers
for their type.

Thanks
Hemant

2013-09-15 11:28:21

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH 1/2] SDT markers listing by perf

Hi Masami,

On 09/04/2013 01:31 PM, Masami Hiramatsu wrote:
> (2013/09/04 15:42), Namhyung Kim wrote:
>> On Tue, 03 Sep 2013 13:06:55 +0530, Hemant Kumar wrote:
>>
>> [SNIP]
>>
>>> diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
>>> index e8a66f9..3d8dcdf 100644
>>> --- a/tools/perf/builtin-probe.c
>>> +++ b/tools/perf/builtin-probe.c
>>> @@ -55,6 +55,7 @@ static struct {
>>> bool show_funcs;
>>> bool mod_events;
>>> bool uprobes;
>>> + bool sdt;
>>> int nevents;
>>> struct perf_probe_event events[MAX_PROBES];
>>> struct strlist *dellist;
>>> @@ -325,6 +326,8 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
>>> opt_set_filter),
>>> OPT_CALLBACK('x', "exec", NULL, "executable|path",
>>> "target executable name or path", opt_set_target),
>>> + OPT_BOOLEAN('S', "sdt", &params.sdt,
>>> + "Show and probe on the SDT markers"),
>> You need to add it to Documentation/perf-probe.txt too. In addition if
>> the --sdt option is only able to work with libelf, it should be wrapped
>> into the #ifdef LIBELF_SUPPORT pair.
>>
>> And I'm not sure that it's a good idea to have two behavior on a single
>> option (S) - show and probe (add). Maybe it can be separated into two
>> or the S option can be used as a flag with existing --list and --add
>> option?
>>
> Good catch! :)
> No, that is really bad idea. All probes must be added by "--add" action.
> So we need a new probe syntax for specifying sdt marker.
>
> How about the below syntax?
>
> [EVENT=]%PROVIDER:MARKER [ARG ...]
>
> Of course, this will require to list up all markers with "%" prefix for
> continuity.
>
> And since --list option is to list up all existing(defined) probe events,
> I think --markers (as like as --funcs) is better for listing it up.
>
> Thank you!
>

I have one doubt here. Why do we need [ARG ...] in the syntax you
specified? I believe these args are to fetched from the sdt notes'
section of the elf of the executable/library. Or am I taking this in a
wrong way and this suggested syntax is actually for the uprobe_events
file in the tracing directory?

--
Thanks
Hemant

Subject: Re: [PATCH 1/2] SDT markers listing by perf

(2013/09/15 20:28), Hemant wrote:
> Hi Masami,

Hi, and sorry for replying so late. I missed this in my mailbox.

> On 09/04/2013 01:31 PM, Masami Hiramatsu wrote:
>> (2013/09/04 15:42), Namhyung Kim wrote:
>>> On Tue, 03 Sep 2013 13:06:55 +0530, Hemant Kumar wrote:
>>>
>>> [SNIP]
>>>
>>>> diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
>>>> index e8a66f9..3d8dcdf 100644
>>>> --- a/tools/perf/builtin-probe.c
>>>> +++ b/tools/perf/builtin-probe.c
>>>> @@ -55,6 +55,7 @@ static struct {
>>>> bool show_funcs;
>>>> bool mod_events;
>>>> bool uprobes;
>>>> + bool sdt;
>>>> int nevents;
>>>> struct perf_probe_event events[MAX_PROBES];
>>>> struct strlist *dellist;
>>>> @@ -325,6 +326,8 @@ int cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
>>>> opt_set_filter),
>>>> OPT_CALLBACK('x', "exec", NULL, "executable|path",
>>>> "target executable name or path", opt_set_target),
>>>> + OPT_BOOLEAN('S', "sdt", &params.sdt,
>>>> + "Show and probe on the SDT markers"),
>>> You need to add it to Documentation/perf-probe.txt too. In addition if
>>> the --sdt option is only able to work with libelf, it should be wrapped
>>> into the #ifdef LIBELF_SUPPORT pair.
>>>
>>> And I'm not sure that it's a good idea to have two behavior on a single
>>> option (S) - show and probe (add). Maybe it can be separated into two
>>> or the S option can be used as a flag with existing --list and --add
>>> option?
>>>
>> Good catch! :)
>> No, that is really bad idea. All probes must be added by "--add" action.
>> So we need a new probe syntax for specifying sdt marker.
>>
>> How about the below syntax?
>>
>> [EVENT=]%PROVIDER:MARKER [ARG ...]
>>
>> Of course, this will require to list up all markers with "%" prefix for
>> continuity.
>>
>> And since --list option is to list up all existing(defined) probe events,
>> I think --markers (as like as --funcs) is better for listing it up.
>>
>> Thank you!
>>
>
> I have one doubt here. Why do we need [ARG ...] in the syntax you
> specified? I believe these args are to fetched from the sdt notes'
> section of the elf of the executable/library. Or am I taking this in a
> wrong way and this suggested syntax is actually for the uprobe_events
> file in the tracing directory?

Hm, indeed. Since all the arguments of the marker is defined in sdt notes,
we actually don't need to specify each of them. However, other probe syntax
has those arguments. I'd like to keep the same syntax style in the
same command (action) for avoiding confusion.
I recommend this way; at the first step, we just find the marker address from
sdt. And next, we will make the argument available. And eventually,
it is better to introduce "$args" meta argument to fetch all the arguments
of the marker.

At this point, we can do

perf probe %foo:bar $args

to trace full information from the marker foo:bar.

Thank you,

--
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

2013-09-25 06:04:08

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH 1/2] SDT markers listing by perf

On 09/25/2013 10:07 AM, Masami Hiramatsu wrote:
> (2013/09/15 20:28), Hemant wrote:
>> Hi Masami,
> Hi, and sorry for replying so late. I missed this in my mailbox.
>
>> On 09/04/2013 01:31 PM, Masami Hiramatsu wrote:
>>> (2013/09/04 15:42), Namhyung Kim wrote:
>>>> [SNIP]
>>>> You need to add it to Documentation/perf-probe.txt too. In addition if
>>>> the --sdt option is only able to work with libelf, it should be wrapped
>>>> into the #ifdef LIBELF_SUPPORT pair.
>>>>
>>>> And I'm not sure that it's a good idea to have two behavior on a single
>>>> option (S) - show and probe (add). Maybe it can be separated into two
>>>> or the S option can be used as a flag with existing --list and --add
>>>> option?
>>>>
>>> Good catch! :)
>>> No, that is really bad idea. All probes must be added by "--add" action.
>>> So we need a new probe syntax for specifying sdt marker.
>>>
>>> How about the below syntax?
>>>
>>> [EVENT=]%PROVIDER:MARKER [ARG ...]
>>>
>>> Of course, this will require to list up all markers with "%" prefix for
>>> continuity.
>>>
>>> And since --list option is to list up all existing(defined) probe events,
>>> I think --markers (as like as --funcs) is better for listing it up.
>>>
>>> Thank you!
>>>
>> I have one doubt here. Why do we need [ARG ...] in the syntax you
>> specified? I believe these args are to fetched from the sdt notes'
>> section of the elf of the executable/library. Or am I taking this in a
>> wrong way and this suggested syntax is actually for the uprobe_events
>> file in the tracing directory?
> Hm, indeed. Since all the arguments of the marker is defined in sdt notes,
> we actually don't need to specify each of them. However, other probe syntax
> has those arguments. I'd like to keep the same syntax style in the
> same command (action) for avoiding confusion.

Hmm, got it.

> I recommend this way; at the first step, we just find the marker address from
> sdt. And next, we will make the argument available. And eventually,
> it is better to introduce "$args" meta argument to fetch all the arguments
> of the marker.
>
> At this point, we can do
>
> perf probe %foo:bar $args

So, at first step (ignoring the arguments), we can go with :
perf probe %foo:bar

And, once, the argument support is enabled (all the arguments will be
fetched at the marker location), we can go with:
perf probe %foo:bar $args

>
> to trace full information from the marker foo:bar.
>
> Thank you,
>


--
Thanks
Hemant Kumar Shaw

Subject: Re: [PATCH 1/2] SDT markers listing by perf

(2013/09/25 15:03), Hemant wrote:
>>> I have one doubt here. Why do we need [ARG ...] in the syntax you
>>> specified? I believe these args are to fetched from the sdt notes'
>>> section of the elf of the executable/library. Or am I taking this in a
>>> wrong way and this suggested syntax is actually for the uprobe_events
>>> file in the tracing directory?
>> Hm, indeed. Since all the arguments of the marker is defined in sdt notes,
>> we actually don't need to specify each of them. However, other probe syntax
>> has those arguments. I'd like to keep the same syntax style in the
>> same command (action) for avoiding confusion.
>
> Hmm, got it.
>
>> I recommend this way; at the first step, we just find the marker address from
>> sdt. And next, we will make the argument available. And eventually,
>> it is better to introduce "$args" meta argument to fetch all the arguments
>> of the marker.
>>
>> At this point, we can do
>>
>> perf probe %foo:bar $args
>
> So, at first step (ignoring the arguments), we can go with :
> perf probe %foo:bar

Right,

> And, once, the argument support is enabled (all the arguments will be
> fetched at the marker location), we can go with:
> perf probe %foo:bar $args

Correct ;). In my plan, $parms and $vars will be also introduced for
accessing all function parameters and local variables correspondingly.

Thank you!

--
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]