2014-07-18 10:18:01

by Hemant Kumar

[permalink] [raw]
Subject: [PATCH v2 0/3] perf/sdt : Support for SDT markers

This patchset helps in listing dtrace style markers(SDT) present in user space
applications through perf.
Notes/markers are placed at important places by the
developers. They have a negligible overhead when not enabled.
We can enable them and probe at these places and find some important information
like the arguments' values, etc.

We have lots of applications which use SDT markers today, like:
Postgresql, MySql, Mozilla, Perl, Python, Java, Ruby, libvirt, QEMU, glib

To add SDT markers into user applications:
We need to have this header sys/sdt.h present.
sys/sdt.h used is version 3.
If not present, install systemtap-sdt-devel package (for fedora-18).

Please refer to the Documentation patch to see how the SDT markers are added into
a program.

With this patchset,
- Use perf to list the markers in the app:
# perf list sdt ./user_app

./user_app :
%user_app:foo_start
%user_app:fun_start

- Also, we can see the SDT markers present in our system in the usual binaries.
These usual binaries are libraries (dsos) listed by ldconfig --print-cache and some
binaries present in PATH environment variable.

First, scan the binaries using :
# perf list sdt --scan

Creating a cache of SDT markers...
perf sdt cache created!
Use : "perf list sdt"
to see the SDT markers

After the sdt cache file is created, use perf list to view the markers :
# perf list sdt

%rtld : init_start
%rtld : init_complete
%rtld : map_failed
%rtld : map_start
%rtld : lll_futex_wake
...
...
%libgcc : unwind
%libvirt : rpc_server_client_auth_allow
%libvirt : rpc_server_client_auth_fail
%libvirt : rpc_server_client_auth_deny

Also, by using "perf list", we can view all the SDT markers as events along with all
the other events present in the system.
Alternatively, one can view the /var/cache/perf/perf-sdt.cache directly.

This link shows an example of marker probing with Systemtap:
https://sourceware.org/systemtap/wiki/AddingUserSpaceProbingToApps

Also, this link provides important info regarding SDT notes:
http://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation

- Markers in binaries :
These SDT markers are present in the ELF in the section named
".note.stapsdt".
Here, the name of the marker, its provider, type, location, base
address, semaphore address.
We can retrieve these values using the members name_off and desc_off in
Nhdr structure. If these are not enabled, they are present in the ELF as nop.

Changes since last series :
- Made the changes as discussed with Masami and Namhyung.
- Now, "perf list" will also display the SDT markers as events along with other events.
- Made some optimazations.

TODO:
- Add support to probe these SDT markers as if they are events.
- Recognizing arguments and support to probe on them.
---

Hemant Kumar (3):
This patch enables perf to list the SDT markers present in a system. It looks
This patch enables perf to look for SDT markers in a single file.
This patch adds the required documentation for perf support to SDT markers.


tools/perf/Documentation/SDT-markers.txt | 123 +++++
tools/perf/Documentation/perf-list.txt | 7
tools/perf/Makefile.perf | 1
tools/perf/builtin-list.c | 6
tools/perf/util/parse-events.h | 3
tools/perf/util/sdt.c | 778 ++++++++++++++++++++++++++++++
tools/perf/util/symbol-elf.c | 229 +++++++++
tools/perf/util/symbol.h | 19 +
8 files changed, 1164 insertions(+), 2 deletions(-)
create mode 100644 tools/perf/Documentation/SDT-markers.txt
create mode 100644 tools/perf/util/sdt.c

--


2014-07-18 10:19:33

by Hemant Kumar

[permalink] [raw]
Subject: [PATCH v2 1/3] perf/sdt : Listing of SDT markers by perf

This patch enables perf to list the SDT markers present in a system. It looks
in dsos given by ldconfig --print-cache and for other binaries, it looks into
the PATH environment variable. After preparing a list of the binaries, then
it starts searching for SDT markers in them.
To find the SDT markers, first an elf section named .note.stapsdt is searched
for. And then the SDT notes are retreived one by one from that section.
To counter the effect of prelinking, the section ".stapsdt.base" is searched.
If its found, then the location of the SDT marker is adjusted.

All these markers' info is written into a cache file
"/var/cache/perf/perf-sdt.cache".
Since, the presence of SDT markers is quite common these days, hence, its better
to make them visible to a user easily. Also, creating a cache file will help a user
to probe (to be implemented) these markers without much hussle. This cache file will
hold most of the SDT markers.

The format of each SDT cache entry is -

%provider:marker:file_path:build_id:location:semaphore_loc

% - marks the beginning of each entry.
provider - The provider name of the SDT marker.
marker - The marker name of the SDT marker.
file_path - Full/absolute path of the file in which this marker is present.
location : Adjusted location of the SDT marker inside the program.
semaphore_loc : The semaphore address if present otherwise 0x0.

This format should help when probing will be implemented. The adjusted address
from the required entry can be directly used in probing if the build_id matches.

To scan the system for SDT markers invoke :
# perf list sdt --scan

"--scan" should be used for the first time and whenever there is any change in
the files containing the SDT markers. It looks into the common binaries available
in a system.

And then use :
# perf list sdt

This displays the list of SDT markers recorded in the SDT cache.
This shows the SDT markers present in the common binaries stored in the system,
present in PATH variable and the /lib/ and /lib64/ directories.

Or use:
# perf list

It should display all events including the SDT events.

Signed-off-by : [email protected]
---
tools/perf/Makefile.perf | 1
tools/perf/builtin-list.c | 6
tools/perf/util/parse-events.h | 3
tools/perf/util/sdt.c | 503 ++++++++++++++++++++++++++++++++++++++++
tools/perf/util/symbol-elf.c | 229 ++++++++++++++++++
tools/perf/util/symbol.h | 19 ++
6 files changed, 760 insertions(+), 1 deletion(-)
create mode 100644 tools/perf/util/sdt.c

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 9670a16..e098dcd 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -373,6 +373,7 @@ LIB_OBJS += $(OUTPUT)util/stat.o
LIB_OBJS += $(OUTPUT)util/record.o
LIB_OBJS += $(OUTPUT)util/srcline.o
LIB_OBJS += $(OUTPUT)util/data.o
+LIB_OBJS += $(OUTPUT)util/sdt.o

LIB_OBJS += $(OUTPUT)ui/setup.o
LIB_OBJS += $(OUTPUT)ui/helpline.o
diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c
index 011195e..e1e654b 100644
--- a/tools/perf/builtin-list.c
+++ b/tools/perf/builtin-list.c
@@ -23,7 +23,7 @@ int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_END()
};
const char * const list_usage[] = {
- "perf list [hw|sw|cache|tracepoint|pmu|event_glob]",
+ "perf list [hw|sw|cache|tracepoint|pmu|event_glob|sdt]",
NULL
};

@@ -34,6 +34,8 @@ int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused)

if (argc == 0) {
print_events(NULL, false);
+ printf("\n\nSDT events :\n");
+ sdt_cache__display();
return 0;
}

@@ -55,6 +57,8 @@ int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused)
print_pmu_events(NULL, false);
else if (strcmp(argv[i], "--raw-dump") == 0)
print_events(NULL, true);
+ else if (strcmp(argv[i], "sdt") == 0)
+ print_sdt_events(argv[++i]);
else {
char *sep = strchr(argv[i], ':'), *s;
int sep_idx;
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index df094b4..486d230 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -109,4 +109,7 @@ extern int is_valid_tracepoint(const char *event_string);

extern int valid_debugfs_mount(const char *debugfs);

+void print_sdt_events(const char *arg);
+extern void sdt_cache__display(void);
+
#endif /* __PERF_PARSE_EVENTS_H */
diff --git a/tools/perf/util/sdt.c b/tools/perf/util/sdt.c
new file mode 100644
index 0000000..f5bfdbd
--- /dev/null
+++ b/tools/perf/util/sdt.c
@@ -0,0 +1,503 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <string.h>
+#include <linux/limits.h>
+#include <dirent.h>
+#include <errno.h>
+#include <sys/mman.h>
+
+#include "linux/list.h"
+#include "symbol.h"
+#include "build-id.h"
+#include "include/linux/kernel.h"
+#include "parse-events.h"
+
+#define LIB_COMM "ldconfig --print-cache"
+
+#define SDT_CACHE_DIR "/var/cache/perf/"
+#define SDT_CACHE "perf-sdt.cache"
+#define SDT_CACHE_TMP "perf-sdt.cache.tmp"
+
+#define DELIM ':'
+
+struct path_list {
+ char path[PATH_MAX];
+ struct list_head list;
+} execs;
+
+/* Write operation for cache */
+static void write_cache(FILE *cache, char *buffer)
+{
+ fprintf(cache, "%s", buffer);
+}
+
+/* copies from src to target till 'delim' */
+static int copy_delim(char *src, char *target, char delim, size_t size)
+{
+ int i;
+
+ memset(target, '\0', size);
+ for (i = 0; src[i] != delim; i++)
+ target[i] = src[i];
+
+ return i + 1;
+}
+
+/*
+ * get_sdt_note_info() is the function actually responsible for
+ * flushing the SDT notes info into the "cache" file or to the
+ * stdout if "cache" points to NULL. Also, this function finds out
+ * the build-id of an ELF to be written into "cache".
+ */
+static void get_sdt_note_info(struct list_head *start, const char *target,
+ FILE *cache)
+{
+ struct sdt_note *pos;
+ u8 build_id[BUILD_ID_SIZE];
+ char sbuild_id[BUILD_ID_SIZE * 2 + 1];
+ char buffer[2 * PATH_MAX];
+
+ if (list_empty(start))
+ return;
+
+ /* Read the build id of the file */
+ if (filename__read_build_id(target, &build_id,
+ sizeof(build_id)) < 0) {
+ pr_debug("Couldn't read build-id in %s\n", target);
+ return;
+ }
+ /* Convert the build id into a string */
+ build_id__sprintf(build_id, sizeof(build_id), sbuild_id);
+
+ list_for_each_entry(pos, start, note_list) {
+ sprintf(buffer, "%%%s:%s:%s:%s:0x%lx:0x%lx",
+ pos->provider, pos->name, target, sbuild_id,
+ pos->bit32 ? pos->addr.a32[0] :
+ pos->addr.a64[0],
+ pos->bit32 ? pos->addr.a32[2] :
+ pos->addr.a64[2]);
+
+ /*
+ * Format of any line of this sdt-cache :
+ * %provider:marker:filename:build-id:location:semaphore
+ */
+ write_cache(cache, buffer);
+ }
+}
+
+/* Free the sdt note list */
+static void cleanup_sdt_note_list(struct list_head *sdt_notes)
+{
+ struct sdt_note *tmp, *pos;
+
+ if (list_empty(sdt_notes))
+ return;
+
+ list_for_each_entry_safe(pos, tmp, sdt_notes, note_list) {
+ list_del(&pos->note_list);
+ free(pos->name);
+ free(pos->provider);
+ free(pos);
+ }
+}
+
+/*
+ * filename__find_sdt() looks for sdt markers and the list is stored
+ * in sdt_notes. The fd passed here is the file in which the info
+ * about the SDT markers is filled up.
+ */
+static int filename__find_sdt(const char *target, FILE *cache)
+{
+ int ret;
+
+ LIST_HEAD(sdt_notes);
+
+ ret = get_sdt_note_list(&sdt_notes, target);
+ if (!ret)
+ get_sdt_note_info(&sdt_notes, target, cache);
+
+ cleanup_sdt_note_list(&sdt_notes);
+ return ret;
+}
+
+/*
+ * Finds out the libraries present in a system as shown by the command
+ * "ldconfig --print-cache". Uses "=>" and '/' to find out the start of a
+ * dso path.
+ */
+static char *parse_lib_name(char *str)
+{
+ char *ptr, *q, *r;
+
+ while (str != NULL) {
+ /* look for "=>" and then '/' */
+ ptr = strstr(str, "=>");
+ if (ptr) {
+ ptr++;
+ q = strchr(ptr, '/');
+ if (!q)
+ return NULL;
+ r = strchr(ptr, '\n');
+ *r = '\0';
+ return q;
+ } else if (ptr == NULL) {
+ return NULL;
+ } else {
+ str = ptr + 1;
+ continue;
+ }
+ }
+
+ return NULL;
+}
+
+/*
+ * Checks if a path is already present in the list.
+ * Returns 'true' if present and 'false' otherwise.
+ */
+static bool is_present_in_list(struct list_head *path_list, char *path)
+{
+ struct path_list *tmp;
+
+ list_for_each_entry(tmp, path_list, list) {
+ if (!strcmp(path, tmp->path))
+ return true;
+ }
+
+ return false;
+}
+
+static inline void append_path(char *path, struct list_head *list)
+{
+ char *res_path = NULL;
+ struct path_list *tmp = NULL;
+
+ res_path = (char *)malloc(sizeof(char) * PATH_MAX);
+
+ if (!res_path)
+ return;
+
+ memset(res_path, '\0', PATH_MAX);
+
+ if (realpath(path, res_path) && !is_present_in_list(list, res_path)) {
+ tmp = (struct path_list *) malloc(sizeof(struct path_list));
+ if (!tmp) {
+ free(res_path);
+ return;
+ }
+ strcpy(tmp->path, res_path);
+ list_add(&(tmp->list), list);
+ if (res_path)
+ free(res_path);
+ }
+}
+
+static void flush_sdt_list(struct list_head *head, FILE *cache)
+{
+ struct path_list *tmp;
+
+ list_for_each_entry(tmp, head, list)
+ filename__find_sdt(tmp->path, cache);
+}
+
+static void cleanup_path_list(struct list_head *head)
+{
+ struct path_list *tmp, *pos;
+
+ list_for_each_entry_safe(tmp, pos, head, list) {
+ list_del(&tmp->list);
+ free(tmp);
+ }
+}
+
+/*
+ * Finds SDT markers in the dsos which are present in a system.
+ * "cache" is the stream which is filled up with all these SDT info.
+ */
+static void find_sdt_in_dsos(FILE *cache)
+{
+ FILE *fp;
+ size_t len = 0;
+ char *line, *path = NULL;
+ struct path_list lib_list;
+
+ INIT_LIST_HEAD(&lib_list.list);
+
+ /*
+ * Run the command ldconfig to find out the libraries in
+ * a system from the file ld.so.cache
+ */
+ fp = popen(LIB_COMM, "r");
+ if (fp == NULL) {
+ printf("Error in running %s\n", LIB_COMM);
+ return;
+ }
+
+ /*
+ * Take a line from the o/p of ldconfig and
+ * parse it to find a library path
+ */
+ while (getline(&line, &len, fp) != -1) {
+ path = parse_lib_name(line);
+ if (path)
+ append_path(path, &lib_list.list);
+ }
+ /* line and fp no more required */
+ free(line);
+ pclose(fp);
+
+ /* Find and write the SDT markers in the perf-sdt cache*/
+ flush_sdt_list(&lib_list.list, cache);
+
+ cleanup_path_list(&lib_list.list);
+ return;
+}
+
+/*
+ * Go through all the files inside "path".
+ * But don't go into sub-directories.
+ */
+static void walk_through_dir(char *path)
+{
+ struct dirent *entry;
+ DIR *dir;
+ struct stat sb;
+ int ret = 0;
+ char *swd;
+
+ dir = opendir(path);
+ if (!dir)
+ return;
+
+ /* save the current working directory */
+ swd = getcwd(NULL, 0);
+ if (!swd) {
+ pr_err("getcwd : error");
+ return;
+ }
+
+ ret = chdir(path);
+ if (ret) {
+ pr_err("chdir : error in %s", path);
+ return;
+ }
+ while ((entry = readdir(dir)) != NULL) {
+
+ ret = stat(entry->d_name, &sb);
+ if (ret == -1) {
+ pr_debug("%s : error in stat!\n", entry->d_name);
+ continue;
+ }
+
+ /* Not pursuing sub-directories */
+ if ((sb.st_mode & S_IFMT) != S_IFDIR)
+ if (sb.st_mode & S_IXUSR)
+ append_path(entry->d_name, &execs.list);
+ }
+
+ closedir(dir);
+
+ /* return to the saved working directory */
+ ret = chdir(swd);
+ if (ret) {
+ pr_err("chdir : error");
+ return;
+ }
+}
+
+/*
+ * parse_list() parses the list obtained from PATH and for each
+ * path, we call walk_through_dir() to traverse through the this
+ * path's subtree and obtain the list of all Elfs.
+ */
+static void parse_list(char *exe_list)
+{
+ struct path_list dir_exe_list, *tmp;
+ char *path = exe_list;
+ char *next_path;
+
+ INIT_LIST_HEAD(&dir_exe_list.list);
+
+ if (!exe_list)
+ return;
+
+ while (path) {
+ /* Parse the string based on ':' */
+ next_path = strchr(path, DELIM);
+ if (next_path) {
+ *next_path = '\0';
+ next_path++;
+ }
+
+ append_path(path, &(dir_exe_list.list));
+ path = next_path;
+ }
+
+ /* Traversing the directory paths */
+ list_for_each_entry(tmp, &dir_exe_list.list, list)
+ walk_through_dir(tmp->path);
+
+ cleanup_path_list(&dir_exe_list.list);
+}
+
+/*
+ * Obtain the list of paths from the PATH env variable
+ */
+static int find_sdt_in_execs(FILE *cache)
+{
+ char *list = NULL;
+ const char *env = "PATH";
+
+ INIT_LIST_HEAD(&execs.list);
+
+ list = getenv(env);
+ if (list)
+ parse_list(list);
+
+ flush_sdt_list(&execs.list, cache);
+ cleanup_path_list(&execs.list);
+
+ return 0;
+}
+
+/*
+ * find_sdt_in_bins() : this function calls diff functions to find
+ * out the SDT markers in different binaries and writes it to a
+ * temporary file.
+ * If everything goes smooth, this temporary file is renamed to the
+ * original cache file name.
+ */
+static int find_sdt_in_bins(void)
+{
+ FILE *cache;
+ struct stat buf;
+ int ret;
+
+ ret = stat(SDT_CACHE_DIR, &buf);
+ if (ret) {
+ ret = mkdir(SDT_CACHE_DIR, buf.st_mode);
+ if (ret) {
+ pr_err("%s : %s\n", SDT_CACHE_DIR, strerror(errno));
+ return -errno;
+ }
+ }
+
+ cache = fopen(SDT_CACHE_DIR SDT_CACHE_TMP, "w");
+ if (!cache) {
+ printf("Error in creating/opening the %s file!\n",
+ SDT_CACHE_DIR SDT_CACHE_TMP);
+ return -errno;
+ }
+
+ find_sdt_in_dsos(cache);
+ find_sdt_in_execs(cache);
+
+ if (!fclose(cache)) {
+ ret = rename(SDT_CACHE_DIR SDT_CACHE_TMP,
+ SDT_CACHE_DIR SDT_CACHE);
+ if (ret)
+ pr_err("%s : %s\n", SDT_CACHE_TMP, strerror(errno));
+
+ } else {
+ pr_err("%s : %s\n", SDT_CACHE_TMP, strerror(errno));
+ ret = -errno;
+ }
+
+ return ret;
+}
+
+/* Take the entries from the sdt cache and display them */
+void sdt_cache__display(void)
+{
+ struct stat sb;
+ int len;
+ int fd;
+ int ret;
+ char *data, *ptr, *p;
+ char provider[PATH_MAX], marker[PATH_MAX], file_name[PATH_MAX];
+
+ fd = open(SDT_CACHE_DIR SDT_CACHE, O_RDONLY);
+ if (fd == -1) {
+ pr_err("Error in opening %s\n", SDT_CACHE_DIR SDT_CACHE);
+ pr_err("Please run\n perf list sdt --scan\n");
+ return;
+ }
+
+ ret = fstat(fd, &sb);
+ if (ret == -1) {
+ pr_err("Error in fstat\n");
+ return;
+ }
+
+ if (!S_ISREG(sb.st_mode)) {
+ pr_err("%s is not a file\n", SDT_CACHE_DIR SDT_CACHE);
+ return;
+ }
+
+ data = mmap(0, sb.st_size, PROT_READ, MAP_SHARED, fd, 0);
+ if (data == MAP_FAILED) {
+ pr_err("Error in mmap\n");
+ return;
+ }
+
+ ret = madvise(data, sb.st_size, MADV_SEQUENTIAL);
+ if (ret == -1) {
+ pr_debug("Error in madvise\n");
+ return;
+ }
+
+ ret = close(fd);
+ if (ret == -1) {
+ pr_err("Error in close\n");
+ return;
+ }
+
+ ptr = strchr(data, '%');
+ while (ptr) {
+ p = ++ptr;
+ len = copy_delim(p, provider, ':', PATH_MAX);
+ p += len;
+ len = copy_delim(p, marker, ':', PATH_MAX);
+ p += len;
+ len = copy_delim(p, file_name, ':', PATH_MAX);
+ p += len;
+ printf("%%%s : %s \t \t (@%s)\n", provider, marker, file_name);
+ ptr = strchr(ptr, '%');
+ }
+
+ ret = munmap(data, sb.st_size);
+ if (ret == -1) {
+ pr_err("Error in munmap\n");
+ return;
+ }
+
+}
+
+void print_sdt_events(const char *arg)
+{
+ int ret;
+
+ if (arg) {
+ if (!strcmp(arg, "--scan")) {
+ ret = find_sdt_in_bins();
+ if (!ret) {
+ printf("perf sdt cache created!\n Use : ");
+ printf("\"perf list sdt\""
+ "\n to see the SDT markers\n");
+ } else {
+ pr_err("Error in creating the cache"
+ " for sdt markers\n");
+ }
+ } else {
+ pr_err("%s: Unknown argument\n", arg);
+ return;
+ }
+ } else {
+ sdt_cache__display();
+ }
+ return;
+}
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 6864661..7173c5b 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1619,6 +1619,235 @@ void kcore_extract__delete(struct kcore_extract *kce)
unlink(kce->extract_filename);
}

+static int populate_sdt_note(Elf **elf, const char *data, size_t len, int type,
+ struct sdt_note **note)
+{
+ const char *provider, *name;
+ struct sdt_note *tmp = NULL;
+ GElf_Ehdr ehdr;
+ GElf_Addr base_off = 0;
+ GElf_Shdr shdr;
+ int ret = -1;
+ int i;
+
+ union {
+ Elf64_Addr a64[3];
+ Elf32_Addr a32[3];
+ } buf;
+
+ Elf_Data dst = {
+ .d_buf = &buf, .d_type = ELF_T_ADDR, .d_version = EV_CURRENT,
+ .d_size = gelf_fsize((*elf), ELF_T_ADDR, 3, EV_CURRENT),
+ .d_off = 0, .d_align = 0
+ };
+ Elf_Data src = {
+ .d_buf = (void *) data, .d_type = ELF_T_ADDR,
+ .d_version = EV_CURRENT, .d_size = dst.d_size, .d_off = 0,
+ .d_align = 0
+ };
+
+ /* Check the type of each of the notes */
+ if (type != SDT_NOTE_TYPE)
+ goto out_err;
+
+ tmp = (struct sdt_note *)calloc(1, sizeof(struct sdt_note));
+ if (!tmp) {
+ ret = -ENOMEM;
+ goto out_err;
+ }
+
+ INIT_LIST_HEAD(&tmp->note_list);
+
+ if (len < dst.d_size + 3)
+ goto out_free_note;
+
+ /* Translation from file representation to memory representation */
+ if (gelf_xlatetom(*elf, &dst, &src,
+ elf_getident(*elf, NULL)[EI_DATA]) == NULL)
+ printf("gelf_xlatetom : %s\n", elf_errmsg(-1));
+
+ /* Populate the fields of sdt_note */
+ provider = data + dst.d_size;
+
+ name = (const char *)memchr(provider, '\0', data + len - provider);
+ if (name++ == NULL)
+ goto out_free_note;
+
+ tmp->provider = strdup(provider);
+ if (!tmp->provider) {
+ ret = -ENOMEM;
+ goto out_free_note;
+ }
+ tmp->name = strdup(name);
+ if (!tmp->name) {
+ ret = -ENOMEM;
+ goto out_free_prov;
+ }
+
+ /* Obtain the addresses */
+ if (gelf_getclass(*elf) == ELFCLASS32) {
+ for (i = 0; i < 3; i++)
+ tmp->addr.a32[i] = buf.a32[i];
+ tmp->bit32 = true;
+ } else {
+ for (i = 0; i < 3; i++)
+ tmp->addr.a64[i] = buf.a64[i];
+ tmp->bit32 = false;
+ }
+
+ /* Now Adjust the prelink effect */
+ if (!gelf_getehdr(*elf, &ehdr)) {
+ pr_debug("%s : cannot get elf header.\n", __func__);
+ ret = -EBADF;
+ goto out_free_name;
+ }
+
+ /*
+ * Find out the .stapsdt.base section.
+ * This scn will help us to handle prelinking (if present).
+ * Compare the retrieved file offset of the base section with the
+ * base address in the description of the SDT note. If its different,
+ * then accordingly, adjust the note location.
+ */
+ if (elf_section_by_name(*elf, &ehdr, &shdr, SDT_BASE_SCN, NULL)) {
+ base_off = shdr.sh_offset;
+ if (base_off) {
+ if (tmp->bit32)
+ tmp->addr.a32[0] = tmp->addr.a32[0] + base_off -
+ tmp->addr.a32[1];
+ else
+ tmp->addr.a64[0] = tmp->addr.a64[0] + base_off -
+ tmp->addr.a64[1];
+ }
+ }
+
+ *note = tmp;
+ return 0;
+
+out_free_name:
+ free(tmp->name);
+out_free_prov:
+ free(tmp->provider);
+out_free_note:
+ free(tmp);
+out_err:
+ return ret;
+}
+
+static int construct_sdt_notes_list(Elf *elf, struct list_head *sdt_notes)
+{
+ GElf_Ehdr ehdr;
+ Elf_Scn *scn = NULL;
+ Elf_Data *data;
+ GElf_Shdr shdr;
+ size_t shstrndx, next;
+ GElf_Nhdr nhdr;
+ size_t name_off, desc_off, offset;
+ struct sdt_note *tmp = NULL;
+ int ret = 0, val = 0;
+
+ if (gelf_getehdr(elf, &ehdr) == NULL) {
+ ret = -EBADF;
+ goto out_ret;
+ }
+ if (elf_getshdrstrndx(elf, &shstrndx) != 0) {
+ ret = -EBADF;
+ goto out_ret;
+ }
+
+ /* Look for the required section */
+ scn = elf_section_by_name(elf, &ehdr, &shdr, SDT_NOTE_SCN, NULL);
+ if (!scn) {
+ ret = -ENOENT;
+ goto out_ret;
+ }
+
+ if (!(shdr.sh_type == SHT_NOTE) || (shdr.sh_flags & SHF_ALLOC)) {
+ ret = -ENOENT;
+ goto out_ret;
+ }
+
+ data = elf_getdata(scn, NULL);
+
+ /* Get the SDT notes */
+ for (offset = 0; (next = gelf_getnote(data, offset, &nhdr, &name_off,
+ &desc_off)) > 0; offset = next) {
+ if (nhdr.n_namesz == sizeof(SDT_NOTE_NAME) &&
+ !memcmp(data->d_buf + name_off, SDT_NOTE_NAME,
+ sizeof(SDT_NOTE_NAME))) {
+ val = populate_sdt_note(&elf, ((data->d_buf) + desc_off),
+ nhdr.n_descsz, nhdr.n_type,
+ &tmp);
+ if (!val)
+ list_add_tail(&tmp->note_list, sdt_notes);
+ if (val == -ENOMEM) {
+ ret = val;
+ goto out_ret;
+ }
+ }
+ }
+ if (list_empty(sdt_notes))
+ ret = -ENOENT;
+
+out_ret:
+ return ret;
+}
+
+int get_sdt_note_list(struct list_head *head, const char *target)
+{
+ Elf *elf;
+ int fd, ret;
+
+ fd = open(target, O_RDONLY);
+ if (fd < 0)
+ return -errno;
+
+ symbol__elf_init();
+ elf = elf_begin(fd, ELF_C_READ, NULL);
+ if (!elf) {
+ ret = -EBADF;
+ goto out_close;
+ }
+ ret = construct_sdt_notes_list(elf, head);
+ elf_end(elf);
+
+out_close:
+ close(fd);
+ return ret;
+}
+
+/*
+ * Returns 'true' if the file is an elf and 'false' otherwise
+ */
+bool is_an_elf(char *file)
+{
+ int fd;
+ Elf *elf;
+ bool ret = true;
+
+ fd = open(file, O_RDONLY);
+ if (fd < 0) {
+ ret = false;
+ goto out_ret;
+ }
+
+ symbol__elf_init();
+ elf = elf_begin(fd, ELF_C_READ, NULL);
+ if (!elf) {
+ ret = false;
+ goto out_close;
+ }
+ if (elf_kind(elf) != ELF_K_ELF)
+ ret = false;
+
+ elf_end(elf);
+
+out_close:
+ close(fd);
+out_ret:
+ return ret;
+}
+
void symbol__elf_init(void)
{
elf_version(EV_CURRENT);
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 615c752..83be31a 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -294,4 +294,23 @@ int compare_proc_modules(const char *from, const char *to);
int setup_list(struct strlist **list, const char *list_str,
const char *list_name);

+struct sdt_note {
+ char *name;
+ char *provider;
+ bool bit32;
+ union {
+ Elf64_Addr a64[3];
+ Elf32_Addr a32[3];
+ } addr;
+ struct list_head note_list;
+};
+
+int get_sdt_note_list(struct list_head *head, const char *target);
+bool is_an_elf(char *file);
+
+#define SDT_BASE_SCN ".stapsdt.base"
+#define SDT_NOTE_SCN ".note.stapsdt"
+#define SDT_NOTE_TYPE 3
+#define SDT_NOTE_NAME "stapsdt"
+
#endif /* __PERF_SYMBOL */

2014-07-18 10:20:19

by Hemant Kumar

[permalink] [raw]
Subject: [PATCH v2 2/3] perf/sdt: Listing SDT markers for a single file

This patch enables perf to look for SDT markers in a single file.
The previous patch looks for SDT markers in a set of default paths and records
them in a SDT cache. However, not all the SDT markers are present in the
SDT cache file.
An individual file argument can be given to "perf list" to find out the SDT markers
present in that file. Usage is as below :

# perf list sdt /home/hemant/tmp

/home/hemant/tmp:
%user : foo
%user : bar

On using this command, the entries for this file name is searched in the cache, and
the build-ids are comapred. If build-ids don't match, the entries in the cache related
to this file name are modified. The previous entries will be deleted and new entries
will be added. If previously, there were no entries for a particular filename, then
entries for that file will be added and accordingly modified.

Signed-off-by : [email protected]
---
tools/perf/util/sdt.c | 317 ++++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 296 insertions(+), 21 deletions(-)

diff --git a/tools/perf/util/sdt.c b/tools/perf/util/sdt.c
index f5bfdbd..57ec767 100644
--- a/tools/perf/util/sdt.c
+++ b/tools/perf/util/sdt.c
@@ -29,6 +29,11 @@ struct path_list {
struct list_head list;
} execs;

+struct update_buf {
+ char data[2 * PATH_MAX];
+ struct list_head list;
+};
+
/* Write operation for cache */
static void write_cache(FILE *cache, char *buffer)
{
@@ -48,6 +53,239 @@ static int copy_delim(char *src, char *target, char delim, size_t size)
}

/*
+ * Takes input the beginning of a line of the cache and returns the filename,
+ * build id and the length of that line
+ */
+static int find_file_build(char *p, char *file_name, char *bid, char *end)
+{
+ int i, len;
+ char *eol, *sol = p;
+
+ /* Skip two ':' to get to the file name */
+ for (i = 0; i < 2; i++) {
+ p = strchr(p, ':');
+ p++;
+ }
+
+ len = copy_delim(p, file_name, ':', PATH_MAX);
+ p += len;
+
+ /* Just after the file name, lies the build id */
+ if (len)
+ len = copy_delim(p, bid, ':', BUILD_ID_SIZE * 2 + 1);
+
+ eol = strchr(p, '%');
+ if (!eol)
+ eol = end;
+ return eol - sol;
+}
+
+/* Prepares a list of buffers to be written into the sdt cache */
+static int prepare_buffer(const char *target, struct list_head *start,
+ char *build_id, struct list_head *update_list)
+{
+ struct update_buf *buf;
+ struct sdt_note *tmp;
+ int count, len = 0;
+
+ list_for_each_entry(tmp, start, note_list) {
+ buf = (struct update_buf *)malloc(sizeof(struct update_buf));
+ if (!buf) {
+ pr_debug("prepare_buf: Error in calloc\n");
+ return 0;
+ }
+ INIT_LIST_HEAD(&buf->list);
+ memset(buf->data, '\0', 2 * PATH_MAX);
+
+ count = sprintf(buf->data, "%%%s:%s:%s:%s:0x%lx:0x%lx",
+ tmp->provider, tmp->name, target, build_id,
+ tmp->bit32 ? tmp->addr.a32[0] :
+ tmp->addr.a64[0],
+ tmp->bit32 ? tmp->addr.a32[2] :
+ tmp->addr.a64[2]);
+ list_add(&buf->list, update_list);
+
+ len += count;
+ }
+
+ return len;
+}
+
+static void cleanup_buffer_list(struct list_head *head)
+{
+ struct update_buf *tmp, *pos;
+
+ list_for_each_entry_safe(tmp, pos, head, list) {
+ list_del(&tmp->list);
+ free(tmp);
+ }
+}
+
+/*
+ * Update the sdt cache with the new info.
+ * First part of this function searches for the entry needed to be modified.
+ * Second part focusses on modifying the lines.
+ */
+static void sdt_cache__update(struct list_head *start, const char *file)
+{
+ char *data, *ptr;
+ struct stat sb;
+ int fd, ret, i = 0, update_count = 0, count = 0;
+ char file_name[PATH_MAX], *p;
+ u8 build_id[BUILD_ID_SIZE];
+ char sbuild_id[BUILD_ID_SIZE * 2 + 1], bid[BUILD_ID_SIZE * 2 + 1];
+ int len = 0, offset = 0, final_size = 0, diff;
+ char *beg = NULL, target[PATH_MAX];
+ bool update = false;
+ struct update_buf upd, *pos;
+
+ INIT_LIST_HEAD(&upd.list);
+
+ /* Resolve the target to canonical path first */
+ memset(target, '\0', PATH_MAX);
+ if (!realpath(file, target)) {
+ pr_debug("sdt_cache__update : realpath() failed\n");
+ return;
+ }
+
+ /* Read the build id of the file */
+ if (filename__read_build_id(target, &build_id,
+ sizeof(build_id)) < 0) {
+ pr_err("Couldn't read build-id in %s\n", target);
+ return;
+ }
+ /* Convert the build id into a string */
+ build_id__sprintf(build_id, sizeof(build_id), sbuild_id);
+
+ fd = open(SDT_CACHE_DIR SDT_CACHE, O_RDWR);
+ if (fd == -1) {
+ pr_err("Error in opening %s\n", SDT_CACHE_DIR SDT_CACHE);
+ return;
+ }
+
+ ret = fstat(fd, &sb);
+ if (ret == -1) {
+ pr_err("Error in fstat\n");
+ return;
+ }
+
+ if (!S_ISREG(sb.st_mode)) {
+ pr_err("%s is not a file\n", SDT_CACHE_DIR SDT_CACHE);
+ return;
+ }
+
+ data = mmap(0, sb.st_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+ if (data == MAP_FAILED) {
+ pr_err("Error in mmap\n");
+ return;
+ }
+
+ ptr = strchr(data, '%');
+ while (ptr) {
+ p = ptr;
+ memset(file_name, '\0', PATH_MAX);
+ /* offset contains the length of one '%' to next '%' */
+ offset = find_file_build(p, file_name, bid, data + sb.st_size);
+
+ if (!strcmp(target, file_name)) {
+ if (!strcmp(bid, sbuild_id)) {
+ /* No need to go further if the build ids match*/
+ break;
+ } else {
+ /*
+ * Build ids don't match!
+ * Add the length of this line to 'len'
+ */
+ if (!count) {
+ beg = ptr;
+ update = true;
+ len += offset;
+ count++;
+ } else {
+ len += offset;
+ count++;
+ }
+ }
+ } else if (update) {
+ break;
+ }
+
+ ptr++;
+ ptr = strchr(ptr, '%');
+ }
+
+ final_size = sb.st_size;
+
+ /* For a file which isn't present in the cache already */
+ if (!update && !ptr) {
+ beg = data + sb.st_size;
+ update = true;
+ }
+ if (update) {
+ /*
+ * 'len' keeps the account of total characters neeeded to be
+ * removed. This shifts all the characters starting from 'beg'.
+ */
+ i = 0;
+ while (beg[len + i]) {
+ beg[i] = beg[len + i];
+ i++;
+ }
+ /* Clear out the end extraneous bytes */
+ for (i = 0; i <= len; i++)
+ data[sb.st_size - i] = '\0';
+
+ /* Build the string to be updated */
+ if (start)
+ update_count = prepare_buffer(target, start, sbuild_id,
+ &upd.list);
+ diff = len;
+ /* Calculate the new size of the file */
+ final_size = sb.st_size - diff + update_count;
+
+ /* Reduce/Increase the size of the file */
+ ret = truncate(SDT_CACHE_DIR SDT_CACHE, final_size);
+ if (ret == -1) {
+ pr_debug("Error in truncate\n");
+ return;
+ }
+ /* remap the cache due to the change in size */
+ data = mremap(data, sb.st_size, final_size, MREMAP_MAYMOVE);
+ if (data == MAP_FAILED) {
+ pr_debug("Error in mremap\n");
+ return;
+ }
+
+ len = 0;
+
+ if (start)
+ /* Now update that update_buf into cache */
+ list_for_each_entry(pos, &upd.list, list) {
+ strcpy(data + sb.st_size - diff + len,
+ pos->data);
+ len += strlen(pos->data);
+ }
+
+ if (!start && (data[final_size - 1] == '%'))
+ ret = truncate(SDT_CACHE_DIR SDT_CACHE, final_size - 1);
+ /* Cleanup the buffer */
+ cleanup_buffer_list(&upd.list);
+ }
+ ret = munmap(data, final_size);
+ if (ret == -1) {
+ pr_err("Error in munmap this\n");
+ return;
+ }
+ ret = close(fd);
+ if (ret == -1) {
+ pr_err("Error in close\n");
+ return;
+ }
+
+ return;
+}
+
+/*
* get_sdt_note_info() is the function actually responsible for
* flushing the SDT notes info into the "cache" file or to the
* stdout if "cache" points to NULL. Also, this function finds out
@@ -64,29 +302,38 @@ static void get_sdt_note_info(struct list_head *start, const char *target,
if (list_empty(start))
return;

- /* Read the build id of the file */
- if (filename__read_build_id(target, &build_id,
- sizeof(build_id)) < 0) {
- pr_debug("Couldn't read build-id in %s\n", target);
- return;
+ if (cache) {
+ /* Read the build id of the file */
+ if (filename__read_build_id(target, &build_id,
+ sizeof(build_id)) < 0) {
+ pr_debug("Couldn't read build-id in %s\n", target);
+ return;
+ }
+ /* Convert the build id into a string */
+ build_id__sprintf(build_id, sizeof(build_id), sbuild_id);
+ } else {
+ printf("%s :\n", target);
}
- /* Convert the build id into a string */
- build_id__sprintf(build_id, sizeof(build_id), sbuild_id);
-
list_for_each_entry(pos, start, note_list) {
- sprintf(buffer, "%%%s:%s:%s:%s:0x%lx:0x%lx",
- pos->provider, pos->name, target, sbuild_id,
- pos->bit32 ? pos->addr.a32[0] :
- pos->addr.a64[0],
- pos->bit32 ? pos->addr.a32[2] :
- pos->addr.a64[2]);
-
- /*
- * Format of any line of this sdt-cache :
- * %provider:marker:filename:build-id:location:semaphore
- */
- write_cache(cache, buffer);
+ if (cache) {
+ sprintf(buffer, "%%%s:%s:%s:%s:0x%lx:0x%lx",
+ pos->provider, pos->name, target, sbuild_id,
+ pos->bit32 ? pos->addr.a32[0] :
+ pos->addr.a64[0],
+ pos->bit32 ? pos->addr.a32[2] :
+ pos->addr.a64[2]);
+
+ /*
+ * Format of any line of this sdt-cache :
+ * %provider:marker:filename:build-id:location:semaphore
+ */
+ write_cache(cache, buffer);
+ } else {
+ printf("%%%s : %s\n", pos->provider, pos->name);
+ }
}
+ if (!cache)
+ sdt_cache__update(start, target);
}

/* Free the sdt note list */
@@ -106,6 +353,30 @@ static void cleanup_sdt_note_list(struct list_head *sdt_notes)
}

/*
+ * Error displayed in case of query of a
+ * single file for SDT markers
+ */
+static int sdt_err(int val, const char *target)
+{
+ switch (-val) {
+ case 0:
+ break;
+ case ENOENT:
+ /* Absence of SDT markers */
+ printf("%s : No SDT markers found\n", target);
+ break;
+ case EBADF:
+ printf("%s : Bad file name\n", target);
+ break;
+ default:
+ printf("%s\n", strerror(val));
+ }
+
+ return val;
+}
+
+
+/*
* filename__find_sdt() looks for sdt markers and the list is stored
* in sdt_notes. The fd passed here is the file in which the info
* about the SDT markers is filled up.
@@ -119,6 +390,10 @@ static int filename__find_sdt(const char *target, FILE *cache)
ret = get_sdt_note_list(&sdt_notes, target);
if (!ret)
get_sdt_note_info(&sdt_notes, target, cache);
+ else if (!cache) /* using cache as flag */
+ sdt_err(ret, target);
+ if (ret == -ENOENT && !cache)
+ sdt_cache__update(NULL, target);

cleanup_sdt_note_list(&sdt_notes);
return ret;
@@ -493,7 +768,7 @@ void print_sdt_events(const char *arg)
" for sdt markers\n");
}
} else {
- pr_err("%s: Unknown argument\n", arg);
+ filename__find_sdt(arg, NULL);
return;
}
} else {

2014-07-18 10:20:45

by Hemant Kumar

[permalink] [raw]
Subject: [PATCH v2 3/3] perf/sdt: Documentation

This patch adds the required documentation for perf support to SDT markers.

---
tools/perf/Documentation/SDT-markers.txt | 123 ++++++++++++++++++++++++++++++
tools/perf/Documentation/perf-list.txt | 7 +-
2 files changed, 129 insertions(+), 1 deletion(-)
create mode 100644 tools/perf/Documentation/SDT-markers.txt

diff --git a/tools/perf/Documentation/SDT-markers.txt b/tools/perf/Documentation/SDT-markers.txt
new file mode 100644
index 0000000..420ba2e
--- /dev/null
+++ b/tools/perf/Documentation/SDT-markers.txt
@@ -0,0 +1,123 @@
+Support to perf for listing the SDT markers :
+
+This helps in listing dtrace style markers(SDT) present in user space
+applications through perf. Notes/markers are placed at important places by the
+developers. They have a negligible overhead when not enabled.
+We can enable them and probe at these places and find some important information
+like the arguments' values, etc.
+
+How to add SDT markers into user applications:
+We need to have this header sys/sdt.h present.
+sys/sdt.h used is version 3.
+If not present, install systemtap-sdt-devel package (for fedora-18).
+
+A very simple example:
+
+$ cat user_app.c
+
+#include <sys/sdt.h>
+
+void main () {
+ /* ... */
+ /*
+ * user_app is the provider name
+ * test_probe is the marker name
+ */
+ STAP_PROBE(user_app, test_mark);
+ /* ... */
+}
+
+$ gcc user_app.c
+$ perf list sdt ./a.out
+./a.out:
+%user_app:test_mark
+
+A different example to show the same:
+- Create a file with .d extension and mention the probe names in it with
+provider name and marker name.
+
+$ cat probes.d
+provider user_app {
+ probe foo_start();
+ probe fun_start();
+};
+
+- Now create the probes.h and probes.o file :
+$ dtrace -C -h -s probes.d -o probes.h
+$ dtrace -C -G -s probes.d -o probes.o
+
+- A program using the markers:
+
+$ cat user_app.c
+
+#include <stdio.h>
+#include "probes.h"
+
+void foo(void)
+{
+ USER_APP_FOO_START();
+ printf("This is foo\n");
+}
+
+void fun(void)
+{
+ USER_APP_FUN_START();
+ printf("Inside fun\n");
+}
+int main(void)
+{
+ printf("In main\n");
+ foo();
+ fun();
+ return 0;
+}
+- Compile it and also provide probes.o file to linker:
+$ gcc user_app.c probes.o -o user_app
+
+- Now use perf to list the markers in the app:
+# perf list sdt ./user_app
+
+./user_app :
+%user_app:foo_start
+%user_app:fun_start
+
+Also, we can see the SDT markers present in our system in the usual binaries.
+First, scan the binaries using :
+# perf list sdt --scan
+
+Creating a cache of SDT markers...
+perf sdt cache created!
+ Use : "perf list sdt"
+ to see the SDT markers
+
+After the sdt cache file is created, use perf list to view the markers :
+# perf list sdt
+
+%rtld : init_start
+%rtld : init_complete
+%rtld : map_failed
+%rtld : map_start
+%rtld : lll_futex_wake
+...
+...
+%libgcc : unwind
+%libvirt : rpc_server_client_auth_allow
+%libvirt : rpc_server_client_auth_fail
+%libvirt : rpc_server_client_auth_deny
+
+Using "perf list sdt <file-name>" after this will update the entries related to that
+<file-name> in the cache.
+
+Also, this link provides important info regarding SDT notes:
+http://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation
+
+This link shows an example of marker probing with Systemtap:
+https://sourceware.org/systemtap/wiki/AddingUserSpaceProbingToApps
+
+- Markers in binaries :
+These SDT markers are present in the ELF in the section named
+".note.stapsdt".
+Here, the name of the marker, its provider, type, location, base
+address, semaphore address.
+We can retrieve these values using the members name_off and desc_off in
+Nhdr structure. If these are not enabled, they are present in the ELF as nop.
diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
index 6fce6a6..f146e53 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -8,7 +8,7 @@ perf-list - List all symbolic event types
SYNOPSIS
--------
[verse]
-'perf list' [hw|sw|cache|tracepoint|pmu|event_glob]
+'perf list' [hw|sw|cache|tracepoint|pmu|event_glob|sdt]

DESCRIPTION
-----------
@@ -108,6 +108,11 @@ To limit the list use:

. 'pmu' to print the kernel supplied PMU events.

+. 'sdt' to print the SDT markers present in dsos and binaries. An additional
+ argument of filename will instruct perf to look for SDT markers only in that
+ file. If that file is already present in the cache and if there is any change,
+ then the entries for that file will be updated in the cache.
+
. If none of the above is matched, it will apply the supplied glob to all
events, printing the ones that match.

Subject: Re: [PATCH v2 0/3] perf/sdt : Support for SDT markers

Hi Hemant,

(2014/07/17 14:53), Hemant Kumar wrote:
> This patchset helps in listing dtrace style markers(SDT) present in user space
> applications through perf.
> Notes/markers are placed at important places by the
> developers. They have a negligible overhead when not enabled.
> We can enable them and probe at these places and find some important information
> like the arguments' values, etc.

Thanks for your work! This actually helps us a lot :)

> We have lots of applications which use SDT markers today, like:
> Postgresql, MySql, Mozilla, Perl, Python, Java, Ruby, libvirt, QEMU, glib
>
> To add SDT markers into user applications:
> We need to have this header sys/sdt.h present.
> sys/sdt.h used is version 3.
> If not present, install systemtap-sdt-devel package (for fedora-18).
>
> Please refer to the Documentation patch to see how the SDT markers are added into
> a program.
>
> With this patchset,
> - Use perf to list the markers in the app:
> # perf list sdt ./user_app
>
> ./user_app :
> %user_app:foo_start
> %user_app:fun_start
>
> - Also, we can see the SDT markers present in our system in the usual binaries.
> These usual binaries are libraries (dsos) listed by ldconfig --print-cache and some
> binaries present in PATH environment variable.
>
> First, scan the binaries using :
> # perf list sdt --scan

At a glance, maybe we'd better have perf sdt-cache as like as perf buildid-cache
for manage sdt information. what would you think?


>
> Creating a cache of SDT markers...
> perf sdt cache created!
> Use : "perf list sdt"
> to see the SDT markers
>
> After the sdt cache file is created, use perf list to view the markers :
> # perf list sdt
>
> %rtld : init_start
> %rtld : init_complete
> %rtld : map_failed
> %rtld : map_start
> %rtld : lll_futex_wake
> ...
> ...
> %libgcc : unwind
> %libvirt : rpc_server_client_auth_allow
> %libvirt : rpc_server_client_auth_fail
> %libvirt : rpc_server_client_auth_deny

Looks good :)
It seems very useful for perf users.

Thank you,

> Also, by using "perf list", we can view all the SDT markers as events along with all
> the other events present in the system.
> Alternatively, one can view the /var/cache/perf/perf-sdt.cache directly.
>
> This link shows an example of marker probing with Systemtap:
> https://sourceware.org/systemtap/wiki/AddingUserSpaceProbingToApps
>
> Also, this link provides important info regarding SDT notes:
> http://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation
>
> - Markers in binaries :
> These SDT markers are present in the ELF in the section named
> ".note.stapsdt".
> Here, the name of the marker, its provider, type, location, base
> address, semaphore address.
> We can retrieve these values using the members name_off and desc_off in
> Nhdr structure. If these are not enabled, they are present in the ELF as nop.
>
> Changes since last series :
> - Made the changes as discussed with Masami and Namhyung.
> - Now, "perf list" will also display the SDT markers as events along with other events.
> - Made some optimazations.
>
> TODO:
> - Add support to probe these SDT markers as if they are events.
> - Recognizing arguments and support to probe on them.
> ---
>
> Hemant Kumar (3):
> This patch enables perf to list the SDT markers present in a system. It looks
> This patch enables perf to look for SDT markers in a single file.
> This patch adds the required documentation for perf support to SDT markers.
>
>
> tools/perf/Documentation/SDT-markers.txt | 123 +++++
> tools/perf/Documentation/perf-list.txt | 7
> tools/perf/Makefile.perf | 1
> tools/perf/builtin-list.c | 6
> tools/perf/util/parse-events.h | 3
> tools/perf/util/sdt.c | 778 ++++++++++++++++++++++++++++++
> tools/perf/util/symbol-elf.c | 229 +++++++++
> tools/perf/util/symbol.h | 19 +
> 8 files changed, 1164 insertions(+), 2 deletions(-)
> create mode 100644 tools/perf/Documentation/SDT-markers.txt
> create mode 100644 tools/perf/util/sdt.c
>


--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

2014-07-18 17:50:55

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] perf/sdt : Listing of SDT markers by perf

Hemant Kumar <[email protected]> writes:

First I should say supporting these probes is very useful. Thanks for
working on this.

> +
> +#define SDT_CACHE_DIR "/var/cache/perf/"

This requires running perf as root, right?

It would be better to use the $HOME cache dir, like the recent JSON patches.

> +#define SDT_CACHE "perf-sdt.cache"
> +#define SDT_CACHE_TMP "perf-sdt.cache.tmp"
> +
> +#define DELIM ':'
> +
> +struct path_list {
> + char path[PATH_MAX];
> + struct list_head list;
> +} execs;
> +
> +/* Write operation for cache */
> +static void write_cache(FILE *cache, char *buffer)
> +{
> + fprintf(cache, "%s", buffer);
> +}
> +

The function seems redundant.


> +/*
> + * get_sdt_note_info() is the function actually responsible for
> + * flushing the SDT notes info into the "cache" file or to the
> + * stdout if "cache" points to NULL. Also, this function finds out
> + * the build-id of an ELF to be written into "cache".
> + */
> +static void get_sdt_note_info(struct list_head *start, const char *target,
> + FILE *cache)
> +{
> + struct sdt_note *pos;
> + u8 build_id[BUILD_ID_SIZE];
> + char sbuild_id[BUILD_ID_SIZE * 2 + 1];
> + char buffer[2 * PATH_MAX];
> +
> + if (list_empty(start))
> + return;
> +
> + /* Read the build id of the file */
> + if (filename__read_build_id(target, &build_id,
> + sizeof(build_id)) < 0) {
> + pr_debug("Couldn't read build-id in %s\n", target);

pr_info ?

> +
> +/*
> + * Finds out the libraries present in a system as shown by the command
> + * "ldconfig --print-cache". Uses "=>" and '/' to find out the start of a
> + * dso path.
> + */

This seems like a hack. How would that handle chroot, containers etc. ?

> +static inline void append_path(char *path, struct list_head *list)
> +{
> + char *res_path = NULL;
> + struct path_list *tmp = NULL;
> +
> + res_path = (char *)malloc(sizeof(char) * PATH_MAX);
> +
> + if (!res_path)
> + return;
> +
> + memset(res_path, '\0', PATH_MAX);
> +
> + if (realpath(path, res_path) && !is_present_in_list(list, res_path)) {


O^2 algorithm ?

> +/*
> + * Obtain the list of paths from the PATH env variable
> + */

Same as above. This probably needs to be more configurable to handle
more ways to find binaries.


-Andi

--
[email protected] -- Speaking for myself only

2014-07-19 17:33:23

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] perf/sdt : Support for SDT markers


On 07/18/2014 04:53 PM, Masami Hiramatsu wrote:
> Hi Hemant,
>
> (2014/07/17 14:53), Hemant Kumar wrote:
>> This patchset helps in listing dtrace style markers(SDT) present in user space
>> applications through perf.
>> Notes/markers are placed at important places by the
>> developers. They have a negligible overhead when not enabled.
>> We can enable them and probe at these places and find some important information
>> like the arguments' values, etc.
> Thanks for your work! This actually helps us a lot :)

Thanks a lot for the appreciation. :)

>
>> We have lots of applications which use SDT markers today, like:
>> Postgresql, MySql, Mozilla, Perl, Python, Java, Ruby, libvirt, QEMU, glib
>>
>> To add SDT markers into user applications:
>> We need to have this header sys/sdt.h present.
>> sys/sdt.h used is version 3.
>> If not present, install systemtap-sdt-devel package (for fedora-18).
>>
>> Please refer to the Documentation patch to see how the SDT markers are added into
>> a program.
>>
>> With this patchset,
>> - Use perf to list the markers in the app:
>> # perf list sdt ./user_app
>>
>> ./user_app :
>> %user_app:foo_start
>> %user_app:fun_start
>>
>> - Also, we can see the SDT markers present in our system in the usual binaries.
>> These usual binaries are libraries (dsos) listed by ldconfig --print-cache and some
>> binaries present in PATH environment variable.
>>
>> First, scan the binaries using :
>> # perf list sdt --scan
> At a glance, maybe we'd better have perf sdt-cache as like as perf buildid-cache
> for manage sdt information. what would you think?
>

I agree with you having perf sdt-cache similar to perf buildid-cache.
But I think if the functionality of perf sdt-cache is only to build the
cache, then we can
go with the perf list sdt --scan. Since, "perf list sdt" is used for
other purposes too, it
should be less confusing for the users to just add another option
(--scan) to create/modify
the cache. What do you suggest?

>> Creating a cache of SDT markers...
>> perf sdt cache created!
>> Use : "perf list sdt"
>> to see the SDT markers
>>
>> After the sdt cache file is created, use perf list to view the markers :
>> # perf list sdt
>>
>> %rtld : init_start
>> %rtld : init_complete
>> %rtld : map_failed
>> %rtld : map_start
>> %rtld : lll_futex_wake
>> ...
>> ...
>> %libgcc : unwind
>> %libvirt : rpc_server_client_auth_allow
>> %libvirt : rpc_server_client_auth_fail
>> %libvirt : rpc_server_client_auth_deny
> Looks good :)
> It seems very useful for perf users.
>
> Thank you,

Thanks a lot for going through this.

--
Thanks,
Hemant Kumar

Subject: Re: Re: [PATCH v2 0/3] perf/sdt : Support for SDT markers

(2014/07/20 2:32), Hemant Kumar wrote:
>>> We have lots of applications which use SDT markers today, like:
>>> Postgresql, MySql, Mozilla, Perl, Python, Java, Ruby, libvirt, QEMU, glib
>>>
>>> To add SDT markers into user applications:
>>> We need to have this header sys/sdt.h present.
>>> sys/sdt.h used is version 3.
>>> If not present, install systemtap-sdt-devel package (for fedora-18).
>>>
>>> Please refer to the Documentation patch to see how the SDT markers are added into
>>> a program.
>>>
>>> With this patchset,
>>> - Use perf to list the markers in the app:
>>> # perf list sdt ./user_app
>>>
>>> ./user_app :
>>> %user_app:foo_start
>>> %user_app:fun_start
>>>
>>> - Also, we can see the SDT markers present in our system in the usual binaries.
>>> These usual binaries are libraries (dsos) listed by ldconfig --print-cache and some
>>> binaries present in PATH environment variable.
>>>
>>> First, scan the binaries using :
>>> # perf list sdt --scan
>> At a glance, maybe we'd better have perf sdt-cache as like as perf buildid-cache
>> for manage sdt information. what would you think?
>>
>
> I agree with you having perf sdt-cache similar to perf buildid-cache.
> But I think if the functionality of perf sdt-cache is only to build the
> cache, then we can
> go with the perf list sdt --scan. Since, "perf list sdt" is used for
> other purposes too, it
> should be less confusing for the users to just add another option
> (--scan) to create/modify
> the cache. What do you suggest?

I think there may be some other cases, for example adding user local build
binary to the cache, or remove/update it locally. :)

And also, in user's mental model of perf-list, it doesn't take an "active"
action, that always does "passive" action. So adding such "active" scan option
will be more confusing.

But I also think it is OK that if the sdt is never scanned, the perf-list
automatically scans in background (without any option) or suggests user
to run "perf sdt-cache --scan". (it depends on how long time it may take)

To summarize it, I'd like to suggest adding below functions;

perf list sdt : shows all cached SDT events
perf list sdt <file> : shows SDT events in <file>
perf sdt-cache --scan/-s : scans all system binaries/libraries + added files
perf sdt-cache --add/-a <file(s)> : add SDT events in <file> to cache
perf sdt-cache --remove/-r <file(s)>: remove SDT events in <file> from cache

And if perf list can't find sdt-cache, it would suggest to run
perf sdt-cache --scan or run it silently. :)

What would you think about this?

Thank you,

--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

Subject: Re: [PATCH v2 1/3] perf/sdt : Listing of SDT markers by perf

(2014/07/19 2:50), Andi Kleen wrote:
> Hemant Kumar <[email protected]> writes:
>
> First I should say supporting these probes is very useful. Thanks for
> working on this.
>
>> +
>> +#define SDT_CACHE_DIR "/var/cache/perf/"
>
> This requires running perf as root, right?
>
> It would be better to use the $HOME cache dir, like the recent JSON patches.

Even though, we have to enhance uprobe-tracer that allows non-root user
to add/remove events. :)
Anyway, I guess we can use ~/.debug for this purpose as buildid-cache does.

Thank you,


--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

2014-07-21 02:29:43

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] perf/sdt : Support for SDT markers

Hi Masami and Hemant,

On Sun, 20 Jul 2014 12:16:46 +0900, Masami Hiramatsu wrote:
> (2014/07/20 2:32), Hemant Kumar wrote:
>>>> We have lots of applications which use SDT markers today, like:
>>>> Postgresql, MySql, Mozilla, Perl, Python, Java, Ruby, libvirt, QEMU, glib
>>>>
>>>> To add SDT markers into user applications:
>>>> We need to have this header sys/sdt.h present.
>>>> sys/sdt.h used is version 3.
>>>> If not present, install systemtap-sdt-devel package (for fedora-18).
>>>>
>>>> Please refer to the Documentation patch to see how the SDT markers are added into
>>>> a program.
>>>>
>>>> With this patchset,
>>>> - Use perf to list the markers in the app:
>>>> # perf list sdt ./user_app
>>>>
>>>> ./user_app :
>>>> %user_app:foo_start
>>>> %user_app:fun_start
>>>>
>>>> - Also, we can see the SDT markers present in our system in the usual binaries.
>>>> These usual binaries are libraries (dsos) listed by ldconfig --print-cache and some
>>>> binaries present in PATH environment variable.
>>>>
>>>> First, scan the binaries using :
>>>> # perf list sdt --scan
>>> At a glance, maybe we'd better have perf sdt-cache as like as perf buildid-cache
>>> for manage sdt information. what would you think?
>>>
>>
>> I agree with you having perf sdt-cache similar to perf buildid-cache.
>> But I think if the functionality of perf sdt-cache is only to build the
>> cache, then we can
>> go with the perf list sdt --scan. Since, "perf list sdt" is used for
>> other purposes too, it
>> should be less confusing for the users to just add another option
>> (--scan) to create/modify
>> the cache. What do you suggest?
>
> I think there may be some other cases, for example adding user local build
> binary to the cache, or remove/update it locally. :)
>
> And also, in user's mental model of perf-list, it doesn't take an "active"
> action, that always does "passive" action. So adding such "active" scan option
> will be more confusing.
>
> But I also think it is OK that if the sdt is never scanned, the perf-list
> automatically scans in background (without any option) or suggests user
> to run "perf sdt-cache --scan". (it depends on how long time it may take)
>
> To summarize it, I'd like to suggest adding below functions;
>
> perf list sdt : shows all cached SDT events
> perf list sdt <file> : shows SDT events in <file>
> perf sdt-cache --scan/-s : scans all system binaries/libraries + added files
> perf sdt-cache --add/-a <file(s)> : add SDT events in <file> to cache
> perf sdt-cache --remove/-r <file(s)>: remove SDT events in <file> from cache
>
> And if perf list can't find sdt-cache, it would suggest to run
> perf sdt-cache --scan or run it silently. :)
>
> What would you think about this?

I agree with Masami's idea. Maybe we can add a config option for the
auto-scan thing.

And I think this series can be split to pieces for ease review. :)
IOW, please consider make it something like below steps:

1. add raw SDT parsing code
2. support perf list to print SDT markers from a single file
3. implement perf sdt-cache like Masami suggested
4. improve perf list sdt to use cached result


I guess it's easy to make a progress on step 1 and 2. And we can
discuss further how to shape/improve step 3 and 4 later.

Thanks,
Namhyung

2014-07-21 02:38:43

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] perf/sdt : Listing of SDT markers by perf

Hi Andi,

On Fri, 18 Jul 2014 10:50:45 -0700, Andi Kleen wrote:
> Hemant Kumar <[email protected]> writes:
>> +/*
>> + * Finds out the libraries present in a system as shown by the command
>> + * "ldconfig --print-cache". Uses "=>" and '/' to find out the start of a
>> + * dso path.
>> + */
>
> This seems like a hack. How would that handle chroot, containers
> etc. ?

[SNIP]
>> +/*
>> + * Obtain the list of paths from the PATH env variable
>> + */
>
> Same as above. This probably needs to be more configurable to handle
> more ways to find binaries.

Agreed. What about not to be smart? IOW, just let users specify
directories and/or files to be scanned. Maybe we can use it like:

perf list sdt --scan $PATH

or

perf sdt-cache --scan /lib:/lib64


We can add some wrapper or default directory later if needed.


Thanks,
Namhyung

2014-07-21 03:01:39

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] perf/sdt : Listing of SDT markers by perf

Hi Hemant,

On Thu, 17 Jul 2014 11:25:12 +0530, Hemant Kumar wrote:
> This patch enables perf to list the SDT markers present in a system. It looks
> in dsos given by ldconfig --print-cache and for other binaries, it looks into
> the PATH environment variable. After preparing a list of the binaries, then
> it starts searching for SDT markers in them.
> To find the SDT markers, first an elf section named .note.stapsdt is searched
> for. And then the SDT notes are retreived one by one from that section.
> To counter the effect of prelinking, the section ".stapsdt.base" is searched.
> If its found, then the location of the SDT marker is adjusted.
>
> All these markers' info is written into a cache file
> "/var/cache/perf/perf-sdt.cache".
> Since, the presence of SDT markers is quite common these days, hence, its better
> to make them visible to a user easily. Also, creating a cache file will help a user
> to probe (to be implemented) these markers without much hussle. This cache file will
> hold most of the SDT markers.
>
> The format of each SDT cache entry is -
>
> %provider:marker:file_path:build_id:location:semaphore_loc
>
> % - marks the beginning of each entry.
> provider - The provider name of the SDT marker.
> marker - The marker name of the SDT marker.
> file_path - Full/absolute path of the file in which this marker is present.
> location : Adjusted location of the SDT marker inside the program.
> semaphore_loc : The semaphore address if present otherwise 0x0.
>
> This format should help when probing will be implemented. The adjusted address
> from the required entry can be directly used in probing if the build_id matches.
>
> To scan the system for SDT markers invoke :
> # perf list sdt --scan
>
> "--scan" should be used for the first time and whenever there is any change in
> the files containing the SDT markers. It looks into the common binaries available
> in a system.
>
> And then use :
> # perf list sdt
>
> This displays the list of SDT markers recorded in the SDT cache.
> This shows the SDT markers present in the common binaries stored in the system,
> present in PATH variable and the /lib/ and /lib64/ directories.
>
> Or use:
> # perf list
>
> It should display all events including the SDT events.
>
> Signed-off-by : [email protected]

Missing your real name in the sob line. (other patchse too)


> ---
> tools/perf/Makefile.perf | 1
> tools/perf/builtin-list.c | 6
> tools/perf/util/parse-events.h | 3
> tools/perf/util/sdt.c | 503 ++++++++++++++++++++++++++++++++++++++++
> tools/perf/util/symbol-elf.c | 229 ++++++++++++++++++
> tools/perf/util/symbol.h | 19 ++
> 6 files changed, 760 insertions(+), 1 deletion(-)
> create mode 100644 tools/perf/util/sdt.c
>
> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
> index 9670a16..e098dcd 100644
> --- a/tools/perf/Makefile.perf
> +++ b/tools/perf/Makefile.perf
> @@ -373,6 +373,7 @@ LIB_OBJS += $(OUTPUT)util/stat.o
> LIB_OBJS += $(OUTPUT)util/record.o
> LIB_OBJS += $(OUTPUT)util/srcline.o
> LIB_OBJS += $(OUTPUT)util/data.o
> +LIB_OBJS += $(OUTPUT)util/sdt.o
>
> LIB_OBJS += $(OUTPUT)ui/setup.o
> LIB_OBJS += $(OUTPUT)ui/helpline.o
> diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c
> index 011195e..e1e654b 100644
> --- a/tools/perf/builtin-list.c
> +++ b/tools/perf/builtin-list.c
> @@ -23,7 +23,7 @@ int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused)
> OPT_END()
> };
> const char * const list_usage[] = {
> - "perf list [hw|sw|cache|tracepoint|pmu|event_glob]",
> + "perf list [hw|sw|cache|tracepoint|pmu|event_glob|sdt]",

Hmm.. I think it'd be better like below

"perf list [hw|sw|cache|tracepoint|pmu|sdt|<event_glob>]",


> NULL
> };
>
> @@ -34,6 +34,8 @@ int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused)
>
> if (argc == 0) {
> print_events(NULL, false);
> + printf("\n\nSDT events :\n");
> + sdt_cache__display();

What about make print_events() to print SDT events also?


> return 0;
> }
>
> @@ -55,6 +57,8 @@ int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused)
> print_pmu_events(NULL, false);
> else if (strcmp(argv[i], "--raw-dump") == 0)
> print_events(NULL, true);
> + else if (strcmp(argv[i], "sdt") == 0)
> + print_sdt_events(argv[++i]);

Please move this above the --raw-dump block, it's a special marker to
help shell completion for event names so I'd like to keep it last.


> else {
> char *sep = strchr(argv[i], ':'), *s;
> int sep_idx;


[SNIP]
> +/*
> + * Finds out the libraries present in a system as shown by the command
> + * "ldconfig --print-cache". Uses "=>" and '/' to find out the start of a
> + * dso path.
> + */

If we received the list of libraries from user directly, it can go
away IMHO.


> +static char *parse_lib_name(char *str)
> +{
> + char *ptr, *q, *r;
> +
> + while (str != NULL) {
> + /* look for "=>" and then '/' */
> + ptr = strstr(str, "=>");
> + if (ptr) {
> + ptr++;
> + q = strchr(ptr, '/');
> + if (!q)
> + return NULL;
> + r = strchr(ptr, '\n');
> + *r = '\0';
> + return q;
> + } else if (ptr == NULL) {
> + return NULL;
> + } else {
> + str = ptr + 1;
> + continue;
> + }
> + }
> +
> + return NULL;
> +}
> +
> +/*
> + * Checks if a path is already present in the list.
> + * Returns 'true' if present and 'false' otherwise.
> + */
> +static bool is_present_in_list(struct list_head *path_list, char *path)
> +{
> + struct path_list *tmp;
> +
> + list_for_each_entry(tmp, path_list, list) {
> + if (!strcmp(path, tmp->path))
> + return true;
> + }

As Andi pointed out, you can use a hashtable for this.


> +
> + return false;
> +}
> +
> +static inline void append_path(char *path, struct list_head *list)
> +{
> + char *res_path = NULL;
> + struct path_list *tmp = NULL;
> +
> + res_path = (char *)malloc(sizeof(char) * PATH_MAX);
> +
> + if (!res_path)
> + return;
> +
> + memset(res_path, '\0', PATH_MAX);
> +
> + if (realpath(path, res_path) && !is_present_in_list(list, res_path)) {
> + tmp = (struct path_list *) malloc(sizeof(struct path_list));

Hmm... why not reusing res_path and make struct path_list to just have a
pointer? Also you can use readpath(path, NULL) rather than allocating
a PATH_MAX buffer and zero'ing it (can use calloc for it anyway).


> + if (!tmp) {
> + free(res_path);
> + return;
> + }
> + strcpy(tmp->path, res_path);
> + list_add(&(tmp->list), list);
> + if (res_path)
> + free(res_path);
> + }
> +}


[SNIP]
> +/*
> + * Go through all the files inside "path".
> + * But don't go into sub-directories.
> + */
> +static void walk_through_dir(char *path)
> +{
> + struct dirent *entry;
> + DIR *dir;
> + struct stat sb;
> + int ret = 0;
> + char *swd;
> +
> + dir = opendir(path);
> + if (!dir)
> + return;
> +
> + /* save the current working directory */
> + swd = getcwd(NULL, 0);
> + if (!swd) {
> + pr_err("getcwd : error");
> + return;
> + }
> +
> + ret = chdir(path);
> + if (ret) {
> + pr_err("chdir : error in %s", path);
> + return;
> + }

Is this really needed? I guess opendir() already covers possible
failure cases, if not, we might use stat() anyway.


> + while ((entry = readdir(dir)) != NULL) {
> +
> + ret = stat(entry->d_name, &sb);
> + if (ret == -1) {
> + pr_debug("%s : error in stat!\n", entry->d_name);
> + continue;
> + }
> +
> + /* Not pursuing sub-directories */
> + if ((sb.st_mode & S_IFMT) != S_IFDIR)
> + if (sb.st_mode & S_IXUSR)
> + append_path(entry->d_name, &execs.list);
> + }
> +
> + closedir(dir);
> +
> + /* return to the saved working directory */
> + ret = chdir(swd);
> + if (ret) {
> + pr_err("chdir : error");
> + return;
> + }
> +}


[SNIP]
> --- a/tools/perf/util/symbol-elf.c
> +++ b/tools/perf/util/symbol-elf.c
> @@ -1619,6 +1619,235 @@ void kcore_extract__delete(struct kcore_extract *kce)
> unlink(kce->extract_filename);
> }
>

Like I said earlier in a different thread, the below code can be a
sepatate change.

Thanks,
Namhyung


> +static int populate_sdt_note(Elf **elf, const char *data, size_t len, int type,
> + struct sdt_note **note)
> +{
> + const char *provider, *name;
> + struct sdt_note *tmp = NULL;
> + GElf_Ehdr ehdr;
> + GElf_Addr base_off = 0;
> + GElf_Shdr shdr;
> + int ret = -1;
> + int i;
> +
> + union {
> + Elf64_Addr a64[3];
> + Elf32_Addr a32[3];
> + } buf;
> +
> + Elf_Data dst = {
> + .d_buf = &buf, .d_type = ELF_T_ADDR, .d_version = EV_CURRENT,
> + .d_size = gelf_fsize((*elf), ELF_T_ADDR, 3, EV_CURRENT),
> + .d_off = 0, .d_align = 0
> + };
> + Elf_Data src = {
> + .d_buf = (void *) data, .d_type = ELF_T_ADDR,
> + .d_version = EV_CURRENT, .d_size = dst.d_size, .d_off = 0,
> + .d_align = 0
> + };
> +
> + /* Check the type of each of the notes */
> + if (type != SDT_NOTE_TYPE)
> + goto out_err;
> +
> + tmp = (struct sdt_note *)calloc(1, sizeof(struct sdt_note));
> + if (!tmp) {
> + ret = -ENOMEM;
> + goto out_err;
> + }
> +
> + INIT_LIST_HEAD(&tmp->note_list);
> +
> + if (len < dst.d_size + 3)
> + goto out_free_note;
> +
> + /* Translation from file representation to memory representation */
> + if (gelf_xlatetom(*elf, &dst, &src,
> + elf_getident(*elf, NULL)[EI_DATA]) == NULL)
> + printf("gelf_xlatetom : %s\n", elf_errmsg(-1));
> +
> + /* Populate the fields of sdt_note */
> + provider = data + dst.d_size;
> +
> + name = (const char *)memchr(provider, '\0', data + len - provider);
> + if (name++ == NULL)
> + goto out_free_note;
> +
> + tmp->provider = strdup(provider);
> + if (!tmp->provider) {
> + ret = -ENOMEM;
> + goto out_free_note;
> + }
> + tmp->name = strdup(name);
> + if (!tmp->name) {
> + ret = -ENOMEM;
> + goto out_free_prov;
> + }
> +
> + /* Obtain the addresses */
> + if (gelf_getclass(*elf) == ELFCLASS32) {
> + for (i = 0; i < 3; i++)
> + tmp->addr.a32[i] = buf.a32[i];
> + tmp->bit32 = true;
> + } else {
> + for (i = 0; i < 3; i++)
> + tmp->addr.a64[i] = buf.a64[i];
> + tmp->bit32 = false;
> + }
> +
> + /* Now Adjust the prelink effect */
> + if (!gelf_getehdr(*elf, &ehdr)) {
> + pr_debug("%s : cannot get elf header.\n", __func__);
> + ret = -EBADF;
> + goto out_free_name;
> + }
> +
> + /*
> + * Find out the .stapsdt.base section.
> + * This scn will help us to handle prelinking (if present).
> + * Compare the retrieved file offset of the base section with the
> + * base address in the description of the SDT note. If its different,
> + * then accordingly, adjust the note location.
> + */
> + if (elf_section_by_name(*elf, &ehdr, &shdr, SDT_BASE_SCN, NULL)) {
> + base_off = shdr.sh_offset;
> + if (base_off) {
> + if (tmp->bit32)
> + tmp->addr.a32[0] = tmp->addr.a32[0] + base_off -
> + tmp->addr.a32[1];
> + else
> + tmp->addr.a64[0] = tmp->addr.a64[0] + base_off -
> + tmp->addr.a64[1];
> + }
> + }
> +
> + *note = tmp;
> + return 0;
> +
> +out_free_name:
> + free(tmp->name);
> +out_free_prov:
> + free(tmp->provider);
> +out_free_note:
> + free(tmp);
> +out_err:
> + return ret;
> +}
> +
> +static int construct_sdt_notes_list(Elf *elf, struct list_head *sdt_notes)
> +{
> + GElf_Ehdr ehdr;
> + Elf_Scn *scn = NULL;
> + Elf_Data *data;
> + GElf_Shdr shdr;
> + size_t shstrndx, next;
> + GElf_Nhdr nhdr;
> + size_t name_off, desc_off, offset;
> + struct sdt_note *tmp = NULL;
> + int ret = 0, val = 0;
> +
> + if (gelf_getehdr(elf, &ehdr) == NULL) {
> + ret = -EBADF;
> + goto out_ret;
> + }
> + if (elf_getshdrstrndx(elf, &shstrndx) != 0) {
> + ret = -EBADF;
> + goto out_ret;
> + }
> +
> + /* Look for the required section */
> + scn = elf_section_by_name(elf, &ehdr, &shdr, SDT_NOTE_SCN, NULL);
> + if (!scn) {
> + ret = -ENOENT;
> + goto out_ret;
> + }
> +
> + if (!(shdr.sh_type == SHT_NOTE) || (shdr.sh_flags & SHF_ALLOC)) {
> + ret = -ENOENT;
> + goto out_ret;
> + }
> +
> + data = elf_getdata(scn, NULL);
> +
> + /* Get the SDT notes */
> + for (offset = 0; (next = gelf_getnote(data, offset, &nhdr, &name_off,
> + &desc_off)) > 0; offset = next) {
> + if (nhdr.n_namesz == sizeof(SDT_NOTE_NAME) &&
> + !memcmp(data->d_buf + name_off, SDT_NOTE_NAME,
> + sizeof(SDT_NOTE_NAME))) {
> + val = populate_sdt_note(&elf, ((data->d_buf) + desc_off),
> + nhdr.n_descsz, nhdr.n_type,
> + &tmp);
> + if (!val)
> + list_add_tail(&tmp->note_list, sdt_notes);
> + if (val == -ENOMEM) {
> + ret = val;
> + goto out_ret;
> + }
> + }
> + }
> + if (list_empty(sdt_notes))
> + ret = -ENOENT;
> +
> +out_ret:
> + return ret;
> +}
> +
> +int get_sdt_note_list(struct list_head *head, const char *target)
> +{
> + Elf *elf;
> + int fd, ret;
> +
> + fd = open(target, O_RDONLY);
> + if (fd < 0)
> + return -errno;
> +
> + symbol__elf_init();
> + elf = elf_begin(fd, ELF_C_READ, NULL);
> + if (!elf) {
> + ret = -EBADF;
> + goto out_close;
> + }
> + ret = construct_sdt_notes_list(elf, head);
> + elf_end(elf);
> +
> +out_close:
> + close(fd);
> + return ret;
> +}
> +
> +/*
> + * Returns 'true' if the file is an elf and 'false' otherwise
> + */
> +bool is_an_elf(char *file)
> +{
> + int fd;
> + Elf *elf;
> + bool ret = true;
> +
> + fd = open(file, O_RDONLY);
> + if (fd < 0) {
> + ret = false;
> + goto out_ret;
> + }
> +
> + symbol__elf_init();
> + elf = elf_begin(fd, ELF_C_READ, NULL);
> + if (!elf) {
> + ret = false;
> + goto out_close;
> + }
> + if (elf_kind(elf) != ELF_K_ELF)
> + ret = false;
> +
> + elf_end(elf);
> +
> +out_close:
> + close(fd);
> +out_ret:
> + return ret;
> +}
> +
> void symbol__elf_init(void)
> {
> elf_version(EV_CURRENT);
> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> index 615c752..83be31a 100644
> --- a/tools/perf/util/symbol.h
> +++ b/tools/perf/util/symbol.h
> @@ -294,4 +294,23 @@ int compare_proc_modules(const char *from, const char *to);
> int setup_list(struct strlist **list, const char *list_str,
> const char *list_name);
>
> +struct sdt_note {
> + char *name;
> + char *provider;
> + bool bit32;
> + union {
> + Elf64_Addr a64[3];
> + Elf32_Addr a32[3];
> + } addr;
> + struct list_head note_list;
> +};
> +
> +int get_sdt_note_list(struct list_head *head, const char *target);
> +bool is_an_elf(char *file);
> +
> +#define SDT_BASE_SCN ".stapsdt.base"
> +#define SDT_NOTE_SCN ".note.stapsdt"
> +#define SDT_NOTE_TYPE 3
> +#define SDT_NOTE_NAME "stapsdt"
> +
> #endif /* __PERF_SYMBOL */

2014-07-21 09:41:11

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] perf/sdt : Listing of SDT markers by perf

Hi Andi and Namhyung,

On 07/21/2014 08:08 AM, Namhyung Kim wrote:
> Hi Andi,
>
> On Fri, 18 Jul 2014 10:50:45 -0700, Andi Kleen wrote:
>> Hemant Kumar <[email protected]> writes:
>>> +/*
>>> + * Finds out the libraries present in a system as shown by the command
>>> + * "ldconfig --print-cache". Uses "=>" and '/' to find out the start of a
>>> + * dso path.
>>> + */
>> This seems like a hack. How would that handle chroot, containers
>> etc. ?
> [SNIP]
>>> +/*
>>> + * Obtain the list of paths from the PATH env variable
>>> + */
>> Same as above. This probably needs to be more configurable to handle
>> more ways to find binaries.
> Agreed. What about not to be smart? IOW, just let users specify
> directories and/or files to be scanned. Maybe we can use it like:
>
> perf list sdt --scan $PATH
>
> or
>
> perf sdt-cache --scan /lib:/lib64

Agreed, that we need to have options to add more binaries present in
directories other than the default directories.

>
> We can add some wrapper or default directory later if needed.
>
>

But wasn't the whole point of changing to this patchset was to make SDT
events more
prominent, i.e., to display the SDTs present in the whole system?

So, what I am trying to say is that, we already have the PATH bins and
libs' dsos, if
we need to add more, we can use :
perf sdt-cache --scan [dir1][:dir2]...

What would you suggest we do?

--
Thanks,
Hemant Kumar

2014-07-21 12:24:57

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] perf/sdt : Support for SDT markers


On 07/20/2014 08:46 AM, Masami Hiramatsu wrote:
> (2014/07/20 2:32), Hemant Kumar wrote:
>>>> [SNIP]
>>>> First, scan the binaries using :
>>>> # perf list sdt --scan
>>> At a glance, maybe we'd better have perf sdt-cache as like as perf buildid-cache
>>> for manage sdt information. what would you think?
>>>
>> I agree with you having perf sdt-cache similar to perf buildid-cache.
>> But I think if the functionality of perf sdt-cache is only to build the
>> cache, then we can
>> go with the perf list sdt --scan. Since, "perf list sdt" is used for
>> other purposes too, it
>> should be less confusing for the users to just add another option
>> (--scan) to create/modify
>> the cache. What do you suggest?
> I think there may be some other cases, for example adding user local build
> binary to the cache, or remove/update it locally. :)
>
> And also, in user's mental model of perf-list, it doesn't take an "active"
> action, that always does "passive" action. So adding such "active" scan option
> will be more confusing.

Ok, I understand now.

> But I also think it is OK that if the sdt is never scanned, the perf-list
> automatically scans in background (without any option) or suggests user
> to run "perf sdt-cache --scan". (it depends on how long time it may take)
>
> To summarize it, I'd like to suggest adding below functions;
>
> perf list sdt : shows all cached SDT events
> perf list sdt <file> : shows SDT events in <file>
> perf sdt-cache --scan/-s : scans all system binaries/libraries + added files
> perf sdt-cache --add/-a <file(s)> : add SDT events in <file> to cache
> perf sdt-cache --remove/-r <file(s)>: remove SDT events in <file> from cache

Yeah, I agree with the above mentioned functions.

So, according to this, if perf list sdt <file> can't find the SDT events
for that file
in the SDT cache, should it say "use perf sdt-cache --add <file> to add
the SDT
events for that file to the cache", or silently, should add that file's
SDT events
to SDT cache?

> And if perf list can't find sdt-cache, it would suggest to run
> perf sdt-cache --scan or run it silently. :)
>
> [...]

--
Thanks,
Hemant Kumar

Subject: Re: [PATCH v2 0/3] perf/sdt : Support for SDT markers

(2014/07/21 21:24), Hemant Kumar wrote:
>
> On 07/20/2014 08:46 AM, Masami Hiramatsu wrote:
>> (2014/07/20 2:32), Hemant Kumar wrote:
>>>>> [SNIP]
>>>>> First, scan the binaries using :
>>>>> # perf list sdt --scan
>>>> At a glance, maybe we'd better have perf sdt-cache as like as perf buildid-cache
>>>> for manage sdt information. what would you think?
>>>>
>>> I agree with you having perf sdt-cache similar to perf buildid-cache.
>>> But I think if the functionality of perf sdt-cache is only to build the
>>> cache, then we can
>>> go with the perf list sdt --scan. Since, "perf list sdt" is used for
>>> other purposes too, it
>>> should be less confusing for the users to just add another option
>>> (--scan) to create/modify
>>> the cache. What do you suggest?
>> I think there may be some other cases, for example adding user local build
>> binary to the cache, or remove/update it locally. :)
>>
>> And also, in user's mental model of perf-list, it doesn't take an "active"
>> action, that always does "passive" action. So adding such "active" scan option
>> will be more confusing.
>
> Ok, I understand now.
>
>> But I also think it is OK that if the sdt is never scanned, the perf-list
>> automatically scans in background (without any option) or suggests user
>> to run "perf sdt-cache --scan". (it depends on how long time it may take)
>>
>> To summarize it, I'd like to suggest adding below functions;
>>
>> perf list sdt : shows all cached SDT events
>> perf list sdt <file> : shows SDT events in <file>
>> perf sdt-cache --scan/-s : scans all system binaries/libraries + added files
>> perf sdt-cache --add/-a <file(s)> : add SDT events in <file> to cache
>> perf sdt-cache --remove/-r <file(s)>: remove SDT events in <file> from cache
>
> Yeah, I agree with the above mentioned functions.
>
> So, according to this, if perf list sdt <file> can't find the SDT events
> for that file
> in the SDT cache, should it say "use perf sdt-cache --add <file> to add
> the SDT
> events for that file to the cache", or silently, should add that file's
> SDT events
> to SDT cache?

Hmm, it's a good question. Since the SDT events will be used as a normal
events, perf-record may NOT take an option of execfile in where SDT events are.
This means that if the given events are not cached, perf-record always fails.
Thus, I think we have 2 options, one is just removing "perf-list sdt <file>"
support, or another is "perf-list sdt <file>" silently caches the SDT in <file>.

IMHO, at the first version, we'd better just removes "perf-list sdt <file>"
support, since it is very simple model. And if someone asks supporting that,
we can add that afterwords as an enhancement. ;)

Thank you,


--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]

2014-07-22 11:33:51

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] perf/sdt : Listing of SDT markers by perf

Hi Namhyung,

On 07/21/2014 08:31 AM, Namhyung Kim wrote:
> Hi Hemant,
>
> On Thu, 17 Jul 2014 11:25:12 +0530, Hemant Kumar wrote:
>> This patch enables perf to list the SDT markers present in a system. It looks
>> in dsos given by ldconfig --print-cache and for other binaries, it looks into
>> the PATH environment variable. After preparing a list of the binaries, then
>> it starts searching for SDT markers in them.
>> To find the SDT markers, first an elf section named .note.stapsdt is searched
>> for. And then the SDT notes are retreived one by one from that section.
>> To counter the effect of prelinking, the section ".stapsdt.base" is searched.
>> If its found, then the location of the SDT marker is adjusted.
>>
>> All these markers' info is written into a cache file
>> "/var/cache/perf/perf-sdt.cache".
>> Since, the presence of SDT markers is quite common these days, hence, its better
>> to make them visible to a user easily. Also, creating a cache file will help a user
>> to probe (to be implemented) these markers without much hussle. This cache file will
>> hold most of the SDT markers.
>>
>> The format of each SDT cache entry is -
>>
>> %provider:marker:file_path:build_id:location:semaphore_loc
>>
>> % - marks the beginning of each entry.
>> provider - The provider name of the SDT marker.
>> marker - The marker name of the SDT marker.
>> file_path - Full/absolute path of the file in which this marker is present.
>> location : Adjusted location of the SDT marker inside the program.
>> semaphore_loc : The semaphore address if present otherwise 0x0.
>>
>> This format should help when probing will be implemented. The adjusted address
>> from the required entry can be directly used in probing if the build_id matches.
>>
>> To scan the system for SDT markers invoke :
>> # perf list sdt --scan
>>
>> "--scan" should be used for the first time and whenever there is any change in
>> the files containing the SDT markers. It looks into the common binaries available
>> in a system.
>>
>> And then use :
>> # perf list sdt
>>
>> This displays the list of SDT markers recorded in the SDT cache.
>> This shows the SDT markers present in the common binaries stored in the system,
>> present in PATH variable and the /lib/ and /lib64/ directories.
>>
>> Or use:
>> # perf list
>>
>> It should display all events including the SDT events.
>>
>> Signed-off-by : [email protected]
> Missing your real name in the sob line. (other patchse too)

Oops. Will add that!

>
>> ---
>> tools/perf/Makefile.perf | 1
>> tools/perf/builtin-list.c | 6
>> tools/perf/util/parse-events.h | 3
>> tools/perf/util/sdt.c | 503 ++++++++++++++++++++++++++++++++++++++++
>> tools/perf/util/symbol-elf.c | 229 ++++++++++++++++++
>> tools/perf/util/symbol.h | 19 ++
>> 6 files changed, 760 insertions(+), 1 deletion(-)
>> create mode 100644 tools/perf/util/sdt.c
>>
>> diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
>> index 9670a16..e098dcd 100644
>> --- a/tools/perf/Makefile.perf
>> +++ b/tools/perf/Makefile.perf
>> @@ -373,6 +373,7 @@ LIB_OBJS += $(OUTPUT)util/stat.o
>> LIB_OBJS += $(OUTPUT)util/record.o
>> LIB_OBJS += $(OUTPUT)util/srcline.o
>> LIB_OBJS += $(OUTPUT)util/data.o
>> +LIB_OBJS += $(OUTPUT)util/sdt.o
>>
>> LIB_OBJS += $(OUTPUT)ui/setup.o
>> LIB_OBJS += $(OUTPUT)ui/helpline.o
>> diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c
>> index 011195e..e1e654b 100644
>> --- a/tools/perf/builtin-list.c
>> +++ b/tools/perf/builtin-list.c
>> @@ -23,7 +23,7 @@ int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused)
>> OPT_END()
>> };
>> const char * const list_usage[] = {
>> - "perf list [hw|sw|cache|tracepoint|pmu|event_glob]",
>> + "perf list [hw|sw|cache|tracepoint|pmu|event_glob|sdt]",
> Hmm.. I think it'd be better like below
>
> "perf list [hw|sw|cache|tracepoint|pmu|sdt|<event_glob>]",
>

Ok, looks good.

>> NULL
>> };
>>
>> @@ -34,6 +34,8 @@ int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused)
>>
>> if (argc == 0) {
>> print_events(NULL, false);
>> + printf("\n\nSDT events :\n");
>> + sdt_cache__display();
> What about make print_events() to print SDT events also?

Hmm, alright.

>
>> return 0;
>> }
>>
>> @@ -55,6 +57,8 @@ int cmd_list(int argc, const char **argv, const char *prefix __maybe_unused)
>> print_pmu_events(NULL, false);
>> else if (strcmp(argv[i], "--raw-dump") == 0)
>> print_events(NULL, true);
>> + else if (strcmp(argv[i], "sdt") == 0)
>> + print_sdt_events(argv[++i]);
> Please move this above the --raw-dump block, it's a special marker to
> help shell completion for event names so I'd like to keep it last.

Oh! will move that.

>
>> else {
>> char *sep = strchr(argv[i], ':'), *s;
>> int sep_idx;
>
> [SNIP]
>> +/*
>> + * Finds out the libraries present in a system as shown by the command
>> + * "ldconfig --print-cache". Uses "=>" and '/' to find out the start of a
>> + * dso path.
>> + */
> If we received the list of libraries from user directly, it can go
> away IMHO.
>

I may be wrong but I think until we close the pipe, the data will stay,
right?

>> +static char *parse_lib_name(char *str)
>> +{
>> + char *ptr, *q, *r;
>> +
>> + while (str != NULL) {
>> + /* look for "=>" and then '/' */
>> + ptr = strstr(str, "=>");
>> + if (ptr) {
>> + ptr++;
>> + q = strchr(ptr, '/');
>> + if (!q)
>> + return NULL;
>> + r = strchr(ptr, '\n');
>> + *r = '\0';
>> + return q;
>> + } else if (ptr == NULL) {
>> + return NULL;
>> + } else {
>> + str = ptr + 1;
>> + continue;
>> + }
>> + }
>> +
>> + return NULL;
>> +}
>> +
>> +/*
>> + * Checks if a path is already present in the list.
>> + * Returns 'true' if present and 'false' otherwise.
>> + */
>> +static bool is_present_in_list(struct list_head *path_list, char *path)
>> +{
>> + struct path_list *tmp;
>> +
>> + list_for_each_entry(tmp, path_list, list) {
>> + if (!strcmp(path, tmp->path))
>> + return true;
>> + }
> As Andi pointed out, you can use a hashtable for this.

Ok, will use a hashtable for this.

>
>> +
>> + return false;
>> +}
>> +
>> +static inline void append_path(char *path, struct list_head *list)
>> +{
>> + char *res_path = NULL;
>> + struct path_list *tmp = NULL;
>> +
>> + res_path = (char *)malloc(sizeof(char) * PATH_MAX);
>> +
>> + if (!res_path)
>> + return;
>> +
>> + memset(res_path, '\0', PATH_MAX);
>> +
>> + if (realpath(path, res_path) && !is_present_in_list(list, res_path)) {
>> + tmp = (struct path_list *) malloc(sizeof(struct path_list));
> Hmm... why not reusing res_path and make struct path_list to just have a
> pointer? Also you can use readpath(path, NULL) rather than allocating
> a PATH_MAX buffer and zero'ing it (can use calloc for it anyway).
>

Ah! ok, will make the changes.

>> + if (!tmp) {
>> + free(res_path);
>> + return;
>> + }
>> + strcpy(tmp->path, res_path);
>> + list_add(&(tmp->list), list);
>> + if (res_path)
>> + free(res_path);
>> + }
>> +}
>
> [SNIP]
>> +/*
>> + * Go through all the files inside "path".
>> + * But don't go into sub-directories.
>> + */
>> +static void walk_through_dir(char *path)
>> +{
>> + struct dirent *entry;
>> + DIR *dir;
>> + struct stat sb;
>> + int ret = 0;
>> + char *swd;
>> +
>> + dir = opendir(path);
>> + if (!dir)
>> + return;
>> +
>> + /* save the current working directory */
>> + swd = getcwd(NULL, 0);
>> + if (!swd) {
>> + pr_err("getcwd : error");
>> + return;
>> + }
>> +
>> + ret = chdir(path);
>> + if (ret) {
>> + pr_err("chdir : error in %s", path);
>> + return;
>> + }
> Is this really needed? I guess opendir() already covers possible
> failure cases, if not, we might use stat() anyway.
>
>
>> + while ((entry = readdir(dir)) != NULL) {
>> +
>> + ret = stat(entry->d_name, &sb);
>> + if (ret == -1) {
>> + pr_debug("%s : error in stat!\n", entry->d_name);
>> + continue;
>> + }
>> +
>> + /* Not pursuing sub-directories */
>> + if ((sb.st_mode & S_IFMT) != S_IFDIR)
>> + if (sb.st_mode & S_IXUSR)
>> + append_path(entry->d_name, &execs.list);
>> + }
>> +
>> + closedir(dir);
>> +
>> + /* return to the saved working directory */
>> + ret = chdir(swd);
>> + if (ret) {
>> + pr_err("chdir : error");
>> + return;
>> + }
>> +}
>
> [SNIP]
>> --- a/tools/perf/util/symbol-elf.c
>> +++ b/tools/perf/util/symbol-elf.c
>> @@ -1619,6 +1619,235 @@ void kcore_extract__delete(struct kcore_extract *kce)
>> unlink(kce->extract_filename);
>> }
>>
> Like I said earlier in a different thread, the below code can be a
> sepatate change.

Ok, will move that to a separate patch.

Thanks a lot for the review!

>
>
>> +static int populate_sdt_note(Elf **elf, const char *data, size_t len, int type,
>> + struct sdt_note **note)
>> +{
>> + const char *provider, *name;
>> + struct sdt_note *tmp = NULL;
>> + GElf_Ehdr ehdr;
>> + GElf_Addr base_off = 0;
>> [SNIP]

--
Thanks,
Hemant Kumar

2014-07-22 11:54:03

by Hemant Kumar

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] perf/sdt : Listing of SDT markers by perf

Hi Andi,

On 07/18/2014 11:20 PM, Andi Kleen wrote:
> Hemant Kumar <[email protected]> writes:
>
> First I should say supporting these probes is very useful. Thanks for
> working on this.

Thanks a lot for the appreciation.

>
>> +
>> +#define SDT_CACHE_DIR "/var/cache/perf/"
> This requires running perf as root, right?

Yes!

>
> It would be better to use the $HOME cache dir, like the recent JSON patches.

Hmm, seems like a good idea! We can use ~/.debug as Masami suggested.

>
>> +#define SDT_CACHE "perf-sdt.cache"
>> +#define SDT_CACHE_TMP "perf-sdt.cache.tmp"
>> +
>> +#define DELIM ':'
>> +
>> +struct path_list {
>> + char path[PATH_MAX];
>> + struct list_head list;
>> +} execs;
>> +
>> +/* Write operation for cache */
>> +static void write_cache(FILE *cache, char *buffer)
>> +{
>> + fprintf(cache, "%s", buffer);
>> +}
>> +
> The function seems redundant.
>

Yeah, right. Will remove it.

>> +/*
>> + * get_sdt_note_info() is the function actually responsible for
>> + * flushing the SDT notes info into the "cache" file or to the
>> + * stdout if "cache" points to NULL. Also, this function finds out
>> + * the build-id of an ELF to be written into "cache".
>> + */
>> +static void get_sdt_note_info(struct list_head *start, const char *target,
>> + FILE *cache)
>> +{
>> + struct sdt_note *pos;
>> + u8 build_id[BUILD_ID_SIZE];
>> + char sbuild_id[BUILD_ID_SIZE * 2 + 1];
>> + char buffer[2 * PATH_MAX];
>> +
>> + if (list_empty(start))
>> + return;
>> +
>> + /* Read the build id of the file */
>> + if (filename__read_build_id(target, &build_id,
>> + sizeof(build_id)) < 0) {
>> + pr_debug("Couldn't read build-id in %s\n", target);
> pr_info ?

Ok, pr_info will be better.

>> +
>> +/*
>> + * Finds out the libraries present in a system as shown by the command
>> + * "ldconfig --print-cache". Uses "=>" and '/' to find out the start of a
>> + * dso path.
>> + */
> This seems like a hack. How would that handle chroot, containers etc. ?

Right now, it doesn't handle chroot, containers and other path related
stuff. We will have to figure that out!

>> +static inline void append_path(char *path, struct list_head *list)
>> +{
>> + char *res_path = NULL;
>> + struct path_list *tmp = NULL;
>> +
>> + res_path = (char *)malloc(sizeof(char) * PATH_MAX);
>> +
>> + if (!res_path)
>> + return;
>> +
>> + memset(res_path, '\0', PATH_MAX);
>> +
>> + if (realpath(path, res_path) && !is_present_in_list(list, res_path)) {
>
> O^2 algorithm ?

:) Will improve that by using a hash table.

>> +/*
>> + * Obtain the list of paths from the PATH env variable
>> + */
> Same as above. This probably needs to be more configurable to handle
> more ways to find binaries.
>

We can make it more configurable, but by default, it should go for
binaries in these directories. What would you suggest?

Thanks a lot for reviewing the patch. :)

--
Thanks,
Hemant Kumar