Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753672AbbDAOXz (ORCPT ); Wed, 1 Apr 2015 10:23:55 -0400 Received: from m50-135.163.com ([123.125.50.135]:42366 "EHLO m50-135.163.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752505AbbDAOXy convert rfc822-to-8bit (ORCPT ); Wed, 1 Apr 2015 10:23:54 -0400 Content-Type: text/plain; charset=gb2312 Mime-Version: 1.0 (1.0) Subject: Re: [PATCH 3/4] perf tools: report: introduce --map-adjustment argument. From: pi3orama X-Mailer: iPhone Mail (12D508) In-Reply-To: <1427884395-241111-4-git-send-email-wangnan0@huawei.com> Date: Wed, 1 Apr 2015 22:23:36 +0800 Cc: "" , "" , "" , "" , "" , "" Content-Transfer-Encoding: 8BIT Message-Id: <52FBBE19-660F-4A91-B9DF-2949655AD2C7@163.com> References: <1427884395-241111-1-git-send-email-wangnan0@huawei.com> <1427884395-241111-4-git-send-email-wangnan0@huawei.com> To: Wang Nan X-CM-TRANSID: D9GowAD3_wdp_xtVjJSoAQ--.2238S2 X-Coremail-Antispam: 1Uf129KBjvJXoW3tryfZFWUtrWUtw17tF4ruFg_yoWkuryfpF WxG3s7Gr48Xr1Fvw15A3WjqFyYkr1vqaya9a4rJrs5ZFsIkr17Gr43KF10vFW3X3ykJw1j vw4UK3s7Grs3JFJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x07UVKZAUUUUU= X-Originating-IP: [117.136.38.136] X-CM-SenderInfo: lslt02xdpdqiywtou0bp/1tbiNATOQFC-Gmv6uQAAsj Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 13717 Lines: 406 ?????ҵ? iPhone > ?? 2015??4??1?գ?????6:33??Wang Nan д???? > > This patch introduces a --map-adjustment argument for perf report. The > goal of this option is to deal with private dynamic loader used in some > special program. > > Some programs write their private dynamic loader instead of glibc ld for > different reasons. They mmap() executable memory area, assemble code > from different '.so' and '.o' files then do the relocation and code > fixing by itself. The memory area is not file-backended so perf is > unable to handle symbol information in those files. > > This patch allows user to give perf report hints directly using > '--map-adjustment' argument. Perf report will regard such mapping as > file-backended mapping and treat them as dso instead of private mapping > area. > > The main part of this patch resides in util/machine.c. struct map_adj is > introduced to represent each adjustment. They are sorted and linked > together to map_adj_list linked list. When a real MMAP event raises, > perf checks such adjustments before calling map__new() and > thread__insert_map(), then setup filename and pgoff according to user > hints. It also splits MMAP events when necessary. > > Usage of --map-adjustment is appended into Documentation/perf-report.txt. > > Here is an example: > > $ perf report --map-adjustment=./libtest.so@0x7fa52fcb1000,0x4000,0x21000,92051 \ > --no-children > > Where 0x7fa52fcb1000 is private map area got through: > > mmap(NULL, 4096 * 4, PROT_EXEC|PROT_WRITE|PROT_READ, MAP_ANONYMOUS|MAP_PRIVATE, > -1, 0); > > And its contents are copied from libtest.so. > > Signed-off-by: Wang Nan > --- > tools/perf/Documentation/perf-report.txt | 11 ++ > tools/perf/builtin-report.c | 2 + > tools/perf/util/machine.c | 276 ++++++++++++++++++++++++++++++- > tools/perf/util/machine.h | 2 + > 4 files changed, 288 insertions(+), 3 deletions(-) > > diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt > index 4879cf6..e19349c 100644 > --- a/tools/perf/Documentation/perf-report.txt > +++ b/tools/perf/Documentation/perf-report.txt > @@ -323,6 +323,17 @@ OPTIONS > --header-only:: > Show only perf.data header (forces --stdio). > > +--map-adjustment=objfile@start,length[,pgoff[,pid]]:: > + Give memory layout hints for specific or all process. This makes > + perf regard provided range of memory as mapped from provided > + file instead of its original attributes found in perf.data. > + start and length should be hexadecimal values represent the > + address range. pgoff should be hexadecimal values represent > + mapping offset (in pages) of that file. Default pgoff value is > + 0 (map from start of the file). If pid is ommited, such > + adjustment will be applied to all process in this trace. This > + should be used when perf.data contains only 1 process. > + > SEE ALSO > -------- > linkperf:perf-stat[1], linkperf:perf-annotate[1] > diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c > index b5b2ad4..9fdfb05 100644 > --- a/tools/perf/builtin-report.c > +++ b/tools/perf/builtin-report.c > @@ -717,6 +717,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused) > "Don't show entries under that percent", parse_percent_limit), > OPT_CALLBACK(0, "percentage", NULL, "relative|absolute", > "how to display percentage of filtered entries", parse_filter_percentage), > + OPT_CALLBACK(0, "map-adjustment", NULL, "objfile@start,length[,pgoff[,pid]]", > + "Provide map adjustment hinting", parse_map_adjustment), > OPT_END() > }; > struct perf_data_file file = { > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c > index 051883a..dc9e91e 100644 > --- a/tools/perf/util/machine.c > +++ b/tools/perf/util/machine.c > @@ -1155,21 +1155,291 @@ out_problem: > return -1; > } > > +/* > + * Users are allowed to provide map adjustment setting for the case > + * that an address range is actually privatly mapped but known to be > + * ELF object file backended. Like this: > + * > + * |<- copied from libx.so ->| |<- copied from liby.so ->| > + * |<-------------------- MMAP area --------------------->| > + * > + * When dealing with such mmap events, try to obey user adjustment. > + * Such adjustment settings are not allowed overlapping. > + * Adjustments won't be considered as valid code until real MMAP events > + * take place. Therefore, users are allowed to provide adjustments which > + * cover never mapped areas, like: > + * > + * |<- libx.so ->| |<- liby.so ->| > + * |<-- MMAP area -->| > + * > + * This feature is useful when dealing with private dynamic linkers, > + * which assemble code piece from different ELF objects. > + * > + * map_adj_list is an ordered linked list. Order of two adjustments is > + * first defined by their pid, and then by their start address. > + * Therefore, adjustments for specific pids are groupped together > + * naturally. > + */ > +static LIST_HEAD(map_adj_list); > +struct map_adj { > + u32 pid; > + u64 start; > + u64 len; > + u64 pgoff; > + struct list_head list; > + char filename[PATH_MAX]; > +}; > + > +enum map_adj_cross { > + MAP_ADJ_LEFT_PID, > + MAP_ADJ_LEFT, > + MAP_ADJ_CROSS, > + MAP_ADJ_RIGHT, > + MAP_ADJ_RIGHT_PID, > +}; > + > +/* > + * Check whether two map_adj cross over each other. This function is > + * used for comparing adjustments. For overlapping adjustments, it > + * reports different between two start address and the length of > + * overlapping area. Signess of pgoff_diff can be used to determine > + * which one is the left one. > + * > + * If anyone in r and l has pid set as -1, don't consider pid. > + */ > +static enum map_adj_cross > +check_map_adj_cross(struct map_adj* l, struct map_adj* r, > + int *pgoff_diff, u64 *cross_len) > +{ > + bool swapped = false; > + > + if ((l->pid != (u32)(-1)) && (r->pid != (u32)(-1)) > + && (l->pid != r->pid)) > + return (l->pid < r->pid) ? MAP_ADJ_LEFT_PID : MAP_ADJ_RIGHT_PID; > + > + if (l->start > r->start) { > + struct map_adj *t = l; > + swapped = true; > + l = r; > + r = t; > + } > + > + if (l->start + l->len > r->start) { > + if (pgoff_diff) > + *pgoff_diff = ((r->start - l->start) / page_size) * > + (swapped ? -1 : 1); > + if (cross_len) { > + u64 cross_start = r->start; > + u64 l_end = l->start + l->len; > + u64 r_end = r->start + r->len; > + > + *cross_len = (l_end < r_end ? l_end : r_end) - > + cross_start; > + } > + return MAP_ADJ_CROSS; > + } > + > + return swapped ? MAP_ADJ_RIGHT : MAP_ADJ_LEFT; > +} > + > +static int machine_add_map_adj(u32 pid, u64 start, u64 len, > + u64 pgoff, const char *filename) > +{ > + struct map_adj *pos; > + struct map_adj *new; > + struct map_adj tmp = { > + .pid = pid, > + .start = start, > + .len = len, > + }; > + > + if (!filename) > + return -EINVAL; > + > + if ((start % page_size) || (len % page_size)) { > + pr_err("Map adjustment is not page aligned for %d%s.\n", pid, > + pid == (u32)(-1) ? " (all pids)" : ""); > + return -EINVAL; > + } > + > + if ((pid != (u32)(-1)) && (!list_empty(&map_adj_list))) { > + /* > + * Don't allow mixing (u32)(-1) (for all pids) and > + * normal pid. > + * > + * During sorting, (u32)(-1) should be considered as > + * the largest pid. > + */ > + struct map_adj *largest = list_entry(map_adj_list.prev, > + struct map_adj, list); > + > + if (largest->pid == (u32)(-1)) { > + pr_err("Providing both system-wide and pid specific map adjustments is forbidden.\n"); > + return -EINVAL; > + } > + } > + > + /* > + * Find the first one which is larger than tmp and insert new > + * adj prior to it. > + */ > + list_for_each_entry(pos, &map_adj_list, list) { > + enum map_adj_cross cross; > + > + cross = check_map_adj_cross(&tmp, pos, NULL, NULL); > + if (cross < MAP_ADJ_CROSS) > + break; > + if (cross == MAP_ADJ_CROSS) { > + pr_err("Overlapping map adjustments provided for pid %d%s\n", pid, > + pid == (u32)(-1) ? " (all pids)" : ""); > + return -EINVAL; > + } > + } > + > + new = malloc(sizeof(*new)); > + if (!new) > + return -EINVAL; > + > + new->pid = pid; > + new->start = start; > + new->len = len; > + new->pgoff = pgoff; > + strncpy(new->filename, filename, PATH_MAX); > + list_add(&new->list, pos->list.prev); > + return 0; > +} > + > static int machine_map_new(struct machine *machine, u64 start, u64 len, > u64 pgoff, u32 pid, u32 d_maj, u32 d_min, u64 ino, > u64 ino_gen, u32 prot, u32 flags, char *filename, > enum map_type type, struct thread *thread) > { > + struct map_adj *pos; > struct map *map; > > - map = map__new(machine, start, len, pgoff, pid, d_maj, d_min, > - ino, ino_gen, prot, flags, filename, type, thread); > + list_for_each_entry(pos, &map_adj_list, list) { > + u64 adj_start, adj_len, adj_pgoff, cross_len; > + enum map_adj_cross cross; > + struct map_adj tmp; > + int pgoff_diff; > + > +again: > + if (len == 0) > + break; > + > + tmp.pid = pid; > + tmp.start = start; > + tmp.len = len; > + > + cross = check_map_adj_cross(&tmp, > + pos, &pgoff_diff, &cross_len); > + > + if (cross < MAP_ADJ_CROSS) > + break; > + if (cross > MAP_ADJ_CROSS) > + continue; > + > + if (pgoff_diff <= 0) { > + /* > + * |<----- tmp ----->| > + * |<----- pos ----->| > + */ > + > + adj_start = tmp.start; > + adj_len = cross_len; > + adj_pgoff = pos->pgoff + (-pgoff_diff); > + map = map__new(machine, adj_start, adj_len, adj_pgoff, > + pid, 0, 0, 0, 0, prot, flags, > + pos->filename, type, thread); > + } else { > + /* > + * |<----- tmp ----->| > + * |<-- X -->|<----- pos ----->| > + * In this case, only deal with tmp part X. goto again > + * instead of next pos. > + */ > + adj_start = tmp.start; > + adj_len = tmp.len - cross_len; > + adj_pgoff = tmp.pgoff; > + map = map__new(machine, adj_start, adj_len, adj_pgoff, > + pid, d_maj, d_min, ino, ino_gen, prot, > + flags, filename, type, thread); > + > + } > + > + if (map == NULL) > + goto error; > + > + thread__insert_map(thread, map); > + > + pgoff += adj_len / page_size; > + start = tmp.start + adj_len; > + len -= adj_len; > + if (pgoff_diff > 0) > + goto again; > + } > + > + map = map__new(machine, start, len, pgoff, > + pid, d_maj, d_min, ino, ino_gen, prot, > + flags, filename, type, thread); We'd better check the value of len, and only do this mapping if len is not 0. > if (map == NULL) > - return -1; > + goto error; > > thread__insert_map(thread, map); > + > return 0; > +error: > + return -1; > +} > + > +int parse_map_adjustment(const struct option *opt __maybe_unused, > + const char *arg, int unset __maybe_unused) > +{ > + const char *ptr; > + char *sep; > + int err; > + u64 start, len, pgoff = 0; > + u32 pid = (u32)(-1); > + char filename[PATH_MAX]; > + > + sep = strchr(arg, '@'); > + if (sep == NULL) > + goto err; > + > + strncpy(filename, arg, sep - arg); > + > + ptr = sep + 1; /* Skip '@' */ > + > + /* start */ > + start = strtoll(ptr, &sep, 16); > + if (*sep != ',') > + goto err; > + ptr = sep + 1; > + > + /* len */ > + len = strtoll(ptr, &sep, 16); > + if (*sep == ',') { > + /* pgoff */ > + ptr = sep + 1; > + pgoff = strtoll(ptr, &sep, 16); > + > + if (*sep == ',') { > + /* pid */ > + ptr = sep + 1; > + pid = strtol(ptr, &sep, 10); > + } > + } > + > + if (*sep != '\0') > + goto err; > + > + err = machine_add_map_adj(pid, start, len, pgoff, filename); > + return err; > + > +err: > + fprintf(stderr, "invalid map adjustment setting: %s\n", arg); > + return -1; > } > > int machine__process_mmap2_event(struct machine *machine, > diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h > index e2faf3b..73b49e4 100644 > --- a/tools/perf/util/machine.h > +++ b/tools/perf/util/machine.h > @@ -223,4 +223,6 @@ pid_t machine__get_current_tid(struct machine *machine, int cpu); > int machine__set_current_tid(struct machine *machine, int cpu, pid_t pid, > pid_t tid); > > +int parse_map_adjustment(const struct option *opt, const char *arg, int unset); > + > #endif /* __PERF_MACHINE_H */ > -- > 1.8.3.4 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/