Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755102AbcLBVVo (ORCPT ); Fri, 2 Dec 2016 16:21:44 -0500 Received: from mga05.intel.com ([192.55.52.43]:26610 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752156AbcLBVUn (ORCPT ); Fri, 2 Dec 2016 16:20:43 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,288,1477983600"; d="scan'208";a="1093847526" From: kan.liang@intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, linux-kernel@vger.kernel.org Cc: alexander.shishkin@linux.intel.com, tglx@linutronix.de, namhyung@kernel.org, jolsa@kernel.org, adrian.hunter@intel.com, wangnan0@huawei.com, mark.rutland@arm.com, andi@firstfloor.org, Kan Liang Subject: [PATCH V2 07/13] perf tools: handle PERF_RECORD_OVERHEAD record type Date: Fri, 2 Dec 2016 16:19:15 -0500 Message-Id: <1480713561-6617-8-git-send-email-kan.liang@intel.com> X-Mailer: git-send-email 2.5.5 In-Reply-To: <1480713561-6617-1-git-send-email-kan.liang@intel.com> References: <1480713561-6617-1-git-send-email-kan.liang@intel.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 12228 Lines: 320 From: Kan Liang The infrastructure to handle PERF_RECORD_OVERHEAD record type. A new perf report option is introduced as a knob to show the profiling overhead information. The option also forces tty output. The overhead information is auxillary information, so it should be possible to access it together with normal sampling information in a single output file. But the overhead is overall profiling time cost, not per-sample/event profiling time cost. Making overhead a standard sort key could confuse the normal samples. So the information is printed separately in the head of the output. Signed-off-by: Kan Liang --- tools/perf/Documentation/perf-report.txt | 3 +++ tools/perf/builtin-report.c | 16 +++++++++++++++- tools/perf/util/event.c | 9 +++++++++ tools/perf/util/event.h | 11 +++++++++++ tools/perf/util/evlist.c | 6 ++++++ tools/perf/util/evlist.h | 1 + tools/perf/util/machine.c | 10 ++++++++++ tools/perf/util/machine.h | 3 +++ tools/perf/util/session.c | 21 +++++++++++++++++++++ tools/perf/util/session.h | 3 +++ tools/perf/util/symbol.h | 3 ++- tools/perf/util/tool.h | 1 + 12 files changed, 85 insertions(+), 2 deletions(-) diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt index 2d17462..36d196c 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -412,6 +412,9 @@ include::itrace.txt[] --hierarchy:: Enable hierarchical output. +--show-profiling-cost: + Show extra time cost during perf profiling + include::callchain-overhead-calculation.txt[] SEE ALSO diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index 3dfbfff..d2f5e3c 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -565,6 +565,10 @@ static int __cmd_report(struct report *rep) evlist__for_each_entry(session->evlist, pos) rep->nr_entries += evsel__hists(pos)->nr_entries; + if (symbol_conf.show_profiling_cost) { + perf_session__fprintf_overhead_info(session, stdout, rep->cpu_list, rep->cpu_bitmap); + } + if (use_browser == 0) { if (verbose > 3) perf_session__fprintf(session, stdout); @@ -830,6 +834,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused) OPT_CALLBACK_DEFAULT(0, "stdio-color", NULL, "mode", "'always' (default), 'never' or 'auto' only applicable to --stdio mode", stdio__config_color, "always"), + OPT_BOOLEAN(0, "show-profiling-cost", &symbol_conf.show_profiling_cost, + "Show extra time cost during perf profiling"), OPT_END() }; struct perf_data_file file = { @@ -957,7 +963,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused) } /* Force tty output for header output and per-thread stat. */ - if (report.header || report.header_only || report.show_threads) + if (report.header || report.header_only || + report.show_threads || symbol_conf.show_profiling_cost) use_browser = 0; if (strcmp(input_name, "-") != 0) @@ -986,6 +993,13 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused) stdout); } + if (!symbol_conf.show_profiling_cost && + perf_evlist__overhead(session->evlist) && + (use_browser == 0)) { + fputs("# To display the profiling time cost info, please use --show-profiling-cost options.\n#\n", + stdout); + } + /* * Only in the TUI browser we are doing integrated annotation, * so don't allocate extra space that won't be used in the stdio diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index 8ab0d7d..58d095c 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -31,6 +31,7 @@ static const char *perf_event__names[] = { [PERF_RECORD_LOST_SAMPLES] = "LOST_SAMPLES", [PERF_RECORD_SWITCH] = "SWITCH", [PERF_RECORD_SWITCH_CPU_WIDE] = "SWITCH_CPU_WIDE", + [PERF_RECORD_OVERHEAD] = "OVERHEAD", [PERF_RECORD_HEADER_ATTR] = "ATTR", [PERF_RECORD_HEADER_EVENT_TYPE] = "EVENT_TYPE", [PERF_RECORD_HEADER_TRACING_DATA] = "TRACING_DATA", @@ -1056,6 +1057,14 @@ int perf_event__process_switch(struct perf_tool *tool __maybe_unused, return machine__process_switch_event(machine, event); } +int perf_event__process_overhead(struct perf_tool *tool __maybe_unused, + union perf_event *event, + struct perf_sample *sample __maybe_unused, + struct machine *machine) +{ + return machine__process_overhead_event(machine, event, sample); +} + size_t perf_event__fprintf_mmap(union perf_event *event, FILE *fp) { return fprintf(fp, " %d/%d: [%#" PRIx64 "(%#" PRIx64 ") @ %#" PRIx64 "]: %c %s\n", diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h index c735c53..d96e215 100644 --- a/tools/perf/util/event.h +++ b/tools/perf/util/event.h @@ -480,6 +480,12 @@ struct time_conv_event { u64 time_zero; }; +struct perf_overhead { + struct perf_event_header header; + u64 type; + struct perf_overhead_entry entry; +}; + union perf_event { struct perf_event_header header; struct mmap_event mmap; @@ -509,6 +515,7 @@ union perf_event { struct stat_event stat; struct stat_round_event stat_round; struct time_conv_event time_conv; + struct perf_overhead overhead; }; void perf_event__print_totals(void); @@ -587,6 +594,10 @@ int perf_event__process_switch(struct perf_tool *tool, union perf_event *event, struct perf_sample *sample, struct machine *machine); +int perf_event__process_overhead(struct perf_tool *tool, + union perf_event *event, + struct perf_sample *sample, + struct machine *machine); int perf_event__process_mmap(struct perf_tool *tool, union perf_event *event, struct perf_sample *sample, diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index d92e020..edcf421 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -1594,6 +1594,12 @@ bool perf_evlist__sample_id_all(struct perf_evlist *evlist) return first->attr.sample_id_all; } +bool perf_evlist__overhead(struct perf_evlist *evlist) +{ + struct perf_evsel *first = perf_evlist__first(evlist); + return first->attr.overhead; +} + void perf_evlist__set_selected(struct perf_evlist *evlist, struct perf_evsel *evsel) { diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h index 4fd034f..6d8efa6 100644 --- a/tools/perf/util/evlist.h +++ b/tools/perf/util/evlist.h @@ -241,6 +241,7 @@ u64 __perf_evlist__combined_sample_type(struct perf_evlist *evlist); u64 perf_evlist__combined_sample_type(struct perf_evlist *evlist); u64 perf_evlist__combined_branch_type(struct perf_evlist *evlist); bool perf_evlist__sample_id_all(struct perf_evlist *evlist); +bool perf_evlist__overhead(struct perf_evlist *evlist); u16 perf_evlist__id_hdr_size(struct perf_evlist *evlist); int perf_evlist__parse_sample(struct perf_evlist *evlist, union perf_event *event, diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c index 9b33bef..02c8f7a 100644 --- a/tools/perf/util/machine.c +++ b/tools/perf/util/machine.c @@ -555,6 +555,14 @@ int machine__process_switch_event(struct machine *machine __maybe_unused, return 0; } +int machine__process_overhead_event(struct machine *machine __maybe_unused, + union perf_event *event, + struct perf_sample *sample __maybe_unused) +{ + dump_printf("\tUNSUPPORT TYPE 0x%lx!\n", event->overhead.type); + return 0; +} + static void dso__adjust_kmod_long_name(struct dso *dso, const char *filename) { const char *dup_filename; @@ -1536,6 +1544,8 @@ int machine__process_event(struct machine *machine, union perf_event *event, case PERF_RECORD_SWITCH: case PERF_RECORD_SWITCH_CPU_WIDE: ret = machine__process_switch_event(machine, event); break; + case PERF_RECORD_OVERHEAD: + ret = machine__process_overhead_event(machine, event, sample); break; default: ret = -1; break; diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h index 354de6e..7e29e28 100644 --- a/tools/perf/util/machine.h +++ b/tools/perf/util/machine.h @@ -97,6 +97,9 @@ int machine__process_itrace_start_event(struct machine *machine, union perf_event *event); int machine__process_switch_event(struct machine *machine, union perf_event *event); +int machine__process_overhead_event(struct machine *machine, + union perf_event *event, + struct perf_sample *sample); int machine__process_mmap_event(struct machine *machine, union perf_event *event, struct perf_sample *sample); int machine__process_mmap2_event(struct machine *machine, union perf_event *event, diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c index f268201..9de4f74 100644 --- a/tools/perf/util/session.c +++ b/tools/perf/util/session.c @@ -373,6 +373,8 @@ void perf_tool__fill_defaults(struct perf_tool *tool) tool->itrace_start = perf_event__process_itrace_start; if (tool->context_switch == NULL) tool->context_switch = perf_event__process_switch; + if (tool->overhead == NULL) + tool->overhead = perf_event__process_overhead; if (tool->read == NULL) tool->read = process_event_sample_stub; if (tool->throttle == NULL) @@ -786,6 +788,7 @@ static perf_event__swap_op perf_event__swap_ops[] = { [PERF_RECORD_LOST_SAMPLES] = perf_event__all64_swap, [PERF_RECORD_SWITCH] = perf_event__switch_swap, [PERF_RECORD_SWITCH_CPU_WIDE] = perf_event__switch_swap, + [PERF_RECORD_OVERHEAD] = perf_event__all64_swap, [PERF_RECORD_HEADER_ATTR] = perf_event__hdr_attr_swap, [PERF_RECORD_HEADER_EVENT_TYPE] = perf_event__event_type_swap, [PERF_RECORD_HEADER_TRACING_DATA] = perf_event__tracing_data_swap, @@ -1267,6 +1270,8 @@ static int machines__deliver_event(struct machines *machines, case PERF_RECORD_SWITCH: case PERF_RECORD_SWITCH_CPU_WIDE: return tool->context_switch(tool, event, sample, machine); + case PERF_RECORD_OVERHEAD: + return tool->overhead(tool, event, sample, machine); default: ++evlist->stats.nr_unknown_events; return -1; @@ -2033,6 +2038,22 @@ void perf_session__fprintf_info(struct perf_session *session, FILE *fp, fprintf(fp, "# ========\n#\n"); } +void perf_session__fprintf_overhead_info(struct perf_session *session, FILE *fp, + const char *cpu_list __maybe_unused, + unsigned long *cpu_bitmap __maybe_unused) +{ + if (session == NULL || fp == NULL) + return; + + if (!perf_evlist__overhead(session->evlist)) { + fprintf(fp, "#\n# No profiling time cost information available.\n#\n"); + return; + } + + fprintf(fp, "# ========\n"); + + fprintf(fp, "# ========\n#\n"); +} int __perf_session__set_tracepoints_handlers(struct perf_session *session, const struct perf_evsel_str_handler *assocs, diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h index 4bd7585..7d08867 100644 --- a/tools/perf/util/session.h +++ b/tools/perf/util/session.h @@ -102,6 +102,9 @@ int perf_session__cpu_bitmap(struct perf_session *session, void perf_session__fprintf_info(struct perf_session *s, FILE *fp, bool full); +void perf_session__fprintf_overhead_info(struct perf_session *s, FILE *fp, + const char *cpu_list, unsigned long *cpu_bitmap); + struct perf_evsel_str_handler; int __perf_session__set_tracepoints_handlers(struct perf_session *session, diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h index 1bcbefc..6902171 100644 --- a/tools/perf/util/symbol.h +++ b/tools/perf/util/symbol.h @@ -118,7 +118,8 @@ struct symbol_conf { show_ref_callgraph, hide_unresolved, raw_trace, - report_hierarchy; + report_hierarchy, + show_profiling_cost; const char *vmlinux_name, *kallsyms_name, *source_prefix, diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h index ac2590a..c5bbb34 100644 --- a/tools/perf/util/tool.h +++ b/tools/perf/util/tool.h @@ -47,6 +47,7 @@ struct perf_tool { aux, itrace_start, context_switch, + overhead, throttle, unthrottle; event_attr_op attr; -- 2.5.5