Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751902AbcLEOrr convert rfc822-to-8bit (ORCPT ); Mon, 5 Dec 2016 09:47:47 -0500 Received: from mga04.intel.com ([192.55.52.120]:19590 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751633AbcLEOrp (ORCPT ); Mon, 5 Dec 2016 09:47:45 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,747,1477983600"; d="scan'208";a="13823185" From: "Liang, Kan" To: Jiri Olsa CC: "peterz@infradead.org" , "mingo@redhat.com" , "acme@kernel.org" , "linux-kernel@vger.kernel.org" , "alexander.shishkin@linux.intel.com" , "tglx@linutronix.de" , "namhyung@kernel.org" , "jolsa@kernel.org" , "Hunter, Adrian" , "wangnan0@huawei.com" , "mark.rutland@arm.com" , "andi@firstfloor.org" Subject: RE: [PATCH V2 09/13] perf script: show kernel overhead Thread-Topic: [PATCH V2 09/13] perf script: show kernel overhead Thread-Index: AQHSTOHYVElQ/C93HkuxBfes3MgXJqD3yhUAgAGlvDA= Date: Mon, 5 Dec 2016 14:47:24 +0000 Message-ID: <37D7C6CF3E00A74B8858931C1DB2F07750CA9982@SHSMSX103.ccr.corp.intel.com> References: <1480713561-6617-1-git-send-email-kan.liang@intel.com> <1480713561-6617-10-git-send-email-kan.liang@intel.com> <20161204212548.GA8859@krava> In-Reply-To: <20161204212548.GA8859@krava> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiYjg4NDYzMWYtYWNkNi00MzkzLWFlZWUtNWIyNGY5ODA3ZmJkIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE1LjkuNi42IiwiVHJ1c3RlZExhYmVsSGFzaCI6Im5TY2dxeE5NSG9lVmp6VzFUQmx4d0N3c3JBdVI1TXFmWTlTS0ZaWEczaFU9In0= x-ctpclassification: CTP_IC x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8302 Lines: 235 > On Fri, Dec 02, 2016 at 04:19:17PM -0500, kan.liang@intel.com wrote: > > From: Kan Liang > > > > Shows kernel overhead in perf script. > > > > The output is as below: > > > > perf script --show-profiling-cost-events > > perf 29001 79989.093958: 1 cycles: > > ffffffff81064ca6 native_write_msr (/lib/ > > sleep 29001 79989.094282: 7661 cycles: > > ffffffff810dc433 update_blocked_averages > > sleep 29001 79989.094294: 7442 cycles: > > ffffffff81810f60 irq_work_interrupt (/li > > sleep 29001 79989.094305: 25466 cycles: > > ffffffff813ca410 radix_tree_next_chunk ( > > sleep 29001 79989.094340: 94368 cycles: > > ffffffff8180fa90 page_fault (/lib/module > > sleep 29001 79989.094459: 167362 cycles: > > ffffffff811e3f79 alloc_set_pte (/lib/mod > > sleep 29001 79989.094672: 190283 cycles: > > 7f5d7c91d8e7 _dl_addr (/usr/lib64/li > > sleep 29001 79991.094978: 194526 cycles: > > ffffffff811e0579 __tlb_remove_page_size. > > sleep 29001 79991.095061: PERF_RECORD_OVERHEAD [SAMPLE] nr: > 8 > > time: 28110 > > sleep 29001 79991.095062: PERF_RECORD_OVERHEAD [SB] nr: 24 > > time: 41397 > > hi, > got segfault by running: > > [jolsa@ibm-x3650m4-01 perf]$ catchsegv ./perf --no-pager script --show- > profiling-cost-events > Thanks for the testing. I only test it with kernel overhead event, but not redo the test after adding the user overhead event. :( The user overhead event doesn't have sample. So it's not supported for now. The patch as below will fix it. I will include the fix in V3. BTW: perf script patch is a stand along patch. It would not impact other patches. I think you can still play with the rest of V2 patches. Thanks, Kan ------ >From 03360f4083c30a19f8985ac07893e65e8de7a355 Mon Sep 17 00:00:00 2001 From: Kan Liang Date: Mon, 5 Dec 2016 08:56:28 -0500 Subject: [PATCH 09/13] perf script: show kernel overhead Shows kernel overhead in perf script. The output is as below: perf script --show-profiling-cost-events perf 29001 79989.093958: 1 cycles: ffffffff81064ca6 native_write_msr (/lib/ sleep 29001 79989.094282: 7661 cycles: ffffffff810dc433 update_blocked_averages sleep 29001 79989.094294: 7442 cycles: ffffffff81810f60 irq_work_interrupt (/li sleep 29001 79989.094305: 25466 cycles: ffffffff813ca410 radix_tree_next_chunk ( sleep 29001 79989.094340: 94368 cycles: ffffffff8180fa90 page_fault (/lib/module sleep 29001 79989.094459: 167362 cycles: ffffffff811e3f79 alloc_set_pte (/lib/mod sleep 29001 79989.094672: 190283 cycles: 7f5d7c91d8e7 _dl_addr (/usr/lib64/li sleep 29001 79991.094978: 194526 cycles: ffffffff811e0579 __tlb_remove_page_size. sleep 29001 79991.095061: PERF_RECORD_OVERHEAD [SAMPLE] nr: 8 time: 28110 sleep 29001 79991.095062: PERF_RECORD_OVERHEAD [SB] nr: 24 time: 41397 Signed-off-by: Kan Liang --- tools/perf/Documentation/perf-script.txt | 3 +++ tools/perf/builtin-script.c | 36 ++++++++++++++++++++++++++++++++ tools/perf/util/event.c | 29 +++++++++++++++++++++++++ tools/perf/util/event.h | 1 + 4 files changed, 69 insertions(+) diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt index c01904f..b371023 100644 --- a/tools/perf/Documentation/perf-script.txt +++ b/tools/perf/Documentation/perf-script.txt @@ -289,6 +289,9 @@ include::itrace.txt[] --force:: Don't do ownership validation. +--show-profiling-cost-events:: + Display perf profiling time cost related event (PERF_RECORD_OVERHEAD) + SEE ALSO -------- linkperf:perf-record[1], linkperf:perf-script-perl[1], diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c index e1daff3..01019de 100644 --- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -829,6 +829,7 @@ struct perf_script { bool show_task_events; bool show_mmap_events; bool show_switch_events; + bool show_profiling_cost_events; bool allocated; struct cpu_map *cpus; struct thread_map *threads; @@ -1264,6 +1265,37 @@ static int process_switch_event(struct perf_tool *tool, return 0; } +static int process_overhead_event(struct perf_tool *tool, + union perf_event *event, + struct perf_sample *sample, + struct machine *machine) +{ + struct thread *thread; + struct perf_script *script = container_of(tool, struct perf_script, tool); + struct perf_session *session = script->session; + struct perf_evsel *evsel; + + if (perf_event__process_overhead(tool, event, sample, machine) < 0) + return -1; + + /* No sample for user overhead event. Not support. */ + if (!sample) + return 0; + + evsel = perf_evlist__id2evsel(session->evlist, sample->id); + thread = machine__findnew_thread(machine, sample->pid, sample->tid); + if (thread == NULL) { + pr_debug("problem processing OVERHEAD event, skipping it.\n"); + return -1; + } + + print_sample_start(sample, thread, evsel); + perf_event__fprintf(event, stdout); + thread__put(thread); + + return 0; +} + static void sig_handler(int sig __maybe_unused) { session_done = 1; @@ -1287,6 +1319,8 @@ static int __cmd_script(struct perf_script *script) } if (script->show_switch_events) script->tool.context_switch = process_switch_event; + if (script->show_profiling_cost_events) + script->tool.overhead = process_overhead_event; ret = perf_session__process_events(script->session); @@ -2172,6 +2206,8 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused) "Show the mmap events"), OPT_BOOLEAN('\0', "show-switch-events", &script.show_switch_events, "Show context switch events (if recorded)"), + OPT_BOOLEAN('\0', "show-profiling-cost-events", &script.show_profiling_cost_events, + "Show perf profiling time cost events"), OPT_BOOLEAN('f', "force", &file.force, "don't complain, do it"), OPT_BOOLEAN(0, "ns", &nanosecs, "Use 9 decimal places when displaying time"), diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index 58d095c..e8736dc 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -1189,6 +1189,32 @@ size_t perf_event__fprintf_switch(union perf_event *event, FILE *fp) event->context_switch.next_prev_tid); } +size_t perf_event__fprintf_overhead(union perf_event *event, FILE *fp) +{ + size_t ret; + + switch (event->overhead.type) { + case PERF_PMU_SAMPLE_OVERHEAD: + ret = fprintf(fp, " [SAMPLE] nr: %llu time: %llu\n", + event->overhead.entry.nr, + event->overhead.entry.time); + break; + case PERF_CORE_MUX_OVERHEAD: + ret = fprintf(fp, " [MUX] nr: %llu time: %llu\n", + event->overhead.entry.nr, + event->overhead.entry.time); + break; + case PERF_CORE_SB_OVERHEAD: + ret = fprintf(fp, " [SB] nr: %llu time: %llu\n", + event->overhead.entry.nr, + event->overhead.entry.time); + break; + default: + ret = fprintf(fp, " unhandled!\n"); + } + return ret; +} + size_t perf_event__fprintf(union perf_event *event, FILE *fp) { size_t ret = fprintf(fp, "PERF_RECORD_%s", @@ -1218,6 +1244,9 @@ size_t perf_event__fprintf(union perf_event *event, FILE *fp) case PERF_RECORD_SWITCH_CPU_WIDE: ret += perf_event__fprintf_switch(event, fp); break; + case PERF_RECORD_OVERHEAD: + ret += perf_event__fprintf_overhead(event, fp); + break; default: ret += fprintf(fp, "\n"); } diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h index dd4ec5c..a6c98ce 100644 --- a/tools/perf/util/event.h +++ b/tools/perf/util/event.h @@ -675,6 +675,7 @@ size_t perf_event__fprintf_switch(union perf_event *event, FILE *fp); size_t perf_event__fprintf_thread_map(union perf_event *event, FILE *fp); size_t perf_event__fprintf_cpu_map(union perf_event *event, FILE *fp); size_t perf_event__fprintf(union perf_event *event, FILE *fp); +size_t perf_event__fprintf_overhead(union perf_event *event, FILE *fp); u64 kallsyms__get_function_start(const char *kallsyms_filename, const char *symbol_name); -- 2.5.5