Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753816AbbHRQla (ORCPT ); Tue, 18 Aug 2015 12:41:30 -0400 Received: from mga11.intel.com ([192.55.52.93]:1540 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753694AbbHRQlT (ORCPT ); Tue, 18 Aug 2015 12:41:19 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.15,702,1432623600"; d="scan'208";a="786447399" From: kan.liang@intel.com To: acme@kernel.org Cc: a.p.zijlstra@chello.nl, mingo@redhat.com, jolsa@kernel.org, namhyung@kernel.org, ak@linux.intel.com, eranian@google.com, linux-kernel@vger.kernel.org, Kan Liang Subject: [PATCH RFC 00/10] stat read during perf sampling Date: Tue, 18 Aug 2015 05:25:36 -0400 Message-Id: <1439889946-28986-1-git-send-email-kan.liang@intel.com> X-Mailer: git-send-email 1.8.3.1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5234 Lines: 128 From: Kan Liang The patch series intends to read counter statistics during sampling. The instant benefit is that we can read memory bandwidth from uncore event during cpu PMU event is sampling. Also, there are more and more free running counter events (such as freq, power etc) we have supported or plan to support on perf. So it could benefit more. The patch series includs 10 patches. - Patch 1: This patch fixes a potential bug, when evlist and evsel have different CPU maps. It can be merged separately. - Patch 2-3: These patches introduce a new sort option --socket. The user can sort the perf result by socket. User also can get the socket view of samples from perf report --stdio --socket. This feature should be useful for per-socket event. - Patch 4-10: Introduce 'N' event/group modifier. The event with this modifier will do counting not sampling. If a group with this modifier, only group leader do sampling. The counter statistics will be wrote in new RECORD type PERF_RECORD_STAT_READ and stored in perf.data. So perf report can present the counter statistics data accordingly. There may be an alternative way to get counter statistics during sampling by running perf record and perf stat together by script. But the script way have various issue and complex to parses the output. For example, the sophisticated bandwidth analysis requires fine granularity (10-20ms interval), while the perf stat interval is 100ms. It's better to record all data in perf.data as the patchset does. Example: #perf record -e 'cycles,uncore_imc_1/cas_count_read/N' --stat-read-interval 10 -a ./tchain_edit [ perf record: Woken up 520 times to write data ] [ perf record: Captured and wrote 1.454 MB perf.data (21328 samples) ] $perf report --stdio --socket # Samples: 21K of event 'cycles' # Event count (approx.): 12073951084 # # Socket: 0 # # Overhead Command Shared Object Symbol # ........ ............ ................ ....................................... # 97.58% tchain_edit tchain_edit [.] f3 0.08% tchain_edit tchain_edit [.] f2 0.05% swapper [kernel.vmlinux] [k] run_timer_softirq # Socket: 1 # # Overhead Command Shared Object Symbol # ........ ............ ................ ....................................... # 0.43% swapper [kernel.vmlinux] [k] acpi_idle_do_entry 0.24% kworker/22:1 [kernel.vmlinux] [k] delay_tsc 0.17% perf [kernel.vmlinux] [k] smp_call_function_single # Socket: 0 # # Performance counter stats: # uncore_imc_1/cas_count_read/N 29004 # # Socket: 1 # # Performance counter stats: # uncore_imc_1/cas_count_read/N 11350 $perf report -D ... 0x3e3a8 [0x20]: PERF_RECORD_STAT_READ: uncore_imc_0/cas_count_read/N CPU 0: value 29 time: 78608435366512 0x3e3c8 [0x20]: PERF_RECORD_STAT_READ: uncore_imc_0/cas_count_read/N CPU 18: value 15 time: 78608435429055 ... 0x3e468 [0x20]: PERF_RECORD_STAT_READ: uncore_imc_0/cas_count_read/N CPU 0: value 25 time: 78608445379258 0x3e488 [0x20]: PERF_RECORD_STAT_READ: uncore_imc_0/cas_count_read/N CPU 18: value 12 time: 78608445423995 ... Kan Liang (10): perf,tools: open event on evsel cpus and threads perf,tools: Support new sort type --socket perf,tools: support option --socket perf,tools: Add 'N' event/group modifier perf,tools: Enable statistic read for perf record perf,tools: New RECORD type PERF_RECORD_STAT_READ perf,tools: record counter statistics during sampling perf,tools: option to set stat read interval perf,tools: don't validate non-sample event perf,tools: Show STAT_READ in perf report tools/perf/Documentation/perf-list.txt | 5 ++ tools/perf/Documentation/perf-record.txt | 7 ++ tools/perf/Documentation/perf-report.txt | 6 +- tools/perf/builtin-diff.c | 2 +- tools/perf/builtin-record.c | 140 ++++++++++++++++++++++++++++++- tools/perf/builtin-report.c | 108 +++++++++++++++++++++++- tools/perf/builtin-top.c | 2 +- tools/perf/ui/stdio/hist.c | 14 +++- tools/perf/util/cpumap.c | 35 ++++++-- tools/perf/util/cpumap.h | 4 + tools/perf/util/event.c | 1 + tools/perf/util/event.h | 10 +++ tools/perf/util/evlist.c | 9 ++ tools/perf/util/evsel.h | 1 + tools/perf/util/hist.h | 3 +- tools/perf/util/parse-events.c | 8 +- tools/perf/util/parse-events.l | 2 +- tools/perf/util/session.c | 15 ++++ tools/perf/util/sort.c | 34 ++++++++ tools/perf/util/sort.h | 1 + tools/perf/util/symbol.h | 1 + tools/perf/util/tool.h | 1 + 22 files changed, 387 insertions(+), 22 deletions(-) -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/