Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753722AbbFJC5g (ORCPT ); Tue, 9 Jun 2015 22:57:36 -0400 Received: from mail9.hitachi.co.jp ([133.145.228.44]:59004 "EHLO mail9.hitachi.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752115AbbFJC51 (ORCPT ); Tue, 9 Jun 2015 22:57:27 -0400 Message-ID: <5577A78F.3070409@hitachi.com> Date: Wed, 10 Jun 2015 11:57:19 +0900 From: Masami Hiramatsu Organization: Hitachi, Ltd., Japan User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Tom Zanussi , rostedt@goodmis.org CC: daniel.wagner@bmw-carit.de, namhyung@kernel.org, josh@joshtriplett.org, andi@firstfloor.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v7 00/10] tracing: 'hist' triggers References: In-Reply-To: Content-Type: text/plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 14108 Lines: 277 On 2015/06/09 6:31, Tom Zanussi wrote: > This is v7 of the 'hist triggers' patchset, following feedback from > v6. > > Changes from v6: > > This version adds a new 'sym-offset' modifier as requested by Masami. > I implemented it as a modifier rather than using the trace option as > suggested, in part because I wanted to keep it all self-contained and > it seemed more consistent to just add it alongside the 'sym' modifier. > Also, hist triggers arent't really a tracer and therefore don't > directly tie into the option update/callback mechanism so making use > of it isn't as simple as a normal tracer. > > I also changed the sort key specification to be stricter and signal an > error if the specified sort key wasn't found (rather than defaulting > to hitcount in those cases), also suggested by Masami. Thanks, > Masami, for your input! OK, here I've ran a quick test. ----- [root@localhost tracing]# perf probe __kmalloc caller=\$stack0 size Added new event: probe:__kmalloc (on __kmalloc with caller=$stack0 size) You can now use it in all perf tools, such as: perf record -e probe:__kmalloc -aR sleep 1 [root@localhost tracing]# echo hist:keys=caller.sym-offset,size:value=hitcount:sort=size > events/probe/__kmalloc/trigger [root@localhost tracing]# echo 1 > events/probe/__kmalloc/enable [root@localhost tracing]# cat events/probe/__kmalloc/hist # trigger info: hist:keys=caller.sym-offset,size:vals=hitcount:sort=size:size=2048 [active] { caller: [ffffffff811e1142] shmem_initxattrs+0x92/0xe0 , size: 17 } hitcount: 14 { caller: [ffffffff81296860] load_elf_binary+0x240/0x10e0 , size: 28 } hitcount: 4 { caller: [ffffffff813ed384] security_context_to_sid_core+0x64/0x280 , size: 30 } hitcount: 84 { caller: [ffffffff813ec0cc] context_struct_to_string+0x11c/0x220 , size: 36 } hitcount: 14 { caller: [ffffffff8125bf84] alloc_fdmem+0x24/0x40 , size: 64 } hitcount: 17 { caller: [ffffffff812653cc] simple_xattr_alloc+0x2c/0x70 , size: 68 } hitcount: 14 { caller: [ffffffffa0304770] __nf_ct_ext_add_length+0x160/0x1a0 [nf_conntrack] , size: 96 } hitcount: 6 { caller: [ffffffff81395ca7] kmem_alloc+0x77/0xf0 , size: 112 } hitcount: 2 { caller: [ffffffff81416058] bio_alloc_bioset+0x248/0x330 , size: 152 } hitcount: 12 { caller: [ffffffff81395ca7] kmem_alloc+0x77/0xf0 , size: 384 } hitcount: 2 { caller: [ffffffff812965f6] load_elf_phdrs+0x76/0xa0 , size: 392 } hitcount: 4 { caller: [ffffffff812965f6] load_elf_phdrs+0x76/0xa0 , size: 504 } hitcount: 4 { caller: [ffffffff8169cbdb] sk_prot_alloc+0xcb/0x1b0 , size: 1136 } hitcount: 2 { caller: [ffffffff8125bf84] alloc_fdmem+0x24/0x40 , size: 2048 } hitcount: 17 { caller: [ffffffff812628fb] seq_buf_alloc+0x1b/0x50 , size: 3024 } hitcount: 2 { caller: [ffffffff812628fb] seq_buf_alloc+0x1b/0x50 , size: 4096 } hitcount: 35 { caller: [ffffffff81196540] tracing_map_sort_entries+0x30/0x5c0 , size: 16384 } hitcount: 1 Totals: Hits: 234 Entries: 17 Dropped: 0 ----- Looks good to me :) ----- [root@localhost tracing]# echo hist:keys=caller.sym-offset,size:value=hitcount:sort=XXX > events/probe/__kmalloc/trigger -bash: echo: write error: Invalid argument [root@localhost tracing]# echo hist:keys=caller.sym-offset,size:value=abc > events/probe/__kmalloc/trigger -bash: echo: write error: Invalid argument [root@localhost tracing]# echo hist:keys=yyy > events/probe/__kmalloc/trigger -bash: echo: write error: Invalid argument ----- Also, it seems to check parse error correctly. I'll take a look for each patch. Thank you, > > Also updated the Documentation and tracing/README to reflect the > changes. > > Changes from v5: > > This version adds support for compound keys, along with the related > ability to sort using primary and secondary keys. This was mentioned > in previous versions as the last important piece that remained > unimplemented, and is now implemented. (I didn't have time to get to > the couple of enhancements suggested by Masami, but I expect to be > able to add those later on top of these.) > > Because we now support compound keys and it's not immediately clear in > the output exactly which fields correspond to keys, the key(s), > compound or not, are now enclosed by curly braces. > > The Documentation and README have been updated to reflect the changes, > and several new examples have been added to illustrate how to use > compound keys. > > Also, the code was updated to work with the new ftrace_event_file, > etc, renaming in tracing/for-next. > > Changes from v4: > > This version addresses some problems and suggestions made by Daniel > Wagner - a lot of the code was reworked to get rid of the distinction > between keys and values, and as a result, both keys and values can be > used as sort keys. As suggested, it also allows 'val=' to be absent > in a trigger command - if no 'val' is specified, hitcount is assumed > and automatically used as the only val. > > The map code was also separated out into a separate file, > tracing_map.c, allowing it to be reused. It also adds a second tracer > called function_hist that actually does reuse the code, as an RFC > patch. > > Patch 01/10 [tracing: Update cond flag when enabling or disabling..] > is a fix for a problem noticed by Daniel and that fixes a problem in > existing trigger code and should be applied regardless of whether the > rest of the patchset is merged. > > As mentioned, patch 10/10 is an RFC patch implementing a new tracer > based on the function tracer code. It's a fun little tool and is > useful for a specific problem I'm working on (and is also a nice test > of the tracing_map code), but is an RFC because first, I'm not sure it > would really be of general interest and secondly, it's POC-level > quality and I'd need to spend more time fixing it up to make it > upstreamable, but I don't want to waste my time if not. > > There are a couple of important bits of functionality that were > present in v1 but not yet reimplemented in v5. > > The first is support for compound keys. Currently, maps can only be > keyed on a single event field, whereas in v1 they could be keyed on > multiple keys. With support for compound keys, you can create much > more interesting output, such as for example per-pid lists of > syscalls or read counts e.g.: > > # echo 'hist:keys=common_pid.execname,id.syscall:vals=hitcount' > \ > /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger > > # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist > > key: common_pid:bash[3112], id:sys_write vals: count:69 > key: common_pid:bash[3112], id:sys_rt_sigprocmask vals: count:218 > > key: common_pid:update-notifier[3164], id:sys_poll vals: count:37 > key: common_pid:update-notifier[3164], id:sys_recvfrom vals: count:118 > > key: common_pid:deja-dup-monito[3194], id:sys_sendto vals: count:1 > key: common_pid:deja-dup-monito[3194], id:sys_read vals: count:4 > key: common_pid:deja-dup-monito[3194], id:sys_poll vals: count:8 > key: common_pid:deja-dup-monito[3194], id:sys_recvmsg vals: count:8 > key: common_pid:deja-dup-monito[3194], id:sys_getegid vals: count:8 > > key: common_pid:emacs[3275], id:sys_fsync vals: count:1 > key: common_pid:emacs[3275], id:sys_open vals: count:1 > key: common_pid:emacs[3275], id:sys_symlink vals: count:2 > key: common_pid:emacs[3275], id:sys_poll vals: count:23 > key: common_pid:emacs[3275], id:sys_select vals: count:23 > key: common_pid:emacs[3275], id:unknown_syscall vals: count:34 > key: common_pid:emacs[3275], id:sys_ioctl vals: count:60 > key: common_pid:emacs[3275], id:sys_rt_sigprocmask vals: count:116 > > key: common_pid:cat[3323], id:sys_munmap vals: count:1 > key: common_pid:cat[3323], id:sys_fadvise64 vals: count:1 > > Related to that is support for sorting on multiple fields. Currently, > you can sort using only a primary key. Being able to sort on multiple > or at least a secondary key is indispensible for seeing trends when > displaying multiple values. > > Changes from v3: > > v4 fixes the race in tracing_map_insert() noted in v3, where > map.val.key could be checked even if map.val wasn't yet set. The > simple fix for that in tracing_map_insert() introduces the possibility > of duplicates in the map, which though rare, need to be accounted for > in the output. To address that, duplicate-merging code was added to > the map-printing code. > > It was also pointed out that it didn't seem correct to include > module.h, but the fix for that has deeper roots and is being addressed > by a separate patchset; for now we need to continue including > module.h, though prompted by that I did some other header include > cleanup. > > The functionality remains the same as v2, but this version no longer > tries to export and use bpf_maps, and more importantly removes the > associated GFP_NOTRACE/trace event hacks and kmem macros required to > work around the bpf_map implementation. > > The tracing_map functionality is instead built on top of a simple > lock-free map algorithm originated by Dr. Cliff Click (see references > in the code for more details), which though too restrictive to be > general-purpose in its current form, functions nicely as a > special-purpose tracing map. > > v3 also moves the hist triggers code into a separate file and puts it > all behind a new config option, CONFIG_HIST_TRIGGERS. It also merges > in the sorting code rather than keeping it as a separate patch. > > This patchset also includes a couple other new and related triggers, > enable_hist and disable_hist, very similar to the existing > enable_event/disable_event triggers used to automatically enable and > disable events based on a triggering condition, but in this case > allowing hist triggers to be enabled and disabled in the same way. > > - Added an insert check for val before checking the key associated with val > - Added code to merge possible duplicates in the map > > Changes from v2: > - reimplemented tracing_map, replacing bpf_map with nmi-safe/lock-free map > - removed GPF_NOTRACE, kmalloc/free macros and event hacks needed by bpf_maps > - moved hist triggers from trace_events_trigger.c to trace_events_hist.c > - added CONFIG_HIST_TRIGGERS config option > - consolidated sorting code with main patch > > Changes from v1: > - completely rewritten on top of tracing_map (renamed and exported bpf_map) > - added map clearing and client ops to tracing_map > - changed the name from 'hash' triggers to 'hist' triggers > - added new trigger 'pause' feature > - added new enable_hist and disable_hist triggers > - added usage for hist/enable_hist/disable hist to tracing/README > - moved examples into Documentation/trace/event.txt > - added ___GFP_NOTRACE, kmalloc/kfree macros, and conditional kmem tracepoints > > The following changes since commit a497adb45b8691f7e477e711a1a4bd54748d64fe: > > ring-buffer: Add enum names for the context levels (2015-05-29 10:39:08 -0400) > > are available in the git repository at: > > git://git.yoctoproject.org/linux-yocto-contrib.git tzanussi/hist-triggers-v7 > http://git.yoctoproject.org/cgit/cgit.cgi/linux-yocto-contrib/log/?h=tzanussi/hist-triggers-v7 > > Tom Zanussi (10): > tracing: Update cond flag when enabling or disabling a trigger > tracing: Make ftrace_event_field checking functions available > tracing: Add event record param to trigger_ops.func() > tracing: Add get_syscall_name() > tracing: Add a per-event-trigger 'paused' field > trace: Add lock-free tracing_map > tracing: Add 'hist' event trigger command > tracing: Add enable_hist/disable_hist triggers > tracing: Add 'hist' trigger Documentation > ftrace: Add function_hist tracer > > Documentation/trace/events.txt | 1131 ++++++++++++++++++++++++++++ > include/linux/trace_events.h | 9 +- > kernel/trace/Kconfig | 14 + > kernel/trace/Makefile | 3 + > kernel/trace/trace.c | 69 ++ > kernel/trace/trace.h | 88 ++- > kernel/trace/trace_events.c | 4 + > kernel/trace/trace_events_filter.c | 12 - > kernel/trace/trace_events_hist.c | 1407 +++++++++++++++++++++++++++++++++++ > kernel/trace/trace_events_trigger.c | 149 ++-- > kernel/trace/trace_functions_hist.c | 345 +++++++++ > kernel/trace/trace_syscalls.c | 11 + > kernel/trace/tracing_map.c | 920 +++++++++++++++++++++++ > kernel/trace/tracing_map.h | 143 ++++ > 14 files changed, 4221 insertions(+), 84 deletions(-) > create mode 100644 kernel/trace/trace_events_hist.c > create mode 100644 kernel/trace/trace_functions_hist.c > create mode 100644 kernel/trace/tracing_map.c > create mode 100644 kernel/trace/tracing_map.h > -- Masami HIRAMATSU Linux Technology Research Center, System Productivity Research Dept. Center for Technology Innovation - Systems Engineering Hitachi, Ltd., Research & Development Group E-mail: masami.hiramatsu.pt@hitachi.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/