Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934285AbbEOHy3 (ORCPT ); Fri, 15 May 2015 03:54:29 -0400 Received: from szxga02-in.huawei.com ([119.145.14.65]:60232 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933237AbbEOHyZ (ORCPT ); Fri, 15 May 2015 03:54:25 -0400 From: Wang Nan To: , , , , , , , , , , , CC: , , , Subject: [RFC PATCH v2 00/37] perf tools: introduce 'perf bpf' command to load eBPF programs. Date: Fri, 15 May 2015 07:50:53 +0000 Message-ID: <1431676290-1230-1-git-send-email-wangnan0@huawei.com> X-Mailer: git-send-email 1.8.3.4 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.107.197.200] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 9628 Lines: 206 This is the second version of 'perf bpf' patch series, based on v4.1-rc3. The goal of this series of patches is to integrate eBPF with perf. After applying these patches, users are allowed to use following command to load eBPF program compiled by LLVM into kernel then start recording with filters on: # perf bpf record --object sample_bpf.o -- -a sleep 4 Different from previous version (can be retrived from lkml: https://lkml.org/lkml/2015/4/30/264 ), v2 series has following modifications: 1. Put common eBPF and eBPF object operations into tools/lib/bpf instead of perf itself. Other programs, like iproute2, can utilize libbpf for their own use. 2. Doesn't rely on 'config' section. In v2 patch, eBPF programs describe their probing points in their section names. 'config' section is no longer mandatory for probing points. However, I still leave the space of 'config' section in libbpf for further expansion. See following descussion. 3. Kprobe points will be removed after exiting. 4. Redesign the logical of perf bpf command. Doesn't like v1 which implements 'perf bpf' as a standalone perf command, in this patch series perf bpf acts as a commmand wrapper. It loads eBPF programs into kernel then start other perf commands based on them. The first wrapped command is 'record'. In the example shown above, 'perf record' will start and captures filtered samples. Other commands, like 'perf top', are possible to be wrapped in similay way. Because of the new logic, a design decision should be made that, are we actually need 'perf bpf' command? Another choice is to midify 'pref report' by introducing new option like '--ebpf-object' and make them load BPF programs before do other things. I prefer keeping 'perf bpf' to group all eBPF related stuffs together using a uniform entry. In addition, eBPF programs can act not only as filters but also data aggregator. It is possible to make something link 'perf bpf run' to simply make it run, and dump result after user hit 'C-c' or timeout. The 'config' section may be utilized in this case to let 'perf bpf' know how to display results. Following is detail description. Patch 1/37 - 4/37 are bugfixs. Some of them are already acked. Patch 5/37 - 25/37 creates tools/lib/bpf. Libbpf will be compiled into libbpf.a and libbpf.so. It can be devided into 2 parts: 1) User-kernel interface. The API is defined by tools/lib/bpf/bpf.h, encapsulates map and program loading operations. In bpf_load_program(), it doesn't use log buffer in the first try to improve performance, and retry with log buffer enabled when failure. 2) ELF operations. The structure of eBPF object file is defined here. API of this part can be found in tools/lib/bpf/libbpf.h. 'struct bpf_map_def' is also put here. Libbpf's API hides internal structures. Callers access data of object files with handlers and accessors. 'struct bpf_object *' is handler of a whole object file. 'struct bpf_prog_handler *' is handler and iterator of programs. Some of accessors are defined to enable caller to retrive section name and file descriptor of a program. Further accessor can be appended. In the design of libbpf, I explictly separate full procedure into opening and loading phase. Data are collected during 'opening' phase. BPF syscalls are called in 'loading' phase. The separation is designed for potential cross-objects operations. Such separation also give caller a chance to let him/her to adjust bytecode and/or maps before real loading. (API of such operation is not provided in this version). During loading, fields in 'struct bpf_map_def' are also swapped if endianess mismatched. Patch 26/37 - 37/37 are patches on perf, which introduce 'perf bpf' command and 'perf bpf record' subcommand. Like previous discussed, 'perf bpf' is not a standalone command. The usage should be: perf bpf [] --objects -- \ First two patches make 'perf bpf' avaliable and make perf depend on libbpf. 28/37 creates 'perf bpf record' and directly passes everything after '--' to cmd_record(). Other stuffs resides in tools/perf/utils/bpf-loader.[ch], which are introduced in 29/37. Following patches do collection -> probing -> loading works step by step. In those operations, 'perf bpf' collects all required objects before creating kprobe points, and load programs into kernel after probing finish. A 'bpf_unload()' is used to remove kprobe points. I use 'atexit' hook to ensure it called before exiting. However, I find that atexit hookers are not always work well. For example, when program is canceled by SIGINT. Therefore we still need to call bpf_unload() after cmd_record(). Patch 34 and 35 introduce a special syntax for event parsing: 'group:name|bpf_fd=%d|', which allows 'perf bpf' to pass file descriptors of eBPF programs to evsel. Patch 36 and 37 regenerate arguments for cmd_record(), utilizes the '|bpf_fd=%d|' syntax to pass eBPF programs. Wang Nan (37): tools perf: set vmlinux_path__nr_entries to 0 in vmlinux_path__exit. tools lib traceevent: install libtraceevent.a into libdir. tools build: Allow other override features to check. tools include: add __aligned_u64 to types.h. tools lib bpf: introduce 'bpf' library to tools. tools lib bpf: allow set printing function. tools lib bpf: defines basic interface. tools lib bpf: open eBPF object file and do basic validation. tools lib bpf: check swap according to EHDR. tools lib bpf: iterater over elf sections to collect information. tools lib bpf: collect version and license from ELF. tools lib bpf: collect map definitions. tools lib bpf: collect config section in object. tools lib bpf: collect symbol table in object files. tools lib bpf: collect bpf programs from object files. tools lib bpf: collect relocation sections from object file. tools lib bpf: collect relocation instructions for each program. tools lib bpf: clean elf memory after loading. tools lib bpf: add bpf.c/h for common bpf operations. tools lib bpf: create maps needed by object file. tools lib bpf: relocation programs. tools lib bpf: introduce bpf_load_program to bpf.c. tools lib bpf: load bpf programs in object file into kernel. tools lib bpf: accessors of bpf_program. tools lib bpf: accessors for struct bpf_object. tools perf: Add new 'perf bpf' command. tools perf: make perf depend on libbpf. tools perf: add 'perf bpf record' subcommand. tools perf: add bpf-loader and open elf object files. tools perf: collect all bpf programs. tools perf: config probe points of eBPF programs during prepartion. tools perf bpf: probe at kprobe points. tools perf bpf: load eBPF object into kernel. tools perf: add a bpf_wrapper global flag. tools perf: add bpf_fd field to evsel and introduce new event syntax. tools perf: generate event argv. tools perf bpf: passes generated arguments to cmd_record. tools/build/Makefile.feature | 4 +- tools/include/linux/types.h | 5 + tools/lib/bpf/.gitignore | 2 + tools/lib/bpf/Build | 1 + tools/lib/bpf/Makefile | 191 ++++++ tools/lib/bpf/bpf.c | 87 +++ tools/lib/bpf/bpf.h | 24 + tools/lib/bpf/libbpf.c | 1089 +++++++++++++++++++++++++++++++++ tools/lib/bpf/libbpf.h | 66 ++ tools/lib/traceevent/Makefile | 20 +- tools/perf/Build | 1 + tools/perf/Documentation/perf-bpf.txt | 18 + tools/perf/Makefile.perf | 20 +- tools/perf/builtin-bpf.c | 202 ++++++ tools/perf/builtin.h | 1 + tools/perf/command-list.txt | 1 + tools/perf/perf.c | 10 + tools/perf/perf.h | 1 + tools/perf/util/Build | 1 + tools/perf/util/bpf-loader.c | 319 ++++++++++ tools/perf/util/bpf-loader.h | 24 + tools/perf/util/debug.c | 5 + tools/perf/util/debug.h | 1 + tools/perf/util/evsel.c | 1 + tools/perf/util/evsel.h | 1 + tools/perf/util/parse-events.c | 19 + tools/perf/util/parse-events.h | 3 + tools/perf/util/parse-events.l | 8 +- tools/perf/util/parse-events.y | 21 + tools/perf/util/parse-options.c | 8 +- tools/perf/util/parse-options.h | 2 + tools/perf/util/symbol.c | 1 + 32 files changed, 2145 insertions(+), 12 deletions(-) create mode 100644 tools/lib/bpf/.gitignore create mode 100644 tools/lib/bpf/Build create mode 100644 tools/lib/bpf/Makefile create mode 100644 tools/lib/bpf/bpf.c create mode 100644 tools/lib/bpf/bpf.h create mode 100644 tools/lib/bpf/libbpf.c create mode 100644 tools/lib/bpf/libbpf.h create mode 100644 tools/perf/Documentation/perf-bpf.txt create mode 100644 tools/perf/builtin-bpf.c create mode 100644 tools/perf/util/bpf-loader.c create mode 100644 tools/perf/util/bpf-loader.h -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/