Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751816AbcDUISQ (ORCPT ); Thu, 21 Apr 2016 04:18:16 -0400 Received: from szxga02-in.huawei.com ([119.145.14.65]:26268 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751552AbcDUISL (ORCPT ); Thu, 21 Apr 2016 04:18:11 -0400 Subject: Re: [RFC PATCH 00/13] perf tools: Support uBPF script To: Alexei Starovoitov References: <1461175313-38310-1-git-send-email-wangnan0@huawei.com> <20160420220609.GA38485@ast-mbp.thefacebook.com> CC: , , , , , Arnaldo Carvalho de Melo , Alexei Starovoitov , Jiri Olsa , Li Zefan From: "Wangnan (F)" Message-ID: <57188C94.4090200@huawei.com> Date: Thu, 21 Apr 2016 16:17:24 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 MIME-Version: 1.0 In-Reply-To: <20160420220609.GA38485@ast-mbp.thefacebook.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.111.66.109] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020203.57188CA6.00C1,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2013-06-18 04:22:30, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 94201e770bec96f22532bf62ad82d9d0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3936 Lines: 97 On 2016/4/21 6:06, Alexei Starovoitov wrote: > On Wed, Apr 20, 2016 at 06:01:40PM +0000, Wang Nan wrote: >> This patch set allows to perf invoke some user space BPF scripts on some >> point. uBPF scripts and kernel BPF scripts reside in one BPF object. >> They communicate with each other with BPF maps. uBPF scripts can invoke >> helper functions provided by perf. >> >> At least following new features can be achieved based on uBPF support: >> >> 1) Report statistical result: >> Like DTrace, perf print statistical report before quit. No need to >> extract data using 'perf report'. Statistical method is controled by >> user. >> >> 2) Control perf's behavior: >> Dynamically adjust period of different events. Policy is defined by >> user. >> >> uBPF library is required before compile. It can be found from github: >> >> https://github.com/iovisor/ubpf.git [SNIP] > Interesting! > If bpf is used for both kernel and user side programs, we can allow > almost arbitrary C code for the user side. > There is no need to be limited to a fixed set of helpers. > There is no verifier in user space either. > Just call 'printf("string")' directly. Calling 'printf("string")' would be cool, but it still need some extra work: .rodata section should be extracted, programs should be relocated to it. > Wouldn't even need to change interpreter. > Also ubpf was written from scratch with apache2, while perf is gpl, > so you can just link kernel/bpf/core.o directly instead of using external > libraries. > I really meant link .o file compiled for kernel. > Advertize dummy kfree/kmalloc and it will link fine, since perf > will only be calling __bpf_prog_run() which is 99% indepdendent from kernel. > I used to do exactly that long ago while performance tunning the interpreter. > Another option is to fork the interpreter for perf, but I don't like it at all. > Compiling the same bpf/core.c once for kernel and once for perf is another option, > but imo linking core.o is easier. I just realized we can't link apache2 static library into perf. (is that true?) Current perf building doesn't support directly linking like this, because such linking makes perf rely on kernel building, so we can't build perf before building kernel any more. One possible solution: providing a kernel build dir to perf builder and find the corresponding '.o' file from it: $ make KBUILD_DIR=/kernel/build/dir /lib/`uname -r`/build can be made as default position. JIT compiler can also be linked this way. Another possible solution: using macro trick to allow building bpf/core.c in perf building. It is possible and simpler, but we could be broken by kernel modification. > In general this set and overall bpf in user space makes sense only > if we allow much more flexible C code for user space. > If it's limited to ubpf_* helpers, that will quickly become suboptimal. Yes. I tried to reimplement tracex2 in sample but find it is not an easy work. However, in two of my usecase(reporting and controlling), only reporting require flexible C code. Even we have full featured C, doing statistical still require relative complex code, because the lacking of data structure support such as associate array (dict in python). For controling (for example, dynamically period adjustments; output perf.data when something unusual detected), uBPF programs describe rules and invoke actions (actions should be provided by perf helpers), similary to their kernel side counterparts. So I think no matter uBPF can be 'full C', we should consider making strong and flexiblity ubpf helpers. Basic reporting method, such as histogram, should be provided by a perf helper directly, no need to be rewritten by uBPF. We can even make a ubpf helper to bridge uBPF and lua scripts, then invokes lua scripts at perf hooks. For example: const char lua_script[] SEC("UBPF-lua;perf_record_exit") ""; Thank you.