Subject: Re: [RFC PATCH 00/13] perf tools: Support uBPF script
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
References: <1461175313-38310-1-git-send-email-wangnan0@huawei.com>
 <20160420220609.GA38485@ast-mbp.thefacebook.com>
CC: <acme@kernel.org>, <jolsa@redhat.com>, <brendan.d.gregg@gmail.com>,
        <linux-kernel@vger.kernel.org>, <pi3orama@163.com>,
        Arnaldo Carvalho de Melo <acme@redhat.com>,
        Alexei Starovoitov <ast@kernel.org>, Jiri Olsa <jolsa@kernel.org>,
        Li Zefan <lizefan@huawei.com>
From: "Wangnan (F)" <wangnan0@huawei.com>
Message-ID: <57188C94.4090200@huawei.com>
Date: Thu, 21 Apr 2016 16:17:24 +0800
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101
 Thunderbird/38.5.0
MIME-Version: 1.0
In-Reply-To: <20160420220609.GA38485@ast-mbp.thefacebook.com>
Content-Type: text/plain; charset="utf-8"; format=flowed
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3936
Lines: 97


On 2016/4/21 6:06, Alexei Starovoitov wrote:
> On Wed, Apr 20, 2016 at 06:01:40PM +0000, Wang Nan wrote:
>> This patch set allows to perf invoke some user space BPF scripts on some
>> point. uBPF scripts and kernel BPF scripts reside in one BPF object.
>> They communicate with each other with BPF maps. uBPF scripts can invoke
>> helper functions provided by perf.
>>
>> At least following new features can be achieved based on uBPF support:
>>
>>   1) Report statistical result:
>>      Like DTrace, perf print statistical report before quit. No need to
>>      extract data using 'perf report'. Statistical method is controled by
>>      user.
>>
>>   2) Control perf's behavior:
>>      Dynamically adjust period of different events. Policy is defined by
>>      user.
>>
>> uBPF library is required before compile. It can be found from github:
>>
>>   https://github.com/iovisor/ubpf.git

[SNIP]

> Interesting!
> If bpf is used for both kernel and user side programs, we can allow
> almost arbitrary C code for the user side.
> There is no need to be limited to a fixed set of helpers.
> There is no verifier in user space either.
> Just call 'printf("string")' directly.

Calling 'printf("string")' would be cool, but it still need
some extra work: .rodata section should be extracted, programs
should be relocated to it.

> Wouldn't even need to change interpreter.
> Also ubpf was written from scratch with apache2, while perf is gpl,
> so you can just link kernel/bpf/core.o directly instead of using external
> libraries.
> I really meant link .o file compiled for kernel.
> Advertize dummy kfree/kmalloc and it will link fine, since perf
> will only be calling __bpf_prog_run() which is 99% indepdendent from kernel.
> I used to do exactly that long ago while performance tunning the interpreter.
> Another option is to fork the interpreter for perf, but I don't like it at all.
> Compiling the same bpf/core.c once for kernel and once for perf is another option,
> but imo linking core.o is easier.

I just realized we can't link apache2 static library into perf. (is that 
true?)

Current perf building doesn't support directly linking like
this, because such linking makes perf rely on kernel building, so we
can't build perf before building kernel any more.

One possible solution: providing a kernel build dir to perf builder
and find the corresponding '.o' file from it:

  $ make KBUILD_DIR=/kernel/build/dir

/lib/`uname -r`/build can be made as default position.

JIT compiler can also be linked this way.

Another possible solution: using macro trick to allow building bpf/core.c
in perf building. It is possible and simpler, but we could be broken by
kernel modification.

> In general this set and overall bpf in user space makes sense only
> if we allow much more flexible C code for user space.
> If it's limited to ubpf_* helpers, that will quickly become suboptimal.

Yes. I tried to reimplement tracex2 in sample but find it is not an easy 
work.
However, in two of my usecase(reporting and controlling), only reporting
require flexible C code. Even we have full featured C, doing statistical 
still
require relative complex code, because the lacking of data structure support
such as associate array (dict in python). For controling (for example, 
dynamically
period adjustments; output perf.data when something unusual detected), uBPF
programs describe rules and invoke actions (actions should be provided 
by perf
helpers), similary to their kernel side counterparts.

So I think no matter uBPF can be 'full C', we should consider making strong
and flexiblity ubpf helpers. Basic reporting method, such as histogram, 
should be
provided by a perf helper directly, no need to be rewritten by uBPF. We
can even make a ubpf helper to bridge uBPF and lua scripts, then invokes 
lua scripts
at perf hooks. For example:

  const char lua_script[] SEC("UBPF-lua;perf_record_exit") "<lua script>";

Thank you.