Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755282AbbEEDCP (ORCPT ); Mon, 4 May 2015 23:02:15 -0400 Received: from mail-pd0-f179.google.com ([209.85.192.179]:35865 "EHLO mail-pd0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754776AbbEEDCE (ORCPT ); Mon, 4 May 2015 23:02:04 -0400 Message-ID: <554832AA.5050503@plumgrid.com> Date: Mon, 04 May 2015 20:02:02 -0700 From: Alexei Starovoitov User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Wang Nan , davem@davemloft.net, acme@kernel.org, mingo@redhat.com, a.p.zijlstra@chello.nl, masami.hiramatsu.pt@hitachi.com, jolsa@kernel.org CC: linux-kernel@vger.kernel.org, pi3orama@163.com, hekuang@huawei.com, bgregg@netflix.com Subject: Re: [RFC PATCH 00/22] perf tools: introduce 'perf bpf' command to load eBPF programs. References: <1430391165-30267-1-git-send-email-wangnan0@huawei.com> <554302F0.3070101@plumgrid.com> <55447A7D.4000205@huawei.com> In-Reply-To: <55447A7D.4000205@huawei.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3765 Lines: 84 On 5/2/15 12:19 AM, Wang Nan wrote: > > I'd like to do following works in the next version (based on my experience and feedbacks): > > 1. Safely clean up kprobe points after unloading; > > 2. Add subcommand space to 'perf bpf'. Current staff should be reside in 'perf bpf load'; > > 3. Extract eBPF ELF walking and collecting work to a separated library to help others. that's a good list. The feedback for existing patches: patch 18 - since we're creating a generic library for bpf elf loading it would great to do the following: first try to load with attr.log_buf = NULL; attr.log_level = 0; then only if it fails, allocate a buffer and repeat with log_level = 1. The reason is that it's better to have fast program loading by default without any verbosity emitted by verifier. patch 19 - I think it's unnecessary. verifier already dumps it. so this '-v' flag can be translated into verbose loading. There is also .s output from llvm for those interested in bpf asm instructions. > My collage He Kuang is working on variable accessing. Probing inside function body > and accessing its local variable will be supported like this: > > SEC("config") char _prog_config[] = "prog: func_name:1234 vara=localvara" > int prog(struct pt_regs *ctx, unsigned long vara) { > // vara is the value of localvara of function func_name > } that would be great. I'm not sure though how you can achieve that without changing C front-end ? This type of feature is exactly the reason why we're trying to write our front-end. In general there are two ways to achieve 'restricted C' language: - start from clang and chop all features that are not supported. I believe Jovi already tried to do that and it became very difficult. - start from simple front-end with minimal C and add all things one by one. That's what we're trying to do. So far we have most of normal syntax. The problem with our approach is that we cannot easily do #include of existing .h files. We're working on that. It's too experimental still. May be will be drop it and go back to first approach. The reason for extending front-end is your example above, where the user would want to write: int prog(struct pt_regs *ctx, unsigned long vara) { // use 'vara' but generated BPF should have only one 'ctx' pointer, since that's the only thing that verifier will accept. bpf/core and JITs expect only one argument, etc. So this func definition + 'vara' access can be compiled as ctx->si (if vara is actually in register) or bpf_probe_read(ctx->bp + magic_offset_from_debug_info) (if vara is on stack) or it can also be done via store_trace_args() but that will be slower and requires hacking kernel, whereas ctx->... style is pure userspace. Lot's of things to brainstorm. So please share your progress soon. > And I want to discuss with you and others about: > > 1. How to make eBPF output its tracing and aggregation results to perf? well, the output of bpf program is a data stored in maps. Each program needs a corresponding user space reader/printer/sorter of this data. Like tracex2 prints this data as histogram and tracex3 prints it as heatmap. We can standardize few things like this, but ideally we keep it up to user. So that user can write single file that consists of functions that are loaded as bpf into kernel and other functions that are executed in user space. llvm can jit first set to bpf and second set to x86. That's distant future though. So far samples/bpf/ style of kern.c+user.c worked quite well. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/