Message-ID: <559584FF.2000500@plumgrid.com>
Date: Thu, 02 Jul 2015 11:37:51 -0700
From: Alexei Starovoitov <ast@plumgrid.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.7.0
MIME-Version: 1.0
To: "Wangnan (F)" <wangnan0@huawei.com>, He Kuang <hekuang@huawei.com>,
        Peter Zijlstra <peterz@infradead.org>
CC: rostedt@goodmis.org, masami.hiramatsu.pt@hitachi.com, mingo@redhat.com,
        acme@redhat.com, jolsa@kernel.org, namhyung@kernel.org,
        linux-kernel@vger.kernel.org, pi3orama <pi3orama@163.com>
Subject: Re: [RFC PATCH 0/5] Make eBPF programs output data to perf event
References: <1435719455-91155-1-git-send-email-hekuang@huawei.com> <20150701054458.GN19282@twins.programming.kicks-ass.net> <559386D7.1020208@huawei.com> <20150701115825.GV19282@twins.programming.kicks-ass.net> <5594A679.7010108@plumgrid.com> <5594B22F.8090500@huawei.com> <5594B569.60103@plumgrid.com> <55950356.3050507@huawei.com>
In-Reply-To: <55950356.3050507@huawei.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1882
Lines: 37

On 7/2/15 2:24 AM, Wangnan (F) wrote:
> Yes, by using perf_trace_buf_prepare() + perf_trace_buf_submit() in
> helper function and let bpf program always returns 0 we can make data
> collected by BPF programs output into samples, if following problems
> are solved:
>
>   1. In bpf program there's no way to get 'struct perf_event' or 'struct
>      ftrace_event_call'. We have to deduce them through pt_regs:
>
>      pt_regs -> ip -> kprobe -> struct trace_kprobe -> struct
>       ftrace_event_call -> hlist_entry -> struct perf_event

yeah, going through hash table via get_kprobe() is not pretty.
How about using this_cpu_write(current_perf_event, ...) and using it
from the helper? bpf progs are non-preemptable and non-reentrable.
Also I think this helper would be more flexible if we can
allow passing sample_type into it.
Ideally from the program one could do something like:
bpf_event_output(buf, sizeof(buf), PERF_SAMPLE_CALLCHAIN);
which will prepare a sample with raw buf and callstack.
This way program can decide when and how send events to user space.

>   2. Even if we finally get 'struct perf_event', I'm not sure whether
>      user really concern on it. If we really concern on all information
>      output through perf_trace_buf_submit() like callstack and
>      register, why not make bpf program return non-zero instead? But then
>      we have to consider how to connect two samples together.

see my suggestion above. when sample_type was hard coded during event
creation it's a useful case on its own, but if we can make program to
provide this type dynamically, it will open whole new set of possibilities.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/