Message-ID: <55E108BC.4050107@huawei.com>
Date: Sat, 29 Aug 2015 09:19:56 +0800
From: "Wangnan (F)" <wangnan0@huawei.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0
MIME-Version: 1.0
To: Alexei Starovoitov <ast@plumgrid.com>, <acme@redhat.com>
CC: <brendan.d.gregg@gmail.com>, <daniel@iogearbox.net>, <dsahern@gmail.com>,
        <hekuang@huawei.com>, <jolsa@kernel.org>, <xiakaixu@huawei.com>,
        <masami.hiramatsu.pt@hitachi.com>, <namhyung@kernel.org>,
        <a.p.zijlstra@chello.nl>, <lizefan@huawei.com>, <pi3orama@163.com>,
        <linux-kernel@vger.kernel.org>,
        Arnaldo Carvalho de Melo <acme@kernel.org>,
        Ingo Molnar <mingo@redhat.com>, Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH 32/32] bpf: Introduce function for outputing data to perf
 event
References: <1440745570-150857-1-git-send-email-wangnan0@huawei.com> <1440745570-150857-33-git-send-email-wangnan0@huawei.com> <55E10091.6020107@plumgrid.com>
In-Reply-To: <55E10091.6020107@plumgrid.com>
Content-Type: text/plain; charset="utf-8"; format=flowed
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2587
Lines: 66


On 2015/8/29 8:45, Alexei Starovoitov wrote:
> On 8/28/15 12:06 AM, Wang Nan wrote:
>> his patch adds a new trace event to establish infrastruction for bpf to
>> output data to perf. Userspace perf tools can detect and use this event
>> as using the existing tracepoint events.
>>
>> New bpf trace event entry in debugfs:
>>
>>       /sys/kernel/debug/tracing/events/bpf/bpf_output_data
>>
>> Userspace perf tools detect the new tracepoint event as:
>>
>>       bpf:bpf_output_data                          [Tracepoint event]
>>
>> Data in ring-buffer of perf events added to this event will be polled
>> out, sample types and other attributes can be adjusted to those events
>> directly without touching the original kprobe events.
>
> Wang,
> I have 2nd thoughts on this.
> I've played with it, but global bpf:bpf_output_data event is limiting.
> I'd like to use this bpf_output_trace_data() helper for tcp estats
> gathering, but global collector will prevent other similar bpf programs
> running in parallel.

So current model work for you but the problem is all output goes into one
place, which prevents similar BPF programs run in parallel because the
reveicer is unable to tell what message is generated by who. So actually
you want a publish-and-subscribe model, subscriber get messages from only
the publisher it interested in. Am I understand your problem correctly?

> So as a concept I think it's very useful, but we need a way to select
> which ring-buffer to output data to.
> proposal A:
> Can we use ftrace:instances concept and make bpf_output_trace_data()
> into that particular trace_pipe ?
> proposal B:
> bpf_perf_event_read() model is using nice concept of an array of
> perf_events. Can we perf_event_open a 'new' event that can be mmaped
> in user space and bpf_output_trace_data(idx, buf, buf_size) into it.
> Where 'idx' will be an index of FD from perf_even_open of such
> new event?
>

I've also thinking about adding the extra id parameter in 
bpf_output_trace_data()
but it is for encoding the type of output data, which is totally different
from what you want.

For me, I use bpf_output_trace_data() to output information like PMU count
value. Perf is the only receiver, so global collector is perfect. Could you
please describe your usecase in more detail?

Thank you for using that feature!

> Thanks!
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/