Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752882AbbGBDwU (ORCPT ); Wed, 1 Jul 2015 23:52:20 -0400 Received: from mail-pa0-f52.google.com ([209.85.220.52]:35264 "EHLO mail-pa0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751378AbbGBDwL (ORCPT ); Wed, 1 Jul 2015 23:52:11 -0400 Message-ID: <5594B569.60103@plumgrid.com> Date: Wed, 01 Jul 2015 20:52:09 -0700 From: Alexei Starovoitov User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: He Kuang , Peter Zijlstra , "Wangnan (F)" CC: rostedt@goodmis.org, masami.hiramatsu.pt@hitachi.com, mingo@redhat.com, acme@redhat.com, jolsa@kernel.org, namhyung@kernel.org, linux-kernel@vger.kernel.org, pi3orama Subject: Re: [RFC PATCH 0/5] Make eBPF programs output data to perf event References: <1435719455-91155-1-git-send-email-hekuang@huawei.com> <20150701054458.GN19282@twins.programming.kicks-ass.net> <559386D7.1020208@huawei.com> <20150701115825.GV19282@twins.programming.kicks-ass.net> <5594A679.7010108@plumgrid.com> <5594B22F.8090500@huawei.com> In-Reply-To: <5594B22F.8090500@huawei.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2316 Lines: 64 On 7/1/15 8:38 PM, He Kuang wrote: > > > On 2015/7/2 10:48, Alexei Starovoitov wrote: >> On 7/1/15 4:58 AM, Peter Zijlstra wrote: >>> >>> But why create a separate trace buffer, it should go into the regular >>> perf buffer. >> >> +1 >> >> I think >> +static char __percpu *perf_extra_trace_buf[PERF_NR_CONTEXTS]; >> is redundant. >> It adds quite a bit of unnecessary complexity to the whole patch set. >> >> Also the call to bpf_output_sample() is not effective unless program >> returns 1. It's a confusing user interface. >> >> Also you cannot ever do: >> BPF_FUNC_probe_read, >> + BPF_FUNC_output_sample, >> BPF_FUNC_ktime_get_ns, >> new functions must be added to the end. >> >> Why not just do: >> perf_trace_buf_prepare() + perf_trace_buf_submit() from the helper? >> No changes to current code. >> No need to call __get_data_size() and other overhead. >> The helper can be called multiple times from the same program. >> imo much cleaner. >> > > Invoke perf_trace_buf_submit() will generate a second perf > event (header->type = PERF_RECORD_SAMPLE) entry which is > different from the event entry outputed by the orignial > kprobe. So the final result of the example in 00/00 patch may > like this: > > sample entry 1(from bpf_prog): > comm timestamp1 generic_perform_write pmu_value=0x1234 > sample entry 2(from original kprobe): > comm timestamp2 generic_perform_write: (ffffffff81140b60) > Compared with current implementation: > combined sample entry: > comm timestamp generic_perform_write: (ffffffff81140b60) > pmu_value=0x1234 > > The former two entries may be discontinuous as there are multiple > threads and kprobes to be recorded, and there's a chance that one > entry is missed but the other is recorded. What we need is the > pmu_value read when 'generic_perform_write' enters, the two > entries result is not intuitive enough and userspace tools have > to do the work to find and combine those two sample entries to > get the result. Just change your example to return 0 and user space will see one sample. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/