Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751524AbbGBFyk (ORCPT ); Thu, 2 Jul 2015 01:54:40 -0400 Received: from szxga03-in.huawei.com ([119.145.14.66]:20140 "EHLO szxga03-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751124AbbGBFyc (ORCPT ); Thu, 2 Jul 2015 01:54:32 -0400 Message-ID: <5594D1AA.9040803@huawei.com> Date: Thu, 2 Jul 2015 13:52:42 +0800 From: "Wangnan (F)" User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Alexei Starovoitov , He Kuang , , , , , , , CC: Subject: Re: [RFC PATCH 1/5] bpf: Put perf_events check ahead of bpf prog References: <1435719455-91155-1-git-send-email-hekuang@huawei.com> <1435719455-91155-2-git-send-email-hekuang@huawei.com> <5594B50A.3010705@plumgrid.com> In-Reply-To: <5594B50A.3010705@plumgrid.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.111.66.109] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020202.5594D1BB.00F5,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2013-05-26 15:14:31, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 4360b80608dfa52ba823d4a69c5b9f36 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2647 Lines: 67 On 2015/7/2 11:50, Alexei Starovoitov wrote: > On 6/30/15 7:57 PM, He Kuang wrote: >> When we add a kprobe point and record events by perf, the execution path >> of all threads on each cpu will enter this point, but perf may only >> record events on a particular thread or cpu at this kprobe point, a >> check on call->perf_events list filters out the threads which perf is >> not recording. > > I think there is a better way to do that. You're adding artificial > per_cpu filtering whereas you really need per_pid filtering. I think the differences between you and He Kuang is the order of filtering. In He Kuang's view, perf's original filtering mechanism (implicit or explicit) should takes precedence over BPF filter, because what the user want is to filter events with *an additional* BPF filter. So filters should be run by following order: event -> X -> Y -> Z -> BPF filter +-> perf.data | `-> dropped (In the above diagram, X represents limitations which prevent an event to be triggered. For example, kprobe reentering. Y represents implicit filters, like checking of call->perf_events, which is used to filter events from other CPU out (per-pid perf event is also done by it). Z represents explicit filter which is set using PERF_EVENT_IOC_SET_FILTER by user.) So only those events which should be collected by perf without BPF filter should be passed to BPF program. The above is our understanding of ideal BPF filters. Therefore, to create a ideal BPF filter, it should be better to put BPF filters into perf_tp_filter_match(). In current implementation, BPF filters take effects in the middle of kprobe event processing: event -> X -> BPF filter -> Y -> Z +-> perf.data | `-> dropped And this patch changes the ordering to: event -> X -> Y -> BPF filter -> Z +-> perf.data | `-> dropped Both are not ideal, but He Kuang's patch moves BPF filter to correct direction. It uses a relativly lower-cost operation (checking of call->perf_events) to reduce the need of calling BPF filters. I'd like to discuss with you about the correctness of our understanding. Do you have any strong reason to put BPF filters at such an early stage? Thank you. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/