Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754942AbcC3Cjf (ORCPT ); Tue, 29 Mar 2016 22:39:35 -0400 Received: from szxga02-in.huawei.com ([119.145.14.65]:45076 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754502AbcC3Cje (ORCPT ); Tue, 29 Mar 2016 22:39:34 -0400 Subject: Re: [PATCH 4/4] perf core: Add backward attribute to perf event To: Peter Zijlstra References: <1459147292-239310-1-git-send-email-wangnan0@huawei.com> <1459147292-239310-5-git-send-email-wangnan0@huawei.com> <20160329140439.GK3408@twins.programming.kicks-ass.net> <56FB39E9.7020006@huawei.com> CC: Alexei Starovoitov , Arnaldo Carvalho de Melo , , Brendan Gregg , He Kuang , Jiri Olsa , Masami Hiramatsu , Namhyung Kim , , Zefan Li From: "Wangnan (F)" Message-ID: <56FB3C3D.3050400@huawei.com> Date: Wed, 30 Mar 2016 10:38:53 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 MIME-Version: 1.0 In-Reply-To: <56FB39E9.7020006@huawei.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.111.66.109] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020205.56FB3C49.00EB,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2013-06-18 04:22:30, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: da25edd64f6d47e9d11e4303bb1192de Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2611 Lines: 75 On 2016/3/30 10:28, Wangnan (F) wrote: > > > On 2016/3/29 22:04, Peter Zijlstra wrote: >> On Mon, Mar 28, 2016 at 06:41:32AM +0000, Wang Nan wrote: >> >> Could you maybe write a perf/tests thingy for this so that _some_ >> userspace exists that exercises this new code? >> >> >>> int perf_output_begin(struct perf_output_handle *handle, >>> struct perf_event *event, unsigned int size) >>> { >>> + if (unlikely(is_write_backward(event))) >>> + return __perf_output_begin(handle, event, size, true); >>> return __perf_output_begin(handle, event, size, false); >>> } >> Would something like: >> >> int perf_output_begin(...) >> { >> if (unlikely(is_write_backward(event)) >> return perf_output_begin_backward(...); >> return perf_output_begin_forward(...); >> } >> >> make sense; I'm not sure how much is still using this, but it seems >> somewhat excessive to inline two copies of that thing into a single >> function. > > perf_output_begin is useful: > > $ grep perf_output_begin ./kernel -r > ./kernel/events/ring_buffer.c: * See perf_output_begin(). > ./kernel/events/ring_buffer.c:int perf_output_begin(struct > perf_output_handle *handle, > ./kernel/events/ring_buffer.c: * perf_output_begin() only checks > rb->paused, therefore > ./kernel/events/core.c: if (perf_output_begin(&handle, event, > header.size)) > ./kernel/events/core.c: ret = perf_output_begin(&handle, event, > read_event.header.size); > ./kernel/events/core.c: ret = perf_output_begin(&handle, event, > ./kernel/events/core.c: ret = perf_output_begin(&handle, event, > ./kernel/events/core.c: ret = perf_output_begin(&handle, event, > ./kernel/events/core.c: ret = perf_output_begin(&handle, event, > rec.header.size); > ./kernel/events/core.c: ret = perf_output_begin(&handle, event, > ./kernel/events/core.c: ret = perf_output_begin(&handle, event, > se->event_id.header.size); > ./kernel/events/core.c: ret = perf_output_begin(&handle, event, > ./kernel/events/core.c: ret = perf_output_begin(&handle, event, > rec.header.size); > > Events like PERF_RECORD_MMAP2 uses this function, so we still need to > consider its overhead. > > So I will use your first suggestion. > Sorry. Your second suggestion seems also good: My implementation makes a big perf_output_begin(), but introduces only one load and one branch. Your first suggestion introduces one load, one branch and one function call. Your second suggestion introduces one load, and at least one (and at most three) branches. I need some benchmarking result. Thank you.