Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932163AbcC2AZ6 (ORCPT ); Mon, 28 Mar 2016 20:25:58 -0400 Received: from mail-pa0-f47.google.com ([209.85.220.47]:35199 "EHLO mail-pa0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752222AbcC2AZ4 (ORCPT ); Mon, 28 Mar 2016 20:25:56 -0400 Date: Mon, 28 Mar 2016 17:25:43 -0700 From: Alexei Starovoitov To: Wang Nan Cc: Alexei Starovoitov , Arnaldo Carvalho de Melo , Peter Zijlstra , linux-kernel@vger.kernel.org, Brendan Gregg , He Kuang , Jiri Olsa , Masami Hiramatsu , Namhyung Kim , pi3orama@163.com, Zefan Li Subject: Re: [PATCH 3/4] perf core: Prepare writing into ring buffer from end Message-ID: <20160329002541.GA31198@ast-mbp.thefacebook.com> References: <1459147292-239310-1-git-send-email-wangnan0@huawei.com> <1459147292-239310-4-git-send-email-wangnan0@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1459147292-239310-4-git-send-email-wangnan0@huawei.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2554 Lines: 66 On Mon, Mar 28, 2016 at 06:41:31AM +0000, Wang Nan wrote: > Convert perf_output_begin to __perf_output_begin and make the later > function able to write records from the end of the ring buffer. > Following commits will utilize the 'backward' flag. > > This is the core patch to support writing ring buffer backward, which > would be introduced by following patch to support reading from > overwritable ring buffer. > > In theory, this patch should not introduce any extra performance > overhead since we use always_inline. > > When CONFIG_OPTIMIZE_INLINING is disabled, the output object is nearly > identical to original one. See [1]. > > When CONFIG_OPTIMIZE_INLINING is enabled, the resuling object file becomes > smaller: > > $ size kernel/events/ring_buffer.o* > text data bss dec hex filename > 4545 4 8 4557 11cd kernel/events/ring_buffer.o.new > 4641 4 8 4653 122d kernel/events/ring_buffer.o.old > > Performance result: > > Calling 3000000 times of 'close(-1)', use gettimeofday() to check > duration. Use 'perf record -o /dev/null -e raw_syscalls:*' to capture > system calls. In ns. > > Testing environment: > > CPU : Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz > Kernel : v4.5.0 > > MEAN STDVAR > BASE 800214.950 2853.083 > PRE 2253846.700 9997.014 > POST 2257495.540 8516.293 > > Where 'BASE' is pure performance without capturing. 'PRE' is test > result of pure 'v4.5.0' kernel. 'POST' is test result after this > patch. See [4] for detail experimental setup. > > Considering the stdvar, this patch doesn't hurt performance. > > For the detail of testing method, please refer to [2]. > > [1] http://lkml.kernel.org/g/56F52E83.70409@huawei.com > [2] http://lkml.kernel.org/g/56F89DCD.1040202@huawei.com > > Signed-off-by: Wang Nan > Cc: He Kuang > Cc: Alexei Starovoitov > Cc: Arnaldo Carvalho de Melo > Cc: Brendan Gregg > Cc: Jiri Olsa > Cc: Masami Hiramatsu > Cc: Namhyung Kim > Cc: Peter Zijlstra > Cc: Zefan Li > Cc: pi3orama@163.com > --- > kernel/events/ring_buffer.c | 42 ++++++++++++++++++++++++++++++++++++------ > 1 file changed, 36 insertions(+), 6 deletions(-) Acked-by: Alexei Starovoitov