Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752646AbcDZUDI (ORCPT ); Tue, 26 Apr 2016 16:03:08 -0400 Received: from mail-io0-f180.google.com ([209.85.223.180]:33772 "EHLO mail-io0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752440AbcDZUDF (ORCPT ); Tue, 26 Apr 2016 16:03:05 -0400 MIME-Version: 1.0 In-Reply-To: References: <20160425161425.GA25218@kernel.org> <20160425162706.GB25218@kernel.org> <20160425192229.GC25218@kernel.org> <20160425200646.GA23875@ast-mbp.thefacebook.com> <20160425201750.GD25218@kernel.org> <20160425215947.GA25915@ast-mbp.thefacebook.com> <20160425234138.GA16708@kernel.org> <20160426000724.GA28705@ast-mbp.thefacebook.com> <20160426002928.GB16708@kernel.org> <20160426004358.GA29875@ast-mbp.thefacebook.com> <20160426004747.GC16708@kernel.org> From: Brendan Gregg Date: Tue, 26 Apr 2016 13:02:34 -0700 Message-ID: Subject: Re: [PATCH/RFC v3] perf core: Allow setting up max frame stack depth via sysctl To: Arnaldo Carvalho de Melo Cc: Alexei Starovoitov , David Ahern , Peter Zijlstra , Frederic Weisbecker , Ingo Molnar , Adrian Hunter , Alexander Shishkin , Alexei Starovoitov , He Kuang , Jiri Olsa , Masami Hiramatsu , Milian Wolff , Namhyung Kim , Stephane Eranian , Thomas Gleixner , Vince Weaver , Wang Nan , Zefan Li , Linux Kernel Mailing List Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3245 Lines: 91 On Mon, Apr 25, 2016 at 5:49 PM, Brendan Gregg wrote: > On Mon, Apr 25, 2016 at 5:47 PM, Arnaldo Carvalho de Melo > wrote: >> Em Mon, Apr 25, 2016 at 05:44:00PM -0700, Alexei Starovoitov escreveu: >>> On Mon, Apr 25, 2016 at 09:29:28PM -0300, Arnaldo Carvalho de Melo wrote: >>> > Em Mon, Apr 25, 2016 at 05:07:26PM -0700, Alexei Starovoitov escreveu: >>> > > > + { >>> > > > + .procname = "perf_event_max_stack", >>> > > > + .data = NULL, /* filled in by handler */ >>> > > > + .maxlen = sizeof(sysctl_perf_event_max_stack), >>> > > > + .mode = 0644, >>> > > > + .proc_handler = perf_event_max_stack_handler, >>> > > > + .extra1 = &zero, >>> > > > + }, >>> > >>> > > you need to define a max value otherwise perf_callchain_entry__sizeof >>> > > will overflow. Sure it's root only facility, but still not nice. >>> > > 1M? Anything above 1M stack frames would be insane anyway. >>> > > The rest looks good. Thanks! >>> > >>> > Something else? ;-) >>> >>> all looks good to me. Thanks a bunch! >> >> Thanks for checking! >> >>> > Because we only allocate the callchain percpu data structures when there >>> > is a user, which allows for changing the max easily, its just a matter >>> > of having no callchain users at that point. >>> > >>> > Reported-and-Tested-by: Brendan Gregg >>> > Acked-by: Alexei Starovoitov >>> >>> yep :) >>> hopefully Brendan can give it another spin. >> >> Agreed, and I'm calling it a day anyway, Brendan, please consider >> retesting, thanks, >> > > Will do, thanks! > Looks good. I started with max depth = 512, and even that was still truncated, and had to profile again at 1024 to capture the full stacks. Seems to generally match the flame graph I generated with V1, which made me want to check that I'm running the new patch, and am: # grep six_hundred_forty_kb /proc/kallsyms ffffffff81c431e0 d six_hundred_forty_kb I was mucking around and was able to get "corrupted callchain. skipping..." errors, but these look to be expected -- that was profiling a binary (bash) that doesn't have frame pointers. Some perf script -D output: 16 3204735442777 0x18f0d8 [0x2030]: PERF_RECORD_SAMPLE(IP, 0x1): 18134/18134: 0xffffffff8118b6a4 period: 1001001 addr: 0 ... FP chain: nr:1023 ..... 0: ffffffffffffff80 ..... 1: ffffffff8118b6a4 ..... 2: ffffffff8118bc47 ..... 3: ffffffff811d8c85 ..... 4: ffffffff811b18f8 ..... 5: ffffffff811b2a55 ..... 6: ffffffff811b5ea0 ..... 7: ffffffff810663c0 ..... 8: ffffffff810666e0 ..... 9: ffffffff817b9d28 ..... 10: fffffffffffffe00 ..... 11: 00000000004b45e2 ..... 12: 000000000000610f ..... 13: 0000000000006110 ..... 14: 0000000000006111 ..... 15: 0000000000006112 ..... 16: 0000000000006113 ..... 17: 0000000000006114 ..... 18: 0000000000006115 ..... 19: 0000000000006116 ..... 20: 0000000000006117 [...] ..... 1021: 000000000000650b ..... 1022: 000000000000650c ... thread: bash:18134 ...... dso: /lib/modules/4.6.0-rc5-virtual/build/vmlinux bash 18134 [016] 3204.735442: 1001001 cpu-clock: Brendan