Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751182AbbFMEH5 (ORCPT ); Sat, 13 Jun 2015 00:07:57 -0400 Received: from mail-pd0-f170.google.com ([209.85.192.170]:36572 "EHLO mail-pd0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750807AbbFMEHv (ORCPT ); Sat, 13 Jun 2015 00:07:51 -0400 Message-ID: <557BAC94.3090301@gmail.com> Date: Fri, 12 Jun 2015 22:07:48 -0600 From: David Ahern User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: "Liang, Kan" , Andi Kleen CC: Arnaldo Carvalho de Melo , "linux-kernel@vger.kernel.org" , "Huang, Ying" Subject: Re: [PATCH 1/1] perf,tools: add time out to force stop endless mmap processing References: <1433922364-22580-1-git-send-email-kan.liang@intel.com> <20150611140614.GC2696@kernel.org> <5579A766.4010504@gmail.com> <20150611184737.GU19417@two.firstfloor.org> <557A28F6.8040603@gmail.com> <37D7C6CF3E00A74B8858931C1DB2F077018767AD@SHSMSX103.ccr.corp.intel.com> <557AFDB1.7030902@gmail.com> <37D7C6CF3E00A74B8858931C1DB2F07701876834@SHSMSX103.ccr.corp.intel.com> <557B16C4.7000000@gmail.com> <37D7C6CF3E00A74B8858931C1DB2F07701876CB2@SHSMSX103.ccr.corp.intel.com> <557B3309.4080002@gmail.com> <37D7C6CF3E00A74B8858931C1DB2F07701876D60@SHSMSX103.ccr.corp.intel.com> <557B4691.7090304@gmail.com> <37D7C6CF3E00A74B8858931C1DB2F07701876DE9@SHSMSX103.ccr.corp.intel.com> In-Reply-To: <37D7C6CF3E00A74B8858931C1DB2F07701876DE9@SHSMSX103.ccr.corp.intel.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2207 Lines: 51 On 6/12/15 4:41 PM, Liang, Kan wrote: >> >> On 6/12/15 2:39 PM, Liang, Kan wrote: >>> Here are the test results. >>> Please note that I get "synthesized threads took..." after the test case >> exit. >>> It means both way have the same issue. >> >> Got it. So what you really mean is launching perf on an already running >> process perf never finishes initializing. There are several types of problems >> like this. For example on a sparc system with a 1024 cpus if I launch perf >> (top or record) after starting a kernel build with make -j >> 1024 the build finishes before perf starts collecting samples. ie., it never >> finishes walking /proc until the build is complete. task_diag does not solve >> that problem either and in general the procps tools can't handle it either >> (ps or top for example). >> > > We should not stop using system wide perf top/record just because there > are some threads which have huge/growing maps. I have not said anything to that effect. I am trying to understand the fundamental points here for a test app you can't / won't distribute. And, I am also pointing out similar problems that perf and other tools can't handle. > The maps information is not critical for sampling. But is for correlating the addresses in those samples. > > If task_diag does not solve this problem, I think we still need a time out > to force stop endless mmap processing. It's the simplest working > solution so far. I disagree with the timeout. For example an overloaded system where perf is not getting scheduled could trigger the same. Also, in the spirit of perf if you are going to drop information you need to generate an event that says information was lost and have the analysis tools show a message that information was lost. You can't simply bail out and have "[unknown]" shown for symbols / dsos. I get tons of user comments about perf showing callchains properly; the proposed patch just adds to that confusion. David -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/