Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933013AbcKYUmp (ORCPT ); Fri, 25 Nov 2016 15:42:45 -0500 Received: from one.firstfloor.org ([193.170.194.197]:40442 "EHLO one.firstfloor.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932604AbcKYUmg (ORCPT ); Fri, 25 Nov 2016 15:42:36 -0500 Date: Fri, 25 Nov 2016 12:42:33 -0800 From: Andi Kleen To: "Liang, Kan" Cc: Andi Kleen , "peterz@infradead.org" , "mingo@redhat.com" , "acme@kernel.org" , "linux-kernel@vger.kernel.org" , "alexander.shishkin@linux.intel.com" , "tglx@linutronix.de" , "namhyung@kernel.org" , "jolsa@kernel.org" , "Hunter, Adrian" , "wangnan0@huawei.com" , "mark.rutland@arm.com" Subject: Re: [PATCH 13/14] perf tools: warn on high overhead Message-ID: <20161125204233.GG26852@two.firstfloor.org> References: <1479894292-16277-1-git-send-email-kan.liang@intel.com> <1479894292-16277-14-git-send-email-kan.liang@intel.com> <87d1hmx7tn.fsf@tassilo.jf.intel.com> <37D7C6CF3E00A74B8858931C1DB2F07750CA27D8@SHSMSX103.ccr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <37D7C6CF3E00A74B8858931C1DB2F07750CA27D8@SHSMSX103.ccr.corp.intel.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1230 Lines: 38 On Wed, Nov 23, 2016 at 10:03:24PM +0000, Liang, Kan wrote: > > Perhaps we need two separate metrics here: > > > > - cost of perf record on its CPU (or later on if it gets multi threaded > > more multiple). Warn if this is >50% or so. > > What's the formula for cost of perf record on its CPU? > The cost only includes user space overhead or all overhead? > What is the divisor? It would be all the overhead in the process. Accounting overhead in kernel threads or interrupts caused by IO is difficult, we could leave that out for now. Sum of: For each perf thread: thread cpu time / monotonic wall time I guess Sum is better than average here because the perf threads are likely running (or could be) on the same CPU. If perf record was changed to be more aggressively flush buffers on the local CPUs this would need to change, but I presume it's good enough for now. > > > > - average perf collection overhead on a CPU. The 10% threshold here > > seems appropiate. > For the average, do you mean add all overheads among CPUs together > and divide the CPU#? Right. Possibly also max of all too. > > To calculate the rate, the divisor is wall clock time, right? monotonic wall clock time yes. -Andi