Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965438AbbLRVi7 (ORCPT ); Fri, 18 Dec 2015 16:38:59 -0500 Received: from one.firstfloor.org ([193.170.194.197]:55105 "EHLO one.firstfloor.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965225AbbLRVi5 (ORCPT ); Fri, 18 Dec 2015 16:38:57 -0500 Date: Fri, 18 Dec 2015 22:38:55 +0100 From: Andi Kleen To: Stephane Eranian Cc: Andi Kleen , Arnaldo Carvalho de Melo , Peter Zijlstra , Jiri Olsa , Ingo Molnar , LKML , Namhyung Kim Subject: Re: Add top down metrics to perf stat v2 Message-ID: <20151218213854.GD15533@two.firstfloor.org> References: <1450227266-2501-1-git-send-email-andi@firstfloor.org> <20151217140123.GA15533@two.firstfloor.org> <20151218015538.GC15533@two.firstfloor.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3778 Lines: 98 On Fri, Dec 18, 2015 at 01:31:18AM -0800, Stephane Eranian wrote: > >> Why force --per-core when HT is on. I know you you need to aggregate > >> per core, but > >> you could still display globally. And then if user requests > >> --per-core, then display per core. > > > > Global TopDown doesn't make much sense. Suppose you have two programs > > running on different cores, one frontend bound and one backend bound. > > What would the union of the two mean? And you may well end up > > with sums of ratios which are >100%. > > > How could that be if you consider that the machine is N-wide and not just 4-wide > anymore? > > How what you are describing here is different when HT is off? I was talking about cores, not CPU threads. With global aggregation we would aggregate data from different cores, which is highly dubious for TopDown. CPU threads on a core are of course aggregated, that is why the patchkit forces --per-core with HT on. > If you force --per-core with HT-on, then you need to force it too when > HT is off so that you get a similar per core breakdown. In the HT on > case, each Sx-Cy represents 2 threads, compared to 1 in the non HT > case.Right now, you have non-HT reporting global, HT reporting per-core. > That does not make much sense to me. Ok. I guess can force --per-core in this case too. This would simplify things because can get rid of the agg-per-core attribute. > >> but it would be clearer and simpler to interpret to users. > > > > Same problem as above. > > > >> > >> One bug I found when testing is that if you do with HT-on: > >> > >> $ perf stat -a --topdown -I 1000 --metric-only sleep 100 > >> Then you get data for frontend and backend but nothing for retiring or > >> bad speculation. > > > > You see all the columns, but no data in some? > > > yes, and I don't like that. It is confusing especially when you do not > know the threshold. > Why are you suppressing the 'retiring' data when it is at 25% (1/4 of > the maximum possible) > when I am running a simple noploop? 25% is a sign of underutilization, > that could be useful too. It's what the TopDown specification uses and the paper describes. The thresholds are needed when you have more than one level because the lower levels become meaningless if their parents didn't cross the threshold. Otherwise you may report something that looks like a bottle neck, but isn't. Given there is currently only level 1 in the patchkit, but if we ever add more levels absolutely need thresholds. So it's better to have them from Day 1. Utilization should be reported separately. TopDown cannot give utilization because it doesn't know about idle time. I can report - for empty fields if it helps you. It's not clear to me why empty fields in CSV are a problem. I don't think colors are useful here, this would have the problem described above. > > > That's intended: the percentage is only printed when it crosses a > > threshold. That's part of the top down specification. > > > I don't like that. I would rather see all the percentages. > My remark applies to non topdown metrics as well, such as IPC. > Clearly the IPC is awkward to use. You need to know you need to > measure cycles, instructions to get ipc with --metric-only. Again, Well it's the default (perf stat --metric-only), or with -d*, and it works fine with --transaction too. If you think there should be more predefined sets of metrics that's fine for me too, but it would be a separate patch. -Andi -- ak@linux.intel.com -- Speaking for myself only. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/