Date: Fri, 18 Dec 2015 22:38:55 +0100
From: Andi Kleen <andi@firstfloor.org>
To: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <andi@firstfloor.org>,
        Arnaldo Carvalho de Melo <acme@kernel.org>,
        Peter Zijlstra <peterz@infradead.org>, Jiri Olsa <jolsa@kernel.org>,
        Ingo Molnar <mingo@kernel.org>, LKML <linux-kernel@vger.kernel.org>,
        Namhyung Kim <namhyung@kernel.org>
Subject: Re: Add top down metrics to perf stat v2
Message-ID: <20151218213854.GD15533@two.firstfloor.org>
References: <1450227266-2501-1-git-send-email-andi@firstfloor.org>
 <CABPqkBS5X+cBK0SrmRP6Wn+_H-f0MJh2JZj_X3FRuV26z3adXQ@mail.gmail.com>
 <20151217140123.GA15533@two.firstfloor.org>
 <CABPqkBTP4JpL+tsXtAaXnETVu4iGTti=zw2ZVt90BSfsF+en1Q@mail.gmail.com>
 <20151218015538.GC15533@two.firstfloor.org>
 <CABPqkBRftsHEAEwgCn3i3=mfk9fjh5r4MycdjHKRka5voTj9JA@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CABPqkBRftsHEAEwgCn3i3=mfk9fjh5r4MycdjHKRka5voTj9JA@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3778
Lines: 98

On Fri, Dec 18, 2015 at 01:31:18AM -0800, Stephane Eranian wrote:
> >> Why force --per-core when HT is on. I know you you need to aggregate
> >> per core, but
> >> you could still display globally. And then if user requests
> >> --per-core, then display per core.
> >
> > Global TopDown doesn't make much sense. Suppose you have two programs
> > running on different cores, one frontend bound and one backend bound.
> > What would the union of the two mean? And you may well end up
> > with sums of ratios which are >100%.
> >
> How could that be if you consider that the machine is N-wide and not just 4-wide
> anymore?
> 
> How what you are describing here is different when HT is off?

I was talking about cores, not CPU threads.

With global aggregation we would aggregate data from different cores,
which is highly dubious for TopDown.

CPU threads on a core are of course aggregated, that is why the patchkit
forces --per-core with HT on.

> If you force --per-core with HT-on, then you need to force it too when
> HT is off so that  you get a similar per core breakdown. In the HT on
> case, each Sx-Cy represents 2 threads, compared to 1 in the non HT
> case.Right now, you have non-HT reporting global, HT reporting per-core.
> That does not make much sense to me.

Ok.  I guess can force --per-core in this case too. This would simplify
things because can get rid of the agg-per-core attribute.

> >> but it would be clearer and simpler to interpret to users.
> >
> > Same problem as above.
> >
> >>
> >> One bug I found when testing is that if you do with HT-on:
> >>
> >> $ perf stat -a --topdown -I 1000 --metric-only sleep 100
> >> Then you get data for frontend and backend but nothing for retiring or
> >> bad speculation.
> >
> > You see all the columns, but no data in some?
> >
> yes, and I don't like that. It is confusing especially when you do not
> know the threshold.
> Why are you suppressing the 'retiring' data when it is at 25% (1/4 of
> the maximum possible)
> when I am running a simple noploop? 25% is a sign of underutilization,
> that could be useful too.

It's what the TopDown specification uses and the paper describes. 

The thresholds are needed when you have more than one level because
the lower levels become meaningless if their parents didn't cross the
threshold. Otherwise you may report something that looks like a
bottle neck, but isn't.

Given there is currently only level 1 in the patchkit, but if we ever
add more levels absolutely need thresholds. So it's better to have
them from Day 1.

Utilization should be reported separately. TopDown cannot give
utilization because it doesn't know about idle time.

I can report - for empty fields if it helps you.  It's not clear
to me why empty fields in CSV are a problem.

I don't think colors are useful here, this would have the problem
described above.

> 
> > That's intended: the percentage is only printed when it crosses a
> > threshold. That's part of the top down specification.
> >
> I don't like that. I would rather see all the percentages.
> My remark applies to non topdown metrics as well, such as IPC.
> Clearly the IPC is awkward to use. You need to know you need to
> measure cycles, instructions to get ipc with --metric-only. Again,

Well it's the default (perf stat --metric-only), or with -d*, and it works fine
with --transaction too.

If you think there should be more predefined sets of metrics
that's fine for me too, but it would be a separate
patch.


-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/