From: Kan Liang <[email protected]>
print_aggr fails to print per-core/per-socket statistics after commit
b7f0c203586b ("perf evlist: Propagate cpu maps to evsels in an evlist"),
if events have differnt cpus. Because in print_aggr, aggr_get_id needs
index (not cpu id) to find core/pkg id.
This patch introduced perf_evsel__get_cpumap_index to get the index by
cpu id for a given event. The index can be used to find correct cpu id
for print_aggr.
Here is an example.
Counting events cycles,uncore_imc_0/cas_count_read/. (Uncore event has
cpumask 0,18)
"perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,18 --per-core
sleep 2"
Without this patch, it failes to get CPU 18 result.
Performance counter stats for 'CPU(s) 0,18':
S0-C0 1 7526851 cycles
S0-C0 1 1.05 MiB uncore_imc_0/cas_count_read/
S1-C0 0 <not counted> cycles
S1-C0 0 <not counted> MiB uncore_imc_0/cas_count_read/
With this patch, it can get both CPU0 and CPU18 result.
Performance counter stats for 'CPU(s) 0,18':
S0-C0 1 6327768 cycles
S0-C0 1 0.47 MiB uncore_imc_0/cas_count_read/
S1-C0 1 330228 cycles
S1-C0 1 0.29 MiB uncore_imc_0/cas_count_read/
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/builtin-stat.c | 1 +
tools/perf/util/cpumap.c | 4 ++--
tools/perf/util/evsel.c | 14 ++++++++++++++
tools/perf/util/evsel.h | 2 ++
4 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 37e301a..a3ea735 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -708,6 +708,7 @@ static void print_aggr(char *prefix)
nr = 0;
for (cpu = 0; cpu < perf_evsel__nr_cpus(counter); cpu++) {
cpu2 = perf_evsel__cpus(counter)->map[cpu];
+ cpu2 = perf_evsel__get_cpumap_index(cpu2, evsel_list->cpus);
s2 = aggr_get_id(evsel_list->cpus, cpu2);
if (s2 != id)
continue;
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 3667e21..34843e5 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -232,7 +232,7 @@ int cpu_map__get_socket(struct cpu_map *map, int idx)
char path[PATH_MAX];
int cpu, ret;
- if (idx > map->nr)
+ if ((idx > map->nr) || (idx < 0))
return -1;
cpu = map->map[idx];
@@ -296,7 +296,7 @@ int cpu_map__get_core(struct cpu_map *map, int idx)
char path[PATH_MAX];
int cpu, ret, s;
- if (idx > map->nr)
+ if ((idx > map->nr) || (idx < 0))
return -1;
cpu = map->map[idx];
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 2936b30..32094d3 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -974,6 +974,20 @@ int __perf_evsel__read_on_cpu(struct perf_evsel *evsel,
return 0;
}
+int perf_evsel__get_cpumap_index(int cpu, struct cpu_map *evsel_cpus)
+{
+ int i;
+
+ if (evsel_cpus == NULL)
+ return -1;
+
+ for (i = 0; i < evsel_cpus->nr; i++) {
+ if (cpu == evsel_cpus->map[i])
+ return i;
+ }
+ return -1;
+}
+
static int get_group_fd(struct perf_evsel *evsel, int cpu, int thread)
{
struct perf_evsel *leader = evsel->leader;
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 4a7ed56..05e68a0 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -355,4 +355,6 @@ typedef int (*attr__fprintf_f)(FILE *, const char *, const char *, void *);
int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
attr__fprintf_f attr__fprintf, void *priv);
+int perf_evsel__get_cpumap_index(int cpu, struct cpu_map *evsel_cpus);
+
#endif /* __PERF_EVSEL_H */
--
1.8.3.1
From: Kan Liang <[email protected]>
Some PMU events have cpumask, e.g uncore events. The cpu list set by
user may be incompatible with event's cpumask.
This patch will check the user defined cpu list. If the incompatible cpu
is found, it will warn the user and discard the incompatible cpu. Only
available cpu can be stored in evsel->cpus->map. If there is no cpu from
cpu list compatible with event's cpumask. It will error out.
Here is an example.
According to cpumask, uncore should only available on CPU0 and CPU18.
So the S0-C1 for uncore should not count.
Without this patch
$ sudo ./perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,1,18
--per-core sleep 2
Performance counter stats for 'CPU(s) 0,1,18':
S0-C0 1 6749638 cycles
S0-C0 1 0.83 MiB uncore_imc_0/cas_count_read/
(100.00%)
S0-C1 1 232421 cycles
S0-C1 1 0.83 MiB uncore_imc_0/cas_count_read/
S1-C0 1 236997 cycles
S1-C0 1 0.35 MiB uncore_imc_0/cas_count_read/
2.001094019 seconds time elapsed
With this patch
$ perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,1,18 --per-core
sleep 2
event uncore_imc_0/cas_count_read/ can only be monitored on CPU 0 18.
Other CPUs will be discard.
Performance counter stats for 'CPU(s) 0,1,18':
S0-C0 1 5557406 cycles
S0-C0 1 0.21 MiB uncore_imc_0/cas_count_read/
S0-C1 1 1012534 cycles
S0-C1 0 <not counted> MiB uncore_imc_0/cas_count_read/
S1-C0 1 916130 cycles
S1-C0 1 0.08 MiB uncore_imc_0/cas_count_read/
2.001110843 seconds time elapsed
Signed-off-by: Kan Liang <[email protected]>
---
tools/perf/util/evlist.c | 77 ++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 72 insertions(+), 5 deletions(-)
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 6cfdee6..f179379 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1101,6 +1101,71 @@ int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
}
+static int cmp_ids(const void *a, const void *b)
+{
+ return *(int *)a - *(int *)b;
+}
+
+static int perf_evlist__check_evsel_cpus(struct perf_evlist *evlist, struct perf_evsel *evsel)
+{
+ const struct cpu_map *cpus = evlist->cpus;
+ const int ncpus = cpu_map__nr(evlist->cpus);
+ int j = 0, cpu_nr = 0, tmp = 0;
+ int i;
+
+ /* ensure we process id in increasing order */
+ qsort(evlist->cpus->map, evlist->cpus->nr, sizeof(int), cmp_ids);
+
+ /* find the common cpus between evsel and evlist. */
+ for (i = 0; i < cpu_map__nr(evsel->cpus);) {
+
+ if (j >= ncpus) {
+ evsel->cpus->map[i++] = -1;
+ continue;
+ }
+ for (; j < ncpus; j++) {
+ if (cpus->map[j] < evsel->cpus->map[i])
+ continue;
+ if (cpus->map[j] == evsel->cpus->map[i]) {
+ cpu_nr++;
+ j++;
+ i++;
+ } else
+ evsel->cpus->map[i++] = -1;
+ break;
+ }
+ }
+
+ if (cpu_nr == 0) {
+ pr_warning("event %s cannot be monitored on the given cpus."
+ "Please check cpumask\n", evsel->name);
+ return -1;
+ }
+
+ if (ncpus > cpu_nr)
+ pr_warning("event %s can only be monitored on CPU", evsel->name);
+
+ /* order evsel cpus */
+ for (i = 0; i < cpu_nr; i++) {
+ if (evsel->cpus->map[i] == -1) {
+ while (evsel->cpus->map[tmp] == -1) {
+ tmp++;
+ BUG_ON(tmp >= cpu_map__nr(evsel->cpus));
+ }
+ evsel->cpus->map[i] = evsel->cpus->map[tmp];
+ evsel->cpus->map[tmp] = -1;
+ }
+ if (ncpus > cpu_nr)
+ pr_warning(" %d", evsel->cpus->map[i]);
+ tmp++;
+ }
+ evsel->cpus->nr = cpu_nr;
+ if (ncpus > cpu_nr)
+ pr_warning(". Other CPUs will be discard.\n");
+
+ return 0;
+}
+
static int perf_evlist__propagate_maps(struct perf_evlist *evlist,
struct target *target)
{
@@ -1108,13 +1173,15 @@ static int perf_evlist__propagate_maps(struct perf_evlist *evlist,
evlist__for_each(evlist, evsel) {
/*
- * We already have cpus for evsel (via PMU sysfs) so
- * keep it, if there's no target cpu list defined.
+ * We already have cpus for evsel (via PMU sysfs)
+ * and target cpu list defined, check if they are
+ * compatible. If not, discard incompatible cpus.
*/
- if (evsel->cpus && target->cpu_list)
- cpu_map__put(evsel->cpus);
+ if (evsel->cpus && target->cpu_list &&
+ perf_evlist__check_evsel_cpus(evlist, evsel))
+ return -EINVAL;
- if (!evsel->cpus || target->cpu_list)
+ if (!evsel->cpus)
evsel->cpus = cpu_map__get(evlist->cpus);
evsel->threads = thread_map__get(evlist->threads);
--
1.8.3.1
On Mon, Jun 29, 2015 at 03:55:34PM -0400, [email protected] wrote:
> From: Kan Liang <[email protected]>
>
> print_aggr fails to print per-core/per-socket statistics after commit
> b7f0c203586b ("perf evlist: Propagate cpu maps to evsels in an evlist"),
> if events have differnt cpus. Because in print_aggr, aggr_get_id needs
> index (not cpu id) to find core/pkg id.
> This patch introduced perf_evsel__get_cpumap_index to get the index by
> cpu id for a given event. The index can be used to find correct cpu id
> for print_aggr.
>
SNIP
>
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 37e301a..a3ea735 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -708,6 +708,7 @@ static void print_aggr(char *prefix)
> nr = 0;
> for (cpu = 0; cpu < perf_evsel__nr_cpus(counter); cpu++) {
> cpu2 = perf_evsel__cpus(counter)->map[cpu];
> + cpu2 = perf_evsel__get_cpumap_index(cpu2, evsel_list->cpus);
> s2 = aggr_get_id(evsel_list->cpus, cpu2);
> if (s2 != id)
> continue;
hum, looks like passing the actual cpu number was introduced in:
582ec0829b3d perf stat: Fix per-socket output bug for uncore events
also, we already have the index into counter's cpus, why not use those
for getting aggregated id (core,socket).. what do I miss?
thanks,
jirka
---
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 37e301a32f43..47c3c1ffea45 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -694,7 +694,7 @@ static void abs_printout(int id, int nr, struct perf_evsel *evsel, double avg)
static void print_aggr(char *prefix)
{
struct perf_evsel *counter;
- int cpu, cpu2, s, s2, id, nr;
+ int cpu, s, s2, id, nr;
double uval;
u64 ena, run, val;
@@ -707,8 +707,7 @@ static void print_aggr(char *prefix)
val = ena = run = 0;
nr = 0;
for (cpu = 0; cpu < perf_evsel__nr_cpus(counter); cpu++) {
- cpu2 = perf_evsel__cpus(counter)->map[cpu];
- s2 = aggr_get_id(evsel_list->cpus, cpu2);
+ s2 = aggr_get_id(perf_evsel__cpus(counter), cpu);
if (s2 != id)
continue;
val += perf_counts(counter->counts, cpu, 0)->val;
On Mon, Jun 29, 2015 at 03:55:35PM -0400, [email protected] wrote:
SNIP
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 6cfdee6..f179379 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -1101,6 +1101,71 @@ int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
> return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
> }
>
> +static int cmp_ids(const void *a, const void *b)
> +{
> + return *(int *)a - *(int *)b;
> +}
> +
> +static int perf_evlist__check_evsel_cpus(struct perf_evlist *evlist, struct perf_evsel *evsel)
> +{
> + const struct cpu_map *cpus = evlist->cpus;
> + const int ncpus = cpu_map__nr(evlist->cpus);
> + int j = 0, cpu_nr = 0, tmp = 0;
> + int i;
> +
> + /* ensure we process id in increasing order */
> + qsort(evlist->cpus->map, evlist->cpus->nr, sizeof(int), cmp_ids);
wouldn't sorting maps affect some other code?
> +
> + /* find the common cpus between evsel and evlist. */
> + for (i = 0; i < cpu_map__nr(evsel->cpus);) {
> +
> + if (j >= ncpus) {
> + evsel->cpus->map[i++] = -1;
> + continue;
> + }
> + for (; j < ncpus; j++) {
> + if (cpus->map[j] < evsel->cpus->map[i])
> + continue;
> + if (cpus->map[j] == evsel->cpus->map[i]) {
> + cpu_nr++;
> + j++;
hum, do you skip 1 item in cpus by j++ here and in for loop header?
also some easy it'd be easier to read if you identify different cpu
maps somehow.. like evlist_cpus and evsel_cpus or such
I have no way of testing this.. could you please add automated test for this one?
thanks,
jirka
> + i++;
> + } else
> + evsel->cpus->map[i++] = -1;
> + break;
> + }
> + }
> +
SNIP
> On Mon, Jun 29, 2015 at 03:55:34PM -0400, [email protected] wrote:
> > From: Kan Liang <[email protected]>
> >
> > print_aggr fails to print per-core/per-socket statistics after commit
> > b7f0c203586b ("perf evlist: Propagate cpu maps to evsels in an
> > evlist"), if events have differnt cpus. Because in print_aggr,
> > aggr_get_id needs index (not cpu id) to find core/pkg id.
> > This patch introduced perf_evsel__get_cpumap_index to get the index
> by
> > cpu id for a given event. The index can be used to find correct cpu id
> > for print_aggr.
> >
>
> SNIP
>
> >
> > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> > index 37e301a..a3ea735 100644
> > --- a/tools/perf/builtin-stat.c
> > +++ b/tools/perf/builtin-stat.c
> > @@ -708,6 +708,7 @@ static void print_aggr(char *prefix)
> > nr = 0;
> > for (cpu = 0; cpu < perf_evsel__nr_cpus(counter);
> cpu++) {
> > cpu2 = perf_evsel__cpus(counter)-
> >map[cpu];
> > + cpu2 =
> perf_evsel__get_cpumap_index(cpu2, evsel_list->cpus);
> > s2 = aggr_get_id(evsel_list->cpus, cpu2);
> > if (s2 != id)
> > continue;
>
> hum, looks like passing the actual cpu number was introduced in:
> 582ec0829b3d perf stat: Fix per-socket output bug for uncore events
>
Right, I will change it.
> also, we already have the index into counter's cpus, why not use those for
> getting aggregated id (core,socket).. what do I miss?
Yes, the index is already there. But we still pass evlist's cpu map to aggr_get_id.
So once evsel and evlist have different cpu maps (e.g. -a with uncore event),
the wrong aggregated id will be return.
I think the patch as below will fix all these issues.
Thanks,
Kan
---
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 37e301a..47c3c1f 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -694,7 +694,7 @@ static void abs_printout(int id, int nr, struct perf_evsel *evsel, double avg)
static void print_aggr(char *prefix)
{
struct perf_evsel *counter;
- int cpu, cpu2, s, s2, id, nr;
+ int cpu, s, s2, id, nr;
double uval;
u64 ena, run, val;
@@ -707,8 +707,7 @@ static void print_aggr(char *prefix)
val = ena = run = 0;
nr = 0;
for (cpu = 0; cpu < perf_evsel__nr_cpus(counter); cpu++) {
- cpu2 = perf_evsel__cpus(counter)->map[cpu];
- s2 = aggr_get_id(evsel_list->cpus, cpu2);
+ s2 = aggr_get_id(perf_evsel__cpus(counter), cpu);
if (s2 != id)
continue;
val += perf_counts(counter->counts, cpu, 0)->val;
> > diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index
> > 6cfdee6..f179379 100644
> > --- a/tools/perf/util/evlist.c
> > +++ b/tools/perf/util/evlist.c
> > @@ -1101,6 +1101,71 @@ int perf_evlist__mmap(struct perf_evlist
> *evlist, unsigned int pages,
> > return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false); }
> >
> > +static int cmp_ids(const void *a, const void *b) {
> > + return *(int *)a - *(int *)b;
> > +}
> > +
> > +static int perf_evlist__check_evsel_cpus(struct perf_evlist *evlist,
> > +struct perf_evsel *evsel) {
> > + const struct cpu_map *cpus = evlist->cpus;
> > + const int ncpus = cpu_map__nr(evlist->cpus);
> > + int j = 0, cpu_nr = 0, tmp = 0;
> > + int i;
> > +
> > + /* ensure we process id in increasing order */
> > + qsort(evlist->cpus->map, evlist->cpus->nr, sizeof(int), cmp_ids);
>
> wouldn't sorting maps affect some other code?
>
I didn't find any bad effect after sorting the maps.
Any codes I need to check?
> > +
> > + /* find the common cpus between evsel and evlist. */
> > + for (i = 0; i < cpu_map__nr(evsel->cpus);) {
> > +
> > + if (j >= ncpus) {
> > + evsel->cpus->map[i++] = -1;
> > + continue;
> > + }
> > + for (; j < ncpus; j++) {
> > + if (cpus->map[j] < evsel->cpus->map[i])
> > + continue;
> > + if (cpus->map[j] == evsel->cpus->map[i]) {
> > + cpu_nr++;
> > + j++;
>
> hum, do you skip 1 item in cpus by j++ here and in for loop header?
>
If matching, it will break the j loop then. There is no chance to
j++ in the loop header. So I do j++ here, and it will not skip any items.
> also some easy it'd be easier to read if you identify different cpu maps
> somehow.. like evlist_cpus and evsel_cpus or such
>
OK. I will change the name.
> I have no way of testing this.. could you please add automated test for this
> one?
OK
Thanks,
Kan
>
> thanks,
> jirka
>
> > + i++;
> > + } else
> > + evsel->cpus->map[i++] = -1;
> > + break;
> > + }
> > + }
> > +
>
> SNIP
On Tue, Jun 30, 2015 at 01:42:49PM +0000, Liang, Kan wrote:
>
>
> > > diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index
> > > 6cfdee6..f179379 100644
> > > --- a/tools/perf/util/evlist.c
> > > +++ b/tools/perf/util/evlist.c
> > > @@ -1101,6 +1101,71 @@ int perf_evlist__mmap(struct perf_evlist
> > *evlist, unsigned int pages,
> > > return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false); }
> > >
> > > +static int cmp_ids(const void *a, const void *b) {
> > > + return *(int *)a - *(int *)b;
> > > +}
> > > +
> > > +static int perf_evlist__check_evsel_cpus(struct perf_evlist *evlist,
> > > +struct perf_evsel *evsel) {
> > > + const struct cpu_map *cpus = evlist->cpus;
> > > + const int ncpus = cpu_map__nr(evlist->cpus);
> > > + int j = 0, cpu_nr = 0, tmp = 0;
> > > + int i;
> > > +
> > > + /* ensure we process id in increasing order */
> > > + qsort(evlist->cpus->map, evlist->cpus->nr, sizeof(int), cmp_ids);
> >
> > wouldn't sorting maps affect some other code?
> >
>
> I didn't find any bad effect after sorting the maps.
> Any codes I need to check?
I dont think so, but I'm not sure either.. thats why I asked ;-)
I guess any code dealing with cpu maps.. it might affect
perf stat output.. but it looks sorted now anyway ;-)
I dont think it's an issue
jirka
Em Tue, Jun 30, 2015 at 03:54:49PM +0200, Jiri Olsa escreveu:
> On Tue, Jun 30, 2015 at 01:42:49PM +0000, Liang, Kan wrote:
> > > > +static int perf_evlist__check_evsel_cpus(struct perf_evlist *evlist,
> > > > +struct perf_evsel *evsel) {
> > > > + const struct cpu_map *cpus = evlist->cpus;
> > > > + const int ncpus = cpu_map__nr(evlist->cpus);
> > > > + int j = 0, cpu_nr = 0, tmp = 0;
> > > > + int i;
> > > > +
> > > > + /* ensure we process id in increasing order */
> > > > + qsort(evlist->cpus->map, evlist->cpus->nr, sizeof(int), cmp_ids);
> > >
> > > wouldn't sorting maps affect some other code?
> > >
> >
> > I didn't find any bad effect after sorting the maps.
> > Any codes I need to check?
>
> I dont think so, but I'm not sure either.. thats why I asked ;-)
>
> I guess any code dealing with cpu maps.. it might affect
> perf stat output.. but it looks sorted now anyway ;-)
> I dont think it's an issue
Is this being done at cpu_map__new time, i.e. when we first parse it?
That way any assumption about repositioning gets out of the way.
- Arnaldo
On Mon, Jun 29, 2015 at 12:55 PM, <[email protected]> wrote:
>
> From: Kan Liang <[email protected]>
>
> Some PMU events have cpumask, e.g uncore events. The cpu list set by
> user may be incompatible with event's cpumask.
> This patch will check the user defined cpu list. If the incompatible cpu
> is found, it will warn the user and discard the incompatible cpu. Only
> available cpu can be stored in evsel->cpus->map. If there is no cpu from
> cpu list compatible with event's cpumask. It will error out.
>
> Here is an example.
> According to cpumask, uncore should only available on CPU0 and CPU18.
> So the S0-C1 for uncore should not count.
>
I don't think this is correct. The cpumask is a default set of CPUs to
be used by
perf. The cpumask does not indicate the ONLY CPUs on which to monitor. It
is just a default. You can monitor an uncore event using a CPU not listed in
the cpumask, unless the kernel code has changed recently. If you are not
on the default CPUs, the kernel will IPI.
>
> Without this patch
> $ sudo ./perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,1,18
> --per-core sleep 2
>
> Performance counter stats for 'CPU(s) 0,1,18':
>
> S0-C0 1 6749638 cycles
> S0-C0 1 0.83 MiB uncore_imc_0/cas_count_read/
> (100.00%)
> S0-C1 1 232421 cycles
> S0-C1 1 0.83 MiB uncore_imc_0/cas_count_read/
> S1-C0 1 236997 cycles
> S1-C0 1 0.35 MiB uncore_imc_0/cas_count_read/
>
> 2.001094019 seconds time elapsed
>
> With this patch
> $ perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,1,18 --per-core
> sleep 2
> event uncore_imc_0/cas_count_read/ can only be monitored on CPU 0 18.
> Other CPUs will be discard.
>
> Performance counter stats for 'CPU(s) 0,1,18':
>
> S0-C0 1 5557406 cycles
> S0-C0 1 0.21 MiB uncore_imc_0/cas_count_read/
> S0-C1 1 1012534 cycles
> S0-C1 0 <not counted> MiB uncore_imc_0/cas_count_read/
> S1-C0 1 916130 cycles
> S1-C0 1 0.08 MiB uncore_imc_0/cas_count_read/
>
> 2.001110843 seconds time elapsed
>
> Signed-off-by: Kan Liang <[email protected]>
> ---
> tools/perf/util/evlist.c | 77 ++++++++++++++++++++++++++++++++++++++++++++----
> 1 file changed, 72 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 6cfdee6..f179379 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -1101,6 +1101,71 @@ int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
> return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
> }
>
> +static int cmp_ids(const void *a, const void *b)
> +{
> + return *(int *)a - *(int *)b;
> +}
> +
> +static int perf_evlist__check_evsel_cpus(struct perf_evlist *evlist, struct perf_evsel *evsel)
> +{
> + const struct cpu_map *cpus = evlist->cpus;
> + const int ncpus = cpu_map__nr(evlist->cpus);
> + int j = 0, cpu_nr = 0, tmp = 0;
> + int i;
> +
> + /* ensure we process id in increasing order */
> + qsort(evlist->cpus->map, evlist->cpus->nr, sizeof(int), cmp_ids);
> +
> + /* find the common cpus between evsel and evlist. */
> + for (i = 0; i < cpu_map__nr(evsel->cpus);) {
> +
> + if (j >= ncpus) {
> + evsel->cpus->map[i++] = -1;
> + continue;
> + }
> + for (; j < ncpus; j++) {
> + if (cpus->map[j] < evsel->cpus->map[i])
> + continue;
> + if (cpus->map[j] == evsel->cpus->map[i]) {
> + cpu_nr++;
> + j++;
> + i++;
> + } else
> + evsel->cpus->map[i++] = -1;
> + break;
> + }
> + }
> +
> + if (cpu_nr == 0) {
> + pr_warning("event %s cannot be monitored on the given cpus."
> + "Please check cpumask\n", evsel->name);
> + return -1;
> + }
> +
> + if (ncpus > cpu_nr)
> + pr_warning("event %s can only be monitored on CPU", evsel->name);
> +
> + /* order evsel cpus */
> + for (i = 0; i < cpu_nr; i++) {
> + if (evsel->cpus->map[i] == -1) {
> + while (evsel->cpus->map[tmp] == -1) {
> + tmp++;
> + BUG_ON(tmp >= cpu_map__nr(evsel->cpus));
> + }
> + evsel->cpus->map[i] = evsel->cpus->map[tmp];
> + evsel->cpus->map[tmp] = -1;
> + }
> + if (ncpus > cpu_nr)
> + pr_warning(" %d", evsel->cpus->map[i]);
> + tmp++;
> + }
> + evsel->cpus->nr = cpu_nr;
> + if (ncpus > cpu_nr)
> + pr_warning(". Other CPUs will be discard.\n");
> +
> + return 0;
> +}
> +
> static int perf_evlist__propagate_maps(struct perf_evlist *evlist,
> struct target *target)
> {
> @@ -1108,13 +1173,15 @@ static int perf_evlist__propagate_maps(struct perf_evlist *evlist,
>
> evlist__for_each(evlist, evsel) {
> /*
> - * We already have cpus for evsel (via PMU sysfs) so
> - * keep it, if there's no target cpu list defined.
> + * We already have cpus for evsel (via PMU sysfs)
> + * and target cpu list defined, check if they are
> + * compatible. If not, discard incompatible cpus.
> */
> - if (evsel->cpus && target->cpu_list)
> - cpu_map__put(evsel->cpus);
> + if (evsel->cpus && target->cpu_list &&
> + perf_evlist__check_evsel_cpus(evlist, evsel))
> + return -EINVAL;
>
> - if (!evsel->cpus || target->cpu_list)
> + if (!evsel->cpus)
> evsel->cpus = cpu_map__get(evlist->cpus);
>
> evsel->threads = thread_map__get(evlist->threads);
> --
> 1.8.3.1
>
>
> On Mon, Jun 29, 2015 at 12:55 PM, <[email protected]> wrote:
> >
> > From: Kan Liang <[email protected]>
> >
> > Some PMU events have cpumask, e.g uncore events. The cpu list set by
> > user may be incompatible with event's cpumask.
> > This patch will check the user defined cpu list. If the incompatible
> > cpu is found, it will warn the user and discard the incompatible cpu.
> > Only available cpu can be stored in evsel->cpus->map. If there is no
> > cpu from cpu list compatible with event's cpumask. It will error out.
> >
> > Here is an example.
> > According to cpumask, uncore should only available on CPU0 and CPU18.
> > So the S0-C1 for uncore should not count.
> >
> I don't think this is correct. The cpumask is a default set of CPUs to be used
> by perf. The cpumask does not indicate the ONLY CPUs on which to
> monitor. It is just a default. You can monitor an uncore event using a CPU
> not listed in the cpumask, unless the kernel code has changed recently. If
> you are not on the default CPUs, the kernel will IPI.
>
Here I mean that the S0-C1 for uncore should show "<not counted>",
as we showed the same thing on "perf stat -a --per-core".
Yes, in theory, user can use a CPU not listed in the cpumask. Because
uncore is per-socket event.
However, it brings error and confusion.
- As the below example, if we run -C0,1, we get two results for socket 0.
I think it's incorrect. Per-socket event should only have one result
per socket.
- Since the cpumask has already been exported to user space, I think users
should follow it. Otherwise, why we export cpumask to user space?
Implicitly changing the monitored CPU in kernel is not a good way I think.
Thanks,
Kan
> >
> > Without this patch
> > $ sudo ./perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,1,18
> > --per-core sleep 2
> >
> > Performance counter stats for 'CPU(s) 0,1,18':
> >
> > S0-C0 1 6749638 cycles
> > S0-C0 1 0.83 MiB uncore_imc_0/cas_count_read/
> > (100.00%)
> > S0-C1 1 232421 cycles
> > S0-C1 1 0.83 MiB uncore_imc_0/cas_count_read/
> > S1-C0 1 236997 cycles
> > S1-C0 1 0.35 MiB uncore_imc_0/cas_count_read/
> >
> > 2.001094019 seconds time elapsed
> >
> > With this patch
> > $ perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,1,18
> > --per-core sleep 2 event uncore_imc_0/cas_count_read/ can only be
> > monitored on CPU 0 18.
> > Other CPUs will be discard.
> >
> > Performance counter stats for 'CPU(s) 0,1,18':
> >
> > S0-C0 1 5557406 cycles
> > S0-C0 1 0.21 MiB uncore_imc_0/cas_count_read/
> > S0-C1 1 1012534 cycles
> > S0-C1 0 <not counted> MiB uncore_imc_0/cas_count_read/
> > S1-C0 1 916130 cycles
> > S1-C0 1 0.08 MiB uncore_imc_0/cas_count_read/
> >
> > 2.001110843 seconds time elapsed
> >
> > Signed-off-by: Kan Liang <[email protected]>
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m????????????I?
On Tue, Jun 30, 2015 at 11:42:25AM -0300, [email protected] wrote:
> Em Tue, Jun 30, 2015 at 03:54:49PM +0200, Jiri Olsa escreveu:
> > On Tue, Jun 30, 2015 at 01:42:49PM +0000, Liang, Kan wrote:
> > > > > +static int perf_evlist__check_evsel_cpus(struct perf_evlist *evlist,
> > > > > +struct perf_evsel *evsel) {
> > > > > + const struct cpu_map *cpus = evlist->cpus;
> > > > > + const int ncpus = cpu_map__nr(evlist->cpus);
> > > > > + int j = 0, cpu_nr = 0, tmp = 0;
> > > > > + int i;
> > > > > +
> > > > > + /* ensure we process id in increasing order */
> > > > > + qsort(evlist->cpus->map, evlist->cpus->nr, sizeof(int), cmp_ids);
> > > >
> > > > wouldn't sorting maps affect some other code?
> > > >
> > >
> > > I didn't find any bad effect after sorting the maps.
> > > Any codes I need to check?
> >
> > I dont think so, but I'm not sure either.. thats why I asked ;-)
> >
> > I guess any code dealing with cpu maps.. it might affect
> > perf stat output.. but it looks sorted now anyway ;-)
> > I dont think it's an issue
>
> Is this being done at cpu_map__new time, i.e. when we first parse it?
it's done within perf_evlist__create_maps.. checking the v2
jirka
> That way any assumption about repositioning gets out of the way.
>
> - Arnaldo
On Tue, Jun 30, 2015 at 04:45:18PM +0000, Liang, Kan wrote:
>
> >
> > On Mon, Jun 29, 2015 at 12:55 PM, <[email protected]> wrote:
> > >
> > > From: Kan Liang <[email protected]>
> > >
> > > Some PMU events have cpumask, e.g uncore events. The cpu list set by
> > > user may be incompatible with event's cpumask.
> > > This patch will check the user defined cpu list. If the incompatible
> > > cpu is found, it will warn the user and discard the incompatible cpu.
> > > Only available cpu can be stored in evsel->cpus->map. If there is no
> > > cpu from cpu list compatible with event's cpumask. It will error out.
> > >
> > > Here is an example.
> > > According to cpumask, uncore should only available on CPU0 and CPU18.
> > > So the S0-C1 for uncore should not count.
> > >
> > I don't think this is correct. The cpumask is a default set of CPUs to be used
> > by perf. The cpumask does not indicate the ONLY CPUs on which to
> > monitor. It is just a default. You can monitor an uncore event using a CPU
> > not listed in the cpumask, unless the kernel code has changed recently. If
> > you are not on the default CPUs, the kernel will IPI.
> >
>
> Here I mean that the S0-C1 for uncore should show "<not counted>",
> as we showed the same thing on "perf stat -a --per-core".
>
> Yes, in theory, user can use a CPU not listed in the cpumask. Because
> uncore is per-socket event.
> However, it brings error and confusion.
> - As the below example, if we run -C0,1, we get two results for socket 0.
> I think it's incorrect. Per-socket event should only have one result
> per socket.
> - Since the cpumask has already been exported to user space, I think users
> should follow it. Otherwise, why we export cpumask to user space?
> Implicitly changing the monitored CPU in kernel is not a good way I think.
I dont follow uncore stuff much :-\ Stephane, does this make sense to you ^^^ ?
please check v2
thanks,
jirka