2024-02-13 07:53:20

by Namhyung Kim

[permalink] [raw]
Subject: [PATCHSET 0/4] perf report: Omit dummy events in the output (v1)

Hello,

This work is to make the output compact by removing dummy events in
the evlist. The dummy events are used to save side-band information
like task creation or memory address space change using mmap(2). But
after collecting these, it's not used because it won't have any
samples.

Sometimes users want to run perf report --group to show all recorded
events together but they are not interested in those dummy events.
This just wastes the precious screen space so we want to get rid of
them after use.

perf report already has --skip-empty option to skip 0 result in the
stat output. I think we can extend it to skip empty events that have
no samples.

Example output:

Before)
#
# Samples: 232 of events 'cpu/mem-loads,ldlat=30/P, cpu/mem-stores/P, dummy:u'
# Event count (approx.): 3089861
#
# Overhead Command Shared Object Symbol
# ........................ ........... ................. .....................................
#
9.29% 0.00% 0.00% swapper [kernel.kallsyms] [k] update_blocked_averages
5.26% 0.15% 0.00% swapper [kernel.kallsyms] [k] __update_load_avg_se
4.15% 0.00% 0.00% perf-exec [kernel.kallsyms] [k] slab_update_freelist.isra.0
3.87% 0.00% 0.00% perf-exec [kernel.kallsyms] [k] memcg_slab_post_alloc_hook
3.79% 0.17% 0.00% swapper [kernel.kallsyms] [k] enqueue_task_fair
3.63% 0.00% 0.00% sleep [kernel.kallsyms] [k] next_uptodate_page
2.86% 0.00% 0.00% swapper [kernel.kallsyms] [k] __update_load_avg_cfs_rq
2.78% 0.00% 0.00% swapper [kernel.kallsyms] [k] __schedule
2.34% 0.00% 0.00% swapper [kernel.kallsyms] [k] intel_idle
2.32% 0.97% 0.00% swapper [kernel.kallsyms] [k] psi_group_change

After)
#
# Samples: 232 of events 'cpu/mem-loads,ldlat=30/P, cpu/mem-stores/P'
# Event count (approx.): 3089861
#
# Overhead Command Shared Object Symbol
# ................ ........... ................. .....................................
#
9.29% 0.00% swapper [kernel.kallsyms] [k] update_blocked_averages
5.26% 0.15% swapper [kernel.kallsyms] [k] __update_load_avg_se
4.15% 0.00% perf-exec [kernel.kallsyms] [k] slab_update_freelist.isra.0
3.87% 0.00% perf-exec [kernel.kallsyms] [k] memcg_slab_post_alloc_hook
3.79% 0.17% swapper [kernel.kallsyms] [k] enqueue_task_fair
3.63% 0.00% sleep [kernel.kallsyms] [k] next_uptodate_page
2.86% 0.00% swapper [kernel.kallsyms] [k] __update_load_avg_cfs_rq
2.78% 0.00% swapper [kernel.kallsyms] [k] __schedule
2.34% 0.00% swapper [kernel.kallsyms] [k] intel_idle
2.32% 0.97% swapper [kernel.kallsyms] [k] psi_group_change

Now 'Overhead' column only has two values for mem-loads and mem-stores.

Thanks,
Namhyung


Namhyung Kim (4):
libperf evlist: Update group info in perf_evlist__remove()
perf hist: Simplify hist printing logic for group events
perf hist: Do not use event index in hpp__fmt()
perf report: Do not show dummy events with --skip-empty

tools/lib/perf/evlist.c | 21 +++++++
tools/perf/Documentation/perf-report.txt | 3 +-
tools/perf/builtin-report.c | 14 ++++-
tools/perf/ui/hist.c | 77 +++++++++++++-----------
4 files changed, 79 insertions(+), 36 deletions(-)

--
2.43.0.687.g38aa6559b0-goog



2024-02-13 07:53:29

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 1/4] libperf evlist: Update group info in perf_evlist__remove()

When an event in a group is removed, it should update the group status
including the pointer to leader and number of member events.

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/lib/perf/evlist.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)

diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 058e3ff10f9b..befdb062fa1d 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -102,8 +102,29 @@ void perf_evlist__add(struct perf_evlist *evlist,
void perf_evlist__remove(struct perf_evlist *evlist,
struct perf_evsel *evsel)
{
+ struct perf_evsel *leader = evsel->leader;
+
list_del_init(&evsel->node);
evlist->nr_entries -= 1;
+
+ /* return stand-alone event */
+ if (leader == evsel && leader->nr_members < 2)
+ return;
+
+ if (leader == evsel) {
+ struct perf_evsel *member;
+
+ /* select the next event as a new leader */
+ leader = member = perf_evlist__next(evlist, evsel);
+
+ /* update members to see the new leader */
+ while (member && member->leader == evsel) {
+ member->leader = leader;
+ member = perf_evlist__next(evlist, member);
+ }
+ }
+
+ leader->nr_members = evsel->leader->nr_members - 1;
}

struct perf_evlist *perf_evlist__new(void)
--
2.43.0.687.g38aa6559b0-goog


2024-02-13 07:53:35

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 2/4] perf hist: Simplify hist printing logic for group events

It can uses an array to save the period (or percent) values for member
events and print them together in a loop. This simplify the logic to
handle missing samples in members with zero values.

No functional change intended.

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/ui/hist.c | 55 +++++++++++++++++---------------------------
1 file changed, 21 insertions(+), 34 deletions(-)

diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c
index 2bf959d08354..5f4c110d840f 100644
--- a/tools/perf/ui/hist.c
+++ b/tools/perf/ui/hist.c
@@ -33,6 +33,7 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
char *buf = hpp->buf;
size_t size = hpp->size;

+ /* print stand-alone or group leader events separately */
if (fmt_percent) {
double percent = 0.0;
u64 total = hists__total_period(hists);
@@ -45,12 +46,19 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
ret = hpp__call_print_fn(hpp, print_fn, fmt, len, get_field(he));

if (evsel__is_group_event(evsel)) {
- int prev_idx, idx_delta;
+ int idx;
struct hist_entry *pair;
int nr_members = evsel->core.nr_members;
+ union {
+ u64 period;
+ double percent;
+ } *val;

- prev_idx = evsel__group_idx(evsel);
+ val = calloc(nr_members, sizeof(*val));
+ if (val == NULL)
+ return 0;

+ /* collect values in the group members */
list_for_each_entry(pair, &he->pairs.head, pairs.node) {
u64 period = get_field(pair);
u64 total = hists__total_period(pair->hists);
@@ -59,47 +67,26 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
continue;

evsel = hists_to_evsel(pair->hists);
- idx_delta = evsel__group_idx(evsel) - prev_idx - 1;
-
- while (idx_delta--) {
- /*
- * zero-fill group members in the middle which
- * have no sample
- */
- if (fmt_percent) {
- ret += hpp__call_print_fn(hpp, print_fn,
- fmt, len, 0.0);
- } else {
- ret += hpp__call_print_fn(hpp, print_fn,
- fmt, len, 0ULL);
- }
- }
-
- if (fmt_percent) {
- ret += hpp__call_print_fn(hpp, print_fn, fmt, len,
- 100.0 * period / total);
- } else {
- ret += hpp__call_print_fn(hpp, print_fn, fmt,
- len, period);
- }
+ idx = evsel__group_idx(evsel);

- prev_idx = evsel__group_idx(evsel);
+ if (fmt_percent)
+ val[idx].percent = 100.0 * period / total;
+ else
+ val[idx].period = period;
}

- idx_delta = nr_members - prev_idx - 1;
-
- while (idx_delta--) {
- /*
- * zero-fill group members at last which have no sample
- */
+ /* idx starts from 1 to skip the leader event */
+ for (idx = 1; idx < nr_members; idx++) {
if (fmt_percent) {
ret += hpp__call_print_fn(hpp, print_fn,
- fmt, len, 0.0);
+ fmt, len, val[idx].percent);
} else {
ret += hpp__call_print_fn(hpp, print_fn,
- fmt, len, 0ULL);
+ fmt, len, val[idx].period);
}
}
+
+ free(val);
}

/*
--
2.43.0.687.g38aa6559b0-goog


2024-02-13 07:53:49

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 3/4] perf hist: Do not use event index in hpp__fmt()

The __hpp__fmt() is to print period values in a hist entry. It handles
event groups using linked pair entries. Until now, it used event index
to print values of group members. But we want to make it more robust
and support groups even if some members in the group were removed.

Let's use an index table from evsel to value array so that we can skip
dummy events in the output later.

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/ui/hist.c | 34 ++++++++++++++++++++++++++++------
1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c
index 5f4c110d840f..9c4c738edde1 100644
--- a/tools/perf/ui/hist.c
+++ b/tools/perf/ui/hist.c
@@ -48,15 +48,30 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
if (evsel__is_group_event(evsel)) {
int idx;
struct hist_entry *pair;
- int nr_members = evsel->core.nr_members;
+ int nr_members = evsel->core.nr_members - 1;
union {
u64 period;
double percent;
} *val;
+ struct evsel *member;
+ struct evsel **idx_table;

val = calloc(nr_members, sizeof(*val));
if (val == NULL)
- return 0;
+ goto out;
+
+ idx_table = calloc(nr_members, sizeof(*idx_table));
+ if (idx_table == NULL)
+ goto out;
+
+ /*
+ * Build an index table for each evsel to the val array.
+ * It cannot use evsel->core.idx because removed events might
+ * create a hole so the index is not consecutive anymore.
+ */
+ idx = 0;
+ for_each_group_member(member, evsel)
+ idx_table[idx++] = member;

/* collect values in the group members */
list_for_each_entry(pair, &he->pairs.head, pairs.node) {
@@ -66,8 +81,15 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
if (!total)
continue;

- evsel = hists_to_evsel(pair->hists);
- idx = evsel__group_idx(evsel);
+ member = hists_to_evsel(pair->hists);
+ for (idx = 0; idx < nr_members; idx++) {
+ if (idx_table[idx] == member)
+ break;
+ }
+
+ /* this should not happen */
+ if (idx == nr_members)
+ continue;

if (fmt_percent)
val[idx].percent = 100.0 * period / total;
@@ -75,8 +97,7 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
val[idx].period = period;
}

- /* idx starts from 1 to skip the leader event */
- for (idx = 1; idx < nr_members; idx++) {
+ for (idx = 0; idx < nr_members; idx++) {
if (fmt_percent) {
ret += hpp__call_print_fn(hpp, print_fn,
fmt, len, val[idx].percent);
@@ -89,6 +110,7 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
free(val);
}

+out:
/*
* Restore original buf and size as it's where caller expects
* the result will be saved.
--
2.43.0.687.g38aa6559b0-goog


2024-02-13 07:53:51

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 4/4] perf report: Do not show dummy events with --skip-empty

In system-wide mode, a dummy event was added during the perf record to
save various side-band events like FORK, MMAP and so on. But this dummy
event doesn't have samples for itself so it just wastes the output with
all zero values.

When --skip-empty option is on (by default), it can skip those (dummy)
events without any samples by removing them from the evlist. Users can
disable it using --no-skip-empty command line option or setting the
report.skip-empty config option to false.

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/Documentation/perf-report.txt | 3 ++-
tools/perf/builtin-report.c | 14 +++++++++++++-
2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index d8b863e01fe0..546af27d209c 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -609,7 +609,8 @@ include::itrace.txt[]
'Avg Cycles' - block average sampled cycles

--skip-empty::
- Do not print 0 results in the --stat output.
+ Do not print 0 results in the output. This means skipping dummy events
+ in the event list and the --stat output.

include::callchain-overhead-calculation.txt[]

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 8e16fa261e6f..60e30f13c8f8 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -752,10 +752,22 @@ static int hists__resort_cb(struct hist_entry *he, void *arg)
static void report__output_resort(struct report *rep)
{
struct ui_progress prog;
- struct evsel *pos;
+ struct evsel *pos, *tmp;

ui_progress__init(&prog, rep->nr_entries, "Sorting events for output...");

+ if (rep->skip_empty) {
+ evlist__for_each_entry_safe(rep->session->evlist, tmp, pos) {
+ struct hists *hists = evsel__hists(pos);
+
+ if (hists->nr_entries != 0)
+ continue;
+
+ evlist__remove(rep->session->evlist, pos);
+ evsel__delete(pos);
+ }
+ }
+
evlist__for_each_entry(rep->session->evlist, pos) {
evsel__output_resort_cb(pos, &prog, hists__resort_cb, rep);
}
--
2.43.0.687.g38aa6559b0-goog


2024-02-14 23:51:40

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 1/4] libperf evlist: Update group info in perf_evlist__remove()

On Mon, Feb 12, 2024 at 11:52 PM Namhyung Kim <[email protected]> wrote:
>
> When an event in a group is removed, it should update the group status
> including the pointer to leader and number of member events.
>
> Signed-off-by: Namhyung Kim <[email protected]>

Should we worry about evlist's all_cpus that could also be stale now?

Thanks,
Ian

> ---
> tools/lib/perf/evlist.c | 21 +++++++++++++++++++++
> 1 file changed, 21 insertions(+)
>
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index 058e3ff10f9b..befdb062fa1d 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -102,8 +102,29 @@ void perf_evlist__add(struct perf_evlist *evlist,
> void perf_evlist__remove(struct perf_evlist *evlist,
> struct perf_evsel *evsel)
> {
> + struct perf_evsel *leader = evsel->leader;
> +
> list_del_init(&evsel->node);
> evlist->nr_entries -= 1;
> +
> + /* return stand-alone event */
> + if (leader == evsel && leader->nr_members < 2)
> + return;
> +
> + if (leader == evsel) {
> + struct perf_evsel *member;
> +
> + /* select the next event as a new leader */
> + leader = member = perf_evlist__next(evlist, evsel);
> +
> + /* update members to see the new leader */
> + while (member && member->leader == evsel) {
> + member->leader = leader;
> + member = perf_evlist__next(evlist, member);
> + }
> + }
> +
> + leader->nr_members = evsel->leader->nr_members - 1;
> }
>
> struct perf_evlist *perf_evlist__new(void)
> --
> 2.43.0.687.g38aa6559b0-goog
>

2024-02-15 00:04:27

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 2/4] perf hist: Simplify hist printing logic for group events

On Mon, Feb 12, 2024 at 11:53 PM Namhyung Kim <[email protected]> wrote:
>
> It can uses an array to save the period (or percent) values for member
> events and print them together in a loop. This simplify the logic to
> handle missing samples in members with zero values.
>
> No functional change intended.
>
> Signed-off-by: Namhyung Kim <[email protected]>
> ---
> tools/perf/ui/hist.c | 55 +++++++++++++++++---------------------------
> 1 file changed, 21 insertions(+), 34 deletions(-)
>
> diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c
> index 2bf959d08354..5f4c110d840f 100644
> --- a/tools/perf/ui/hist.c
> +++ b/tools/perf/ui/hist.c
> @@ -33,6 +33,7 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
> char *buf = hpp->buf;
> size_t size = hpp->size;
>
> + /* print stand-alone or group leader events separately */
> if (fmt_percent) {
> double percent = 0.0;
> u64 total = hists__total_period(hists);
> @@ -45,12 +46,19 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
> ret = hpp__call_print_fn(hpp, print_fn, fmt, len, get_field(he));
>
> if (evsel__is_group_event(evsel)) {
> - int prev_idx, idx_delta;
> + int idx;
> struct hist_entry *pair;
> int nr_members = evsel->core.nr_members;
> + union {
> + u64 period;
> + double percent;
> + } *val;
>
> - prev_idx = evsel__group_idx(evsel);
> + val = calloc(nr_members, sizeof(*val));
> + if (val == NULL)
> + return 0;

Looks good, but should this return value be negative to indicate an
error? Snprintf gives a negative result on error, its result is
sometimes returned from hpp__fmt_acc, as is this result.

Thanks,
Ian

>
> + /* collect values in the group members */
> list_for_each_entry(pair, &he->pairs.head, pairs.node) {
> u64 period = get_field(pair);
> u64 total = hists__total_period(pair->hists);
> @@ -59,47 +67,26 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
> continue;
>
> evsel = hists_to_evsel(pair->hists);
> - idx_delta = evsel__group_idx(evsel) - prev_idx - 1;
> -
> - while (idx_delta--) {
> - /*
> - * zero-fill group members in the middle which
> - * have no sample
> - */
> - if (fmt_percent) {
> - ret += hpp__call_print_fn(hpp, print_fn,
> - fmt, len, 0.0);
> - } else {
> - ret += hpp__call_print_fn(hpp, print_fn,
> - fmt, len, 0ULL);
> - }
> - }
> -
> - if (fmt_percent) {
> - ret += hpp__call_print_fn(hpp, print_fn, fmt, len,
> - 100.0 * period / total);
> - } else {
> - ret += hpp__call_print_fn(hpp, print_fn, fmt,
> - len, period);
> - }
> + idx = evsel__group_idx(evsel);
>
> - prev_idx = evsel__group_idx(evsel);
> + if (fmt_percent)
> + val[idx].percent = 100.0 * period / total;
> + else
> + val[idx].period = period;
> }
>
> - idx_delta = nr_members - prev_idx - 1;
> -
> - while (idx_delta--) {
> - /*
> - * zero-fill group members at last which have no sample
> - */
> + /* idx starts from 1 to skip the leader event */
> + for (idx = 1; idx < nr_members; idx++) {
> if (fmt_percent) {
> ret += hpp__call_print_fn(hpp, print_fn,
> - fmt, len, 0.0);
> + fmt, len, val[idx].percent);
> } else {
> ret += hpp__call_print_fn(hpp, print_fn,
> - fmt, len, 0ULL);
> + fmt, len, val[idx].period);
> }
> }
> +
> + free(val);
> }
>
> /*
> --
> 2.43.0.687.g38aa6559b0-goog
>

2024-02-15 00:08:44

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 3/4] perf hist: Do not use event index in hpp__fmt()

On Mon, Feb 12, 2024 at 11:52 PM Namhyung Kim <[email protected]> wrote:
>
> The __hpp__fmt() is to print period values in a hist entry. It handles
> event groups using linked pair entries. Until now, it used event index
> to print values of group members. But we want to make it more robust
> and support groups even if some members in the group were removed.

I'm unclear how it breaks currently. The evsel idx is set the evlist
nr_entries on creation and not updated by a remove. A remove may
change a groups leader, should the remove also make the next entry's
index idx that of the previous group leader?

> Let's use an index table from evsel to value array so that we can skip
> dummy events in the output later.
>
> Signed-off-by: Namhyung Kim <[email protected]>
> ---
> tools/perf/ui/hist.c | 34 ++++++++++++++++++++++++++++------
> 1 file changed, 28 insertions(+), 6 deletions(-)
>
> diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c
> index 5f4c110d840f..9c4c738edde1 100644
> --- a/tools/perf/ui/hist.c
> +++ b/tools/perf/ui/hist.c
> @@ -48,15 +48,30 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
> if (evsel__is_group_event(evsel)) {
> int idx;
> struct hist_entry *pair;
> - int nr_members = evsel->core.nr_members;
> + int nr_members = evsel->core.nr_members - 1;

A comment on the -1 would be useful.

Thanks,
Ian


> union {
> u64 period;
> double percent;
> } *val;
> + struct evsel *member;
> + struct evsel **idx_table;
>
> val = calloc(nr_members, sizeof(*val));
> if (val == NULL)
> - return 0;
> + goto out;
> +
> + idx_table = calloc(nr_members, sizeof(*idx_table));
> + if (idx_table == NULL)
> + goto out;
> +
> + /*
> + * Build an index table for each evsel to the val array.
> + * It cannot use evsel->core.idx because removed events might
> + * create a hole so the index is not consecutive anymore.
> + */
> + idx = 0;
> + for_each_group_member(member, evsel)
> + idx_table[idx++] = member;
>
> /* collect values in the group members */
> list_for_each_entry(pair, &he->pairs.head, pairs.node) {
> @@ -66,8 +81,15 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
> if (!total)
> continue;
>
> - evsel = hists_to_evsel(pair->hists);
> - idx = evsel__group_idx(evsel);
> + member = hists_to_evsel(pair->hists);
> + for (idx = 0; idx < nr_members; idx++) {
> + if (idx_table[idx] == member)
> + break;
> + }
> +
> + /* this should not happen */
> + if (idx == nr_members)
> + continue;
>
> if (fmt_percent)
> val[idx].percent = 100.0 * period / total;
> @@ -75,8 +97,7 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
> val[idx].period = period;
> }
>
> - /* idx starts from 1 to skip the leader event */
> - for (idx = 1; idx < nr_members; idx++) {
> + for (idx = 0; idx < nr_members; idx++) {
> if (fmt_percent) {
> ret += hpp__call_print_fn(hpp, print_fn,
> fmt, len, val[idx].percent);
> @@ -89,6 +110,7 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
> free(val);
> }
>
> +out:
> /*
> * Restore original buf and size as it's where caller expects
> * the result will be saved.
> --
> 2.43.0.687.g38aa6559b0-goog
>

2024-02-15 05:17:02

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 1/4] libperf evlist: Update group info in perf_evlist__remove()

On Wed, Feb 14, 2024 at 3:50 PM Ian Rogers <[email protected]> wrote:
>
> On Mon, Feb 12, 2024 at 11:52 PM Namhyung Kim <[email protected]> wrote:
> >
> > When an event in a group is removed, it should update the group status
> > including the pointer to leader and number of member events.
> >
> > Signed-off-by: Namhyung Kim <[email protected]>
>
> Should we worry about evlist's all_cpus that could also be stale now?

Hmm.. right. I will refresh the cpu list after removal.

Thanks,
Namhyung

2024-02-15 05:28:04

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 2/4] perf hist: Simplify hist printing logic for group events

On Wed, Feb 14, 2024 at 3:57 PM Ian Rogers <[email protected]> wrote:
>
> On Mon, Feb 12, 2024 at 11:53 PM Namhyung Kim <[email protected]> wrote:
> >
> > It can uses an array to save the period (or percent) values for member
> > events and print them together in a loop. This simplify the logic to
> > handle missing samples in members with zero values.
> >
> > No functional change intended.
> >
> > Signed-off-by: Namhyung Kim <[email protected]>
> > ---
> > tools/perf/ui/hist.c | 55 +++++++++++++++++---------------------------
> > 1 file changed, 21 insertions(+), 34 deletions(-)
> >
> > diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c
> > index 2bf959d08354..5f4c110d840f 100644
> > --- a/tools/perf/ui/hist.c
> > +++ b/tools/perf/ui/hist.c
> > @@ -33,6 +33,7 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
> > char *buf = hpp->buf;
> > size_t size = hpp->size;
> >
> > + /* print stand-alone or group leader events separately */
> > if (fmt_percent) {
> > double percent = 0.0;
> > u64 total = hists__total_period(hists);
> > @@ -45,12 +46,19 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
> > ret = hpp__call_print_fn(hpp, print_fn, fmt, len, get_field(he));
> >
> > if (evsel__is_group_event(evsel)) {
> > - int prev_idx, idx_delta;
> > + int idx;
> > struct hist_entry *pair;
> > int nr_members = evsel->core.nr_members;
> > + union {
> > + u64 period;
> > + double percent;
> > + } *val;
> >
> > - prev_idx = evsel__group_idx(evsel);
> > + val = calloc(nr_members, sizeof(*val));
> > + if (val == NULL)
> > + return 0;
>
> Looks good, but should this return value be negative to indicate an
> error? Snprintf gives a negative result on error, its result is
> sometimes returned from hpp__fmt_acc, as is this result.

Sounds good, will change.

Thanks,
Namhyung

>
> >
> > + /* collect values in the group members */
> > list_for_each_entry(pair, &he->pairs.head, pairs.node) {
> > u64 period = get_field(pair);
> > u64 total = hists__total_period(pair->hists);
> > @@ -59,47 +67,26 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
> > continue;
> >
> > evsel = hists_to_evsel(pair->hists);
> > - idx_delta = evsel__group_idx(evsel) - prev_idx - 1;
> > -
> > - while (idx_delta--) {
> > - /*
> > - * zero-fill group members in the middle which
> > - * have no sample
> > - */
> > - if (fmt_percent) {
> > - ret += hpp__call_print_fn(hpp, print_fn,
> > - fmt, len, 0.0);
> > - } else {
> > - ret += hpp__call_print_fn(hpp, print_fn,
> > - fmt, len, 0ULL);
> > - }
> > - }
> > -
> > - if (fmt_percent) {
> > - ret += hpp__call_print_fn(hpp, print_fn, fmt, len,
> > - 100.0 * period / total);
> > - } else {
> > - ret += hpp__call_print_fn(hpp, print_fn, fmt,
> > - len, period);
> > - }
> > + idx = evsel__group_idx(evsel);
> >
> > - prev_idx = evsel__group_idx(evsel);
> > + if (fmt_percent)
> > + val[idx].percent = 100.0 * period / total;
> > + else
> > + val[idx].period = period;
> > }
> >
> > - idx_delta = nr_members - prev_idx - 1;
> > -
> > - while (idx_delta--) {
> > - /*
> > - * zero-fill group members at last which have no sample
> > - */
> > + /* idx starts from 1 to skip the leader event */
> > + for (idx = 1; idx < nr_members; idx++) {
> > if (fmt_percent) {
> > ret += hpp__call_print_fn(hpp, print_fn,
> > - fmt, len, 0.0);
> > + fmt, len, val[idx].percent);
> > } else {
> > ret += hpp__call_print_fn(hpp, print_fn,
> > - fmt, len, 0ULL);
> > + fmt, len, val[idx].period);
> > }
> > }
> > +
> > + free(val);
> > }
> >
> > /*
> > --
> > 2.43.0.687.g38aa6559b0-goog
> >

2024-02-15 05:38:18

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 3/4] perf hist: Do not use event index in hpp__fmt()

On Wed, Feb 14, 2024 at 4:08 PM Ian Rogers <[email protected]> wrote:
>
> On Mon, Feb 12, 2024 at 11:52 PM Namhyung Kim <[email protected]> wrote:
> >
> > The __hpp__fmt() is to print period values in a hist entry. It handles
> > event groups using linked pair entries. Until now, it used event index
> > to print values of group members. But we want to make it more robust
> > and support groups even if some members in the group were removed.
>
> I'm unclear how it breaks currently. The evsel idx is set the evlist
> nr_entries on creation and not updated by a remove. A remove may
> change a groups leader, should the remove also make the next entry's
> index idx that of the previous group leader?

The evsel__group_idx() returns evsel->idx - leader->idx.
If it has a group event {A, B, C} then the index would be 0, 1, 2.
If it removes B, the group would be {A, C} with index 0 and 2.
The nr_members is 2 now so it cannot use index 2 for C.

Note that we cannot change the index of C because some information
like annotation histogram relies on the index.

>
> > Let's use an index table from evsel to value array so that we can skip
> > dummy events in the output later.
> >
> > Signed-off-by: Namhyung Kim <[email protected]>
> > ---
> > tools/perf/ui/hist.c | 34 ++++++++++++++++++++++++++++------
> > 1 file changed, 28 insertions(+), 6 deletions(-)
> >
> > diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c
> > index 5f4c110d840f..9c4c738edde1 100644
> > --- a/tools/perf/ui/hist.c
> > +++ b/tools/perf/ui/hist.c
> > @@ -48,15 +48,30 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
> > if (evsel__is_group_event(evsel)) {
> > int idx;
> > struct hist_entry *pair;
> > - int nr_members = evsel->core.nr_members;
> > + int nr_members = evsel->core.nr_members - 1;
>
> A comment on the -1 would be useful.

The 'nr_members' includes the leader which is already printed.
In this code, we want to print member events only, hence -1.
I'll add it to the comment.

Thanks,
Namhyung

>
> Thanks,
> Ian
>
>
> > union {
> > u64 period;
> > double percent;
> > } *val;
> > + struct evsel *member;
> > + struct evsel **idx_table;
> >
> > val = calloc(nr_members, sizeof(*val));
> > if (val == NULL)
> > - return 0;
> > + goto out;
> > +
> > + idx_table = calloc(nr_members, sizeof(*idx_table));
> > + if (idx_table == NULL)
> > + goto out;
> > +
> > + /*
> > + * Build an index table for each evsel to the val array.
> > + * It cannot use evsel->core.idx because removed events might
> > + * create a hole so the index is not consecutive anymore.
> > + */
> > + idx = 0;
> > + for_each_group_member(member, evsel)
> > + idx_table[idx++] = member;
> >
> > /* collect values in the group members */
> > list_for_each_entry(pair, &he->pairs.head, pairs.node) {
> > @@ -66,8 +81,15 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
> > if (!total)
> > continue;
> >
> > - evsel = hists_to_evsel(pair->hists);
> > - idx = evsel__group_idx(evsel);
> > + member = hists_to_evsel(pair->hists);
> > + for (idx = 0; idx < nr_members; idx++) {
> > + if (idx_table[idx] == member)
> > + break;
> > + }
> > +
> > + /* this should not happen */
> > + if (idx == nr_members)
> > + continue;
> >
> > if (fmt_percent)
> > val[idx].percent = 100.0 * period / total;
> > @@ -75,8 +97,7 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
> > val[idx].period = period;
> > }
> >
> > - /* idx starts from 1 to skip the leader event */
> > - for (idx = 1; idx < nr_members; idx++) {
> > + for (idx = 0; idx < nr_members; idx++) {
> > if (fmt_percent) {
> > ret += hpp__call_print_fn(hpp, print_fn,
> > fmt, len, val[idx].percent);
> > @@ -89,6 +110,7 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
> > free(val);
> > }
> >
> > +out:
> > /*
> > * Restore original buf and size as it's where caller expects
> > * the result will be saved.
> > --
> > 2.43.0.687.g38aa6559b0-goog
> >

2024-02-15 07:54:39

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 3/4] perf hist: Do not use event index in hpp__fmt()

On Wed, Feb 14, 2024 at 9:26 PM Namhyung Kim <[email protected]> wrote:
>
> On Wed, Feb 14, 2024 at 4:08 PM Ian Rogers <[email protected]> wrote:
> >
> > On Mon, Feb 12, 2024 at 11:52 PM Namhyung Kim <[email protected]> wrote:
> > >
> > > The __hpp__fmt() is to print period values in a hist entry. It handles
> > > event groups using linked pair entries. Until now, it used event index
> > > to print values of group members. But we want to make it more robust
> > > and support groups even if some members in the group were removed.
> >
> > I'm unclear how it breaks currently. The evsel idx is set the evlist
> > nr_entries on creation and not updated by a remove. A remove may
> > change a groups leader, should the remove also make the next entry's
> > index idx that of the previous group leader?
>
> The evsel__group_idx() returns evsel->idx - leader->idx.
> If it has a group event {A, B, C} then the index would be 0, 1, 2.
> If it removes B, the group would be {A, C} with index 0 and 2.
> The nr_members is 2 now so it cannot use index 2 for C.
>
> Note that we cannot change the index of C because some information
> like annotation histogram relies on the index.

Ugh, the whole index thing is just messy - perhaps these days we could
have a hashmap with the evsel as a key instead. I remember that I also
forced the idx here:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n2049
If it were invariant that the idx were always the position of an event
in the evlist then I think life would be easier, but that won't help
for the arrays of counters that need the index to be constant. I guess
this is why the previous work to do this skipped evsels rather than
removed them.

Thanks,
Ian


> >
> > > Let's use an index table from evsel to value array so that we can skip
> > > dummy events in the output later.
> > >
> > > Signed-off-by: Namhyung Kim <[email protected]>
> > > ---
> > > tools/perf/ui/hist.c | 34 ++++++++++++++++++++++++++++------
> > > 1 file changed, 28 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c
> > > index 5f4c110d840f..9c4c738edde1 100644
> > > --- a/tools/perf/ui/hist.c
> > > +++ b/tools/perf/ui/hist.c
> > > @@ -48,15 +48,30 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
> > > if (evsel__is_group_event(evsel)) {
> > > int idx;
> > > struct hist_entry *pair;
> > > - int nr_members = evsel->core.nr_members;
> > > + int nr_members = evsel->core.nr_members - 1;
> >
> > A comment on the -1 would be useful.
>
> The 'nr_members' includes the leader which is already printed.
> In this code, we want to print member events only, hence -1.
> I'll add it to the comment.
>
> Thanks,
> Namhyung
>
> >
> > Thanks,
> > Ian
> >
> >
> > > union {
> > > u64 period;
> > > double percent;
> > > } *val;
> > > + struct evsel *member;
> > > + struct evsel **idx_table;
> > >
> > > val = calloc(nr_members, sizeof(*val));
> > > if (val == NULL)
> > > - return 0;
> > > + goto out;
> > > +
> > > + idx_table = calloc(nr_members, sizeof(*idx_table));
> > > + if (idx_table == NULL)
> > > + goto out;
> > > +
> > > + /*
> > > + * Build an index table for each evsel to the val array.
> > > + * It cannot use evsel->core.idx because removed events might
> > > + * create a hole so the index is not consecutive anymore.
> > > + */
> > > + idx = 0;
> > > + for_each_group_member(member, evsel)
> > > + idx_table[idx++] = member;
> > >
> > > /* collect values in the group members */
> > > list_for_each_entry(pair, &he->pairs.head, pairs.node) {
> > > @@ -66,8 +81,15 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
> > > if (!total)
> > > continue;
> > >
> > > - evsel = hists_to_evsel(pair->hists);
> > > - idx = evsel__group_idx(evsel);
> > > + member = hists_to_evsel(pair->hists);
> > > + for (idx = 0; idx < nr_members; idx++) {
> > > + if (idx_table[idx] == member)
> > > + break;
> > > + }
> > > +
> > > + /* this should not happen */
> > > + if (idx == nr_members)
> > > + continue;
> > >
> > > if (fmt_percent)
> > > val[idx].percent = 100.0 * period / total;
> > > @@ -75,8 +97,7 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
> > > val[idx].period = period;
> > > }
> > >
> > > - /* idx starts from 1 to skip the leader event */
> > > - for (idx = 1; idx < nr_members; idx++) {
> > > + for (idx = 0; idx < nr_members; idx++) {
> > > if (fmt_percent) {
> > > ret += hpp__call_print_fn(hpp, print_fn,
> > > fmt, len, val[idx].percent);
> > > @@ -89,6 +110,7 @@ static int __hpp__fmt(struct perf_hpp *hpp, struct hist_entry *he,
> > > free(val);
> > > }
> > >
> > > +out:
> > > /*
> > > * Restore original buf and size as it's where caller expects
> > > * the result will be saved.
> > > --
> > > 2.43.0.687.g38aa6559b0-goog
> > >

2024-02-16 20:09:15

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 3/4] perf hist: Do not use event index in hpp__fmt()

On Wed, Feb 14, 2024 at 11:54 PM Ian Rogers <[email protected]> wrote:
>
> On Wed, Feb 14, 2024 at 9:26 PM Namhyung Kim <[email protected]> wrote:
> >
> > On Wed, Feb 14, 2024 at 4:08 PM Ian Rogers <[email protected]> wrote:
> > >
> > > On Mon, Feb 12, 2024 at 11:52 PM Namhyung Kim <[email protected]> wrote:
> > > >
> > > > The __hpp__fmt() is to print period values in a hist entry. It handles
> > > > event groups using linked pair entries. Until now, it used event index
> > > > to print values of group members. But we want to make it more robust
> > > > and support groups even if some members in the group were removed.
> > >
> > > I'm unclear how it breaks currently. The evsel idx is set the evlist
> > > nr_entries on creation and not updated by a remove. A remove may
> > > change a groups leader, should the remove also make the next entry's
> > > index idx that of the previous group leader?
> >
> > The evsel__group_idx() returns evsel->idx - leader->idx.
> > If it has a group event {A, B, C} then the index would be 0, 1, 2.
> > If it removes B, the group would be {A, C} with index 0 and 2.
> > The nr_members is 2 now so it cannot use index 2 for C.
> >
> > Note that we cannot change the index of C because some information
> > like annotation histogram relies on the index.
>
> Ugh, the whole index thing is just messy - perhaps these days we could
> have a hashmap with the evsel as a key instead. I remember that I also
> forced the idx here:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n2049
> If it were invariant that the idx were always the position of an event
> in the evlist then I think life would be easier, but that won't help
> for the arrays of counters that need the index to be constant. I guess
> this is why the previous work to do this skipped evsels rather than
> removed them.

Actually I have a patch series to convert the annotation histogram
to a hash map. It'd reduce memory usage as well. Will post.

I think removing evsel is not a common operation and should be
done with care. In this patchset, it removed (dummy) events after
processing all samples. I can make the code to skip those event
when printing the result but it'd be much easier if it can remove the
unnecessary events.

Thanks,
Namhyung