Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DE49CC7EE30 for ; Fri, 3 Mar 2023 01:39:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229538AbjCCBjo (ORCPT ); Thu, 2 Mar 2023 20:39:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48048 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229565AbjCCBjj (ORCPT ); Thu, 2 Mar 2023 20:39:39 -0500 Received: from mail-il1-x132.google.com (mail-il1-x132.google.com [IPv6:2607:f8b0:4864:20::132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 12330D523 for ; Thu, 2 Mar 2023 17:39:28 -0800 (PST) Received: by mail-il1-x132.google.com with SMTP id k9so321833ilu.13 for ; Thu, 02 Mar 2023 17:39:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1677807557; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=afiiJhPSSiHJmdf66CB/7aw4gZPvkDrKatb2as0Vakc=; b=fOnatLOg32m5g0orIf0onuNkphmlz3cB6jThZ12XpMus564ZH2h3uLShx/RRuaFpO+ /dxZaQtrJHq6R9lbS+VBsxPiQbxzb+rmK03dEf8r7TcnfFFw3AJJ5LSEo3oRIzW51ITk +Xxsm9Z0hmQKzw/pilKklC1rb68/6VorCS6i6q0ftEEio/SNP9RbnqGSSlkhaYthwDbg WXgvIpujI/fRndBK7ued9Ft6k0ua9dwkBqyUl4T4PcpEqgdZab1YOdkP2K0hlkyhsgcH b4/qyz8e6j58s95gHwZeD3tSN6btXeIHdZbqZWgU+e20/TL0XULn3cdQbsIsN/V6fr12 kXMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1677807557; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=afiiJhPSSiHJmdf66CB/7aw4gZPvkDrKatb2as0Vakc=; b=WxsA0NzdBq5VlQsLrcKvCU4cOR+z3WA4CMqxWO9ZVf/ah856oxQU6Itqrf6QSOvD4i uamfq9j5kaTuS6m1OfTkQ4xqJ4tqvWSjPtTm/3tcN8s2nhKayM/wlL3DxiajekBeMYiT ftgrsX0wMutB4AnttiMupZCTykwI2dVhRjna7eJls3zIGUMe6Od60l6IeBg0qflV1ja6 di8QUheycPXUPGypAgNR6HK3pP+qJ5f6nzjvZyhb/MaaJCyIFDIKXx+xpFDmrLEOo0t6 V+58Mfu/sx2/Jf3GnpLot1tmaaP1xw4TeUdGmhH+pOnMbU7T4g3t2ilWBs09hiU+GaIp OJNg== X-Gm-Message-State: AO0yUKVID5rygIgPuVZgl0gYlmNNO6Kvhw32lVOv3KV10QA55SP25JkT Vgb298OJI+5MCR2DNZJABsFuvJQ88qidcXCyc45i4w== X-Google-Smtp-Source: AK7set/EjdRez2qxatJl38e794z7b3KssGI7IHrAG6oVrRfyd84n5oYzNtpIGhDu4BwLj0WQI0h3AQWZVZ52WaEQdh0= X-Received: by 2002:a05:6e02:13f4:b0:310:9fc1:a92b with SMTP id w20-20020a056e0213f400b003109fc1a92bmr280119ilj.0.1677807556951; Thu, 02 Mar 2023 17:39:16 -0800 (PST) MIME-Version: 1.0 References: <20230302212531.1043318-1-irogers@google.com> <20230302212531.1043318-9-irogers@google.com> In-Reply-To: From: Ian Rogers Date: Thu, 2 Mar 2023 17:39:03 -0800 Message-ID: Subject: Re: [PATCH v2 08/10] perf parse-events: Sort and group parsed events To: Namhyung Kim Cc: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Kan Liang , Zhengjun Xing , Ravi Bangoria , Adrian Hunter , "Steinar H. Gunderson" , Kim Phillips , Florian Fischer , James Clark , Suzuki Poulouse , Sean Christopherson , Leo Yan , John Garry , Kajol Jain , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, Stephane Eranian Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 2, 2023 at 4:37=E2=80=AFPM Namhyung Kim w= rote: > > On Thu, Mar 2, 2023 at 1:26 PM Ian Rogers wrote: > > > > This change is intended to be a no-op for most current cases, the > > default sort order is the order the events were parsed. Where it > > varies is in how groups are handled. Previously an uncore and core > > event that are grouped would most often cause the group to be removed: > > > > ``` > > $ perf stat -e '{instructions,uncore_imc_free_running_0/data_total/}' -= a sleep 1 > > WARNING: grouped events cpus do not match, disabling group: > > anon group { instructions, uncore_imc_free_running_0/data_total/ } > > ... > > ``` > > > > However, when wildcards are used the events should be re-sorted and > > re-grouped in parse_events__set_leader, but this currently fails for > > simple examples: > > > > ``` > > $ perf stat -e '{uncore_imc_free_running/data_read/,uncore_imc_free_run= ning/data_write/}' -a sleep 1 > > > > Performance counter stats for 'system wide': > > > > MiB uncore_imc_free_running/data_read/ > > MiB uncore_imc_free_running/data_write/ > > > > 1.000996992 seconds time elapsed > > ``` > > > > A futher failure mode, fixed in this patch, is to force topdown events > > into a group. > > > > This change moves sorting the evsels in the evlist after parsing. It > > requires parsing to set up groups. First the evsels are sorted > > respecting the existing groupings and parse order, but also reordering > > to ensure evsels of the same PMU and group appear together. So that > > software and aux events respect groups, their pmu_name is taken from > > the group leader. The sorting is done with list_sort removing a memory > > allocation. > > > > After sorting a pass is done to correct the group leaders and for > > topdown events ensuring they have a group leader. > > > > This fixes the problems seen before: > > > > ``` > > $ perf stat -e '{uncore_imc_free_running/data_read/,uncore_imc_free_run= ning/data_write/}' -a sleep 1 > > > > Performance counter stats for 'system wide': > > > > 727.42 MiB uncore_imc_free_running/data_read/ > > 81.84 MiB uncore_imc_free_running/data_write/ > > > > 1.000948615 seconds time elapsed > > ``` > > > > As well as making groups not fail for cases like: > > > > ``` > > $ perf stat -e '{imc_free_running_0/data_total/,imc_free_running_1/data= _total/}' -a sleep 1 > > > > Performance counter stats for 'system wide': > > > > 256.47 MiB imc_free_running_0/data_total/ > > 256.48 MiB imc_free_running_1/data_total/ > > I didn't expect we can group events from different PMUs. > Not sure if it can handle multiplexing properly.. You are right, this example is now working as the sorting and regrouping breaks the events into two groups. The rules around grouping are complex and Arnaldo mentioned that maybe cases like this should be warned about. The problem then is that wildcard and metric expansion may naturally produce these cases and we don't want the warning. It is something of a shame that the grouping information in the perf stat output isn't clearer. Thanks, Ian > Thanks, > Namhyung > > > > > > 1.001165442 seconds time elapsed > > ``` > > > > Signed-off-by: Ian Rogers > > --- > > tools/perf/arch/x86/util/evlist.c | 39 ++--- > > tools/perf/util/evlist.h | 2 +- > > tools/perf/util/parse-events.c | 240 +++++++++++++++--------------- > > tools/perf/util/parse-events.h | 3 +- > > tools/perf/util/parse-events.y | 4 +- > > 5 files changed, 136 insertions(+), 152 deletions(-) > > > > diff --git a/tools/perf/arch/x86/util/evlist.c b/tools/perf/arch/x86/ut= il/evlist.c > > index 8a7ae4162563..d4193479a364 100644 > > --- a/tools/perf/arch/x86/util/evlist.c > > +++ b/tools/perf/arch/x86/util/evlist.c > > @@ -65,29 +65,22 @@ int arch_evlist__add_default_attrs(struct evlist *e= vlist, > > return ___evlist__add_default_attrs(evlist, attrs, nr_attrs); > > } > > > > -struct evsel *arch_evlist__leader(struct list_head *list) > > +int arch_evlist__cmp(const struct evsel *lhs, const struct evsel *rhs) > > { > > - struct evsel *evsel, *first, *slots =3D NULL; > > - bool has_topdown =3D false; > > - > > - first =3D list_first_entry(list, struct evsel, core.node); > > - > > - if (!topdown_sys_has_perf_metrics()) > > - return first; > > - > > - /* If there is a slots event and a topdown event then the slots= event comes first. */ > > - __evlist__for_each_entry(list, evsel) { > > - if (evsel->pmu_name && !strncmp(evsel->pmu_name, "cpu",= 3) && evsel->name) { > > - if (strcasestr(evsel->name, "slots")) { > > - slots =3D evsel; > > - if (slots =3D=3D first) > > - return first; > > - } > > - if (strcasestr(evsel->name, "topdown")) > > - has_topdown =3D true; > > - if (slots && has_topdown) > > - return slots; > > - } > > + if (topdown_sys_has_perf_metrics() && > > + (!lhs->pmu_name || !strncmp(lhs->pmu_name, "cpu", 3))) { > > + /* Ensure the topdown slots comes first. */ > > + if (strcasestr(lhs->name, "slots")) > > + return -1; > > + if (strcasestr(rhs->name, "slots")) > > + return 1; > > + /* Followed by topdown events. */ > > + if (strcasestr(lhs->name, "topdown") && !strcasestr(rhs= ->name, "topdown")) > > + return -1; > > + if (!strcasestr(lhs->name, "topdown") && strcasestr(rhs= ->name, "topdown")) > > + return 1; > > } > > - return first; > > + > > + /* Default ordering by insertion index. */ > > + return lhs->core.idx - rhs->core.idx; > > } > > diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h > > index 01fa9d592c5a..d89d8f92802b 100644 > > --- a/tools/perf/util/evlist.h > > +++ b/tools/perf/util/evlist.h > > @@ -119,7 +119,7 @@ int arch_evlist__add_default_attrs(struct evlist *e= vlist, > > #define evlist__add_default_attrs(evlist, array) \ > > arch_evlist__add_default_attrs(evlist, array, ARRAY_SIZE(array)= ) > > > > -struct evsel *arch_evlist__leader(struct list_head *list); > > +int arch_evlist__cmp(const struct evsel *lhs, const struct evsel *rhs)= ; > > > > int evlist__add_dummy(struct evlist *evlist); > > struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system= _wide); > > diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-eve= nts.c > > index 1be454697d57..d6eb0a54dd2d 100644 > > --- a/tools/perf/util/parse-events.c > > +++ b/tools/perf/util/parse-events.c > > @@ -1,6 +1,7 @@ > > // SPDX-License-Identifier: GPL-2.0 > > #include > > #include > > +#include > > #include > > #include > > #include > > @@ -1655,125 +1656,7 @@ int parse_events__modifier_group(struct list_he= ad *list, > > return parse_events__modifier_event(list, event_mod, true); > > } > > > > -/* > > - * Check if the two uncore PMUs are from the same uncore block > > - * The format of the uncore PMU name is uncore_#blockname_#pmuidx > > - */ > > -static bool is_same_uncore_block(const char *pmu_name_a, const char *p= mu_name_b) > > -{ > > - char *end_a, *end_b; > > - > > - end_a =3D strrchr(pmu_name_a, '_'); > > - end_b =3D strrchr(pmu_name_b, '_'); > > - > > - if (!end_a || !end_b) > > - return false; > > - > > - if ((end_a - pmu_name_a) !=3D (end_b - pmu_name_b)) > > - return false; > > - > > - return (strncmp(pmu_name_a, pmu_name_b, end_a - pmu_name_a) =3D= =3D 0); > > -} > > - > > -static int > > -parse_events__set_leader_for_uncore_aliase(char *name, struct list_hea= d *list, > > - struct parse_events_state *p= arse_state) > > -{ > > - struct evsel *evsel, *leader; > > - uintptr_t *leaders; > > - bool is_leader =3D true; > > - int i, nr_pmu =3D 0, total_members, ret =3D 0; > > - > > - leader =3D list_first_entry(list, struct evsel, core.node); > > - evsel =3D list_last_entry(list, struct evsel, core.node); > > - total_members =3D evsel->core.idx - leader->core.idx + 1; > > - > > - leaders =3D calloc(total_members, sizeof(uintptr_t)); > > - if (WARN_ON(!leaders)) > > - return 0; > > - > > - /* > > - * Going through the whole group and doing sanity check. > > - * All members must use alias, and be from the same uncore bloc= k. > > - * Also, storing the leader events in an array. > > - */ > > - __evlist__for_each_entry(list, evsel) { > > - > > - /* Only split the uncore group which members use alias = */ > > - if (!evsel->use_uncore_alias) > > - goto out; > > - > > - /* The events must be from the same uncore block */ > > - if (!is_same_uncore_block(leader->pmu_name, evsel->pmu_= name)) > > - goto out; > > - > > - if (!is_leader) > > - continue; > > - /* > > - * If the event's PMU name starts to repeat, it must be= a new > > - * event. That can be used to distinguish the leader fr= om > > - * other members, even they have the same event name. > > - */ > > - if ((leader !=3D evsel) && > > - !strcmp(leader->pmu_name, evsel->pmu_name)) { > > - is_leader =3D false; > > - continue; > > - } > > - > > - /* Store the leader event for each PMU */ > > - leaders[nr_pmu++] =3D (uintptr_t) evsel; > > - } > > - > > - /* only one event alias */ > > - if (nr_pmu =3D=3D total_members) { > > - parse_state->nr_groups--; > > - goto handled; > > - } > > - > > - /* > > - * An uncore event alias is a joint name which means the same e= vent > > - * runs on all PMUs of a block. > > - * Perf doesn't support mixed events from different PMUs in the= same > > - * group. The big group has to be split into multiple small gro= ups > > - * which only include the events from the same PMU. > > - * > > - * Here the uncore event aliases must be from the same uncore b= lock. > > - * The number of PMUs must be same for each alias. The number o= f new > > - * small groups equals to the number of PMUs. > > - * Setting the leader event for corresponding members in each g= roup. > > - */ > > - i =3D 0; > > - __evlist__for_each_entry(list, evsel) { > > - if (i >=3D nr_pmu) > > - i =3D 0; > > - evsel__set_leader(evsel, (struct evsel *) leaders[i++])= ; > > - } > > - > > - /* The number of members and group name are same for each group= */ > > - for (i =3D 0; i < nr_pmu; i++) { > > - evsel =3D (struct evsel *) leaders[i]; > > - evsel->core.nr_members =3D total_members / nr_pmu; > > - evsel->group_name =3D name ? strdup(name) : NULL; > > - } > > - > > - /* Take the new small groups into account */ > > - parse_state->nr_groups +=3D nr_pmu - 1; > > - > > -handled: > > - ret =3D 1; > > - free(name); > > -out: > > - free(leaders); > > - return ret; > > -} > > - > > -__weak struct evsel *arch_evlist__leader(struct list_head *list) > > -{ > > - return list_first_entry(list, struct evsel, core.node); > > -} > > - > > -void parse_events__set_leader(char *name, struct list_head *list, > > - struct parse_events_state *parse_state) > > +void parse_events__set_leader(char *name, struct list_head *list) > > { > > struct evsel *leader; > > > > @@ -1782,13 +1665,9 @@ void parse_events__set_leader(char *name, struct= list_head *list, > > return; > > } > > > > - if (parse_events__set_leader_for_uncore_aliase(name, list, pars= e_state)) > > - return; > > - > > - leader =3D arch_evlist__leader(list); > > + leader =3D list_first_entry(list, struct evsel, core.node); > > __perf_evlist__set_leader(list, &leader->core); > > leader->group_name =3D name; > > - list_move(&leader->core.node, list); > > } > > > > /* list_event is assumed to point to malloc'ed memory */ > > @@ -2245,6 +2124,117 @@ static int parse_events__with_hybrid_pmu(struct= parse_events_state *parse_state, > > return ret; > > } > > > > +__weak int arch_evlist__cmp(const struct evsel *lhs, const struct evse= l *rhs) > > +{ > > + /* Order by insertion index. */ > > + return lhs->core.idx - rhs->core.idx; > > +} > > + > > +static int evlist__cmp(void *state, const struct list_head *l, const s= truct list_head *r) > > +{ > > + const struct perf_evsel *lhs_core =3D container_of(l, struct pe= rf_evsel, node); > > + const struct evsel *lhs =3D container_of(lhs_core, struct evsel= , core); > > + const struct perf_evsel *rhs_core =3D container_of(r, struct pe= rf_evsel, node); > > + const struct evsel *rhs =3D container_of(rhs_core, struct evsel= , core); > > + int *leader_idx =3D state; > > + int lhs_leader_idx =3D *leader_idx, rhs_leader_idx =3D *leader_= idx, ret; > > + const char *lhs_pmu_name, *rhs_pmu_name; > > + > > + /* > > + * First sort by grouping/leader. Read the leader idx only if t= he evsel > > + * is part of a group, as -1 indicates no group. > > + */ > > + if (lhs_core->leader !=3D lhs_core || lhs_core->nr_members > 1) > > + lhs_leader_idx =3D lhs_core->leader->idx; > > + if (rhs_core->leader !=3D rhs_core || rhs_core->nr_members > 1) > > + rhs_leader_idx =3D rhs_core->leader->idx; > > + > > + if (lhs_leader_idx !=3D rhs_leader_idx) > > + return lhs_leader_idx - rhs_leader_idx; > > + > > + /* Group by PMU. Groups can't span PMUs. */ > > + lhs_pmu_name =3D evsel__pmu_name(lhs); > > + rhs_pmu_name =3D evsel__pmu_name(rhs); > > + ret =3D strcmp(lhs_pmu_name, rhs_pmu_name); > > + if (ret) > > + return ret; > > + > > + /* Architecture specific sorting. */ > > + return arch_evlist__cmp(lhs, rhs); > > +} > > + > > +static void parse_events__sort_events_and_fix_groups(struct list_head = *list) > > +{ > > + int idx =3D -1; > > + struct evsel *pos, *cur_leader =3D NULL; > > + struct perf_evsel *cur_leaders_grp =3D NULL; > > + > > + /* > > + * Compute index to insert ungrouped events at. Place them wher= e the > > + * first ungrouped event appears. > > + */ > > + list_for_each_entry(pos, list, core.node) { > > + const struct evsel *pos_leader =3D evsel__leader(pos); > > + > > + if (pos !=3D pos_leader || pos->core.nr_members > 1) > > + continue; > > + > > + idx =3D pos->core.idx; > > + break; > > + } > > + > > + /* Sort events. */ > > + list_sort(&idx, list, evlist__cmp); > > + > > + /* > > + * Recompute groups, splitting for PMUs and adding groups for e= vents > > + * that require them. > > + */ > > + idx =3D 0; > > + list_for_each_entry(pos, list, core.node) { > > + const struct evsel *pos_leader =3D evsel__leader(pos); > > + const char *pos_pmu_name =3D evsel__pmu_name(pos); > > + const char *cur_leader_pmu_name, *pos_leader_pmu_name; > > + bool force_grouped =3D arch_evsel__must_be_in_group(pos= ); > > + > > + /* Reset index and nr_members. */ > > + pos->core.idx =3D idx++; > > + pos->core.nr_members =3D 0; > > + > > + /* > > + * Set the group leader respecting the given groupings = and that > > + * groups can't span PMUs. > > + */ > > + if (!cur_leader) > > + cur_leader =3D pos; > > + > > + cur_leader_pmu_name =3D evsel__pmu_name(cur_leader); > > + if ((cur_leaders_grp !=3D pos->core.leader && !force_gr= ouped) || > > + strcmp(cur_leader_pmu_name, pos_pmu_name)) { > > + /* Event is for a different group/PMU than last= . */ > > + cur_leader =3D pos; > > + /* > > + * Remember the leader's group before it is ove= rwritten, > > + * so that later events match as being in the s= ame > > + * group. > > + */ > > + cur_leaders_grp =3D pos->core.leader; > > + } > > + pos_leader_pmu_name =3D evsel__pmu_name(pos_leader); > > + if (strcmp(pos_leader_pmu_name, pos_pmu_name) || force_= grouped) { > > + /* > > + * Event's PMU differs from its leader's. Group= s can't > > + * span PMUs, so update leader from the group/P= MU > > + * tracker. > > + */ > > + evsel__set_leader(pos, cur_leader); > > + } > > + } > > + list_for_each_entry(pos, list, core.node) { > > + pos->core.leader->nr_members++; > > + } > > +} > > + > > int __parse_events(struct evlist *evlist, const char *str, > > struct parse_events_error *err, struct perf_pmu *fak= e_pmu) > > { > > @@ -2266,6 +2256,8 @@ int __parse_events(struct evlist *evlist, const c= har *str, > > return -1; > > } > > > > + parse_events__sort_events_and_fix_groups(&parse_state.list); > > + > > /* > > * Add list to the evlist even with errors to allow callers to = clean up. > > */ > > diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-eve= nts.h > > index 428e72eaafcc..22fc11b0bd59 100644 > > --- a/tools/perf/util/parse-events.h > > +++ b/tools/perf/util/parse-events.h > > @@ -200,8 +200,7 @@ int parse_events_copy_term_list(struct list_head *o= ld, > > > > enum perf_pmu_event_symbol_type > > perf_pmu__parse_check(const char *name); > > -void parse_events__set_leader(char *name, struct list_head *list, > > - struct parse_events_state *parse_state); > > +void parse_events__set_leader(char *name, struct list_head *list); > > void parse_events_update_lists(struct list_head *list_event, > > struct list_head *list_all); > > void parse_events_evlist_error(struct parse_events_state *parse_state, > > diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-eve= nts.y > > index 541b8dde2063..90d12f2bc8be 100644 > > --- a/tools/perf/util/parse-events.y > > +++ b/tools/perf/util/parse-events.y > > @@ -203,7 +203,7 @@ PE_NAME '{' events '}' > > > > inc_group_count(list, _parse_state); > > /* Takes ownership of $1. */ > > - parse_events__set_leader($1, list, _parse_state); > > + parse_events__set_leader($1, list); > > $$ =3D list; > > } > > | > > @@ -212,7 +212,7 @@ PE_NAME '{' events '}' > > struct list_head *list =3D $2; > > > > inc_group_count(list, _parse_state); > > - parse_events__set_leader(NULL, list, _parse_state); > > + parse_events__set_leader(NULL, list); > > $$ =3D list; > > } > > > > -- > > 2.40.0.rc0.216.gc4246ad0f0-goog > >