2021-04-07 16:10:20

by Song Liu

[permalink] [raw]
Subject: [PATCH v2 0/3] perf util: bpf perf improvements

This patches set improves bpf_perf (perf-stat --bpf-counter) by
1) exposing key definitions to a libperf header;
2) adding compatibility check for perf_attr_map;
3) introducing config stat.bpf-counter-events.

Changes v1 => v2:
1. Separte 2/3 from 1/3. (Jiri)
2. Rename bperf.h to bpf_perf.h. (Jiri)
3. Other small fixes/optimizations. (Jiri)

Song Liu (3):
perf util: move bpf_perf definitions to a libperf header
perf bpf: check perf_attr_map is compatible with the perf binary
perf-stat: introduce config stat.bpf-counter-events

tools/lib/perf/include/perf/bpf_perf.h | 31 ++++++++++++++
tools/perf/Documentation/perf-stat.txt | 2 +
tools/perf/builtin-stat.c | 43 ++++++++++++-------
tools/perf/util/bpf_counter.c | 57 +++++++++++++++-----------
tools/perf/util/config.c | 32 +++++++++++++++
tools/perf/util/evsel.c | 2 +
tools/perf/util/evsel.h | 6 +++
tools/perf/util/target.h | 5 ---
8 files changed, 134 insertions(+), 44 deletions(-)
create mode 100644 tools/lib/perf/include/perf/bpf_perf.h

--
2.30.2


2021-04-07 16:10:25

by Song Liu

[permalink] [raw]
Subject: [PATCH v2 2/3] perf bpf: check perf_attr_map is compatible with the perf binary

perf_attr_map could be shared among different version of perf binary. Add
bperf_attr_map_compatible() to check whether the existing attr_map is
compatible with current perf binary.

Signed-off-by: Song Liu <[email protected]>
---
tools/perf/util/bpf_counter.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)

diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
index be484ddbbd5be..5de991ab46af9 100644
--- a/tools/perf/util/bpf_counter.c
+++ b/tools/perf/util/bpf_counter.c
@@ -312,6 +312,20 @@ static __u32 bpf_map_get_id(int fd)
return map_info.id;
}

+static bool bperf_attr_map_compatible(int attr_map_fd)
+{
+ struct bpf_map_info map_info = {0};
+ __u32 map_info_len = sizeof(map_info);
+ int err;
+
+ err = bpf_obj_get_info_by_fd(attr_map_fd, &map_info, &map_info_len);
+
+ if (err)
+ return false;
+ return (map_info.key_size == sizeof(struct perf_event_attr)) &&
+ (map_info.value_size == sizeof(struct perf_event_attr_map_entry));
+}
+
static int bperf_lock_attr_map(struct target *target)
{
char path[PATH_MAX];
@@ -346,6 +360,11 @@ static int bperf_lock_attr_map(struct target *target)
return -1;
}

+ if (!bperf_attr_map_compatible(map_fd)) {
+ close(map_fd);
+ return -1;
+
+ }
err = flock(map_fd, LOCK_EX);
if (err) {
close(map_fd);
--
2.30.2

2021-04-07 16:10:25

by Song Liu

[permalink] [raw]
Subject: [PATCH v2 3/3] perf-stat: introduce config stat.bpf-counter-events

Currently, to use BPF to aggregate perf event counters, the user uses
--bpf-counters option. Enable "use bpf by default" events with a config
option, stat.bpf-counter-events. This is limited to hardware events in
evsel__hw_names.

This also enables mixed BPF event and regular event in the same sesssion.
For example:

perf config stat.bpf-counter-events=instructions
perf stat -e instructions,cs

The second command will use BPF for "instructions" but not "cs".

Signed-off-by: Song Liu <[email protected]>
---
tools/perf/Documentation/perf-stat.txt | 2 ++
tools/perf/builtin-stat.c | 43 +++++++++++++++++---------
tools/perf/util/bpf_counter.c | 11 +++++++
tools/perf/util/config.c | 32 +++++++++++++++++++
tools/perf/util/evsel.c | 2 ++
tools/perf/util/evsel.h | 6 ++++
tools/perf/util/target.h | 5 ---
7 files changed, 81 insertions(+), 20 deletions(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 744211fa8c186..6d4733eaac170 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -97,6 +97,8 @@ report::
Use BPF programs to aggregate readings from perf_events. This
allows multiple perf-stat sessions that are counting the same metric (cycles,
instructions, etc.) to share hardware counters.
+ To use BPF programs on common hardware events by default, use
+ "perf config stat.bpf-counter-events=<list_of_events>".

--bpf-attr-map::
With option "--bpf-counters", different perf-stat sessions share
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 4bb48c6b66980..7c26e627db0ef 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -413,6 +413,8 @@ static int read_affinity_counters(struct timespec *rs)
evlist__for_each_entry(evsel_list, counter) {
if (evsel__cpu_iter_skip(counter, cpu))
continue;
+ if (evsel__is_bpf(counter))
+ continue;
if (!counter->err) {
counter->err = read_counter_cpu(counter, rs,
counter->cpu_iter - 1);
@@ -423,17 +425,28 @@ static int read_affinity_counters(struct timespec *rs)
return 0;
}

+/*
+ * Returns:
+ * 0 if all events use BPF;
+ * 1 if some events do NOT use BPF;
+ * < 0 on errors;
+ */
static int read_bpf_map_counters(void)
{
+ bool has_none_bpf_events = false;
struct evsel *counter;
int err;

evlist__for_each_entry(evsel_list, counter) {
+ if (!evsel__is_bpf(counter)) {
+ has_none_bpf_events = true;
+ continue;
+ }
err = bpf_counter__read(counter);
if (err)
return err;
}
- return 0;
+ return has_none_bpf_events ? 1 : 0;
}

static void read_counters(struct timespec *rs)
@@ -442,9 +455,10 @@ static void read_counters(struct timespec *rs)
int err;

if (!stat_config.stop_read_counter) {
- if (target__has_bpf(&target))
- err = read_bpf_map_counters();
- else
+ err = read_bpf_map_counters();
+ if (err < 0)
+ return;
+ if (err)
err = read_affinity_counters(rs);
if (err < 0)
return;
@@ -535,12 +549,13 @@ static int enable_counters(void)
struct evsel *evsel;
int err;

- if (target__has_bpf(&target)) {
- evlist__for_each_entry(evsel_list, evsel) {
- err = bpf_counter__enable(evsel);
- if (err)
- return err;
- }
+ evlist__for_each_entry(evsel_list, evsel) {
+ if (!evsel__is_bpf(evsel))
+ continue;
+
+ err = bpf_counter__enable(evsel);
+ if (err)
+ return err;
}

if (stat_config.initial_delay < 0) {
@@ -784,11 +799,9 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
if (affinity__setup(&affinity) < 0)
return -1;

- if (target__has_bpf(&target)) {
- evlist__for_each_entry(evsel_list, counter) {
- if (bpf_counter__load(counter, &target))
- return -1;
- }
+ evlist__for_each_entry(evsel_list, counter) {
+ if (bpf_counter__load(counter, &target))
+ return -1;
}

evlist__for_each_cpu (evsel_list, i, cpu) {
diff --git a/tools/perf/util/bpf_counter.c b/tools/perf/util/bpf_counter.c
index 5de991ab46af9..0f3b3f90526d7 100644
--- a/tools/perf/util/bpf_counter.c
+++ b/tools/perf/util/bpf_counter.c
@@ -792,6 +792,17 @@ int bpf_counter__load(struct evsel *evsel, struct target *target)
evsel->bpf_counter_ops = &bpf_program_profiler_ops;
else if (target->use_bpf)
evsel->bpf_counter_ops = &bperf_ops;
+ else {
+ int i;
+
+ for (i = 0; i < PERF_COUNT_HW_MAX; i++) {
+ if (!strcmp(evsel->name, evsel__hw_names[i])) {
+ if (evsel__use_bpf_counters[i])
+ evsel->bpf_counter_ops = &bperf_ops;
+ break;
+ }
+ }
+ }

if (evsel->bpf_counter_ops)
return evsel->bpf_counter_ops->load(evsel, target);
diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c
index 2daeaa9a4a241..fe2ec56258735 100644
--- a/tools/perf/util/config.c
+++ b/tools/perf/util/config.c
@@ -18,6 +18,7 @@
#include "util/hist.h" /* perf_hist_config */
#include "util/llvm-utils.h" /* perf_llvm_config */
#include "util/stat.h" /* perf_stat__set_big_num */
+#include "util/evsel.h" /* evsel__hw_names, evsel__use_bpf_counters */
#include "build-id.h"
#include "debug.h"
#include "config.h"
@@ -433,6 +434,29 @@ static int perf_buildid_config(const char *var, const char *value)
return 0;
}

+static int perf_stat_config_parse_bpf_counter_event(const char *value)
+{
+ char *event_str, *event_str_, *tok, *saveptr = NULL;
+ int i;
+
+ event_str_ = event_str = strdup(value);
+ if (!event_str)
+ return -1;
+
+ while ((tok = strtok_r(event_str, ",", &saveptr)) != NULL) {
+ for (i = 0; i < PERF_COUNT_HW_MAX; i++) {
+ if (!strcmp(tok, evsel__hw_names[i])) {
+ evsel__use_bpf_counters[i] = true;
+ break;
+ }
+ }
+ event_str = NULL;
+ }
+
+ free(event_str_);
+ return 0;
+}
+
static int perf_default_core_config(const char *var __maybe_unused,
const char *value __maybe_unused)
{
@@ -454,9 +478,17 @@ static int perf_ui_config(const char *var, const char *value)

static int perf_stat_config(const char *var, const char *value)
{
+ int err = 0;
+
if (!strcmp(var, "stat.big-num"))
perf_stat__set_big_num(perf_config_bool(var, value));

+ if (!strcmp(var, "stat.bpf-counter-events")) {
+ err = perf_stat_config_parse_bpf_counter_event(value);
+ if (err)
+ return err;
+ }
+
/* Add other config variables here. */
return 0;
}
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 2d2614eeaa20e..592d93bcccd04 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -492,6 +492,8 @@ const char *evsel__hw_names[PERF_COUNT_HW_MAX] = {
"ref-cycles",
};

+bool evsel__use_bpf_counters[PERF_COUNT_HW_MAX] = {false};
+
static const char *__evsel__hw_name(u64 config)
{
if (config < PERF_COUNT_HW_MAX && evsel__hw_names[config])
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index dd4f56f9cfdf5..ca52581f1b179 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -240,6 +240,11 @@ void evsel__calc_id_pos(struct evsel *evsel);

bool evsel__is_cache_op_valid(u8 type, u8 op);

+static inline bool evsel__is_bpf(struct evsel *evsel)
+{
+ return evsel->bpf_counter_ops != NULL;
+}
+
#define EVSEL__MAX_ALIASES 8

extern const char *evsel__hw_cache[PERF_COUNT_HW_CACHE_MAX][EVSEL__MAX_ALIASES];
@@ -247,6 +252,7 @@ extern const char *evsel__hw_cache_op[PERF_COUNT_HW_CACHE_OP_MAX][EVSEL__MAX_ALI
extern const char *evsel__hw_cache_result[PERF_COUNT_HW_CACHE_RESULT_MAX][EVSEL__MAX_ALIASES];
extern const char *evsel__hw_names[PERF_COUNT_HW_MAX];
extern const char *evsel__sw_names[PERF_COUNT_SW_MAX];
+extern bool evsel__use_bpf_counters[PERF_COUNT_HW_MAX];
int __evsel__hw_cache_type_op_res_name(u8 type, u8 op, u8 result, char *bf, size_t size);
const char *evsel__name(struct evsel *evsel);

diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
index 1bce3eb28ef25..4ff56217f2a65 100644
--- a/tools/perf/util/target.h
+++ b/tools/perf/util/target.h
@@ -66,11 +66,6 @@ static inline bool target__has_cpu(struct target *target)
return target->system_wide || target->cpu_list;
}

-static inline bool target__has_bpf(struct target *target)
-{
- return target->bpf_str || target->use_bpf;
-}
-
static inline bool target__none(struct target *target)
{
return !target__has_task(target) && !target__has_cpu(target);
--
2.30.2

2021-04-08 11:50:07

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] perf-stat: introduce config stat.bpf-counter-events

On Tue, Apr 06, 2021 at 05:36:01PM -0700, Song Liu wrote:
> Currently, to use BPF to aggregate perf event counters, the user uses
> --bpf-counters option. Enable "use bpf by default" events with a config
> option, stat.bpf-counter-events. This is limited to hardware events in
> evsel__hw_names.
>
> This also enables mixed BPF event and regular event in the same sesssion.
> For example:
>
> perf config stat.bpf-counter-events=instructions
> perf stat -e instructions,cs
>

so if we are mixing events now, how about uing modifier for bpf counters,
instead of configuring .perfconfig list we could use:

perf stat -e instructions:b,cs

thoughts?

the change below adds 'b' modifier and sets 'evsel::bpf_counter',
feel free to use it

jirka


---
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index ca52581f1b17..c55e4e58d1dc 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -82,6 +82,7 @@ struct evsel {
bool auto_merge_stats;
bool collect_stat;
bool weak_group;
+ bool bpf_counter;
int bpf_fd;
struct bpf_object *bpf_obj;
};
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 9ecb45bea948..b5850f1ea90b 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1801,6 +1801,7 @@ struct event_modifier {
int pinned;
int weak;
int exclusive;
+ int bpf_counter;
};

static int get_event_modifier(struct event_modifier *mod, char *str,
@@ -1821,6 +1822,7 @@ static int get_event_modifier(struct event_modifier *mod, char *str,
int exclude = eu | ek | eh;
int exclude_GH = evsel ? evsel->exclude_GH : 0;
int weak = 0;
+ int bpf_counter = 0;

memset(mod, 0, sizeof(*mod));

@@ -1864,6 +1866,8 @@ static int get_event_modifier(struct event_modifier *mod, char *str,
exclusive = 1;
} else if (*str == 'W') {
weak = 1;
+ } else if (*str == 'b') {
+ bpf_counter = 1;
} else
break;

@@ -1895,6 +1899,7 @@ static int get_event_modifier(struct event_modifier *mod, char *str,
mod->sample_read = sample_read;
mod->pinned = pinned;
mod->weak = weak;
+ mod->bpf_counter = bpf_counter;
mod->exclusive = exclusive;

return 0;
@@ -1909,7 +1914,7 @@ static int check_modifier(char *str)
char *p = str;

/* The sizeof includes 0 byte as well. */
- if (strlen(str) > (sizeof("ukhGHpppPSDIWe") - 1))
+ if (strlen(str) > (sizeof("ukhGHpppPSDIWeb") - 1))
return -1;

while (*p) {
@@ -1950,6 +1955,7 @@ int parse_events__modifier_event(struct list_head *list, char *str, bool add)
evsel->sample_read = mod.sample_read;
evsel->precise_max = mod.precise_max;
evsel->weak_group = mod.weak;
+ evsel->bpf_counter = mod.bpf_counter;

if (evsel__is_group_leader(evsel)) {
evsel->core.attr.pinned = mod.pinned;
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 0b36285a9435..fb8646cc3e83 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -210,7 +210,7 @@ name_tag [\'][a-zA-Z_*?\[\]][a-zA-Z0-9_*?\-,\.\[\]:=]*[\']
name_minus [a-zA-Z_*?][a-zA-Z0-9\-_*?.:]*
drv_cfg_term [a-zA-Z0-9_\.]+(=[a-zA-Z0-9_*?\.:]+)?
/* If you add a modifier you need to update check_modifier() */
-modifier_event [ukhpPGHSDIWe]+
+modifier_event [ukhpPGHSDIWeb]+
modifier_bp [rwx]{1,3}

%%

2021-04-08 16:43:43

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] perf-stat: introduce config stat.bpf-counter-events



> On Apr 8, 2021, at 4:47 AM, Jiri Olsa <[email protected]> wrote:
>
> On Tue, Apr 06, 2021 at 05:36:01PM -0700, Song Liu wrote:
>> Currently, to use BPF to aggregate perf event counters, the user uses
>> --bpf-counters option. Enable "use bpf by default" events with a config
>> option, stat.bpf-counter-events. This is limited to hardware events in
>> evsel__hw_names.
>>
>> This also enables mixed BPF event and regular event in the same sesssion.
>> For example:
>>
>> perf config stat.bpf-counter-events=instructions
>> perf stat -e instructions,cs
>>
>
> so if we are mixing events now, how about uing modifier for bpf counters,
> instead of configuring .perfconfig list we could use:
>
> perf stat -e instructions:b,cs
>
> thoughts?
>
> the change below adds 'b' modifier and sets 'evsel::bpf_counter',
> feel free to use it

I think we will need both 'b' modifier and .perfconfig configuration.
For systems with BPF-managed perf events running in the background,
.perfconfig makes sure perf-stat sessions will share PMCs with these
background monitoring tools. 'b' modifier, on the other hand, is useful
when the user knows there is opportunity to share the PMCs.

Does this make sense?

Thanks,
Song

>
> jirka
>
>
> ---
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index ca52581f1b17..c55e4e58d1dc 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -82,6 +82,7 @@ struct evsel {
> bool auto_merge_stats;
> bool collect_stat;
> bool weak_group;
> + bool bpf_counter;
> int bpf_fd;
> struct bpf_object *bpf_obj;
> };
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index 9ecb45bea948..b5850f1ea90b 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -1801,6 +1801,7 @@ struct event_modifier {
> int pinned;
> int weak;
> int exclusive;
> + int bpf_counter;
> };
>
> static int get_event_modifier(struct event_modifier *mod, char *str,
> @@ -1821,6 +1822,7 @@ static int get_event_modifier(struct event_modifier *mod, char *str,
> int exclude = eu | ek | eh;
> int exclude_GH = evsel ? evsel->exclude_GH : 0;
> int weak = 0;
> + int bpf_counter = 0;
>
> memset(mod, 0, sizeof(*mod));
>
> @@ -1864,6 +1866,8 @@ static int get_event_modifier(struct event_modifier *mod, char *str,
> exclusive = 1;
> } else if (*str == 'W') {
> weak = 1;
> + } else if (*str == 'b') {
> + bpf_counter = 1;
> } else
> break;
>
> @@ -1895,6 +1899,7 @@ static int get_event_modifier(struct event_modifier *mod, char *str,
> mod->sample_read = sample_read;
> mod->pinned = pinned;
> mod->weak = weak;
> + mod->bpf_counter = bpf_counter;
> mod->exclusive = exclusive;
>
> return 0;
> @@ -1909,7 +1914,7 @@ static int check_modifier(char *str)
> char *p = str;
>
> /* The sizeof includes 0 byte as well. */
> - if (strlen(str) > (sizeof("ukhGHpppPSDIWe") - 1))
> + if (strlen(str) > (sizeof("ukhGHpppPSDIWeb") - 1))
> return -1;
>
> while (*p) {
> @@ -1950,6 +1955,7 @@ int parse_events__modifier_event(struct list_head *list, char *str, bool add)
> evsel->sample_read = mod.sample_read;
> evsel->precise_max = mod.precise_max;
> evsel->weak_group = mod.weak;
> + evsel->bpf_counter = mod.bpf_counter;
>
> if (evsel__is_group_leader(evsel)) {
> evsel->core.attr.pinned = mod.pinned;
> diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
> index 0b36285a9435..fb8646cc3e83 100644
> --- a/tools/perf/util/parse-events.l
> +++ b/tools/perf/util/parse-events.l
> @@ -210,7 +210,7 @@ name_tag [\'][a-zA-Z_*?\[\]][a-zA-Z0-9_*?\-,\.\[\]:=]*[\']
> name_minus [a-zA-Z_*?][a-zA-Z0-9\-_*?.:]*
> drv_cfg_term [a-zA-Z0-9_\.]+(=[a-zA-Z0-9_*?\.:]+)?
> /* If you add a modifier you need to update check_modifier() */
> -modifier_event [ukhpPGHSDIWe]+
> +modifier_event [ukhpPGHSDIWeb]+
> modifier_bp [rwx]{1,3}
>
> %%
>

2021-04-08 17:21:09

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] perf-stat: introduce config stat.bpf-counter-events

Em Thu, Apr 08, 2021 at 04:39:33PM +0000, Song Liu escreveu:
> > On Apr 8, 2021, at 4:47 AM, Jiri Olsa <[email protected]> wrote:
> > On Tue, Apr 06, 2021 at 05:36:01PM -0700, Song Liu wrote:
> >> Currently, to use BPF to aggregate perf event counters, the user uses
> >> --bpf-counters option. Enable "use bpf by default" events with a config
> >> option, stat.bpf-counter-events. This is limited to hardware events in
> >> evsel__hw_names.
> >>
> >> This also enables mixed BPF event and regular event in the same sesssion.
> >> For example:
> >>
> >> perf config stat.bpf-counter-events=instructions
> >> perf stat -e instructions,cs
> >>
> >
> > so if we are mixing events now, how about uing modifier for bpf counters,
> > instead of configuring .perfconfig list we could use:
> >
> > perf stat -e instructions:b,cs
> >
> > thoughts?
> >
> > the change below adds 'b' modifier and sets 'evsel::bpf_counter',
> > feel free to use it
>
> I think we will need both 'b' modifier and .perfconfig configuration.

Agreed, maximum flexibility.

> For systems with BPF-managed perf events running in the background,
> .perfconfig makes sure perf-stat sessions will share PMCs with these
> background monitoring tools. 'b' modifier, on the other hand, is useful
> when the user knows there is opportunity to share the PMCs.
>
> Does this make sense?

I think so.

- Arnaldo

> Thanks,
> Song
>
> >
> > jirka
> >
> >
> > ---
> > diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> > index ca52581f1b17..c55e4e58d1dc 100644
> > --- a/tools/perf/util/evsel.h
> > +++ b/tools/perf/util/evsel.h
> > @@ -82,6 +82,7 @@ struct evsel {
> > bool auto_merge_stats;
> > bool collect_stat;
> > bool weak_group;
> > + bool bpf_counter;
> > int bpf_fd;
> > struct bpf_object *bpf_obj;
> > };
> > diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> > index 9ecb45bea948..b5850f1ea90b 100644
> > --- a/tools/perf/util/parse-events.c
> > +++ b/tools/perf/util/parse-events.c
> > @@ -1801,6 +1801,7 @@ struct event_modifier {
> > int pinned;
> > int weak;
> > int exclusive;
> > + int bpf_counter;
> > };
> >
> > static int get_event_modifier(struct event_modifier *mod, char *str,
> > @@ -1821,6 +1822,7 @@ static int get_event_modifier(struct event_modifier *mod, char *str,
> > int exclude = eu | ek | eh;
> > int exclude_GH = evsel ? evsel->exclude_GH : 0;
> > int weak = 0;
> > + int bpf_counter = 0;
> >
> > memset(mod, 0, sizeof(*mod));
> >
> > @@ -1864,6 +1866,8 @@ static int get_event_modifier(struct event_modifier *mod, char *str,
> > exclusive = 1;
> > } else if (*str == 'W') {
> > weak = 1;
> > + } else if (*str == 'b') {
> > + bpf_counter = 1;
> > } else
> > break;
> >
> > @@ -1895,6 +1899,7 @@ static int get_event_modifier(struct event_modifier *mod, char *str,
> > mod->sample_read = sample_read;
> > mod->pinned = pinned;
> > mod->weak = weak;
> > + mod->bpf_counter = bpf_counter;
> > mod->exclusive = exclusive;
> >
> > return 0;
> > @@ -1909,7 +1914,7 @@ static int check_modifier(char *str)
> > char *p = str;
> >
> > /* The sizeof includes 0 byte as well. */
> > - if (strlen(str) > (sizeof("ukhGHpppPSDIWe") - 1))
> > + if (strlen(str) > (sizeof("ukhGHpppPSDIWeb") - 1))
> > return -1;
> >
> > while (*p) {
> > @@ -1950,6 +1955,7 @@ int parse_events__modifier_event(struct list_head *list, char *str, bool add)
> > evsel->sample_read = mod.sample_read;
> > evsel->precise_max = mod.precise_max;
> > evsel->weak_group = mod.weak;
> > + evsel->bpf_counter = mod.bpf_counter;
> >
> > if (evsel__is_group_leader(evsel)) {
> > evsel->core.attr.pinned = mod.pinned;
> > diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
> > index 0b36285a9435..fb8646cc3e83 100644
> > --- a/tools/perf/util/parse-events.l
> > +++ b/tools/perf/util/parse-events.l
> > @@ -210,7 +210,7 @@ name_tag [\'][a-zA-Z_*?\[\]][a-zA-Z0-9_*?\-,\.\[\]:=]*[\']
> > name_minus [a-zA-Z_*?][a-zA-Z0-9\-_*?.:]*
> > drv_cfg_term [a-zA-Z0-9_\.]+(=[a-zA-Z0-9_*?\.:]+)?
> > /* If you add a modifier you need to update check_modifier() */
> > -modifier_event [ukhpPGHSDIWe]+
> > +modifier_event [ukhpPGHSDIWeb]+
> > modifier_bp [rwx]{1,3}
> >
> > %%
> >
>

--

- Arnaldo

2021-04-08 17:21:35

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] perf-stat: introduce config stat.bpf-counter-events

On Thu, Apr 08, 2021 at 04:39:33PM +0000, Song Liu wrote:
>
>
> > On Apr 8, 2021, at 4:47 AM, Jiri Olsa <[email protected]> wrote:
> >
> > On Tue, Apr 06, 2021 at 05:36:01PM -0700, Song Liu wrote:
> >> Currently, to use BPF to aggregate perf event counters, the user uses
> >> --bpf-counters option. Enable "use bpf by default" events with a config
> >> option, stat.bpf-counter-events. This is limited to hardware events in
> >> evsel__hw_names.
> >>
> >> This also enables mixed BPF event and regular event in the same sesssion.
> >> For example:
> >>
> >> perf config stat.bpf-counter-events=instructions
> >> perf stat -e instructions,cs
> >>
> >
> > so if we are mixing events now, how about uing modifier for bpf counters,
> > instead of configuring .perfconfig list we could use:
> >
> > perf stat -e instructions:b,cs
> >
> > thoughts?
> >
> > the change below adds 'b' modifier and sets 'evsel::bpf_counter',
> > feel free to use it
>
> I think we will need both 'b' modifier and .perfconfig configuration.
> For systems with BPF-managed perf events running in the background,

hum, I'm not sure I understand what that means.. you mean there
are tools that run perf stat so you don't want to change them?

> .perfconfig makes sure perf-stat sessions will share PMCs with these
> background monitoring tools. 'b' modifier, on the other hand, is useful
> when the user knows there is opportunity to share the PMCs.
>
> Does this make sense?

if there's reason for that then sure.. but let's not limit that just
on HARDWARE events only.. there are RAW events with the same demand
for this feature.. why don't we let user define any event for this?

jirka

2021-04-08 17:30:16

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] perf-stat: introduce config stat.bpf-counter-events



> On Apr 8, 2021, at 10:20 AM, Jiri Olsa <[email protected]> wrote:
>
> On Thu, Apr 08, 2021 at 04:39:33PM +0000, Song Liu wrote:
>>
>>
>>> On Apr 8, 2021, at 4:47 AM, Jiri Olsa <[email protected]> wrote:
>>>
>>> On Tue, Apr 06, 2021 at 05:36:01PM -0700, Song Liu wrote:
>>>> Currently, to use BPF to aggregate perf event counters, the user uses
>>>> --bpf-counters option. Enable "use bpf by default" events with a config
>>>> option, stat.bpf-counter-events. This is limited to hardware events in
>>>> evsel__hw_names.
>>>>
>>>> This also enables mixed BPF event and regular event in the same sesssion.
>>>> For example:
>>>>
>>>> perf config stat.bpf-counter-events=instructions
>>>> perf stat -e instructions,cs
>>>>
>>>
>>> so if we are mixing events now, how about uing modifier for bpf counters,
>>> instead of configuring .perfconfig list we could use:
>>>
>>> perf stat -e instructions:b,cs
>>>
>>> thoughts?
>>>
>>> the change below adds 'b' modifier and sets 'evsel::bpf_counter',
>>> feel free to use it
>>
>> I think we will need both 'b' modifier and .perfconfig configuration.
>> For systems with BPF-managed perf events running in the background,
>
> hum, I'm not sure I understand what that means.. you mean there
> are tools that run perf stat so you don't want to change them?

We have tools that do perf_event_open(). I will change them to use
BPF managed perf events for "cycles" and "instructions". Since these
tools are running 24/7, perf-stat on the system should use BPF managed
"cycles" and "instructions" by default.

>
>> .perfconfig makes sure perf-stat sessions will share PMCs with these
>> background monitoring tools. 'b' modifier, on the other hand, is useful
>> when the user knows there is opportunity to share the PMCs.
>>
>> Does this make sense?
>
> if there's reason for that then sure.. but let's not limit that just
> on HARDWARE events only.. there are RAW events with the same demand
> for this feature.. why don't we let user define any event for this?

I haven't found a good way to config RAW events. I guess RAW events
could use 'b' modifier?

Thanks,
Song

2021-04-08 17:49:09

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] perf-stat: introduce config stat.bpf-counter-events

On Thu, Apr 08, 2021 at 05:28:10PM +0000, Song Liu wrote:
>
>
> > On Apr 8, 2021, at 10:20 AM, Jiri Olsa <[email protected]> wrote:
> >
> > On Thu, Apr 08, 2021 at 04:39:33PM +0000, Song Liu wrote:
> >>
> >>
> >>> On Apr 8, 2021, at 4:47 AM, Jiri Olsa <[email protected]> wrote:
> >>>
> >>> On Tue, Apr 06, 2021 at 05:36:01PM -0700, Song Liu wrote:
> >>>> Currently, to use BPF to aggregate perf event counters, the user uses
> >>>> --bpf-counters option. Enable "use bpf by default" events with a config
> >>>> option, stat.bpf-counter-events. This is limited to hardware events in
> >>>> evsel__hw_names.
> >>>>
> >>>> This also enables mixed BPF event and regular event in the same sesssion.
> >>>> For example:
> >>>>
> >>>> perf config stat.bpf-counter-events=instructions
> >>>> perf stat -e instructions,cs
> >>>>
> >>>
> >>> so if we are mixing events now, how about uing modifier for bpf counters,
> >>> instead of configuring .perfconfig list we could use:
> >>>
> >>> perf stat -e instructions:b,cs
> >>>
> >>> thoughts?
> >>>
> >>> the change below adds 'b' modifier and sets 'evsel::bpf_counter',
> >>> feel free to use it
> >>
> >> I think we will need both 'b' modifier and .perfconfig configuration.
> >> For systems with BPF-managed perf events running in the background,
> >
> > hum, I'm not sure I understand what that means.. you mean there
> > are tools that run perf stat so you don't want to change them?
>
> We have tools that do perf_event_open(). I will change them to use
> BPF managed perf events for "cycles" and "instructions". Since these
> tools are running 24/7, perf-stat on the system should use BPF managed
> "cycles" and "instructions" by default.

well if you are already changing the tools why not change them to add
modifier.. but I don't mind adding that .perfconfig stuff if you need
that

>
> >
> >> .perfconfig makes sure perf-stat sessions will share PMCs with these
> >> background monitoring tools. 'b' modifier, on the other hand, is useful
> >> when the user knows there is opportunity to share the PMCs.
> >>
> >> Does this make sense?
> >
> > if there's reason for that then sure.. but let's not limit that just
> > on HARDWARE events only.. there are RAW events with the same demand
> > for this feature.. why don't we let user define any event for this?
>
> I haven't found a good way to config RAW events. I guess RAW events
> could use 'b' modifier?

any event uing the pmu notation like cpu/instructions/

we can allow any event to be BPF-managed, right? IIUC we don't care,
the code will work with any event

jirka

2021-04-08 18:12:03

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] perf-stat: introduce config stat.bpf-counter-events



> On Apr 8, 2021, at 10:45 AM, Jiri Olsa <[email protected]> wrote:
>
> On Thu, Apr 08, 2021 at 05:28:10PM +0000, Song Liu wrote:
>>
>>
>>> On Apr 8, 2021, at 10:20 AM, Jiri Olsa <[email protected]> wrote:
>>>
>>> On Thu, Apr 08, 2021 at 04:39:33PM +0000, Song Liu wrote:
>>>>
>>>>
>>>>> On Apr 8, 2021, at 4:47 AM, Jiri Olsa <[email protected]> wrote:
>>>>>
>>>>> On Tue, Apr 06, 2021 at 05:36:01PM -0700, Song Liu wrote:
>>>>>> Currently, to use BPF to aggregate perf event counters, the user uses
>>>>>> --bpf-counters option. Enable "use bpf by default" events with a config
>>>>>> option, stat.bpf-counter-events. This is limited to hardware events in
>>>>>> evsel__hw_names.
>>>>>>
>>>>>> This also enables mixed BPF event and regular event in the same sesssion.
>>>>>> For example:
>>>>>>
>>>>>> perf config stat.bpf-counter-events=instructions
>>>>>> perf stat -e instructions,cs
>>>>>>
>>>>>
>>>>> so if we are mixing events now, how about uing modifier for bpf counters,
>>>>> instead of configuring .perfconfig list we could use:
>>>>>
>>>>> perf stat -e instructions:b,cs
>>>>>
>>>>> thoughts?
>>>>>
>>>>> the change below adds 'b' modifier and sets 'evsel::bpf_counter',
>>>>> feel free to use it
>>>>
>>>> I think we will need both 'b' modifier and .perfconfig configuration.
>>>> For systems with BPF-managed perf events running in the background,
>>>
>>> hum, I'm not sure I understand what that means.. you mean there
>>> are tools that run perf stat so you don't want to change them?
>>
>> We have tools that do perf_event_open(). I will change them to use
>> BPF managed perf events for "cycles" and "instructions". Since these
>> tools are running 24/7, perf-stat on the system should use BPF managed
>> "cycles" and "instructions" by default.
>
> well if you are already changing the tools why not change them to add
> modifier.. but I don't mind adding that .perfconfig stuff if you need
> that

The tools I mentioned here don't use perf-stat, they just use
perf_event_open() and read the perf events fds. We want a config to make
"cycles" to use BPF by default, so that when the user (not these tools)
runs perf-stat, it will share PMCs with those events by default.

>
>>
>>>
>>>> .perfconfig makes sure perf-stat sessions will share PMCs with these
>>>> background monitoring tools. 'b' modifier, on the other hand, is useful
>>>> when the user knows there is opportunity to share the PMCs.
>>>>
>>>> Does this make sense?
>>>
>>> if there's reason for that then sure.. but let's not limit that just
>>> on HARDWARE events only.. there are RAW events with the same demand
>>> for this feature.. why don't we let user define any event for this?
>>
>> I haven't found a good way to config RAW events. I guess RAW events
>> could use 'b' modifier?
> any event uing the pmu notation like cpu/instructions/

Can we do something like "perf config stat.bpf-counter-events=cpu/*" means
all "cpu/xx" events use BPF by default?

Thanks,
Song

>
> we can allow any event to be BPF-managed, right? IIUC we don't care,
> the code will work with any event

Yes, the code works with any event.

2021-04-08 18:26:33

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] perf-stat: introduce config stat.bpf-counter-events

On Thu, Apr 08, 2021 at 06:08:20PM +0000, Song Liu wrote:
>
>
> > On Apr 8, 2021, at 10:45 AM, Jiri Olsa <[email protected]> wrote:
> >
> > On Thu, Apr 08, 2021 at 05:28:10PM +0000, Song Liu wrote:
> >>
> >>
> >>> On Apr 8, 2021, at 10:20 AM, Jiri Olsa <[email protected]> wrote:
> >>>
> >>> On Thu, Apr 08, 2021 at 04:39:33PM +0000, Song Liu wrote:
> >>>>
> >>>>
> >>>>> On Apr 8, 2021, at 4:47 AM, Jiri Olsa <[email protected]> wrote:
> >>>>>
> >>>>> On Tue, Apr 06, 2021 at 05:36:01PM -0700, Song Liu wrote:
> >>>>>> Currently, to use BPF to aggregate perf event counters, the user uses
> >>>>>> --bpf-counters option. Enable "use bpf by default" events with a config
> >>>>>> option, stat.bpf-counter-events. This is limited to hardware events in
> >>>>>> evsel__hw_names.
> >>>>>>
> >>>>>> This also enables mixed BPF event and regular event in the same sesssion.
> >>>>>> For example:
> >>>>>>
> >>>>>> perf config stat.bpf-counter-events=instructions
> >>>>>> perf stat -e instructions,cs
> >>>>>>
> >>>>>
> >>>>> so if we are mixing events now, how about uing modifier for bpf counters,
> >>>>> instead of configuring .perfconfig list we could use:
> >>>>>
> >>>>> perf stat -e instructions:b,cs
> >>>>>
> >>>>> thoughts?
> >>>>>
> >>>>> the change below adds 'b' modifier and sets 'evsel::bpf_counter',
> >>>>> feel free to use it
> >>>>
> >>>> I think we will need both 'b' modifier and .perfconfig configuration.
> >>>> For systems with BPF-managed perf events running in the background,
> >>>
> >>> hum, I'm not sure I understand what that means.. you mean there
> >>> are tools that run perf stat so you don't want to change them?
> >>
> >> We have tools that do perf_event_open(). I will change them to use
> >> BPF managed perf events for "cycles" and "instructions". Since these
> >> tools are running 24/7, perf-stat on the system should use BPF managed
> >> "cycles" and "instructions" by default.
> >
> > well if you are already changing the tools why not change them to add
> > modifier.. but I don't mind adding that .perfconfig stuff if you need
> > that
>
> The tools I mentioned here don't use perf-stat, they just use
> perf_event_open() and read the perf events fds. We want a config to make

just curious, how those tools use perf_event_open?

> "cycles" to use BPF by default, so that when the user (not these tools)
> runs perf-stat, it will share PMCs with those events by default.

I'm sorry but I still don't see the usecase.. if you need to change both tools,
you can change them to use bpf-managed event, why bother with the list?

> >
> >>
> >>>
> >>>> .perfconfig makes sure perf-stat sessions will share PMCs with these
> >>>> background monitoring tools. 'b' modifier, on the other hand, is useful
> >>>> when the user knows there is opportunity to share the PMCs.
> >>>>
> >>>> Does this make sense?
> >>>
> >>> if there's reason for that then sure.. but let's not limit that just
> >>> on HARDWARE events only.. there are RAW events with the same demand
> >>> for this feature.. why don't we let user define any event for this?
> >>
> >> I haven't found a good way to config RAW events. I guess RAW events
> >> could use 'b' modifier?
> > any event uing the pmu notation like cpu/instructions/
>
> Can we do something like "perf config stat.bpf-counter-events=cpu/*" means
> all "cpu/xx" events use BPF by default?

I think there's misundestanding, all I'm saying is that IIUC you check
events stat.bpf-counter-events to be HARDWARE type, which I don't think
is necessary and we can allow any event in there

jirka

2021-04-08 18:51:34

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] perf-stat: introduce config stat.bpf-counter-events

Em Thu, Apr 08, 2021 at 08:24:47PM +0200, Jiri Olsa escreveu:
> On Thu, Apr 08, 2021 at 06:08:20PM +0000, Song Liu wrote:
> >
> >
> > > On Apr 8, 2021, at 10:45 AM, Jiri Olsa <[email protected]> wrote:
> > >
> > > On Thu, Apr 08, 2021 at 05:28:10PM +0000, Song Liu wrote:
> > >>
> > >>
> > >>> On Apr 8, 2021, at 10:20 AM, Jiri Olsa <[email protected]> wrote:
> > >>>
> > >>> On Thu, Apr 08, 2021 at 04:39:33PM +0000, Song Liu wrote:
> > >>>>
> > >>>>
> > >>>>> On Apr 8, 2021, at 4:47 AM, Jiri Olsa <[email protected]> wrote:
> > >>>>>
> > >>>>> On Tue, Apr 06, 2021 at 05:36:01PM -0700, Song Liu wrote:
> > >>>>>> Currently, to use BPF to aggregate perf event counters, the user uses
> > >>>>>> --bpf-counters option. Enable "use bpf by default" events with a config
> > >>>>>> option, stat.bpf-counter-events. This is limited to hardware events in
> > >>>>>> evsel__hw_names.
> > >>>>>>
> > >>>>>> This also enables mixed BPF event and regular event in the same sesssion.
> > >>>>>> For example:
> > >>>>>>
> > >>>>>> perf config stat.bpf-counter-events=instructions
> > >>>>>> perf stat -e instructions,cs
> > >>>>>>
> > >>>>>
> > >>>>> so if we are mixing events now, how about uing modifier for bpf counters,
> > >>>>> instead of configuring .perfconfig list we could use:
> > >>>>>
> > >>>>> perf stat -e instructions:b,cs
> > >>>>>
> > >>>>> thoughts?
> > >>>>>
> > >>>>> the change below adds 'b' modifier and sets 'evsel::bpf_counter',
> > >>>>> feel free to use it
> > >>>>
> > >>>> I think we will need both 'b' modifier and .perfconfig configuration.
> > >>>> For systems with BPF-managed perf events running in the background,
> > >>>
> > >>> hum, I'm not sure I understand what that means.. you mean there
> > >>> are tools that run perf stat so you don't want to change them?
> > >>
> > >> We have tools that do perf_event_open(). I will change them to use
> > >> BPF managed perf events for "cycles" and "instructions". Since these
> > >> tools are running 24/7, perf-stat on the system should use BPF managed
> > >> "cycles" and "instructions" by default.
> > >
> > > well if you are already changing the tools why not change them to add
> > > modifier.. but I don't mind adding that .perfconfig stuff if you need
> > > that
> >
> > The tools I mentioned here don't use perf-stat, they just use
> > perf_event_open() and read the perf events fds. We want a config to make
>
> just curious, how those tools use perf_event_open?

I.e. do they use tools/lib/perf/? :-)

I guess they will use it now for getting that "struct perf_event_attr_map_entry" and
the map name define.

> > "cycles" to use BPF by default, so that when the user (not these tools)
> > runs perf-stat, it will share PMCs with those events by default.

> I'm sorry but I still don't see the usecase.. if you need to change both tools,
> you can change them to use bpf-managed event, why bother with the list?

He wants users not to bother if they are using bpf based counters, this will happen
automagically after they set their ~/.perfconfig with some command line Song provides.

Then they will be using bpf counters that won't get exclusive access to those
scarce counters, the tooling they are using will use bpf-counters and all will
be well.

Right Song?

- Arnaldo

2021-04-08 19:43:06

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] perf-stat: introduce config stat.bpf-counter-events



> On Apr 8, 2021, at 11:50 AM, Arnaldo Carvalho de Melo <[email protected]> wrote:
>
> Em Thu, Apr 08, 2021 at 08:24:47PM +0200, Jiri Olsa escreveu:
>> On Thu, Apr 08, 2021 at 06:08:20PM +0000, Song Liu wrote:
>>>
>>>
>>>> On Apr 8, 2021, at 10:45 AM, Jiri Olsa <[email protected]> wrote:
>>>>
>>>> On Thu, Apr 08, 2021 at 05:28:10PM +0000, Song Liu wrote:
>>>>>
>>>>>
>>>>>> On Apr 8, 2021, at 10:20 AM, Jiri Olsa <[email protected]> wrote:
>>>>>>
>>>>>> On Thu, Apr 08, 2021 at 04:39:33PM +0000, Song Liu wrote:
>>>>>>>
>>>>>>>
>>>>>>>> On Apr 8, 2021, at 4:47 AM, Jiri Olsa <[email protected]> wrote:
>>>>>>>>
>>>>>>>> On Tue, Apr 06, 2021 at 05:36:01PM -0700, Song Liu wrote:
>>>>>>>>> Currently, to use BPF to aggregate perf event counters, the user uses
>>>>>>>>> --bpf-counters option. Enable "use bpf by default" events with a config
>>>>>>>>> option, stat.bpf-counter-events. This is limited to hardware events in
>>>>>>>>> evsel__hw_names.
>>>>>>>>>
>>>>>>>>> This also enables mixed BPF event and regular event in the same sesssion.
>>>>>>>>> For example:
>>>>>>>>>
>>>>>>>>> perf config stat.bpf-counter-events=instructions
>>>>>>>>> perf stat -e instructions,cs
>>>>>>>>>
>>>>>>>>
>>>>>>>> so if we are mixing events now, how about uing modifier for bpf counters,
>>>>>>>> instead of configuring .perfconfig list we could use:
>>>>>>>>
>>>>>>>> perf stat -e instructions:b,cs
>>>>>>>>
>>>>>>>> thoughts?
>>>>>>>>
>>>>>>>> the change below adds 'b' modifier and sets 'evsel::bpf_counter',
>>>>>>>> feel free to use it
>>>>>>>
>>>>>>> I think we will need both 'b' modifier and .perfconfig configuration.
>>>>>>> For systems with BPF-managed perf events running in the background,
>>>>>>
>>>>>> hum, I'm not sure I understand what that means.. you mean there
>>>>>> are tools that run perf stat so you don't want to change them?
>>>>>
>>>>> We have tools that do perf_event_open(). I will change them to use
>>>>> BPF managed perf events for "cycles" and "instructions". Since these
>>>>> tools are running 24/7, perf-stat on the system should use BPF managed
>>>>> "cycles" and "instructions" by default.
>>>>
>>>> well if you are already changing the tools why not change them to add
>>>> modifier.. but I don't mind adding that .perfconfig stuff if you need
>>>> that
>>>
>>> The tools I mentioned here don't use perf-stat, they just use
>>> perf_event_open() and read the perf events fds. We want a config to make
>>
>> just curious, how those tools use perf_event_open?
>
> I.e. do they use tools/lib/perf/? :-)

Not right now. I do hope we can eventually let them use libperf. But I
haven't figured out the best path forward.

>
> I guess they will use it now for getting that "struct perf_event_attr_map_entry" and
> the map name define.
>
>>> "cycles" to use BPF by default, so that when the user (not these tools)
>>> runs perf-stat, it will share PMCs with those events by default.
>
>> I'm sorry but I still don't see the usecase.. if you need to change both tools,
>> you can change them to use bpf-managed event, why bother with the list?
>
> He wants users not to bother if they are using bpf based counters, this will happen
> automagically after they set their ~/.perfconfig with some command line Song provides.
>
> Then they will be using bpf counters that won't get exclusive access to those
> scarce counters, the tooling they are using will use bpf-counters and all will
> be well.
>
> Right Song?

Yes, exactly. The config automatically switches ad-hoc perf-stat runs (for debug,
performance tuning, etc.) to bpf managed counters.

Thanks,
Song

2021-04-08 19:50:38

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] perf-stat: introduce config stat.bpf-counter-events



> On Apr 8, 2021, at 11:24 AM, Jiri Olsa <[email protected]> wrote:
>
> On Thu, Apr 08, 2021 at 06:08:20PM +0000, Song Liu wrote:
>>
>>
>>> On Apr 8, 2021, at 10:45 AM, Jiri Olsa <[email protected]> wrote:
>>>
>>> On Thu, Apr 08, 2021 at 05:28:10PM +0000, Song Liu wrote:
>>>>
>>>>
>>>>> On Apr 8, 2021, at 10:20 AM, Jiri Olsa <[email protected]> wrote:
>>>>>
>>>>> On Thu, Apr 08, 2021 at 04:39:33PM +0000, Song Liu wrote:
>>>>>>
>>>>>>
>>>>>>> On Apr 8, 2021, at 4:47 AM, Jiri Olsa <[email protected]> wrote:
>>>>>>>
>>>>>>> On Tue, Apr 06, 2021 at 05:36:01PM -0700, Song Liu wrote:
>>>>>>>> Currently, to use BPF to aggregate perf event counters, the user uses
>>>>>>>> --bpf-counters option. Enable "use bpf by default" events with a config
>>>>>>>> option, stat.bpf-counter-events. This is limited to hardware events in
>>>>>>>> evsel__hw_names.
>>>>>>>>
>>>>>>>> This also enables mixed BPF event and regular event in the same sesssion.
>>>>>>>> For example:
>>>>>>>>
>>>>>>>> perf config stat.bpf-counter-events=instructions
>>>>>>>> perf stat -e instructions,cs
>>>>>>>>
>>>>>>>
>>>>>>> so if we are mixing events now, how about uing modifier for bpf counters,
>>>>>>> instead of configuring .perfconfig list we could use:
>>>>>>>
>>>>>>> perf stat -e instructions:b,cs
>>>>>>>
>>>>>>> thoughts?
>>>>>>>
>>>>>>> the change below adds 'b' modifier and sets 'evsel::bpf_counter',
>>>>>>> feel free to use it
>>>>>>
>>>>>> I think we will need both 'b' modifier and .perfconfig configuration.
>>>>>> For systems with BPF-managed perf events running in the background,
>>>>>
>>>>> hum, I'm not sure I understand what that means.. you mean there
>>>>> are tools that run perf stat so you don't want to change them?
>>>>
>>>> We have tools that do perf_event_open(). I will change them to use
>>>> BPF managed perf events for "cycles" and "instructions". Since these
>>>> tools are running 24/7, perf-stat on the system should use BPF managed
>>>> "cycles" and "instructions" by default.
>>>
>>> well if you are already changing the tools why not change them to add
>>> modifier.. but I don't mind adding that .perfconfig stuff if you need
>>> that
>>
>> The tools I mentioned here don't use perf-stat, they just use
>> perf_event_open() and read the perf events fds. We want a config to make
>
> just curious, how those tools use perf_event_open?
>
>> "cycles" to use BPF by default, so that when the user (not these tools)
>> runs perf-stat, it will share PMCs with those events by default.
>
> I'm sorry but I still don't see the usecase.. if you need to change both tools,
> you can change them to use bpf-managed event, why bother with the list?
>
>>>
>>>>
>>>>>
>>>>>> .perfconfig makes sure perf-stat sessions will share PMCs with these
>>>>>> background monitoring tools. 'b' modifier, on the other hand, is useful
>>>>>> when the user knows there is opportunity to share the PMCs.
>>>>>>
>>>>>> Does this make sense?
>>>>>
>>>>> if there's reason for that then sure.. but let's not limit that just
>>>>> on HARDWARE events only.. there are RAW events with the same demand
>>>>> for this feature.. why don't we let user define any event for this?
>>>>
>>>> I haven't found a good way to config RAW events. I guess RAW events
>>>> could use 'b' modifier?
>>> any event uing the pmu notation like cpu/instructions/
>>
>> Can we do something like "perf config stat.bpf-counter-events=cpu/*" means
>> all "cpu/xx" events use BPF by default?
>
> I think there's misundestanding, all I'm saying is that IIUC you check
> events stat.bpf-counter-events to be HARDWARE type, which I don't think
> is necessary and we can allow any event in there

From what I see, most of the opportunity of sharing comes from a few common
events, like cycles, instructions. The second reason is that, the config
implementation is easy and straightforward. We sure can extend the config
to other events. Before that, 'b' modifier should be good for these use
cases.

Thanks,
Song