changes in v4:
- Add enum support to tracepoint arguments
- For tracing syscalls, move trace__load_vmlinux_btf() to
syscall__set_arg_fmts() to make it more elegant
changes in v3:
- Fixed another awkward formatting issue in trace__load_vmlinux_btf()
changes in v2:
- Fix formatting issues
- Pass a &use_btf to syscall_arg_fmt__init_array(), instead of
traversing all the arguments again.
- Add a trace__load_vmlinux_btf() function to load vmlinux BTF
- Add member 'btf_entry' in 'struct syscall_arg_fmt' to save the entry to
the corresponding 'struct btf_member' object, without having to do
btf__find_by_name(), btf__type_by_id(), btf_enum(), and btf_vlen()
everytime a syscall enters.
===
v4 notes
before:
perf $ ./perf trace -e timer:hrtimer_start --max-events=1
0.000 :0/0 timer:hrtimer_start(hrtimer: 0xffff974466c25f18, function: 0xffffffff89da5be0, expires: 377432432256753, softexpires: 377432432256753, mode: 10)
after:
perf $ ./perf trace -e timer:hrtimer_start --max-events=1
0.000 :0/0 timer:hrtimer_start(hrtimer: 0xffff974466c25f18, function: 0xffffffff89da5be0, expires: 377732805560049, softexpires: 377732805560049, mode: HRTIMER_MODE_ABS_PINNED_HARD)
HRTIMER_MODE_ABS_PINNED_HARD is:
perf $ pahole hrtimer_mode
enum hrtimer_mode {
HRTIMER_MODE_ABS = 0,
HRTIMER_MODE_REL = 1,
HRTIMER_MODE_PINNED = 2,
HRTIMER_MODE_SOFT = 4,
HRTIMER_MODE_HARD = 8,
HRTIMER_MODE_ABS_PINNED = 2,
HRTIMER_MODE_REL_PINNED = 3,
HRTIMER_MODE_ABS_SOFT = 4,
HRTIMER_MODE_REL_SOFT = 5,
HRTIMER_MODE_ABS_PINNED_SOFT = 6,
HRTIMER_MODE_REL_PINNED_SOFT = 7,
HRTIMER_MODE_ABS_HARD = 8,
HRTIMER_MODE_REL_HARD = 9,
HRTIMER_MODE_ABS_PINNED_HARD = 10,
HRTIMER_MODE_REL_PINNED_HARD = 11,
};
this: HRTIMER_MODE_ABS_PINNED_HARD = 10,
Can also be tested by
./perf trace -e pagemap:mm_lru_insertion,timer:hrtimer_start,timer:hrtimer_init,skb:kfree_skb --max-events=10
(Chose these 4 tracepoints because they happen quite frequently.)
By using the command above, I got:
perf $ ./perf trace -e pagemap:mm_lru_insertion,timer:hrtimer_start,timer:hrtimer_init,skb:kfree_skb --max-events=10
0.000 :0/0 timer:hrtimer_start(hrtimer: 0xffff974466c25f18, function: 0xffffffff89da5be0, expires: 376189979047665, softexpires: 376189979047665, mode: HRTIMER_MODE_ABS_PINNED_HARD)
0.053 :0/0 timer:hrtimer_start(hrtimer: 0xffff974466ca5f18, function: 0xffffffff89da5be0, expires: 376230939043569, softexpires: 376230939043569, mode: HRTIMER_MODE_ABS_PINNED_HARD)
0.099 :0/0 timer:hrtimer_start(hrtimer: 0xffff974466ca5f18, function: 0xffffffff89da5be0, expires: 376111885722141, softexpires: 376111885722141, mode: HRTIMER_MODE_ABS_PINNED_HARD)
0.242 tmux: server/296019 timer:hrtimer_init(hrtimer: 0xffffb872caa1f628, clockid: 1, mode: HRTIMER_MODE_ABS)
0.244 tmux: server/296019 timer:hrtimer_start(hrtimer: 0xffffb872caa1f628, function: 0xffffffff89d90420, expires: 376111894496606, softexpires: 376111894446606, mode: HRTIMER_MODE_ABS)
0.349 :0/0 timer:hrtimer_start(hrtimer: 0xffff974466ca5f18, function: 0xffffffff89da5be0, expires: 376230939043569, softexpires: 376230939043569, mode: HRTIMER_MODE_ABS_PINNED_HARD)
0.486 :0/0 timer:hrtimer_start(hrtimer: 0xffff974466ca5f18, function: 0xffffffff89da5be0, expires: 376111885722141, softexpires: 376111885722141, mode: HRTIMER_MODE_ABS_PINNED_HARD)
0.499 :0/0 timer:hrtimer_start(hrtimer: 0xffff974466ca5f18, function: 0xffffffff89da5be0, expires: 376230939043569, softexpires: 376230939043569, mode: HRTIMER_MODE_ABS_PINNED_HARD)
0.501 :0/0 timer:hrtimer_start(hrtimer: 0xffff974466ca5f18, function: 0xffffffff89da5be0, expires: 376111885722141, softexpires: 376111885722141, mode: HRTIMER_MODE_ABS_PINNED_HARD)
0.506 :0/0 timer:hrtimer_start(hrtimer: 0xffff974466ca5f18, function: 0xffffffff89da5be0, expires: 376230939043569, softexpires: 376230939043569, mode: HRTIMER_MODE_ABS_PINNED_HARD)
If you want to test all tracepoints that has enum arguments, and
that enum argument is retrievable in vmlinux:
./perf trace -e power:dev_pm_qos_add_request,power:dev_pm_qos_remove_request,power:dev_pm_qos_update_request,error_report:error_report_end,error_report:error_report_end,timer:hrtimer_init,timer:hrtimer_start,i2c_slave:i2c_slave,cfg80211:cfg80211_get_bss,pagemap:mm_lru_insertion,migrate:mm_migrate_pages,migrate:mm_migrate_pages_start,cfg80211:rdev_auth,cfg80211:rdev_connect,cfg80211:rdev_start_ap,cfg80211:cfg80211_chandef_dfs_required,cfg80211:cfg80211_ch_switch_notify,cfg80211:cfg80211_ch_switch_started_notify,cfg80211:cfg80211_get_bss,cfg80211:cfg80211_ibss_joined,cfg80211:cfg80211_inform_bss_frame,cfg80211:cfg80211_radar_event,cfg80211:cfg80211_ready_on_channel_expired,cfg80211:cfg80211_ready_on_channel,cfg80211:cfg80211_reg_can_beacon,cfg80211:cfg80211_return_bss,cfg80211:cfg80211_tx_mgmt_expired,cfg80211:rdev_channel_switch,cfg80211:rdev_inform_bss,cfg80211:rdev_libertas_set_mesh_channel,cfg80211:rdev_mgmt_tx,cfg80211:rdev_remain_on_channel,cfg80211:rdev_return_chandef,cfg80211:rdev_return_int_survey_info,cfg80211:rdev_set_ap_chanwidth,cfg80211:rdev_set_monitor_channel,cfg80211:rdev_set_radar_background,cfg80211:rdev_start_ap,cfg80211:rdev_start_radar_detection,cfg80211:rdev_tdls_channel_switch,cfg80211:cfg80211_reg_can_beacon,cfg80211:rdev_add_virtual_intf,cfg80211:rdev_change_virtual_intf,sched:sched_skip_vma_numa,power:pm_qos_update_flags,power:pm_qos_update_target,pwm:pwm_apply,pwm:pwm_get,skb:kfree_skb,thermal:thermal_zone_trip,xen:xen_mc_batch,xen:xen_mc_issue,xen:xen_mc_extend_args,xen:xen_mc_extend_args,xen:xen_mc_flush_reason,xen:xen_mc_flush_reason,compaction:mm_compaction_defer_compaction,compaction:mm_compaction_deferred,compaction:mm_compaction_defer_reset,compaction:mm_compaction_finished,compaction:mm_compaction_kcompactd_wake,compaction:mm_compaction_suitable,compaction:mm_compaction_wakeup_kcompactd --max-events=20
But whether this enum is in vmlinux may vary on different versions.
If you want to find out which enum type is in vmlinux, please use:
vmlinux_dir $ bpftool btf dump file /sys/kernel/btf/vmlinux > vmlinux
vmlinux_dir $ while read l; do grep "ENUM '$l'" vmlinux; done < <(grep field:enum /sys/kernel/tracing/events/*/*/format | awk '{print $3}' | sort | uniq) | awk '{print $3}' | sed "s/'\(.*\)'/\1/g"
dev_pm_qos_req_type
error_detector
hrtimer_mode
i2c_slave_event
ieee80211_bss_type
lru_list
migrate_mode
nl80211_auth_type
nl80211_band
nl80211_iftype
numa_vmaskip_reason
pm_qos_req_action
pwm_polarity
skb_drop_reason
thermal_trip_type
xen_lazy_mode
xen_mc_extend_args
xen_mc_flush_reason
zone_type
And what tracepoints have these enum types as their arguments:
vmlinux_dir $ while read l; do grep "ENUM '$l'" vmlinux; done < <(grep field:enum /sys/kernel/tracing/events/*/*/format | awk '{print $3}' | sort | uniq) | awk '{print $3}' | sed "s/'\(.*\)'/\1/g" > good_enums
vmlinux_dir $ cat good_enums
dev_pm_qos_req_type
error_detector
hrtimer_mode
i2c_slave_event
ieee80211_bss_type
lru_list
migrate_mode
nl80211_auth_type
nl80211_band
nl80211_iftype
numa_vmaskip_reason
pm_qos_req_action
pwm_polarity
skb_drop_reason
thermal_trip_type
xen_lazy_mode
xen_mc_extend_args
xen_mc_flush_reason
zone_type
vmlinux_dir $ grep -f good_enums -l /sys/kernel/tracing/events/*/*/format
/sys/kernel/tracing/events/cfg80211/cfg80211_chandef_dfs_required/format
/sys/kernel/tracing/events/cfg80211/cfg80211_ch_switch_notify/format
/sys/kernel/tracing/events/cfg80211/cfg80211_ch_switch_started_notify/format
/sys/kernel/tracing/events/cfg80211/cfg80211_get_bss/format
/sys/kernel/tracing/events/cfg80211/cfg80211_ibss_joined/format
/sys/kernel/tracing/events/cfg80211/cfg80211_inform_bss_frame/format
/sys/kernel/tracing/events/cfg80211/cfg80211_radar_event/format
/sys/kernel/tracing/events/cfg80211/cfg80211_ready_on_channel_expired/format
/sys/kernel/tracing/events/cfg80211/cfg80211_ready_on_channel/format
/sys/kernel/tracing/events/cfg80211/cfg80211_reg_can_beacon/format
/sys/kernel/tracing/events/cfg80211/cfg80211_return_bss/format
/sys/kernel/tracing/events/cfg80211/cfg80211_tx_mgmt_expired/format
/sys/kernel/tracing/events/cfg80211/rdev_add_virtual_intf/format
/sys/kernel/tracing/events/cfg80211/rdev_auth/format
/sys/kernel/tracing/events/cfg80211/rdev_change_virtual_intf/format
/sys/kernel/tracing/events/cfg80211/rdev_channel_switch/format
/sys/kernel/tracing/events/cfg80211/rdev_connect/format
/sys/kernel/tracing/events/cfg80211/rdev_inform_bss/format
/sys/kernel/tracing/events/cfg80211/rdev_libertas_set_mesh_channel/format
/sys/kernel/tracing/events/cfg80211/rdev_mgmt_tx/format
/sys/kernel/tracing/events/cfg80211/rdev_remain_on_channel/format
/sys/kernel/tracing/events/cfg80211/rdev_return_chandef/format
/sys/kernel/tracing/events/cfg80211/rdev_return_int_survey_info/format
/sys/kernel/tracing/events/cfg80211/rdev_set_ap_chanwidth/format
/sys/kernel/tracing/events/cfg80211/rdev_set_monitor_channel/format
/sys/kernel/tracing/events/cfg80211/rdev_set_radar_background/format
/sys/kernel/tracing/events/cfg80211/rdev_start_ap/format
/sys/kernel/tracing/events/cfg80211/rdev_start_radar_detection/format
/sys/kernel/tracing/events/cfg80211/rdev_tdls_channel_switch/format
/sys/kernel/tracing/events/compaction/mm_compaction_defer_compaction/format
/sys/kernel/tracing/events/compaction/mm_compaction_deferred/format
/sys/kernel/tracing/events/compaction/mm_compaction_defer_reset/format
/sys/kernel/tracing/events/compaction/mm_compaction_finished/format
/sys/kernel/tracing/events/compaction/mm_compaction_kcompactd_wake/format
/sys/kernel/tracing/events/compaction/mm_compaction_suitable/format
/sys/kernel/tracing/events/compaction/mm_compaction_wakeup_kcompactd/format
/sys/kernel/tracing/events/error_report/error_report_end/format
/sys/kernel/tracing/events/i2c_slave/i2c_slave/format
/sys/kernel/tracing/events/migrate/mm_migrate_pages/format
/sys/kernel/tracing/events/migrate/mm_migrate_pages_start/format
/sys/kernel/tracing/events/pagemap/mm_lru_insertion/format
/sys/kernel/tracing/events/power/dev_pm_qos_add_request/format
/sys/kernel/tracing/events/power/dev_pm_qos_remove_request/format
/sys/kernel/tracing/events/power/dev_pm_qos_update_request/format
/sys/kernel/tracing/events/power/pm_qos_update_flags/format
/sys/kernel/tracing/events/power/pm_qos_update_target/format
/sys/kernel/tracing/events/pwm/pwm_apply/format
/sys/kernel/tracing/events/pwm/pwm_get/format
/sys/kernel/tracing/events/sched/sched_skip_vma_numa/format
/sys/kernel/tracing/events/skb/kfree_skb/format
/sys/kernel/tracing/events/thermal/thermal_zone_trip/format
/sys/kernel/tracing/events/timer/hrtimer_init/format
/sys/kernel/tracing/events/timer/hrtimer_start/format
/sys/kernel/tracing/events/xen/xen_mc_batch/format
/sys/kernel/tracing/events/xen/xen_mc_extend_args/format
/sys/kernel/tracing/events/xen/xen_mc_flush_reason/format
/sys/kernel/tracing/events/xen/xen_mc_issue/format
This is how I got the crazy long list of tracepoints above.
From there, please use the command below to get the events, and use it as ./perf trace -e's
argument
vmlinux_dir $ grep -f good_enums -l /sys/kernel/tracing/events/*/*/format | sed "s/.*events\/\(.*\)\/\(.*\)\/.*/\1:\2/g"
cfg80211:cfg80211_chandef_dfs_required
cfg80211:cfg80211_ch_switch_notify
cfg80211:cfg80211_ch_switch_started_notify
cfg80211:cfg80211_get_bss
cfg80211:cfg80211_ibss_joined
cfg80211:cfg80211_inform_bss_frame
cfg80211:cfg80211_radar_event
cfg80211:cfg80211_ready_on_channel_expired
cfg80211:cfg80211_ready_on_channel
cfg80211:cfg80211_reg_can_beacon
cfg80211:cfg80211_return_bss
cfg80211:cfg80211_tx_mgmt_expired
cfg80211:rdev_add_virtual_intf
cfg80211:rdev_auth
cfg80211:rdev_change_virtual_intf
cfg80211:rdev_channel_switch
cfg80211:rdev_connect
cfg80211:rdev_inform_bss
cfg80211:rdev_libertas_set_mesh_channel
cfg80211:rdev_mgmt_tx
cfg80211:rdev_remain_on_channel
cfg80211:rdev_return_chandef
cfg80211:rdev_return_int_survey_info
cfg80211:rdev_set_ap_chanwidth
cfg80211:rdev_set_monitor_channel
cfg80211:rdev_set_radar_background
cfg80211:rdev_start_ap
cfg80211:rdev_start_radar_detection
cfg80211:rdev_tdls_channel_switch
compaction:mm_compaction_defer_compaction
compaction:mm_compaction_deferred
compaction:mm_compaction_defer_reset
compaction:mm_compaction_finished
compaction:mm_compaction_kcompactd_wake
compaction:mm_compaction_suitable
compaction:mm_compaction_wakeup_kcompactd
error_report:error_report_end
i2c_slave:i2c_slave
migrate:mm_migrate_pages
migrate:mm_migrate_pages_start
pagemap:mm_lru_insertion
power:dev_pm_qos_add_request
power:dev_pm_qos_remove_request
power:dev_pm_qos_update_request
power:pm_qos_update_flags
power:pm_qos_update_target
pwm:pwm_apply
pwm:pwm_get
sched:sched_skip_vma_numa
skb:kfree_skb
thermal:thermal_zone_trip
timer:hrtimer_init
timer:hrtimer_start
xen:xen_mc_batch
xen:xen_mc_extend_args
xen:xen_mc_flush_reason
xen:xen_mc_issue
===
v2 notes
In 'struct syscall_arg_fmt':
```
struct {
void *entry;
u16 nr_entries;
} btf_entry;
```
This is the new member btf_entry. 'struct btf_member' object, so that
we don't have to do btf__find_by_name(), btf__type_by_id(), btf_enum(),
and btf_vlen() everytime a landlock_add_rule() syscall entered.
Note that entry is of type 'void *', because this btf_entry can also be
applied to 'struct btf_member *' for 'BTF_KIND_STRUCT', hopefully in the
future.
===
v1 notes
This is a feature implemented on the basis of the previous bug fix
https://lore.kernel.org/linux-perf-users/[email protected]/T/#t
In this patch, BTF is used to turn enum value to the corresponding
name. There is only one system call that uses enum value as its
argument, that is `landlock_add_rule()`.
The vmlinux btf is loaded lazily, when user decided to trace the
`landlock_add_rule` syscall. But if one decide to run `perf trace`
without any arguments, the behaviour is to trace `landlock_add_rule`,
so vmlinux btf will be loaded by default.
The laziest behaviour is to load vmlinux btf when a
`landlock_add_rule` syscall hits. But I think you could lose some
samples when loading vmlinux btf at run time, for it can delay the
handling of other samples. I might need your precious opinions on
this...
before:
```
perf $ ./perf trace -e landlock_add_rule
0.000 ( 0.008 ms): ldlck-test/438194 landlock_add_rule(rule_type: 2) = -1 EBADFD (File descriptor in bad state)
0.010 ( 0.001 ms): ldlck-test/438194 landlock_add_rule(rule_type: 1) = -1 EBADFD (File descriptor in bad state)
```
after:
```
perf $ ./perf trace -e landlock_add_rule
0.000 ( 0.029 ms): ldlck-test/438194 landlock_add_rule(rule_type: LANDLOCK_RULE_NET_PORT) = -1 EBADFD (File descriptor in bad state)
0.036 ( 0.004 ms): ldlck-test/438194 landlock_add_rule(rule_type: LANDLOCK_RULE_PATH_BENEATH) = -1 EBADFD (File descriptor in bad state)
```
Signed-off-by: Howard Chu <[email protected]>
---
tools/perf/builtin-trace.c | 121 ++++++++++++++++++++++++++++++++++---
1 file changed, 111 insertions(+), 10 deletions(-)
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 5cbe1748911d..0a168cb9b0c2 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -19,6 +19,7 @@
#ifdef HAVE_LIBBPF_SUPPORT
#include <bpf/bpf.h>
#include <bpf/libbpf.h>
+#include <bpf/btf.h>
#ifdef HAVE_BPF_SKEL
#include "bpf_skel/augmented_raw_syscalls.skel.h"
#endif
@@ -110,6 +111,11 @@ struct syscall_arg_fmt {
const char *name;
u16 nr_entries; // for arrays
bool show_zero;
+ bool is_enum;
+ struct {
+ void *entry;
+ u16 nr_entries;
+ } btf_entry;
};
struct syscall_fmt {
@@ -140,6 +146,7 @@ struct trace {
#ifdef HAVE_BPF_SKEL
struct augmented_raw_syscalls_bpf *skel;
#endif
+ struct btf *btf;
struct record_opts opts;
struct evlist *evlist;
struct machine *host;
@@ -887,6 +894,56 @@ static size_t syscall_arg__scnprintf_getrandom_flags(char *bf, size_t size,
#define SCA_GETRANDOM_FLAGS syscall_arg__scnprintf_getrandom_flags
+static int btf_enum_find_entry(struct btf *btf, char *type, struct syscall_arg_fmt *arg_fmt)
+{
+ const struct btf_type *bt;
+ char enum_prefix[][16] = { "enum", "const enum" }, *ep;
+ int id;
+ size_t i;
+
+ for (i = 0; i < ARRAY_SIZE(enum_prefix); i++) {
+ ep = enum_prefix[i];
+ if (strlen(type) > strlen(ep) + 1 && strstarts(type, ep))
+ type += strlen(ep) + 1;
+ }
+
+ id = btf__find_by_name(btf, type);
+ if (id < 0)
+ return -1;
+
+ bt = btf__type_by_id(btf, id);
+ if (bt == NULL)
+ return -1;
+
+ arg_fmt->btf_entry.entry = btf_enum(bt);
+ arg_fmt->btf_entry.nr_entries = btf_vlen(bt);
+
+ return 0;
+}
+
+static size_t btf_enum_scnprintf(char *bf, size_t size, int val, struct btf *btf, char *type,
+ struct syscall_arg_fmt *arg_fmt)
+{
+ struct btf_enum *be;
+ int i;
+
+ /* if btf_entry is NULL, find and save it to arg_fmt */
+ if (arg_fmt->btf_entry.entry == NULL)
+ if (btf_enum_find_entry(btf, type, arg_fmt))
+ return 0;
+
+ be = (struct btf_enum *)arg_fmt->btf_entry.entry;
+
+ for (i = 0; i < arg_fmt->btf_entry.nr_entries; ++i, ++be) {
+ if (be->val == val) {
+ return scnprintf(bf, size, "%s",
+ btf__name_by_offset(btf, be->name_off));
+ }
+ }
+
+ return 0;
+}
+
#define STRARRAY(name, array) \
{ .scnprintf = SCA_STRARRAY, \
.strtoul = STUL_STRARRAY, \
@@ -1238,6 +1295,7 @@ struct syscall {
bool is_exit;
bool is_open;
bool nonexistent;
+ bool use_btf;
struct tep_format_field *args;
const char *name;
const struct syscall_fmt *fmt;
@@ -1699,6 +1757,15 @@ static void trace__symbols__exit(struct trace *trace)
symbol__exit();
}
+static void trace__load_vmlinux_btf(struct trace *trace)
+{
+ trace->btf = btf__load_vmlinux_btf();
+ if (verbose > 0) {
+ fprintf(trace->output, trace->btf ? "vmlinux BTF loaded\n" :
+ "Failed to load vmlinux BTF\n");
+ }
+}
+
static int syscall__alloc_arg_fmts(struct syscall *sc, int nr_args)
{
int idx;
@@ -1744,7 +1811,8 @@ static const struct syscall_arg_fmt *syscall_arg_fmt__find_by_name(const char *n
}
static struct tep_format_field *
-syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field *field)
+syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field *field,
+ bool *use_btf)
{
struct tep_format_field *last_field = NULL;
int len;
@@ -1756,6 +1824,7 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
continue;
len = strlen(field->name);
+ arg->is_enum = false;
if (strcmp(field->type, "const char *") == 0 &&
((len >= 4 && strcmp(field->name + len - 4, "name") == 0) ||
@@ -1782,6 +1851,8 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
* 7 unsigned long
*/
arg->scnprintf = SCA_FD;
+ } else if (strstr(field->type, "enum") && use_btf != NULL) {
+ *use_btf = arg->is_enum = true;
} else {
const struct syscall_arg_fmt *fmt =
syscall_arg_fmt__find_by_name(field->name);
@@ -1796,9 +1867,14 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
return last_field;
}
-static int syscall__set_arg_fmts(struct syscall *sc)
+static int syscall__set_arg_fmts(struct trace *trace, struct syscall *sc)
{
- struct tep_format_field *last_field = syscall_arg_fmt__init_array(sc->arg_fmt, sc->args);
+ bool use_btf;
+ struct tep_format_field *last_field = syscall_arg_fmt__init_array(sc->arg_fmt, sc->args,
+ &use_btf);
+
+ if (use_btf && trace->btf == NULL)
+ trace__load_vmlinux_btf(trace);
if (last_field)
sc->args_size = last_field->offset + last_field->size;
@@ -1883,15 +1959,20 @@ static int trace__read_syscall_info(struct trace *trace, int id)
sc->is_exit = !strcmp(name, "exit_group") || !strcmp(name, "exit");
sc->is_open = !strcmp(name, "open") || !strcmp(name, "openat");
- return syscall__set_arg_fmts(sc);
+ return syscall__set_arg_fmts(trace, sc);
}
-static int evsel__init_tp_arg_scnprintf(struct evsel *evsel)
+static int evsel__init_tp_arg_scnprintf(struct trace *trace, struct evsel *evsel)
{
struct syscall_arg_fmt *fmt = evsel__syscall_arg_fmt(evsel);
+ bool use_btf;
if (fmt != NULL) {
- syscall_arg_fmt__init_array(fmt, evsel->tp_format->format.fields);
+ syscall_arg_fmt__init_array(fmt, evsel->tp_format->format.fields, &use_btf);
+
+ if (use_btf && trace->btf == NULL)
+ trace__load_vmlinux_btf(trace);
+
return 0;
}
@@ -2103,6 +2184,16 @@ static size_t syscall__scnprintf_args(struct syscall *sc, char *bf, size_t size,
if (trace->show_arg_names)
printed += scnprintf(bf + printed, size - printed, "%s: ", field->name);
+ if (sc->arg_fmt[arg.idx].is_enum && trace->btf) {
+ size_t p = btf_enum_scnprintf(bf + printed, size - printed, val,
+ trace->btf, field->type,
+ &sc->arg_fmt[arg.idx]);
+ if (p) {
+ printed += p;
+ continue;
+ }
+ }
+
printed += syscall_arg_fmt__scnprintf_val(&sc->arg_fmt[arg.idx],
bf + printed, size - printed, &arg, val);
}
@@ -2791,7 +2882,7 @@ static size_t trace__fprintf_tp_fields(struct trace *trace, struct evsel *evsel,
val = syscall_arg_fmt__mask_val(arg, &syscall_arg, val);
/* Suppress this argument if its value is zero and show_zero property isn't set. */
- if (val == 0 && !trace->show_zeros && !arg->show_zero)
+ if (val == 0 && !trace->show_zeros && !arg->show_zero && !arg->is_enum)
continue;
printed += scnprintf(bf + printed, size - printed, "%s", printed ? ", " : "");
@@ -2799,6 +2890,15 @@ static size_t trace__fprintf_tp_fields(struct trace *trace, struct evsel *evsel,
if (trace->show_arg_names)
printed += scnprintf(bf + printed, size - printed, "%s: ", field->name);
+ if (arg->is_enum && trace->btf) {
+ size_t p = btf_enum_scnprintf(bf + printed, size - printed, val, trace->btf,
+ field->type, arg);
+ if (p) {
+ printed += p;
+ continue;
+ }
+ }
+
printed += syscall_arg_fmt__scnprintf_val(arg, bf + printed, size - printed, &syscall_arg, val);
}
@@ -4461,8 +4561,9 @@ static void evsel__set_syscall_arg_fmt(struct evsel *evsel, const char *name)
}
}
-static int evlist__set_syscall_tp_fields(struct evlist *evlist)
+static int evlist__set_syscall_tp_fields(struct trace *trace)
{
+ struct evlist *evlist = trace->evlist;
struct evsel *evsel;
evlist__for_each_entry(evlist, evsel) {
@@ -4470,7 +4571,7 @@ static int evlist__set_syscall_tp_fields(struct evlist *evlist)
continue;
if (strcmp(evsel->tp_format->system, "syscalls")) {
- evsel__init_tp_arg_scnprintf(evsel);
+ evsel__init_tp_arg_scnprintf(trace, evsel);
continue;
}
@@ -4949,7 +5050,7 @@ int cmd_trace(int argc, const char **argv)
if (trace.evlist->core.nr_entries > 0) {
evlist__set_default_evsel_handler(trace.evlist, trace__event_handler);
- if (evlist__set_syscall_tp_fields(trace.evlist)) {
+ if (evlist__set_syscall_tp_fields(&trace)) {
perror("failed to set syscalls:* tracepoint fields");
goto out;
}
--
2.45.2
On Thu, Jun 13, 2024 at 12:27:47PM +0800, Howard Chu wrote:
> changes in v4:
>
> - Add enum support to tracepoint arguments
That is cool, but see below the comment as having this as a separate
patch.
Also please, on the patch that introduces ! syscall tracepoint enum args
BTF augmentation include examples of tracepoints being augmented. I'll
try here while testing the patch as-is.
More comments below.
> - For tracing syscalls, move trace__load_vmlinux_btf() to
> syscall__set_arg_fmts() to make it more elegant
>
> changes in v3:
>
> - Fixed another awkward formatting issue in trace__load_vmlinux_btf()
>
> changes in v2:
>
> - Fix formatting issues
<SNIP>
> before:
>
> ```
> perf $ ./perf trace -e landlock_add_rule
> 0.000 ( 0.008 ms): ldlck-test/438194 landlock_add_rule(rule_type: 2) = -1 EBADFD (File descriptor in bad state)
> 0.010 ( 0.001 ms): ldlck-test/438194 landlock_add_rule(rule_type: 1) = -1 EBADFD (File descriptor in bad state)
> ```
>
> after:
>
> ```
> perf $ ./perf trace -e landlock_add_rule
> 0.000 ( 0.029 ms): ldlck-test/438194 landlock_add_rule(rule_type: LANDLOCK_RULE_NET_PORT) = -1 EBADFD (File descriptor in bad state)
> 0.036 ( 0.004 ms): ldlck-test/438194 landlock_add_rule(rule_type: LANDLOCK_RULE_PATH_BENEATH) = -1 EBADFD (File descriptor in bad state)
> ```
>
> Signed-off-by: Howard Chu <[email protected]>
> ---
> tools/perf/builtin-trace.c | 121 ++++++++++++++++++++++++++++++++++---
> 1 file changed, 111 insertions(+), 10 deletions(-)
>
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index 5cbe1748911d..0a168cb9b0c2 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -19,6 +19,7 @@
> #ifdef HAVE_LIBBPF_SUPPORT
> #include <bpf/bpf.h>
> #include <bpf/libbpf.h>
> +#include <bpf/btf.h>
> #ifdef HAVE_BPF_SKEL
> #include "bpf_skel/augmented_raw_syscalls.skel.h"
> #endif
> @@ -110,6 +111,11 @@ struct syscall_arg_fmt {
> const char *name;
> u16 nr_entries; // for arrays
> bool show_zero;
> + bool is_enum;
> + struct {
> + void *entry;
> + u16 nr_entries;
> + } btf_entry;
> };
>
> struct syscall_fmt {
> @@ -140,6 +146,7 @@ struct trace {
> #ifdef HAVE_BPF_SKEL
> struct augmented_raw_syscalls_bpf *skel;
> #endif
> + struct btf *btf;
> struct record_opts opts;
> struct evlist *evlist;
> struct machine *host;
> @@ -887,6 +894,56 @@ static size_t syscall_arg__scnprintf_getrandom_flags(char *bf, size_t size,
>
> #define SCA_GETRANDOM_FLAGS syscall_arg__scnprintf_getrandom_flags
>
> +static int btf_enum_find_entry(struct btf *btf, char *type, struct syscall_arg_fmt *arg_fmt)
> +{
> + const struct btf_type *bt;
> + char enum_prefix[][16] = { "enum", "const enum" }, *ep;
> + int id;
> + size_t i;
> +
> + for (i = 0; i < ARRAY_SIZE(enum_prefix); i++) {
> + ep = enum_prefix[i];
> + if (strlen(type) > strlen(ep) + 1 && strstarts(type, ep))
> + type += strlen(ep) + 1;
> + }
> +
> + id = btf__find_by_name(btf, type);
> + if (id < 0)
> + return -1;
> +
> + bt = btf__type_by_id(btf, id);
> + if (bt == NULL)
> + return -1;
> +
> + arg_fmt->btf_entry.entry = btf_enum(bt);
> + arg_fmt->btf_entry.nr_entries = btf_vlen(bt);
> +
> + return 0;
> +}
> +
> +static size_t btf_enum_scnprintf(char *bf, size_t size, int val, struct btf *btf, char *type,
> + struct syscall_arg_fmt *arg_fmt)
> +{
> + struct btf_enum *be;
> + int i;
> +
> + /* if btf_entry is NULL, find and save it to arg_fmt */
> + if (arg_fmt->btf_entry.entry == NULL)
> + if (btf_enum_find_entry(btf, type, arg_fmt))
> + return 0;
> +
> + be = (struct btf_enum *)arg_fmt->btf_entry.entry;
> +
> + for (i = 0; i < arg_fmt->btf_entry.nr_entries; ++i, ++be) {
> + if (be->val == val) {
> + return scnprintf(bf, size, "%s",
> + btf__name_by_offset(btf, be->name_off));
> + }
> + }
> +
> + return 0;
> +}
> +
> #define STRARRAY(name, array) \
> { .scnprintf = SCA_STRARRAY, \
> .strtoul = STUL_STRARRAY, \
> @@ -1238,6 +1295,7 @@ struct syscall {
> bool is_exit;
> bool is_open;
> bool nonexistent;
> + bool use_btf;
> struct tep_format_field *args;
> const char *name;
> const struct syscall_fmt *fmt;
> @@ -1699,6 +1757,15 @@ static void trace__symbols__exit(struct trace *trace)
> symbol__exit();
> }
>
> +static void trace__load_vmlinux_btf(struct trace *trace)
> +{
> + trace->btf = btf__load_vmlinux_btf();
> + if (verbose > 0) {
> + fprintf(trace->output, trace->btf ? "vmlinux BTF loaded\n" :
> + "Failed to load vmlinux BTF\n");
> + }
> +}
> +
> static int syscall__alloc_arg_fmts(struct syscall *sc, int nr_args)
> {
> int idx;
> @@ -1744,7 +1811,8 @@ static const struct syscall_arg_fmt *syscall_arg_fmt__find_by_name(const char *n
> }
>
> static struct tep_format_field *
> -syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field *field)
> +syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field *field,
> + bool *use_btf)
> {
> struct tep_format_field *last_field = NULL;
> int len;
> @@ -1756,6 +1824,7 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
> continue;
>
> len = strlen(field->name);
> + arg->is_enum = false;
>
> if (strcmp(field->type, "const char *") == 0 &&
> ((len >= 4 && strcmp(field->name + len - 4, "name") == 0) ||
> @@ -1782,6 +1851,8 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
> * 7 unsigned long
> */
> arg->scnprintf = SCA_FD;
> + } else if (strstr(field->type, "enum") && use_btf != NULL) {
> + *use_btf = arg->is_enum = true;
> } else {
> const struct syscall_arg_fmt *fmt =
> syscall_arg_fmt__find_by_name(field->name);
> @@ -1796,9 +1867,14 @@ syscall_arg_fmt__init_array(struct syscall_arg_fmt *arg, struct tep_format_field
> return last_field;
> }
>
> -static int syscall__set_arg_fmts(struct syscall *sc)
> +static int syscall__set_arg_fmts(struct trace *trace, struct syscall *sc)
See the comment about evsel__init_tp_arg_scnprintf() below. Also please
do patches on top of previous work, i.e. the v3 patch should be a
separate patch and this v4 should add the extra functionality, i.e. the
support for !syscall tracepoint enum BTF augmentation.
> {
> - struct tep_format_field *last_field = syscall_arg_fmt__init_array(sc->arg_fmt, sc->args);
> + bool use_btf;
> + struct tep_format_field *last_field = syscall_arg_fmt__init_array(sc->arg_fmt, sc->args,
> + &use_btf);
> +
> + if (use_btf && trace->btf == NULL)
> + trace__load_vmlinux_btf(trace);
>
> if (last_field)
> sc->args_size = last_field->offset + last_field->size;
> @@ -1883,15 +1959,20 @@ static int trace__read_syscall_info(struct trace *trace, int id)
> sc->is_exit = !strcmp(name, "exit_group") || !strcmp(name, "exit");
> sc->is_open = !strcmp(name, "open") || !strcmp(name, "openat");
>
> - return syscall__set_arg_fmts(sc);
> + return syscall__set_arg_fmts(trace, sc);
> }
>
> -static int evsel__init_tp_arg_scnprintf(struct evsel *evsel)
> +static int evsel__init_tp_arg_scnprintf(struct trace *trace, struct evsel *evsel)
The convention here is that evsel__ is the "class" name, so the first
arg is a 'struct evsel *', if you really were transforming this into a
'struct trace' specific "method" you would change the name of the C
function to 'trace__init_tp_arg_scnprintf'.
But in this case instead of passing the 'struct trace' pointer all the
way down we should instead pass a 'bool *use_btf' argument, making it:
static int evsel__init_tp_arg_scnprintf(struct evsel *evsel, bool *use_btf)
Then, when evlist__set_syscall_tp_fields(evlist, &use_btf) returns,
check that use_btf to check if we need to call
trace__load_vmlinux_btf(trace).
I'll test the patch as is now.
- Arnaldo
> {
> struct syscall_arg_fmt *fmt = evsel__syscall_arg_fmt(evsel);
> + bool use_btf;
>
> if (fmt != NULL) {
> - syscall_arg_fmt__init_array(fmt, evsel->tp_format->format.fields);
> + syscall_arg_fmt__init_array(fmt, evsel->tp_format->format.fields, &use_btf);
> +
> + if (use_btf && trace->btf == NULL)
> + trace__load_vmlinux_btf(trace);
> +
> return 0;
> }
>
> @@ -2103,6 +2184,16 @@ static size_t syscall__scnprintf_args(struct syscall *sc, char *bf, size_t size,
> if (trace->show_arg_names)
> printed += scnprintf(bf + printed, size - printed, "%s: ", field->name);
>
> + if (sc->arg_fmt[arg.idx].is_enum && trace->btf) {
> + size_t p = btf_enum_scnprintf(bf + printed, size - printed, val,
> + trace->btf, field->type,
> + &sc->arg_fmt[arg.idx]);
> + if (p) {
> + printed += p;
> + continue;
> + }
> + }
> +
> printed += syscall_arg_fmt__scnprintf_val(&sc->arg_fmt[arg.idx],
> bf + printed, size - printed, &arg, val);
> }
> @@ -2791,7 +2882,7 @@ static size_t trace__fprintf_tp_fields(struct trace *trace, struct evsel *evsel,
> val = syscall_arg_fmt__mask_val(arg, &syscall_arg, val);
>
> /* Suppress this argument if its value is zero and show_zero property isn't set. */
> - if (val == 0 && !trace->show_zeros && !arg->show_zero)
> + if (val == 0 && !trace->show_zeros && !arg->show_zero && !arg->is_enum)
> continue;
>
> printed += scnprintf(bf + printed, size - printed, "%s", printed ? ", " : "");
> @@ -2799,6 +2890,15 @@ static size_t trace__fprintf_tp_fields(struct trace *trace, struct evsel *evsel,
> if (trace->show_arg_names)
> printed += scnprintf(bf + printed, size - printed, "%s: ", field->name);
>
> + if (arg->is_enum && trace->btf) {
> + size_t p = btf_enum_scnprintf(bf + printed, size - printed, val, trace->btf,
> + field->type, arg);
> + if (p) {
> + printed += p;
> + continue;
> + }
> + }
> +
> printed += syscall_arg_fmt__scnprintf_val(arg, bf + printed, size - printed, &syscall_arg, val);
> }
>
> @@ -4461,8 +4561,9 @@ static void evsel__set_syscall_arg_fmt(struct evsel *evsel, const char *name)
> }
> }
>
> -static int evlist__set_syscall_tp_fields(struct evlist *evlist)
> +static int evlist__set_syscall_tp_fields(struct trace *trace)
> {
> + struct evlist *evlist = trace->evlist;
> struct evsel *evsel;
>
> evlist__for_each_entry(evlist, evsel) {
> @@ -4470,7 +4571,7 @@ static int evlist__set_syscall_tp_fields(struct evlist *evlist)
> continue;
>
> if (strcmp(evsel->tp_format->system, "syscalls")) {
> - evsel__init_tp_arg_scnprintf(evsel);
> + evsel__init_tp_arg_scnprintf(trace, evsel);
> continue;
> }
>
> @@ -4949,7 +5050,7 @@ int cmd_trace(int argc, const char **argv)
>
> if (trace.evlist->core.nr_entries > 0) {
> evlist__set_default_evsel_handler(trace.evlist, trace__event_handler);
> - if (evlist__set_syscall_tp_fields(trace.evlist)) {
> + if (evlist__set_syscall_tp_fields(&trace)) {
> perror("failed to set syscalls:* tracepoint fields");
> goto out;
> }
> --
> 2.45.2
>
On Thu, Jun 13, 2024 at 09:47:02AM -0300, Arnaldo Carvalho de Melo wrote:
> On Thu, Jun 13, 2024 at 12:27:47PM +0800, Howard Chu wrote:
> > changes in v4:
> > - Add enum support to tracepoint arguments
> That is cool, but see below the comment as having this as a separate
> patch.
>
> Also please, on the patch that introduces ! syscall tracepoint enum args
> BTF augmentation include examples of tracepoints being augmented. I'll
You did it as a notes for v4, great, I missed that.
> try here while testing the patch as-is.
The landlock_add_rule continues to work, using the same test program I
posted when testing your v1 patch:
root@x1:~# perf trace -e landlock_add_rule
0.000 ( 0.016 ms): landlock_add_r/475518 landlock_add_rule(ruleset_fd: 1, rule_type: LANDLOCK_RULE_PATH_BENEATH, rule_attr: 0x7ffd790ff690) = -1 EBADFD (File descriptor in bad state)
0.115 ( 0.003 ms): landlock_add_r/475518 landlock_add_rule(ruleset_fd: 2, rule_type: LANDLOCK_RULE_NET_PORT, rule_attr: 0x7ffd790ff690) = -1 EBADFD (File descriptor in bad state)
Now lets try with some of the !syscalls tracepoints with enum args:
root@x1:~# perf trace -e timer:hrtimer_start --max-events=5
0.000 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff225050, function: 0xffffffff9e22ddd0, expires: 210588861000000, softexpires: 210588861000000, mode: HRTIMER_MODE_ABS)
18446744073709.551 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff2a5050, function: 0xffffffff9e22ddd0, expires: 210588861000000, softexpires: 210588861000000, mode: HRTIMER_MODE_ABS)
0.007 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff325050, function: 0xffffffff9e22ddd0, expires: 210588861000000, softexpires: 210588861000000, mode: HRTIMER_MODE_ABS)
0.007 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff3a5050, function: 0xffffffff9e22ddd0, expires: 210588861000000, softexpires: 210588861000000, mode: HRTIMER_MODE_ABS)
18446744073709.543 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff425050, function: 0xffffffff9e22ddd0, expires: 210588861000000, softexpires: 210588861000000, mode: HRTIMER_MODE_ABS)
root@x1:~#
Cool, it works!
Now lets try and use it with filters, to get something other than HRTIMER_MODE_ABS:
root@x1:~# perf trace -e timer:hrtimer_start --filter='mode!=HRTIMER_MODE_ABS' --max-events=5
No resolver (strtoul) for "mode" in "timer:hrtimer_start", can't set filter "(mode!=HRTIMER_MODE_ABS) && (common_pid != 475859 && common_pid != 4041)"
root@x1:~#
oops, that is the next step then :-)
If I do:
root@x1:~# pahole --contains_enumerator=HRTIMER_MODE_ABS
enum hrtimer_mode {
HRTIMER_MODE_ABS = 0,
HRTIMER_MODE_REL = 1,
HRTIMER_MODE_PINNED = 2,
HRTIMER_MODE_SOFT = 4,
HRTIMER_MODE_HARD = 8,
HRTIMER_MODE_ABS_PINNED = 2,
HRTIMER_MODE_REL_PINNED = 3,
HRTIMER_MODE_ABS_SOFT = 4,
HRTIMER_MODE_REL_SOFT = 5,
HRTIMER_MODE_ABS_PINNED_SOFT = 6,
HRTIMER_MODE_REL_PINNED_SOFT = 7,
HRTIMER_MODE_ABS_HARD = 8,
HRTIMER_MODE_REL_HARD = 9,
HRTIMER_MODE_ABS_PINNED_HARD = 10,
HRTIMER_MODE_REL_PINNED_HARD = 11,
}
root@x1:~#
And then use the value for HRTIMER_MODE_ABS instead:
root@x1:~# perf trace -e timer:hrtimer_start --filter='mode!=0' --max-events=1
0.000 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff225050, function: 0xffffffff9e22ddd0, expires: 210759990000000, softexpires: 210759990000000, mode: HRTIMER_MODE_ABS_PINNED_HARD)
root@x1:~#
Now also filtering HRTIMER_MODE_ABS_PINNED_HARD:
root@x1:~# perf trace -e timer:hrtimer_start --filter='mode!=0 && mode != 10' --max-events=2
0.000 podman/178137 timer:hrtimer_start(hrtimer: 0xffffa2024468fda8, function: 0xffffffff9e2170c0, expires: 210886679225214, softexpires: 210886679175214, mode: HRTIMER_MODE_REL)
32.935 podman/5046 timer:hrtimer_start(hrtimer: 0xffffa20244fabc40, function: 0xffffffff9e2170c0, expires: 210886712159707, softexpires: 210886712109707, mode: HRTIMER_MODE_REL)
root@x1:~#
But this then should be a _third_ patch :-)
We're making progress!
- Arnaldo
On Thu, Jun 13, 2024 at 09:47:02AM -0300, Arnaldo Carvalho de Melo wrote:
> On Thu, Jun 13, 2024 at 12:27:47PM +0800, Howard Chu wrote:
> > perf $ ./perf trace -e landlock_add_rule
> > 0.000 ( 0.029 ms): ldlck-test/438194 landlock_add_rule(rule_type: LANDLOCK_RULE_NET_PORT) = -1 EBADFD (File descriptor in bad state)
> > 0.036 ( 0.004 ms): ldlck-test/438194 landlock_add_rule(rule_type: LANDLOCK_RULE_PATH_BENEATH) = -1 EBADFD (File descriptor in bad state)
> > ```
> > Signed-off-by: Howard Chu <[email protected]>
And when someone suggests you do something and you implement it, a
Suggested-by: tag is as documented in:
⬢[acme@toolbox perf-tools-next]$ grep -A5 Suggested-by Documentation/process/submitting-patches.rst
Using Reported-by:, Tested-by:, Reviewed-by:, Suggested-by: and Fixes:
----------------------------------------------------------------------
The Reported-by tag gives credit to people who find bugs and report them and it
hopefully inspires them to help us again in the future. The tag is intended for
bugs; please do not use it to credit feature requests. The tag should be
--
A Suggested-by: tag indicates that the patch idea is suggested by the person
named and ensures credit to the person for the idea. Please note that this
tag should not be added without the reporter's permission, especially if the
idea was not posted in a public forum. That said, if we diligently credit our
idea reporters, they will, hopefully, be inspired to help us again in the
future.
⬢[acme@toolbox perf-tools-next]$
- Arnaldo
Hello Arnaldo,
Thanks for testing and reviewing this patch, and your precious suggestions.
On Thu, Jun 13, 2024 at 8:59 PM Arnaldo Carvalho de Melo
<[email protected]> wrote:
>
> On Thu, Jun 13, 2024 at 09:47:02AM -0300, Arnaldo Carvalho de Melo wrote:
> > On Thu, Jun 13, 2024 at 12:27:47PM +0800, Howard Chu wrote:
> > > changes in v4:
>
> > > - Add enum support to tracepoint arguments
>
> > That is cool, but see below the comment as having this as a separate
> > patch.
> >
> > Also please, on the patch that introduces ! syscall tracepoint enum args
> > BTF augmentation include examples of tracepoints being augmented. I'll
>
> You did it as a notes for v4, great, I missed that.
>
> > try here while testing the patch as-is.
>
> The landlock_add_rule continues to work, using the same test program I
> posted when testing your v1 patch:
>
> root@x1:~# perf trace -e landlock_add_rule
> 0.000 ( 0.016 ms): landlock_add_r/475518 landlock_add_rule(ruleset_fd: 1, rule_type: LANDLOCK_RULE_PATH_BENEATH, rule_attr: 0x7ffd790ff690) = -1 EBADFD (File descriptor in bad state)
> 0.115 ( 0.003 ms): landlock_add_r/475518 landlock_add_rule(ruleset_fd: 2, rule_type: LANDLOCK_RULE_NET_PORT, rule_attr: 0x7ffd790ff690) = -1 EBADFD (File descriptor in bad state)
>
> Now lets try with some of the !syscalls tracepoints with enum args:
>
> root@x1:~# perf trace -e timer:hrtimer_start --max-events=5
> 0.000 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff225050, function: 0xffffffff9e22ddd0, expires: 210588861000000, softexpires: 210588861000000, mode: HRTIMER_MODE_ABS)
> 18446744073709.551 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff2a5050, function: 0xffffffff9e22ddd0, expires: 210588861000000, softexpires: 210588861000000, mode: HRTIMER_MODE_ABS)
> 0.007 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff325050, function: 0xffffffff9e22ddd0, expires: 210588861000000, softexpires: 210588861000000, mode: HRTIMER_MODE_ABS)
> 0.007 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff3a5050, function: 0xffffffff9e22ddd0, expires: 210588861000000, softexpires: 210588861000000, mode: HRTIMER_MODE_ABS)
> 18446744073709.543 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff425050, function: 0xffffffff9e22ddd0, expires: 210588861000000, softexpires: 210588861000000, mode: HRTIMER_MODE_ABS)
> root@x1:~#
>
> Cool, it works!
>
> Now lets try and use it with filters, to get something other than HRTIMER_MODE_ABS:
>
> root@x1:~# perf trace -e timer:hrtimer_start --filter='mode!=HRTIMER_MODE_ABS' --max-events=5
> No resolver (strtoul) for "mode" in "timer:hrtimer_start", can't set filter "(mode!=HRTIMER_MODE_ABS) && (common_pid != 475859 && common_pid != 4041)"
> root@x1:~#
>
>
> oops, that is the next step then :-)
Sure, I will add support for enum filtering(enum string -> int).
>
> If I do:
>
> root@x1:~# pahole --contains_enumerator=HRTIMER_MODE_ABS
> enum hrtimer_mode {
> HRTIMER_MODE_ABS = 0,
> HRTIMER_MODE_REL = 1,
> HRTIMER_MODE_PINNED = 2,
> HRTIMER_MODE_SOFT = 4,
> HRTIMER_MODE_HARD = 8,
> HRTIMER_MODE_ABS_PINNED = 2,
> HRTIMER_MODE_REL_PINNED = 3,
> HRTIMER_MODE_ABS_SOFT = 4,
> HRTIMER_MODE_REL_SOFT = 5,
> HRTIMER_MODE_ABS_PINNED_SOFT = 6,
> HRTIMER_MODE_REL_PINNED_SOFT = 7,
> HRTIMER_MODE_ABS_HARD = 8,
> HRTIMER_MODE_REL_HARD = 9,
> HRTIMER_MODE_ABS_PINNED_HARD = 10,
> HRTIMER_MODE_REL_PINNED_HARD = 11,
> }
> root@x1:~#
>
> And then use the value for HRTIMER_MODE_ABS instead:
>
> root@x1:~# perf trace -e timer:hrtimer_start --filter='mode!=0' --max-events=1
> 0.000 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff225050, function: 0xffffffff9e22ddd0, expires: 210759990000000, softexpires: 210759990000000, mode: HRTIMER_MODE_ABS_PINNED_HARD)
> root@x1:~#
>
> Now also filtering HRTIMER_MODE_ABS_PINNED_HARD:
>
> root@x1:~# perf trace -e timer:hrtimer_start --filter='mode!=0 && mode != 10' --max-events=2
> 0.000 podman/178137 timer:hrtimer_start(hrtimer: 0xffffa2024468fda8, function: 0xffffffff9e2170c0, expires: 210886679225214, softexpires: 210886679175214, mode: HRTIMER_MODE_REL)
> 32.935 podman/5046 timer:hrtimer_start(hrtimer: 0xffffa20244fabc40, function: 0xffffffff9e2170c0, expires: 210886712159707, softexpires: 210886712109707, mode: HRTIMER_MODE_REL)
> root@x1:~#
>
> But this then should be a _third_ patch :-)
Sure.
>
> We're making progress!
>
> - Arnaldo
> See the comment about evsel__init_tp_arg_scnprintf() below. Also please
> do patches on top of previous work, i.e. the v3 patch should be a
> separate patch and this v4 should add the extra functionality, i.e. the
> support for !syscall tracepoint enum BTF augmentation.
Thank you for suggesting this. May I ask if this is saying that v3 and
v4 should all be separated?
> The convention here is that evsel__ is the "class" name, so the first
> arg is a 'struct evsel *', if you really were transforming this into a
> 'struct trace' specific "method" you would change the name of the C
> function to 'trace__init_tp_arg_scnprintf'.
Oops, my bad. Thanks for pointing it out.
>
> But in this case instead of passing the 'struct trace' pointer all the
> way down we should instead pass a 'bool *use_btf' argument, making it:
>
>
> static int evsel__init_tp_arg_scnprintf(struct evsel *evsel, bool *use_btf)
You are right, we should do that. Thanks for pointing out this silly
implementation. I think we should do the same for
syscall__set_arg_fmts(struct trace *trace, struct syscall *sc) as
well. Also, I forgot to delete the unused 'bool use_btf' in struct
syscall, I will delete it.
>
> Then, when evlist__set_syscall_tp_fields(evlist, &use_btf) returns,
> check that use_btf to check if we need to call
> trace__load_vmlinux_btf(trace).
> And when someone suggests you do something and you implement it, a
> Suggested-by: tag is as documented in:
>
> ⬢[acme@toolbox perf-tools-next]$ grep -A5 Suggested-by Documentation/process/submitting-patches.rst
> Using Reported-by:, Tested-by:, Reviewed-by:, Suggested-by: and Fixes:
> A Suggested-by: tag indicates that the patch idea is suggested by the person
> named and ensures credit to the person for the idea. Please note that this
> tag should not be added without the reporter's permission
May I ask if you want a Suggested-by? Hats off to you sir.
Also, do you want me to do the fixes on evsel__init_tp_arg_scnprintf()
for tracepoint enum, and send it as v5, or just send a separate patch
for tracepoint enum, so we get a patch for syscall enum, and another
patch for tracepoint enum? Sorry to bother you on these trivial
things.
Thanks again for this detailed review, and valuable suggestions.
Thanks,
Howard
On Thu, Jun 13, 2024 at 11:50:59PM +0800, Howard Chu wrote:
> Thanks for testing and reviewing this patch, and your precious suggestions.
You're welcome
> On Thu, Jun 13, 2024 at 8:59 PM Arnaldo Carvalho de Melo <[email protected]> wrote:
> > > On Thu, Jun 13, 2024 at 12:27:47PM +0800, Howard Chu wrote:
> > > > changes in v4:
> > > > - Add enum support to tracepoint arguments
> > > That is cool, but see below the comment as having this as a separate
> > > patch.
> > > Also please, on the patch that introduces ! syscall tracepoint enum args
> > > BTF augmentation include examples of tracepoints being augmented. I'll
> > You did it as a notes for v4, great, I missed that.
> > > try here while testing the patch as-is.
> > The landlock_add_rule continues to work, using the same test program I
> > posted when testing your v1 patch:
> > root@x1:~# perf trace -e landlock_add_rule
> > 0.000 ( 0.016 ms): landlock_add_r/475518 landlock_add_rule(ruleset_fd: 1, rule_type: LANDLOCK_RULE_PATH_BENEATH, rule_attr: 0x7ffd790ff690) = -1 EBADFD (File descriptor in bad state)
> > 0.115 ( 0.003 ms): landlock_add_r/475518 landlock_add_rule(ruleset_fd: 2, rule_type: LANDLOCK_RULE_NET_PORT, rule_attr: 0x7ffd790ff690) = -1 EBADFD (File descriptor in bad state)
> > Now lets try with some of the !syscalls tracepoints with enum args:
> > root@x1:~# perf trace -e timer:hrtimer_start --max-events=5
> > 0.000 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff225050, function: 0xffffffff9e22ddd0, expires: 210588861000000, softexpires: 210588861000000, mode: HRTIMER_MODE_ABS)
> > 18446744073709.551 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff2a5050, function: 0xffffffff9e22ddd0, expires: 210588861000000, softexpires: 210588861000000, mode: HRTIMER_MODE_ABS)
> > 0.007 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff325050, function: 0xffffffff9e22ddd0, expires: 210588861000000, softexpires: 210588861000000, mode: HRTIMER_MODE_ABS)
> > 0.007 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff3a5050, function: 0xffffffff9e22ddd0, expires: 210588861000000, softexpires: 210588861000000, mode: HRTIMER_MODE_ABS)
> > 18446744073709.543 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff425050, function: 0xffffffff9e22ddd0, expires: 210588861000000, softexpires: 210588861000000, mode: HRTIMER_MODE_ABS)
> > root@x1:~#
> > Cool, it works!
> > Now lets try and use it with filters, to get something other than HRTIMER_MODE_ABS:
> > root@x1:~# perf trace -e timer:hrtimer_start --filter='mode!=HRTIMER_MODE_ABS' --max-events=5
> > No resolver (strtoul) for "mode" in "timer:hrtimer_start", can't set filter "(mode!=HRTIMER_MODE_ABS) && (common_pid != 475859 && common_pid != 4041)"
> > root@x1:~#
> > oops, that is the next step then :-)
> Sure, I will add support for enum filtering(enum string -> int).
Cool
> > If I do:
> > root@x1:~# pahole --contains_enumerator=HRTIMER_MODE_ABS
> > enum hrtimer_mode {
> > HRTIMER_MODE_ABS = 0,
> > HRTIMER_MODE_REL = 1,
> > HRTIMER_MODE_PINNED = 2,
> > HRTIMER_MODE_SOFT = 4,
> > HRTIMER_MODE_HARD = 8,
> > HRTIMER_MODE_ABS_PINNED = 2,
> > HRTIMER_MODE_REL_PINNED = 3,
> > HRTIMER_MODE_ABS_SOFT = 4,
> > HRTIMER_MODE_REL_SOFT = 5,
> > HRTIMER_MODE_ABS_PINNED_SOFT = 6,
> > HRTIMER_MODE_REL_PINNED_SOFT = 7,
> > HRTIMER_MODE_ABS_HARD = 8,
> > HRTIMER_MODE_REL_HARD = 9,
> > HRTIMER_MODE_ABS_PINNED_HARD = 10,
> > HRTIMER_MODE_REL_PINNED_HARD = 11,
> > }
> > root@x1:~#
> > And then use the value for HRTIMER_MODE_ABS instead:
> > root@x1:~# perf trace -e timer:hrtimer_start --filter='mode!=0' --max-events=1
> > 0.000 :0/0 timer:hrtimer_start(hrtimer: 0xffff8d4eff225050, function: 0xffffffff9e22ddd0, expires: 210759990000000, softexpires: 210759990000000, mode: HRTIMER_MODE_ABS_PINNED_HARD)
> > root@x1:~#
> > Now also filtering HRTIMER_MODE_ABS_PINNED_HARD:
> > root@x1:~# perf trace -e timer:hrtimer_start --filter='mode!=0 && mode != 10' --max-events=2
> > 0.000 podman/178137 timer:hrtimer_start(hrtimer: 0xffffa2024468fda8, function: 0xffffffff9e2170c0, expires: 210886679225214, softexpires: 210886679175214, mode: HRTIMER_MODE_REL)
> > 32.935 podman/5046 timer:hrtimer_start(hrtimer: 0xffffa20244fabc40, function: 0xffffffff9e2170c0, expires: 210886712159707, softexpires: 210886712109707, mode: HRTIMER_MODE_REL)
> > root@x1:~#
> > But this then should be a _third_ patch :-)
>
> Sure.
> > We're making progress!
> > See the comment about evsel__init_tp_arg_scnprintf() below. Also please
> > do patches on top of previous work, i.e. the v3 patch should be a
> > separate patch and this v4 should add the extra functionality, i.e. the
> > support for !syscall tracepoint enum BTF augmentation.
> Thank you for suggesting this. May I ask if this is saying that v3 and
> v4 should all be separated?
Yes, I suggest you extract from v4 the updated contents of v3 and have
it as a "perf trace: Augment enum syscall arguments with BTF", have the
examples of such syscalls before and after.
Then have another patch, that assumes that first patch with the fix and
the "perf trace: Augment enum syscall arguments with BTF" are applied
that will add support for augmenting non-syscall tracepoints with enum
arguments with BTF.
> > The convention here is that evsel__ is the "class" name, so the first
> > arg is a 'struct evsel *', if you really were transforming this into a
> > 'struct trace' specific "method" you would change the name of the C
> > function to 'trace__init_tp_arg_scnprintf'.
> Oops, my bad. Thanks for pointing it out.
> > But in this case instead of passing the 'struct trace' pointer all the
> > way down we should instead pass a 'bool *use_btf' argument, making it:
> > static int evsel__init_tp_arg_scnprintf(struct evsel *evsel, bool *use_btf)
> You are right, we should do that. Thanks for pointing out this silly
> implementation. I think we should do the same for
> syscall__set_arg_fmts(struct trace *trace, struct syscall *sc) as
> well. Also, I forgot to delete the unused 'bool use_btf' in struct
> syscall, I will delete it.
>
> > Then, when evlist__set_syscall_tp_fields(evlist, &use_btf) returns,
> > check that use_btf to check if we need to call
> > trace__load_vmlinux_btf(trace).
> > And when someone suggests you do something and you implement it, a
> > Suggested-by: tag is as documented in:
> > ⬢[acme@toolbox perf-tools-next]$ grep -A5 Suggested-by Documentation/process/submitting-patches.rst
> > Using Reported-by:, Tested-by:, Reviewed-by:, Suggested-by: and Fixes:
> > A Suggested-by: tag indicates that the patch idea is suggested by the person
> > named and ensures credit to the person for the idea. Please note that this
> > tag should not be added without the reporter's permission
> May I ask if you want a Suggested-by? Hats off to you sir.
yes, it is appropriate in this case.
> Also, do you want me to do the fixes on evsel__init_tp_arg_scnprintf()
If its separate from what you are doing, yes, you do the fix then
continue with the new features.
> for tracepoint enum, and send it as v5, or just send a separate patch
> for tracepoint enum, so we get a patch for syscall enum, and another
> patch for tracepoint enum? Sorry to bother you on these trivial
> things.
> Thanks again for this detailed review, and valuable suggestions.
- Arnaldo