This set catches symbol for all bpf programs loaded/unloaded
before/during/after perf-record run PERF_RECORD_KSYMBOL and
PERF_RECORD_BPF_EVENT.
PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT includes key information
of a bpf program load and unload. They are sent through perf ringbuffer,
and stored in perf.data. PERF_RECORD_KSYMBOL includes basic information
for simple profiling. It is ON by default. PERF_RECORD_BPF_EVENT is
used to gather more information of the bpf program. It is necessary for
perf-annotate of bpf programs.
Before this patch, perf-report will not be able to recover symbols of
bpf programs once the programs are unloaded.
This is to follow up Alexei's early effort [2] to show bpf programs via
mmap events.
Thanks,
Song
Changes v5 -> PATCH v6:
1. Reduce len in PERF_RECORD_KSYMBOL from u64 to u32. Use the 32 free bits
for ksym_type (u16) and flags (u16).
Changes v4 -> PATCH v5:
1. Fixed build error reported by kbuild test bot.
Changes v3 -> PATCH v4:
1. Split information about bpf program into PERF_RECORD_KSYMBOL (with
name, addr, len); and PERF_RECORD_BPF_EVENT PERF_RECORD_BPF_EVENT
(with id, tag);
2. Split the implementation in kernel and user space.
Changes v2 -> PATCH v3:
1. Rebase on bpf-next tree, and on top of BPF sub program tag patches [1]
for latest information in bpf_prog_info.
2. Complete handling and synthesizing PERF_RECORD_BPF_EVENT in perf.
Changes v1 -> PATCH v2:
1. Only 3 of the 5 patches in v1, to focus on ABI first;
2. Generate PERF_RECORD_BPF_EVENT per bpf sub program instead of per prog;
3. Modify PERF_RECORD_BPF_EVENT with more details (addr, len, name),
so that it can be used for basic profiling without calling sys_bpf.
Changes RFC -> PATCH v1:
1. In perf-record, poll vip events in a separate thread;
2. Add tag to bpf prog name;
3. Small refactorings.
[1] https://patchwork.ozlabs.org/project/netdev/list/?series=81037
[2] https://www.spinics.net/lists/netdev/msg524232.html
Song Liu (7):
perf, bpf: Introduce PERF_RECORD_KSYMBOL
sync tools/include/uapi/linux/perf_event.h
perf, bpf: introduce PERF_RECORD_BPF_EVENT
sync tools/include/uapi/linux/perf_event.h
perf util: handle PERF_RECORD_KSYMBOL
perf util: handle PERF_RECORD_BPF_EVENT
perf tools: synthesize PERF_RECORD_* for loaded BPF programs
include/linux/filter.h | 7 +
include/linux/perf_event.h | 19 +++
include/uapi/linux/perf_event.h | 53 ++++++-
kernel/bpf/core.c | 2 +-
kernel/bpf/syscall.c | 2 +
kernel/events/core.c | 218 ++++++++++++++++++++++++-
tools/include/uapi/linux/perf_event.h | 53 ++++++-
tools/perf/builtin-record.c | 7 +
tools/perf/perf.h | 1 +
tools/perf/util/Build | 2 +
tools/perf/util/bpf-event.c | 220 ++++++++++++++++++++++++++
tools/perf/util/bpf-event.h | 16 ++
tools/perf/util/event.c | 41 +++++
tools/perf/util/event.h | 36 +++++
tools/perf/util/evsel.c | 19 +++
tools/perf/util/evsel.h | 2 +
tools/perf/util/machine.c | 60 +++++++
tools/perf/util/machine.h | 3 +
tools/perf/util/session.c | 8 +
tools/perf/util/tool.h | 5 +-
20 files changed, 769 insertions(+), 5 deletions(-)
create mode 100644 tools/perf/util/bpf-event.c
create mode 100644 tools/perf/util/bpf-event.h
--
2.17.1
This patch handles PERF_RECORD_KSYMBOL in perf record/report.
Specifically, map and symbol are created for ksymbol register, and
removed for ksymbol unregister.
This patch also set perf_event_attr.ksymbol properly. The flag is
ON by default.
Signed-off-by: Song Liu <[email protected]>
---
tools/perf/util/event.c | 21 +++++++++++++++
tools/perf/util/event.h | 20 ++++++++++++++
tools/perf/util/evsel.c | 9 +++++++
tools/perf/util/evsel.h | 1 +
tools/perf/util/machine.c | 57 +++++++++++++++++++++++++++++++++++++++
tools/perf/util/machine.h | 3 +++
tools/perf/util/session.c | 4 +++
tools/perf/util/tool.h | 4 ++-
8 files changed, 118 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 937a5a4f71cc..3c8a6a8dd260 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -24,6 +24,7 @@
#include "symbol/kallsyms.h"
#include "asm/bug.h"
#include "stat.h"
+#include "session.h"
#define DEFAULT_PROC_MAP_PARSE_TIMEOUT 500
@@ -45,6 +46,7 @@ static const char *perf_event__names[] = {
[PERF_RECORD_SWITCH] = "SWITCH",
[PERF_RECORD_SWITCH_CPU_WIDE] = "SWITCH_CPU_WIDE",
[PERF_RECORD_NAMESPACES] = "NAMESPACES",
+ [PERF_RECORD_KSYMBOL] = "KSYMBOL",
[PERF_RECORD_HEADER_ATTR] = "ATTR",
[PERF_RECORD_HEADER_EVENT_TYPE] = "EVENT_TYPE",
[PERF_RECORD_HEADER_TRACING_DATA] = "TRACING_DATA",
@@ -1329,6 +1331,14 @@ int perf_event__process_switch(struct perf_tool *tool __maybe_unused,
return machine__process_switch_event(machine, event);
}
+int perf_event__process_ksymbol(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
+ struct perf_sample *sample __maybe_unused,
+ struct machine *machine)
+{
+ return machine__process_ksymbol(machine, event, sample);
+}
+
size_t perf_event__fprintf_mmap(union perf_event *event, FILE *fp)
{
return fprintf(fp, " %d/%d: [%#" PRIx64 "(%#" PRIx64 ") @ %#" PRIx64 "]: %c %s\n",
@@ -1461,6 +1471,14 @@ static size_t perf_event__fprintf_lost(union perf_event *event, FILE *fp)
return fprintf(fp, " lost %" PRIu64 "\n", event->lost.lost);
}
+size_t perf_event__fprintf_ksymbol(union perf_event *event, FILE *fp)
+{
+ return fprintf(fp, " ksymbol event with addr %lx len %u type %u flags 0x%x name %s\n",
+ event->ksymbol_event.addr, event->ksymbol_event.len,
+ event->ksymbol_event.ksym_type,
+ event->ksymbol_event.flags, event->ksymbol_event.name);
+}
+
size_t perf_event__fprintf(union perf_event *event, FILE *fp)
{
size_t ret = fprintf(fp, "PERF_RECORD_%s",
@@ -1496,6 +1514,9 @@ size_t perf_event__fprintf(union perf_event *event, FILE *fp)
case PERF_RECORD_LOST:
ret += perf_event__fprintf_lost(event, fp);
break;
+ case PERF_RECORD_KSYMBOL:
+ ret += perf_event__fprintf_ksymbol(event, fp);
+ break;
default:
ret += fprintf(fp, "\n");
}
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index eb95f3384958..018322f2a13e 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -5,6 +5,7 @@
#include <limits.h>
#include <stdio.h>
#include <linux/kernel.h>
+#include <linux/bpf.h>
#include "../perf.h"
#include "build-id.h"
@@ -84,6 +85,19 @@ struct throttle_event {
u64 stream_id;
};
+#ifndef KSYM_NAME_LEN
+#define KSYM_NAME_LEN 256
+#endif
+
+struct ksymbol_event {
+ struct perf_event_header header;
+ u64 addr;
+ u32 len;
+ u16 ksym_type;
+ u16 flags;
+ char name[KSYM_NAME_LEN];
+};
+
#define PERF_SAMPLE_MASK \
(PERF_SAMPLE_IP | PERF_SAMPLE_TID | \
PERF_SAMPLE_TIME | PERF_SAMPLE_ADDR | \
@@ -651,6 +665,7 @@ union perf_event {
struct stat_round_event stat_round;
struct time_conv_event time_conv;
struct feature_event feat;
+ struct ksymbol_event ksymbol_event;
};
void perf_event__print_totals(void);
@@ -748,6 +763,10 @@ int perf_event__process_exit(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine);
+int perf_event__process_ksymbol(struct perf_tool *tool,
+ union perf_event *event,
+ struct perf_sample *sample,
+ struct machine *machine);
int perf_tool__process_synth_event(struct perf_tool *tool,
union perf_event *event,
struct machine *machine,
@@ -811,6 +830,7 @@ size_t perf_event__fprintf_switch(union perf_event *event, FILE *fp);
size_t perf_event__fprintf_thread_map(union perf_event *event, FILE *fp);
size_t perf_event__fprintf_cpu_map(union perf_event *event, FILE *fp);
size_t perf_event__fprintf_namespaces(union perf_event *event, FILE *fp);
+size_t perf_event__fprintf_ksymbol(union perf_event *event, FILE *fp);
size_t perf_event__fprintf(union perf_event *event, FILE *fp);
int kallsyms__get_function_start(const char *kallsyms_filename,
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index dbc0466db368..de34ce875648 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1035,6 +1035,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts,
attr->mmap = track;
attr->mmap2 = track && !perf_missing_features.mmap2;
attr->comm = track;
+ attr->ksymbol = track && !perf_missing_features.ksymbol;
if (opts->record_namespaces)
attr->namespaces = track;
@@ -1652,6 +1653,7 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
PRINT_ATTRf(context_switch, p_unsigned);
PRINT_ATTRf(write_backward, p_unsigned);
PRINT_ATTRf(namespaces, p_unsigned);
+ PRINT_ATTRf(ksymbol, p_unsigned);
PRINT_ATTRn("{ wakeup_events, wakeup_watermark }", wakeup_events, p_unsigned);
PRINT_ATTRf(bp_type, p_unsigned);
@@ -1811,6 +1813,8 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
PERF_SAMPLE_BRANCH_NO_CYCLES);
if (perf_missing_features.group_read && evsel->attr.inherit)
evsel->attr.read_format &= ~(PERF_FORMAT_GROUP|PERF_FORMAT_ID);
+ if (perf_missing_features.ksymbol)
+ evsel->attr.ksymbol = 0;
retry_sample_id:
if (perf_missing_features.sample_id_all)
evsel->attr.sample_id_all = 0;
@@ -1955,6 +1959,11 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
perf_missing_features.exclude_guest = true;
pr_debug2("switching off exclude_guest, exclude_host\n");
goto fallback_missing_features;
+ } else if (!perf_missing_features.ksymbol &&
+ evsel->attr.ksymbol) {
+ perf_missing_features.ksymbol = true;
+ pr_debug2("switching off ksymbol\n");
+ goto fallback_missing_features;
} else if (!perf_missing_features.sample_id_all) {
perf_missing_features.sample_id_all = true;
pr_debug2("switching off sample_id_all\n");
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 82a289ce8b0c..4a8c3e7f4808 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -168,6 +168,7 @@ struct perf_missing_features {
bool lbr_flags;
bool write_backward;
bool group_read;
+ bool ksymbol;
};
extern struct perf_missing_features perf_missing_features;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 6fcb3bce0442..1734ca027661 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -681,6 +681,61 @@ int machine__process_switch_event(struct machine *machine __maybe_unused,
return 0;
}
+static int machine__process_ksymbol_register(
+ struct machine *machine,
+ union perf_event *event,
+ struct perf_sample *sample __maybe_unused)
+{
+ struct symbol *sym;
+ struct map *map;
+
+ map = map_groups__find(&machine->kmaps, event->ksymbol_event.addr);
+ if (!map) {
+ map = dso__new_map("bpf_prog");
+ if (!map)
+ return -ENOMEM;
+
+ map->start = event->ksymbol_event.addr;
+ map->pgoff = map->start;
+ map->end = map->start + event->ksymbol_event.len;
+ map_groups__insert(&machine->kmaps, map);
+ }
+
+ sym = symbol__new(event->ksymbol_event.addr, event->ksymbol_event.len,
+ 0, 0, event->ksymbol_event.name);
+ if (!sym)
+ return -ENOMEM;
+ dso__insert_symbol(map->dso, sym);
+ return 0;
+}
+
+static int machine__process_ksymbol_unregister(
+ struct machine *machine,
+ union perf_event *event,
+ struct perf_sample *sample __maybe_unused)
+{
+ struct map *map;
+
+ map = map_groups__find(&machine->kmaps, event->ksymbol_event.addr);
+ if (map)
+ map_groups__remove(&machine->kmaps, map);
+
+ return 0;
+}
+
+int machine__process_ksymbol(struct machine *machine __maybe_unused,
+ union perf_event *event,
+ struct perf_sample *sample)
+{
+ if (dump_trace)
+ perf_event__fprintf_ksymbol(event, stderr);
+
+ if (event->ksymbol_event.flags & PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER)
+ return machine__process_ksymbol_unregister(machine, event,
+ sample);
+ return machine__process_ksymbol_register(machine, event, sample);
+}
+
static void dso__adjust_kmod_long_name(struct dso *dso, const char *filename)
{
const char *dup_filename;
@@ -1812,6 +1867,8 @@ int machine__process_event(struct machine *machine, union perf_event *event,
case PERF_RECORD_SWITCH:
case PERF_RECORD_SWITCH_CPU_WIDE:
ret = machine__process_switch_event(machine, event); break;
+ case PERF_RECORD_KSYMBOL:
+ ret = machine__process_ksymbol(machine, event, sample); break;
default:
ret = -1;
break;
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index a5d1da60f751..4ecd380ce1b4 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -130,6 +130,9 @@ int machine__process_mmap_event(struct machine *machine, union perf_event *event
struct perf_sample *sample);
int machine__process_mmap2_event(struct machine *machine, union perf_event *event,
struct perf_sample *sample);
+int machine__process_ksymbol(struct machine *machine,
+ union perf_event *event,
+ struct perf_sample *sample);
int machine__process_event(struct machine *machine, union perf_event *event,
struct perf_sample *sample);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 78a067777144..a9c98c3914ed 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -376,6 +376,8 @@ void perf_tool__fill_defaults(struct perf_tool *tool)
tool->itrace_start = perf_event__process_itrace_start;
if (tool->context_switch == NULL)
tool->context_switch = perf_event__process_switch;
+ if (tool->ksymbol == NULL)
+ tool->ksymbol = perf_event__process_ksymbol;
if (tool->read == NULL)
tool->read = process_event_sample_stub;
if (tool->throttle == NULL)
@@ -1305,6 +1307,8 @@ static int machines__deliver_event(struct machines *machines,
case PERF_RECORD_SWITCH:
case PERF_RECORD_SWITCH_CPU_WIDE:
return tool->context_switch(tool, event, sample, machine);
+ case PERF_RECORD_KSYMBOL:
+ return tool->ksymbol(tool, event, sample, machine);
default:
++evlist->stats.nr_unknown_events;
return -1;
diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
index 56e4ca54020a..9c81ca2f3cf7 100644
--- a/tools/perf/util/tool.h
+++ b/tools/perf/util/tool.h
@@ -53,7 +53,9 @@ struct perf_tool {
itrace_start,
context_switch,
throttle,
- unthrottle;
+ unthrottle,
+ ksymbol;
+
event_attr_op attr;
event_attr_op event_update;
event_op2 tracing_data;
--
2.17.1
For better performance analysis of BPF programs, this patch introduces
PERF_RECORD_BPF_EVENT, a new perf_event_type that exposes BPF program
load/unload information to user space.
Each BPF program may contain up to BPF_MAX_SUBPROGS (256) sub programs.
The following example shows kernel symbols for a BPF program with 7
sub programs:
ffffffffa0257cf9 t bpf_prog_b07ccb89267cf242_F
ffffffffa02592e1 t bpf_prog_2dcecc18072623fc_F
ffffffffa025b0e9 t bpf_prog_bb7a405ebaec5d5c_F
ffffffffa025dd2c t bpf_prog_a7540d4a39ec1fc7_F
ffffffffa025fcca t bpf_prog_05762d4ade0e3737_F
ffffffffa026108f t bpf_prog_db4bd11e35df90d4_F
ffffffffa0263f00 t bpf_prog_89d64e4abf0f0126_F
ffffffffa0257cf9 t bpf_prog_ae31629322c4b018__dummy_tracepoi
When a bpf program is loaded, PERF_RECORD_KSYMBOL is generated for
each of these sub programs. Therefore, PERF_RECORD_BPF_EVENT is not
needed for simple profiling.
For annotation, user space need to listen to PERF_RECORD_BPF_EVENT
and gather more information about these (sub) programs via sys_bpf.
Signed-off-by: Song Liu <[email protected]>
---
include/linux/filter.h | 7 ++
include/linux/perf_event.h | 6 ++
include/uapi/linux/perf_event.h | 29 +++++++-
kernel/bpf/core.c | 2 +-
kernel/bpf/syscall.c | 2 +
kernel/events/core.c | 120 ++++++++++++++++++++++++++++++++
6 files changed, 164 insertions(+), 2 deletions(-)
diff --git a/include/linux/filter.h b/include/linux/filter.h
index ad106d845b22..d531d4250bff 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -951,6 +951,7 @@ bpf_address_lookup(unsigned long addr, unsigned long *size,
void bpf_prog_kallsyms_add(struct bpf_prog *fp);
void bpf_prog_kallsyms_del(struct bpf_prog *fp);
+void bpf_get_prog_name(const struct bpf_prog *prog, char *sym);
#else /* CONFIG_BPF_JIT */
@@ -1006,6 +1007,12 @@ static inline void bpf_prog_kallsyms_add(struct bpf_prog *fp)
static inline void bpf_prog_kallsyms_del(struct bpf_prog *fp)
{
}
+
+static inline void bpf_get_prog_name(const struct bpf_prog *prog, char *sym)
+{
+ sym[0] = '\0';
+}
+
#endif /* CONFIG_BPF_JIT */
void bpf_prog_kallsyms_del_subprogs(struct bpf_prog *fp);
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 6b5f08db5ef3..10c560fcc7a4 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1119,6 +1119,9 @@ typedef int (perf_ksymbol_get_name_f)(char *name, int name_len, void *data);
extern void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len,
bool unregister,
perf_ksymbol_get_name_f get_name, void *data);
+extern void perf_event_bpf_event(struct bpf_prog *prog,
+ enum perf_bpf_event_type type,
+ u16 flags);
extern struct perf_guest_info_callbacks *perf_guest_cbs;
extern int perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks);
@@ -1346,6 +1349,9 @@ static inline void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len,
bool unregister,
perf_ksymbol_get_name_f get_name,
void *data) { }
+static inline void perf_event_bpf_event(struct bpf_prog *prog,
+ enum perf_bpf_event_type type,
+ u16 flags) { }
static inline void perf_event_exec(void) { }
static inline void perf_event_comm(struct task_struct *tsk, bool exec) { }
static inline void perf_event_namespaces(struct task_struct *tsk) { }
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 68c4da0227c5..8bd78a34e396 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -373,7 +373,8 @@ struct perf_event_attr {
write_backward : 1, /* Write ring buffer from end to beginning */
namespaces : 1, /* include namespaces data */
ksymbol : 1, /* include ksymbol events */
- __reserved_1 : 34;
+ bpf_event : 1, /* include bpf events */
+ __reserved_1 : 33;
union {
__u32 wakeup_events; /* wakeup every n events */
@@ -981,6 +982,25 @@ enum perf_event_type {
*/
PERF_RECORD_KSYMBOL = 17,
+ /*
+ * Record bpf events:
+ * enum perf_bpf_event_type {
+ * PERF_BPF_EVENT_UNKNOWN = 0,
+ * PERF_BPF_EVENT_PROG_LOAD = 1,
+ * PERF_BPF_EVENT_PROG_UNLOAD = 2,
+ * };
+ *
+ * struct {
+ * struct perf_event_header header;
+ * u16 type;
+ * u16 flags;
+ * u32 id;
+ * u8 tag[BPF_TAG_SIZE];
+ * struct sample_id sample_id;
+ * };
+ */
+ PERF_RECORD_BPF_EVENT = 18,
+
PERF_RECORD_MAX, /* non-ABI */
};
@@ -992,6 +1012,13 @@ enum perf_record_ksymbol_type {
#define PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER (1 << 0)
+enum perf_bpf_event_type {
+ PERF_BPF_EVENT_UNKNOWN = 0,
+ PERF_BPF_EVENT_PROG_LOAD = 1,
+ PERF_BPF_EVENT_PROG_UNLOAD = 2,
+ PERF_BPF_EVENT_MAX, /* non-ABI */
+};
+
#define PERF_MAX_STACK_DEPTH 127
#define PERF_MAX_CONTEXTS_PER_STACK 8
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index f908b9356025..19c49313c709 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -495,7 +495,7 @@ bpf_get_prog_addr_region(const struct bpf_prog *prog,
*symbol_end = addr + hdr->pages * PAGE_SIZE;
}
-static void bpf_get_prog_name(const struct bpf_prog *prog, char *sym)
+void bpf_get_prog_name(const struct bpf_prog *prog, char *sym)
{
const char *end = sym + KSYM_NAME_LEN;
const struct btf_type *type;
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index b155cd17c1bd..30ebd085790b 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1211,6 +1211,7 @@ static void __bpf_prog_put_rcu(struct rcu_head *rcu)
static void __bpf_prog_put(struct bpf_prog *prog, bool do_idr_lock)
{
if (atomic_dec_and_test(&prog->aux->refcnt)) {
+ perf_event_bpf_event(prog, PERF_BPF_EVENT_PROG_UNLOAD, 0);
/* bpf_prog_free_id() must be called first */
bpf_prog_free_id(prog, do_idr_lock);
bpf_prog_kallsyms_del_all(prog);
@@ -1554,6 +1555,7 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
}
bpf_prog_kallsyms_add(prog);
+ perf_event_bpf_event(prog, PERF_BPF_EVENT_PROG_LOAD, 0);
return err;
free_used_maps:
diff --git a/kernel/events/core.c b/kernel/events/core.c
index ef27f2776999..2f238a8ddaab 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -386,6 +386,7 @@ static atomic_t nr_task_events __read_mostly;
static atomic_t nr_freq_events __read_mostly;
static atomic_t nr_switch_events __read_mostly;
static atomic_t nr_ksymbol_events __read_mostly;
+static atomic_t nr_bpf_events __read_mostly;
static LIST_HEAD(pmus);
static DEFINE_MUTEX(pmus_lock);
@@ -4308,6 +4309,8 @@ static void unaccount_event(struct perf_event *event)
dec = true;
if (event->attr.ksymbol)
atomic_dec(&nr_ksymbol_events);
+ if (event->attr.bpf_event)
+ atomic_dec(&nr_bpf_events);
if (dec) {
if (!atomic_add_unless(&perf_sched_count, -1, 1))
@@ -7744,6 +7747,121 @@ void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len, bool unregister,
WARN_ONCE(1, "%s: Invalid KSYMBOL type 0x%x\n", __func__, ksym_type);
}
+/*
+ * bpf program load/unload tracking
+ */
+
+struct perf_bpf_event {
+ struct bpf_prog *prog;
+ struct {
+ struct perf_event_header header;
+ u16 type;
+ u16 flags;
+ u32 id;
+ u8 tag[BPF_TAG_SIZE];
+ } event_id;
+};
+
+static int perf_event_bpf_match(struct perf_event *event)
+{
+ return event->attr.bpf_event;
+}
+
+static void perf_event_bpf_output(struct perf_event *event, void *data)
+{
+ struct perf_bpf_event *bpf_event = data;
+ struct perf_output_handle handle;
+ struct perf_sample_data sample;
+ int ret;
+
+ if (!perf_event_bpf_match(event))
+ return;
+
+ perf_event_header__init_id(&bpf_event->event_id.header,
+ &sample, event);
+ ret = perf_output_begin(&handle, event,
+ bpf_event->event_id.header.size);
+ if (ret)
+ return;
+
+ perf_output_put(&handle, bpf_event->event_id);
+ perf_event__output_id_sample(event, &handle, &sample);
+
+ perf_output_end(&handle);
+}
+
+static int perf_event_bpf_get_name(char *name, int len, void *data)
+{
+ struct bpf_prog *prog = data;
+
+ bpf_get_prog_name(prog, name);
+ return 0;
+}
+
+static void perf_event_bpf_emit_ksymbols(struct bpf_prog *prog,
+ enum perf_bpf_event_type type)
+{
+ bool unregister = type == PERF_BPF_EVENT_PROG_UNLOAD;
+ int i;
+
+ if (prog->aux->func_cnt == 0) {
+ perf_event_ksymbol(PERF_RECORD_KSYMBOL_TYPE_BPF,
+ (u64)(unsigned long)prog->bpf_func,
+ prog->jited_len, unregister,
+ perf_event_bpf_get_name, prog);
+ } else {
+ for (i = 0; i < prog->aux->func_cnt; i++) {
+ struct bpf_prog *subprog = prog->aux->func[i];
+
+ perf_event_ksymbol(
+ PERF_RECORD_KSYMBOL_TYPE_BPF,
+ (u64)(unsigned long)subprog->bpf_func,
+ subprog->jited_len, unregister,
+ perf_event_bpf_get_name, subprog);
+ }
+ }
+}
+
+void perf_event_bpf_event(struct bpf_prog *prog,
+ enum perf_bpf_event_type type,
+ u16 flags)
+{
+ struct perf_bpf_event bpf_event;
+
+ if (type <= PERF_BPF_EVENT_UNKNOWN ||
+ type >= PERF_BPF_EVENT_MAX)
+ return;
+
+ switch (type) {
+ case PERF_BPF_EVENT_PROG_LOAD:
+ case PERF_BPF_EVENT_PROG_UNLOAD:
+ if (atomic_read(&nr_ksymbol_events))
+ perf_event_bpf_emit_ksymbols(prog, type);
+ break;
+ default:
+ break;
+ }
+
+ if (!atomic_read(&nr_bpf_events))
+ return;
+
+ bpf_event = (struct perf_bpf_event){
+ .prog = prog,
+ .event_id = {
+ .header = {
+ .type = PERF_RECORD_BPF_EVENT,
+ .size = sizeof(bpf_event.event_id),
+ },
+ .type = type,
+ .flags = flags,
+ .id = prog->aux->id,
+ },
+ };
+
+ memcpy(bpf_event.event_id.tag, prog->tag, BPF_TAG_SIZE);
+ perf_iterate_sb(perf_event_bpf_output, &bpf_event, NULL);
+}
+
void perf_event_itrace_started(struct perf_event *event)
{
event->attach_state |= PERF_ATTACH_ITRACE;
@@ -9996,6 +10114,8 @@ static void account_event(struct perf_event *event)
inc = true;
if (event->attr.ksymbol)
atomic_inc(&nr_ksymbol_events);
+ if (event->attr.bpf_event)
+ atomic_inc(&nr_bpf_events);
if (inc) {
/*
--
2.17.1
sync for PERF_RECORD_BPF_EVENT
Signed-off-by: Song Liu <[email protected]>
---
tools/include/uapi/linux/perf_event.h | 29 ++++++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)
diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 68c4da0227c5..8bd78a34e396 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -373,7 +373,8 @@ struct perf_event_attr {
write_backward : 1, /* Write ring buffer from end to beginning */
namespaces : 1, /* include namespaces data */
ksymbol : 1, /* include ksymbol events */
- __reserved_1 : 34;
+ bpf_event : 1, /* include bpf events */
+ __reserved_1 : 33;
union {
__u32 wakeup_events; /* wakeup every n events */
@@ -981,6 +982,25 @@ enum perf_event_type {
*/
PERF_RECORD_KSYMBOL = 17,
+ /*
+ * Record bpf events:
+ * enum perf_bpf_event_type {
+ * PERF_BPF_EVENT_UNKNOWN = 0,
+ * PERF_BPF_EVENT_PROG_LOAD = 1,
+ * PERF_BPF_EVENT_PROG_UNLOAD = 2,
+ * };
+ *
+ * struct {
+ * struct perf_event_header header;
+ * u16 type;
+ * u16 flags;
+ * u32 id;
+ * u8 tag[BPF_TAG_SIZE];
+ * struct sample_id sample_id;
+ * };
+ */
+ PERF_RECORD_BPF_EVENT = 18,
+
PERF_RECORD_MAX, /* non-ABI */
};
@@ -992,6 +1012,13 @@ enum perf_record_ksymbol_type {
#define PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER (1 << 0)
+enum perf_bpf_event_type {
+ PERF_BPF_EVENT_UNKNOWN = 0,
+ PERF_BPF_EVENT_PROG_LOAD = 1,
+ PERF_BPF_EVENT_PROG_UNLOAD = 2,
+ PERF_BPF_EVENT_MAX, /* non-ABI */
+};
+
#define PERF_MAX_STACK_DEPTH 127
#define PERF_MAX_CONTEXTS_PER_STACK 8
--
2.17.1
sync changes for PERF_RECORD_KSYMBOL
Signed-off-by: Song Liu <[email protected]>
---
tools/include/uapi/linux/perf_event.h | 26 +++++++++++++++++++++++++-
1 file changed, 25 insertions(+), 1 deletion(-)
diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 9de8780ac8d9..68c4da0227c5 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -372,7 +372,8 @@ struct perf_event_attr {
context_switch : 1, /* context switch data */
write_backward : 1, /* Write ring buffer from end to beginning */
namespaces : 1, /* include namespaces data */
- __reserved_1 : 35;
+ ksymbol : 1, /* include ksymbol events */
+ __reserved_1 : 34;
union {
__u32 wakeup_events; /* wakeup every n events */
@@ -965,9 +966,32 @@ enum perf_event_type {
*/
PERF_RECORD_NAMESPACES = 16,
+ /*
+ * Record ksymbol register/unregister events:
+ *
+ * struct {
+ * struct perf_event_header header;
+ * u64 addr;
+ * u32 len;
+ * u16 ksym_type;
+ * u16 flags;
+ * char name[];
+ * struct sample_id sample_id;
+ * };
+ */
+ PERF_RECORD_KSYMBOL = 17,
+
PERF_RECORD_MAX, /* non-ABI */
};
+enum perf_record_ksymbol_type {
+ PERF_RECORD_KSYMBOL_TYPE_UNKNOWN = 0,
+ PERF_RECORD_KSYMBOL_TYPE_BPF = 1,
+ PERF_RECORD_KSYMBOL_TYPE_MAX /* non-ABI */
+};
+
+#define PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER (1 << 0)
+
#define PERF_MAX_STACK_DEPTH 127
#define PERF_MAX_CONTEXTS_PER_STACK 8
--
2.17.1
For better performance analysis of dynamically JITed and loaded kernel
functions, such as BPF programs, this patch introduces
PERF_RECORD_KSYMBOL, a new perf_event_type that exposes kernel symbol
register/unregister information to user space.
The following data structure is used for PERF_RECORD_KSYMBOL.
/*
* struct {
* struct perf_event_header header;
* u64 addr;
* u32 len;
* u16 ksym_type;
* u16 flags;
* char name[];
* struct sample_id sample_id;
* };
*/
Signed-off-by: Song Liu <[email protected]>
---
include/linux/perf_event.h | 13 +++++
include/uapi/linux/perf_event.h | 26 ++++++++-
kernel/events/core.c | 98 ++++++++++++++++++++++++++++++++-
3 files changed, 135 insertions(+), 2 deletions(-)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 1d5c551a5add..6b5f08db5ef3 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1113,6 +1113,13 @@ static inline void perf_event_task_sched_out(struct task_struct *prev,
}
extern void perf_event_mmap(struct vm_area_struct *vma);
+
+/* callback function to generate ksymbol name */
+typedef int (perf_ksymbol_get_name_f)(char *name, int name_len, void *data);
+extern void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len,
+ bool unregister,
+ perf_ksymbol_get_name_f get_name, void *data);
+
extern struct perf_guest_info_callbacks *perf_guest_cbs;
extern int perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks);
extern int perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks);
@@ -1333,6 +1340,12 @@ static inline int perf_unregister_guest_info_callbacks
(struct perf_guest_info_callbacks *callbacks) { return 0; }
static inline void perf_event_mmap(struct vm_area_struct *vma) { }
+
+typedef int (perf_ksymbol_get_name_f)(char *name, int name_len, void *data);
+static inline void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len,
+ bool unregister,
+ perf_ksymbol_get_name_f get_name,
+ void *data) { }
static inline void perf_event_exec(void) { }
static inline void perf_event_comm(struct task_struct *tsk, bool exec) { }
static inline void perf_event_namespaces(struct task_struct *tsk) { }
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 9de8780ac8d9..68c4da0227c5 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -372,7 +372,8 @@ struct perf_event_attr {
context_switch : 1, /* context switch data */
write_backward : 1, /* Write ring buffer from end to beginning */
namespaces : 1, /* include namespaces data */
- __reserved_1 : 35;
+ ksymbol : 1, /* include ksymbol events */
+ __reserved_1 : 34;
union {
__u32 wakeup_events; /* wakeup every n events */
@@ -965,9 +966,32 @@ enum perf_event_type {
*/
PERF_RECORD_NAMESPACES = 16,
+ /*
+ * Record ksymbol register/unregister events:
+ *
+ * struct {
+ * struct perf_event_header header;
+ * u64 addr;
+ * u32 len;
+ * u16 ksym_type;
+ * u16 flags;
+ * char name[];
+ * struct sample_id sample_id;
+ * };
+ */
+ PERF_RECORD_KSYMBOL = 17,
+
PERF_RECORD_MAX, /* non-ABI */
};
+enum perf_record_ksymbol_type {
+ PERF_RECORD_KSYMBOL_TYPE_UNKNOWN = 0,
+ PERF_RECORD_KSYMBOL_TYPE_BPF = 1,
+ PERF_RECORD_KSYMBOL_TYPE_MAX /* non-ABI */
+};
+
+#define PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER (1 << 0)
+
#define PERF_MAX_STACK_DEPTH 127
#define PERF_MAX_CONTEXTS_PER_STACK 8
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 3cd13a30f732..ef27f2776999 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -385,6 +385,7 @@ static atomic_t nr_namespaces_events __read_mostly;
static atomic_t nr_task_events __read_mostly;
static atomic_t nr_freq_events __read_mostly;
static atomic_t nr_switch_events __read_mostly;
+static atomic_t nr_ksymbol_events __read_mostly;
static LIST_HEAD(pmus);
static DEFINE_MUTEX(pmus_lock);
@@ -4235,7 +4236,7 @@ static bool is_sb_event(struct perf_event *event)
if (attr->mmap || attr->mmap_data || attr->mmap2 ||
attr->comm || attr->comm_exec ||
- attr->task ||
+ attr->task || attr->ksymbol ||
attr->context_switch)
return true;
return false;
@@ -4305,6 +4306,8 @@ static void unaccount_event(struct perf_event *event)
dec = true;
if (has_branch_stack(event))
dec = true;
+ if (event->attr.ksymbol)
+ atomic_dec(&nr_ksymbol_events);
if (dec) {
if (!atomic_add_unless(&perf_sched_count, -1, 1))
@@ -7650,6 +7653,97 @@ static void perf_log_throttle(struct perf_event *event, int enable)
perf_output_end(&handle);
}
+/*
+ * ksymbol register/unregister tracking
+ */
+
+struct perf_ksymbol_event {
+ const char *name;
+ int name_len;
+ struct {
+ struct perf_event_header header;
+ u64 addr;
+ u32 len;
+ u16 ksym_type;
+ u16 flags;
+ } event_id;
+};
+
+static int perf_event_ksymbol_match(struct perf_event *event)
+{
+ return event->attr.ksymbol;
+}
+
+static void perf_event_ksymbol_output(struct perf_event *event, void *data)
+{
+ struct perf_ksymbol_event *ksymbol_event = data;
+ struct perf_output_handle handle;
+ struct perf_sample_data sample;
+ int ret;
+
+ if (!perf_event_ksymbol_match(event))
+ return;
+
+ perf_event_header__init_id(&ksymbol_event->event_id.header,
+ &sample, event);
+ ret = perf_output_begin(&handle, event,
+ ksymbol_event->event_id.header.size);
+ if (ret)
+ return;
+
+ perf_output_put(&handle, ksymbol_event->event_id);
+ __output_copy(&handle, ksymbol_event->name, ksymbol_event->name_len);
+ perf_event__output_id_sample(event, &handle, &sample);
+
+ perf_output_end(&handle);
+}
+
+void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len, bool unregister,
+ perf_ksymbol_get_name_f get_name, void *data)
+{
+ struct perf_ksymbol_event ksymbol_event;
+ char name[KSYM_NAME_LEN];
+ u16 flags = 0;
+ int name_len;
+
+ if (!atomic_read(&nr_ksymbol_events))
+ return;
+
+ if (ksym_type >= PERF_RECORD_KSYMBOL_TYPE_MAX ||
+ ksym_type == PERF_RECORD_KSYMBOL_TYPE_UNKNOWN)
+ goto err;
+
+ get_name(name, KSYM_NAME_LEN, data);
+ name_len = strlen(name) + 1;
+ while (!IS_ALIGNED(name_len, sizeof(u64)))
+ name[name_len++] = '\0';
+ BUILD_BUG_ON(KSYM_NAME_LEN % sizeof(u64));
+
+ if (unregister)
+ flags |= PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER;
+
+ ksymbol_event = (struct perf_ksymbol_event){
+ .name = name,
+ .name_len = name_len,
+ .event_id = {
+ .header = {
+ .type = PERF_RECORD_KSYMBOL,
+ .size = sizeof(ksymbol_event.event_id) +
+ name_len,
+ },
+ .addr = addr,
+ .len = len,
+ .ksym_type = ksym_type,
+ .flags = flags,
+ },
+ };
+
+ perf_iterate_sb(perf_event_ksymbol_output, &ksymbol_event, NULL);
+ return;
+err:
+ WARN_ONCE(1, "%s: Invalid KSYMBOL type 0x%x\n", __func__, ksym_type);
+}
+
void perf_event_itrace_started(struct perf_event *event)
{
event->attach_state |= PERF_ATTACH_ITRACE;
@@ -9900,6 +9994,8 @@ static void account_event(struct perf_event *event)
inc = true;
if (is_cgroup_event(event))
inc = true;
+ if (event->attr.ksymbol)
+ atomic_inc(&nr_ksymbol_events);
if (inc) {
/*
--
2.17.1
This patch synthesize PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT for
BPF programs loaded before perf-record. This is achieved by gathering
information about all BPF programs via sys_bpf.
Signed-off-by: Song Liu <[email protected]>
---
tools/perf/builtin-record.c | 6 ++
tools/perf/util/bpf-event.c | 205 ++++++++++++++++++++++++++++++++++++
tools/perf/util/bpf-event.h | 5 +
3 files changed, 216 insertions(+)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index deaf9b902094..88ea11d57c6f 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -41,6 +41,7 @@
#include "util/perf-hooks.h"
#include "util/time-utils.h"
#include "util/units.h"
+#include "util/bpf-event.h"
#include "asm/bug.h"
#include <errno.h>
@@ -1082,6 +1083,11 @@ static int record__synthesize(struct record *rec, bool tail)
return err;
}
+ err = perf_event__synthesize_bpf_events(tool, process_synthesized_event,
+ machine, opts);
+ if (err < 0)
+ pr_warning("Couldn't synthesize bpf events.\n");
+
err = __machine__synthesize_threads(machine, tool, &opts->target, rec->evlist->threads,
process_synthesized_event, opts->sample_address,
1);
diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c
index f24f75506f51..ee60c8b1e636 100644
--- a/tools/perf/util/bpf-event.c
+++ b/tools/perf/util/bpf-event.c
@@ -1,10 +1,24 @@
// SPDX-License-Identifier: GPL-2.0
#include <errno.h>
#include <bpf/bpf.h>
+#include <bpf/btf.h>
+#include <linux/btf.h>
#include "bpf-event.h"
#include "debug.h"
#include "symbol.h"
+#define ptr_to_u64(ptr) ((__u64)(unsigned long)(ptr))
+
+static int snprintf_hex(char *buf, size_t size, unsigned char *data, size_t len)
+{
+ int ret = 0;
+ size_t i;
+
+ for (i = 0; i < len; i++)
+ ret += snprintf(buf + ret, size - ret, "%02x", data[i]);
+ return ret;
+}
+
int machine__process_bpf_event(struct machine *machine __maybe_unused,
union perf_event *event,
struct perf_sample *sample __maybe_unused)
@@ -13,3 +27,194 @@ int machine__process_bpf_event(struct machine *machine __maybe_unused,
perf_event__fprintf_bpf_event(event, stderr);
return 0;
}
+
+/*
+ * Synthesize PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT for one bpf
+ * program. One PERF_RECORD_BPF_EVENT is generated for the program. And
+ * one PERF_RECORD_KSYMBOL is generated for each sub program.
+ */
+static int perf_event__synthesize_one_bpf_prog(struct perf_tool *tool,
+ perf_event__handler_t process,
+ struct machine *machine,
+ int fd,
+ union perf_event *event,
+ struct record_opts *opts)
+{
+ struct ksymbol_event *ksymbol_event = &event->ksymbol_event;
+ struct bpf_event *bpf_event = &event->bpf_event;
+ u32 sub_prog_cnt, i, func_info_rec_size;
+ u8 (*prog_tags)[BPF_TAG_SIZE] = NULL;
+ struct bpf_prog_info info = {};
+ u32 info_len = sizeof(info);
+ void *func_infos = NULL;
+ u64 *prog_addrs = NULL;
+ struct btf *btf = NULL;
+ u32 *prog_lens = NULL;
+ bool has_btf = false;
+ int err = 0;
+
+ /* Call bpf_obj_get_info_by_fd() to get sizes of arrays */
+ err = bpf_obj_get_info_by_fd(fd, &info, &info_len);
+
+ if (err || info_len < 192 /* need field prog_tags */)
+ return -1;
+
+ /* number of ksyms, func_lengths, and tags should match */
+ sub_prog_cnt = info.nr_jited_ksyms;
+ if (sub_prog_cnt != info.nr_prog_tags ||
+ sub_prog_cnt != info.nr_jited_func_lens)
+ return -1;
+
+ /* check BTF func info support */
+ if (info.btf_id && info.nr_func_info && info.func_info_rec_size) {
+ /* btf func info number should be same as sub_prog_cnt */
+ if (sub_prog_cnt != info.nr_func_info)
+ return -1;
+ if (btf__get_from_id(info.btf_id, &btf))
+ return -1;
+ func_info_rec_size = info.func_info_rec_size;
+ func_infos = malloc(sub_prog_cnt * func_info_rec_size);
+ if (!func_infos)
+ return -1;
+ has_btf = true;
+ }
+
+ /*
+ * We need address, length, and tag for each sub program.
+ * Allocate memory and call bpf_obj_get_info_by_fd() again
+ */
+ prog_addrs = (u64 *)malloc(sizeof(u64) * sub_prog_cnt);
+ prog_lens = (u32 *)malloc(sizeof(u64) * sub_prog_cnt);
+ prog_tags = malloc(sizeof(u8) * BPF_TAG_SIZE * sub_prog_cnt);
+
+ err = !prog_addrs || !prog_lens || !prog_tags;
+ if (err)
+ goto out;
+
+ memset(&info, 0, sizeof(info));
+ info.nr_jited_ksyms = sub_prog_cnt;
+ info.nr_jited_func_lens = sub_prog_cnt;
+ info.nr_prog_tags = sub_prog_cnt;
+ info.jited_ksyms = ptr_to_u64(prog_addrs);
+ info.jited_func_lens = ptr_to_u64(prog_lens);
+ info.prog_tags = ptr_to_u64(prog_tags);
+ info_len = sizeof(info);
+ if (has_btf) {
+ info.nr_func_info = sub_prog_cnt;
+ info.func_info_rec_size = func_info_rec_size;
+ info.func_info = ptr_to_u64(func_infos);
+ }
+
+ err = bpf_obj_get_info_by_fd(fd, &info, &info_len);
+ if (err)
+ goto out;
+
+ /* Synthesize PERF_RECORD_KSYMBOL */
+ for (i = 0; i < sub_prog_cnt; i++) {
+ const struct bpf_func_info *finfo;
+ const char *short_name = NULL;
+ const struct btf_type *t;
+ int name_len;
+
+ *ksymbol_event = (struct ksymbol_event){
+ .header = {
+ .type = PERF_RECORD_KSYMBOL,
+ .size = sizeof(struct ksymbol_event),
+ },
+ .addr = prog_addrs[i],
+ .len = prog_lens[i],
+ .ksym_type = PERF_RECORD_KSYMBOL_TYPE_BPF,
+ .flags = 0,
+ };
+ name_len = snprintf(ksymbol_event->name, KSYM_NAME_LEN,
+ "bpf_prog_");
+ name_len += snprintf_hex(ksymbol_event->name + name_len,
+ KSYM_NAME_LEN - name_len,
+ prog_tags[i], BPF_TAG_SIZE);
+ if (has_btf) {
+ finfo = func_infos + i * info.func_info_rec_size;
+ t = btf__type_by_id(btf, finfo->type_id);
+ short_name = btf__name_by_offset(btf, t->name_off);
+ } else if (i == 0 && sub_prog_cnt == 1) {
+ /* no subprog */
+ if (info.name[0])
+ short_name = info.name;
+ } else
+ short_name = "F";
+ if (short_name)
+ name_len += snprintf(ksymbol_event->name + name_len,
+ KSYM_NAME_LEN - name_len,
+ "_%s", short_name);
+
+ ksymbol_event->header.size += PERF_ALIGN(name_len + 1,
+ sizeof(u64));
+ err = perf_tool__process_synth_event(tool, event,
+ machine, process);
+ }
+
+ /* Synthesize PERF_RECORD_BPF_EVENT */
+ if (opts->bpf_event) {
+ *bpf_event = (struct bpf_event){
+ .header = {
+ .type = PERF_RECORD_BPF_EVENT,
+ .size = sizeof(struct bpf_event),
+ },
+ .type = PERF_BPF_EVENT_PROG_LOAD,
+ .flags = 0,
+ .id = info.id,
+ };
+ memcpy(bpf_event->tag, prog_tags[i], BPF_TAG_SIZE);
+ err = perf_tool__process_synth_event(tool, event,
+ machine, process);
+ }
+
+out:
+ free(prog_tags);
+ free(prog_lens);
+ free(prog_addrs);
+ free(func_infos);
+ free(btf);
+ return err ? -1 : 0;
+}
+
+int perf_event__synthesize_bpf_events(struct perf_tool *tool,
+ perf_event__handler_t process,
+ struct machine *machine,
+ struct record_opts *opts)
+{
+ union perf_event *event;
+ __u32 id = 0;
+ int err;
+ int fd;
+
+ event = malloc(sizeof(event->bpf_event) + KSYM_NAME_LEN);
+ if (!event)
+ return -1;
+ while (true) {
+ err = bpf_prog_get_next_id(id, &id);
+ if (err) {
+ if (errno == ENOENT) {
+ err = 0;
+ break;
+ }
+ pr_err("can't get next program: %s%s",
+ strerror(errno),
+ errno == EINVAL ? " -- kernel too old?" : "");
+ err = -1;
+ break;
+ }
+ fd = bpf_prog_get_fd_by_id(id);
+ if (fd < 0) {
+ pr_debug("Failed to get fd for prog_id %u\n", id);
+ continue;
+ }
+
+ err = perf_event__synthesize_one_bpf_prog(
+ tool, process, machine, fd, event, opts);
+ close(fd);
+ if (err)
+ break;
+ }
+ free(event);
+ return err;
+}
diff --git a/tools/perf/util/bpf-event.h b/tools/perf/util/bpf-event.h
index d5ca355dd298..38aee4040f12 100644
--- a/tools/perf/util/bpf-event.h
+++ b/tools/perf/util/bpf-event.h
@@ -8,4 +8,9 @@ int machine__process_bpf_event(struct machine *machine,
union perf_event *event,
struct perf_sample *sample);
+int perf_event__synthesize_bpf_events(struct perf_tool *tool,
+ perf_event__handler_t process,
+ struct machine *machine,
+ struct record_opts *opts);
+
#endif
--
2.17.1
This patch adds basic handling of PERF_RECORD_BPF_EVENT.
Tracking of PERF_RECORD_BPF_EVENT is OFF by default. Option --bpf-event
is added to turn it on.
Signed-off-by: Song Liu <[email protected]>
---
tools/perf/builtin-record.c | 1 +
tools/perf/perf.h | 1 +
tools/perf/util/Build | 2 ++
tools/perf/util/bpf-event.c | 15 +++++++++++++++
tools/perf/util/bpf-event.h | 11 +++++++++++
tools/perf/util/event.c | 20 ++++++++++++++++++++
tools/perf/util/event.h | 16 ++++++++++++++++
tools/perf/util/evsel.c | 10 ++++++++++
tools/perf/util/evsel.h | 1 +
tools/perf/util/machine.c | 3 +++
tools/perf/util/session.c | 4 ++++
tools/perf/util/tool.h | 3 ++-
12 files changed, 86 insertions(+), 1 deletion(-)
create mode 100644 tools/perf/util/bpf-event.c
create mode 100644 tools/perf/util/bpf-event.h
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 882285fb9f64..deaf9b902094 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1839,6 +1839,7 @@ static struct option __record_options[] = {
OPT_BOOLEAN(0, "tail-synthesize", &record.opts.tail_synthesize,
"synthesize non-sample events at the end of output"),
OPT_BOOLEAN(0, "overwrite", &record.opts.overwrite, "use overwrite mode"),
+ OPT_BOOLEAN(0, "bpf-event", &record.opts.bpf_event, "record bpf events"),
OPT_BOOLEAN(0, "strict-freq", &record.opts.strict_freq,
"Fail if the specified frequency can't be used"),
OPT_CALLBACK('F', "freq", &record.opts, "freq or 'max'",
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 388c6dd128b8..5941fb6eccfc 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -66,6 +66,7 @@ struct record_opts {
bool ignore_missing_thread;
bool strict_freq;
bool sample_id;
+ bool bpf_event;
unsigned int freq;
unsigned int mmap_pages;
unsigned int auxtrace_mmap_pages;
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index af72be7f5b3b..fa8305390315 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -152,6 +152,8 @@ endif
libperf-y += perf-hooks.o
+libperf-$(CONFIG_LIBBPF) += bpf-event.o
+
libperf-$(CONFIG_CXX) += c++/
CFLAGS_config.o += -DETC_PERFCONFIG="BUILD_STR($(ETC_PERFCONFIG_SQ))"
diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c
new file mode 100644
index 000000000000..f24f75506f51
--- /dev/null
+++ b/tools/perf/util/bpf-event.c
@@ -0,0 +1,15 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <errno.h>
+#include <bpf/bpf.h>
+#include "bpf-event.h"
+#include "debug.h"
+#include "symbol.h"
+
+int machine__process_bpf_event(struct machine *machine __maybe_unused,
+ union perf_event *event,
+ struct perf_sample *sample __maybe_unused)
+{
+ if (dump_trace)
+ perf_event__fprintf_bpf_event(event, stderr);
+ return 0;
+}
diff --git a/tools/perf/util/bpf-event.h b/tools/perf/util/bpf-event.h
new file mode 100644
index 000000000000..d5ca355dd298
--- /dev/null
+++ b/tools/perf/util/bpf-event.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __PERF_BPF_EVENT_H
+#define __PERF_BPF_EVENT_H
+
+#include "machine.h"
+
+int machine__process_bpf_event(struct machine *machine,
+ union perf_event *event,
+ struct perf_sample *sample);
+
+#endif
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 3c8a6a8dd260..3b646d27374e 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -25,6 +25,7 @@
#include "asm/bug.h"
#include "stat.h"
#include "session.h"
+#include "bpf-event.h"
#define DEFAULT_PROC_MAP_PARSE_TIMEOUT 500
@@ -47,6 +48,7 @@ static const char *perf_event__names[] = {
[PERF_RECORD_SWITCH_CPU_WIDE] = "SWITCH_CPU_WIDE",
[PERF_RECORD_NAMESPACES] = "NAMESPACES",
[PERF_RECORD_KSYMBOL] = "KSYMBOL",
+ [PERF_RECORD_BPF_EVENT] = "BPF_EVENT",
[PERF_RECORD_HEADER_ATTR] = "ATTR",
[PERF_RECORD_HEADER_EVENT_TYPE] = "EVENT_TYPE",
[PERF_RECORD_HEADER_TRACING_DATA] = "TRACING_DATA",
@@ -1339,6 +1341,14 @@ int perf_event__process_ksymbol(struct perf_tool *tool __maybe_unused,
return machine__process_ksymbol(machine, event, sample);
}
+int perf_event__process_bpf_event(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
+ struct perf_sample *sample __maybe_unused,
+ struct machine *machine)
+{
+ return machine__process_bpf_event(machine, event, sample);
+}
+
size_t perf_event__fprintf_mmap(union perf_event *event, FILE *fp)
{
return fprintf(fp, " %d/%d: [%#" PRIx64 "(%#" PRIx64 ") @ %#" PRIx64 "]: %c %s\n",
@@ -1479,6 +1489,13 @@ size_t perf_event__fprintf_ksymbol(union perf_event *event, FILE *fp)
event->ksymbol_event.flags, event->ksymbol_event.name);
}
+size_t perf_event__fprintf_bpf_event(union perf_event *event, FILE *fp)
+{
+ return fprintf(fp, " bpf event with type %u, flags %u, id %u\n",
+ event->bpf_event.type, event->bpf_event.flags,
+ event->bpf_event.id);
+}
+
size_t perf_event__fprintf(union perf_event *event, FILE *fp)
{
size_t ret = fprintf(fp, "PERF_RECORD_%s",
@@ -1517,6 +1534,9 @@ size_t perf_event__fprintf(union perf_event *event, FILE *fp)
case PERF_RECORD_KSYMBOL:
ret += perf_event__fprintf_ksymbol(event, fp);
break;
+ case PERF_RECORD_BPF_EVENT:
+ ret += perf_event__fprintf_bpf_event(event, fp);
+ break;
default:
ret += fprintf(fp, "\n");
}
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 018322f2a13e..dad32b81fe71 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -98,6 +98,16 @@ struct ksymbol_event {
char name[KSYM_NAME_LEN];
};
+struct bpf_event {
+ struct perf_event_header header;
+ u16 type;
+ u16 flags;
+ u32 id;
+
+ /* for bpf_prog types */
+ u8 tag[BPF_TAG_SIZE]; // prog tag
+};
+
#define PERF_SAMPLE_MASK \
(PERF_SAMPLE_IP | PERF_SAMPLE_TID | \
PERF_SAMPLE_TIME | PERF_SAMPLE_ADDR | \
@@ -666,6 +676,7 @@ union perf_event {
struct time_conv_event time_conv;
struct feature_event feat;
struct ksymbol_event ksymbol_event;
+ struct bpf_event bpf_event;
};
void perf_event__print_totals(void);
@@ -767,6 +778,10 @@ int perf_event__process_ksymbol(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine);
+int perf_event__process_bpf_event(struct perf_tool *tool,
+ union perf_event *event,
+ struct perf_sample *sample,
+ struct machine *machine);
int perf_tool__process_synth_event(struct perf_tool *tool,
union perf_event *event,
struct machine *machine,
@@ -831,6 +846,7 @@ size_t perf_event__fprintf_thread_map(union perf_event *event, FILE *fp);
size_t perf_event__fprintf_cpu_map(union perf_event *event, FILE *fp);
size_t perf_event__fprintf_namespaces(union perf_event *event, FILE *fp);
size_t perf_event__fprintf_ksymbol(union perf_event *event, FILE *fp);
+size_t perf_event__fprintf_bpf_event(union perf_event *event, FILE *fp);
size_t perf_event__fprintf(union perf_event *event, FILE *fp);
int kallsyms__get_function_start(const char *kallsyms_filename,
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index de34ce875648..1e8b7d2897e3 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1036,6 +1036,8 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts,
attr->mmap2 = track && !perf_missing_features.mmap2;
attr->comm = track;
attr->ksymbol = track && !perf_missing_features.ksymbol;
+ attr->bpf_event = track && opts->bpf_event &&
+ !perf_missing_features.bpf_event;
if (opts->record_namespaces)
attr->namespaces = track;
@@ -1654,6 +1656,7 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
PRINT_ATTRf(write_backward, p_unsigned);
PRINT_ATTRf(namespaces, p_unsigned);
PRINT_ATTRf(ksymbol, p_unsigned);
+ PRINT_ATTRf(bpf_event, p_unsigned);
PRINT_ATTRn("{ wakeup_events, wakeup_watermark }", wakeup_events, p_unsigned);
PRINT_ATTRf(bp_type, p_unsigned);
@@ -1815,6 +1818,8 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
evsel->attr.read_format &= ~(PERF_FORMAT_GROUP|PERF_FORMAT_ID);
if (perf_missing_features.ksymbol)
evsel->attr.ksymbol = 0;
+ if (perf_missing_features.bpf_event)
+ evsel->attr.bpf_event = 0;
retry_sample_id:
if (perf_missing_features.sample_id_all)
evsel->attr.sample_id_all = 0;
@@ -1964,6 +1969,11 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
perf_missing_features.ksymbol = true;
pr_debug2("switching off ksymbol\n");
goto fallback_missing_features;
+ } else if (!perf_missing_features.bpf_event &&
+ evsel->attr.bpf_event) {
+ perf_missing_features.bpf_event = true;
+ pr_debug2("switching off bpf_event\n");
+ goto fallback_missing_features;
} else if (!perf_missing_features.sample_id_all) {
perf_missing_features.sample_id_all = true;
pr_debug2("switching off sample_id_all\n");
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 4a8c3e7f4808..29c5eb68c44b 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -169,6 +169,7 @@ struct perf_missing_features {
bool write_backward;
bool group_read;
bool ksymbol;
+ bool bpf_event;
};
extern struct perf_missing_features perf_missing_features;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 1734ca027661..8c0b16382226 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -21,6 +21,7 @@
#include "unwind.h"
#include "linux/hash.h"
#include "asm/bug.h"
+#include "bpf-event.h"
#include "sane_ctype.h"
#include <symbol/kallsyms.h>
@@ -1869,6 +1870,8 @@ int machine__process_event(struct machine *machine, union perf_event *event,
ret = machine__process_switch_event(machine, event); break;
case PERF_RECORD_KSYMBOL:
ret = machine__process_ksymbol(machine, event, sample); break;
+ case PERF_RECORD_BPF_EVENT:
+ ret = machine__process_bpf_event(machine, event, sample); break;
default:
ret = -1;
break;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index a9c98c3914ed..dac8c9b62036 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -378,6 +378,8 @@ void perf_tool__fill_defaults(struct perf_tool *tool)
tool->context_switch = perf_event__process_switch;
if (tool->ksymbol == NULL)
tool->ksymbol = perf_event__process_ksymbol;
+ if (tool->bpf_event == NULL)
+ tool->bpf_event = perf_event__process_bpf_event;
if (tool->read == NULL)
tool->read = process_event_sample_stub;
if (tool->throttle == NULL)
@@ -1309,6 +1311,8 @@ static int machines__deliver_event(struct machines *machines,
return tool->context_switch(tool, event, sample, machine);
case PERF_RECORD_KSYMBOL:
return tool->ksymbol(tool, event, sample, machine);
+ case PERF_RECORD_BPF_EVENT:
+ return tool->bpf_event(tool, event, sample, machine);
default:
++evlist->stats.nr_unknown_events;
return -1;
diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
index 9c81ca2f3cf7..250391672f9f 100644
--- a/tools/perf/util/tool.h
+++ b/tools/perf/util/tool.h
@@ -54,7 +54,8 @@ struct perf_tool {
context_switch,
throttle,
unthrottle,
- ksymbol;
+ ksymbol,
+ bpf_event;
event_attr_op attr;
event_attr_op event_update;
--
2.17.1
On 1/9/19 11:21 AM, Song Liu wrote:
> For better performance analysis of BPF programs, this patch introduces
> PERF_RECORD_BPF_EVENT, a new perf_event_type that exposes BPF program
> load/unload information to user space.
>
> Each BPF program may contain up to BPF_MAX_SUBPROGS (256) sub programs.
> The following example shows kernel symbols for a BPF program with 7
> sub programs:
>
> ffffffffa0257cf9 t bpf_prog_b07ccb89267cf242_F
> ffffffffa02592e1 t bpf_prog_2dcecc18072623fc_F
> ffffffffa025b0e9 t bpf_prog_bb7a405ebaec5d5c_F
> ffffffffa025dd2c t bpf_prog_a7540d4a39ec1fc7_F
> ffffffffa025fcca t bpf_prog_05762d4ade0e3737_F
> ffffffffa026108f t bpf_prog_db4bd11e35df90d4_F
> ffffffffa0263f00 t bpf_prog_89d64e4abf0f0126_F
> ffffffffa0257cf9 t bpf_prog_ae31629322c4b018__dummy_tracepoi
>
> When a bpf program is loaded, PERF_RECORD_KSYMBOL is generated for
> each of these sub programs. Therefore, PERF_RECORD_BPF_EVENT is not
> needed for simple profiling.
>
> For annotation, user space need to listen to PERF_RECORD_BPF_EVENT
> and gather more information about these (sub) programs via sys_bpf.
>
> Signed-off-by: Song Liu <[email protected]>
> ---
> include/linux/filter.h | 7 ++
> include/linux/perf_event.h | 6 ++
> include/uapi/linux/perf_event.h | 29 +++++++-
> kernel/bpf/core.c | 2 +-
> kernel/bpf/syscall.c | 2 +
> kernel/events/core.c | 120 ++++++++++++++++++++++++++++++++
> 6 files changed, 164 insertions(+), 2 deletions(-)
Acked-by: Alexei Starovoitov <[email protected]>
The bpf bits are small comparing to perf bits, so it's probably
better to get the whole thing via tip tree in one go.
Splitting perf patches into perf tree also seems unnecessary.
Peter, Arnaldo, thoughts?
> On Jan 10, 2019, at 10:24 AM, Arnaldo Carvalho de Melo <[email protected]> wrote:
>
> Em Wed, Jan 09, 2019 at 11:21:05AM -0800, Song Liu escreveu:
>> For better performance analysis of dynamically JITed and loaded kernel
>> functions, such as BPF programs, this patch introduces
>> PERF_RECORD_KSYMBOL, a new perf_event_type that exposes kernel symbol
>> register/unregister information to user space.
>>
>> The following data structure is used for PERF_RECORD_KSYMBOL.
>>
>> /*
>> * struct {
>> * struct perf_event_header header;
>> * u64 addr;
>> * u32 len;
>> * u16 ksym_type;
>> * u16 flags;
>> * char name[];
>> * struct sample_id sample_id;
>> * };
>> */
>
> So, I couldn't find where this gets used, the intention here is just to
> add the interfaces and afterwards is that you will wire this up? I would
> like to test the whole shebang to see it working.
>
> - Arnaldo
I guess you meant PERF_RECORD_BPF_EVENT not being used?
PERF_RECORD_KSYMBOL is used by BPF in 3/7 and 5/7. I tested
PERF_RECORD_BPF_EVENT with dump_trace. As we separate RECORD_KSYMBOL from
RECORD_BPF_EVENT, user space won't use BPF_EVENT until annotation support.
Thanks,
Song
>> Signed-off-by: Song Liu <[email protected]>
>> ---
>> include/linux/perf_event.h | 13 +++++
>> include/uapi/linux/perf_event.h | 26 ++++++++-
>> kernel/events/core.c | 98 ++++++++++++++++++++++++++++++++-
>> 3 files changed, 135 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
>> index 1d5c551a5add..6b5f08db5ef3 100644
>> --- a/include/linux/perf_event.h
>> +++ b/include/linux/perf_event.h
>> @@ -1113,6 +1113,13 @@ static inline void perf_event_task_sched_out(struct task_struct *prev,
>> }
>>
>> extern void perf_event_mmap(struct vm_area_struct *vma);
>> +
>> +/* callback function to generate ksymbol name */
>> +typedef int (perf_ksymbol_get_name_f)(char *name, int name_len, void *data);
>> +extern void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len,
>> + bool unregister,
>> + perf_ksymbol_get_name_f get_name, void *data);
>> +
>> extern struct perf_guest_info_callbacks *perf_guest_cbs;
>> extern int perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks);
>> extern int perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks);
>> @@ -1333,6 +1340,12 @@ static inline int perf_unregister_guest_info_callbacks
>> (struct perf_guest_info_callbacks *callbacks) { return 0; }
>>
>> static inline void perf_event_mmap(struct vm_area_struct *vma) { }
>> +
>> +typedef int (perf_ksymbol_get_name_f)(char *name, int name_len, void *data);
>> +static inline void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len,
>> + bool unregister,
>> + perf_ksymbol_get_name_f get_name,
>> + void *data) { }
>> static inline void perf_event_exec(void) { }
>> static inline void perf_event_comm(struct task_struct *tsk, bool exec) { }
>> static inline void perf_event_namespaces(struct task_struct *tsk) { }
>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>> index 9de8780ac8d9..68c4da0227c5 100644
>> --- a/include/uapi/linux/perf_event.h
>> +++ b/include/uapi/linux/perf_event.h
>> @@ -372,7 +372,8 @@ struct perf_event_attr {
>> context_switch : 1, /* context switch data */
>> write_backward : 1, /* Write ring buffer from end to beginning */
>> namespaces : 1, /* include namespaces data */
>> - __reserved_1 : 35;
>> + ksymbol : 1, /* include ksymbol events */
>> + __reserved_1 : 34;
>>
>> union {
>> __u32 wakeup_events; /* wakeup every n events */
>> @@ -965,9 +966,32 @@ enum perf_event_type {
>> */
>> PERF_RECORD_NAMESPACES = 16,
>>
>> + /*
>> + * Record ksymbol register/unregister events:
>> + *
>> + * struct {
>> + * struct perf_event_header header;
>> + * u64 addr;
>> + * u32 len;
>> + * u16 ksym_type;
>> + * u16 flags;
>> + * char name[];
>> + * struct sample_id sample_id;
>> + * };
>> + */
>> + PERF_RECORD_KSYMBOL = 17,
>> +
>> PERF_RECORD_MAX, /* non-ABI */
>> };
>>
>> +enum perf_record_ksymbol_type {
>> + PERF_RECORD_KSYMBOL_TYPE_UNKNOWN = 0,
>> + PERF_RECORD_KSYMBOL_TYPE_BPF = 1,
>> + PERF_RECORD_KSYMBOL_TYPE_MAX /* non-ABI */
>> +};
>> +
>> +#define PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER (1 << 0)
>> +
>> #define PERF_MAX_STACK_DEPTH 127
>> #define PERF_MAX_CONTEXTS_PER_STACK 8
>>
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index 3cd13a30f732..ef27f2776999 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -385,6 +385,7 @@ static atomic_t nr_namespaces_events __read_mostly;
>> static atomic_t nr_task_events __read_mostly;
>> static atomic_t nr_freq_events __read_mostly;
>> static atomic_t nr_switch_events __read_mostly;
>> +static atomic_t nr_ksymbol_events __read_mostly;
>>
>> static LIST_HEAD(pmus);
>> static DEFINE_MUTEX(pmus_lock);
>> @@ -4235,7 +4236,7 @@ static bool is_sb_event(struct perf_event *event)
>>
>> if (attr->mmap || attr->mmap_data || attr->mmap2 ||
>> attr->comm || attr->comm_exec ||
>> - attr->task ||
>> + attr->task || attr->ksymbol ||
>> attr->context_switch)
>> return true;
>> return false;
>> @@ -4305,6 +4306,8 @@ static void unaccount_event(struct perf_event *event)
>> dec = true;
>> if (has_branch_stack(event))
>> dec = true;
>> + if (event->attr.ksymbol)
>> + atomic_dec(&nr_ksymbol_events);
>>
>> if (dec) {
>> if (!atomic_add_unless(&perf_sched_count, -1, 1))
>> @@ -7650,6 +7653,97 @@ static void perf_log_throttle(struct perf_event *event, int enable)
>> perf_output_end(&handle);
>> }
>>
>> +/*
>> + * ksymbol register/unregister tracking
>> + */
>> +
>> +struct perf_ksymbol_event {
>> + const char *name;
>> + int name_len;
>> + struct {
>> + struct perf_event_header header;
>> + u64 addr;
>> + u32 len;
>> + u16 ksym_type;
>> + u16 flags;
>> + } event_id;
>> +};
>> +
>> +static int perf_event_ksymbol_match(struct perf_event *event)
>> +{
>> + return event->attr.ksymbol;
>> +}
>> +
>> +static void perf_event_ksymbol_output(struct perf_event *event, void *data)
>> +{
>> + struct perf_ksymbol_event *ksymbol_event = data;
>> + struct perf_output_handle handle;
>> + struct perf_sample_data sample;
>> + int ret;
>> +
>> + if (!perf_event_ksymbol_match(event))
>> + return;
>> +
>> + perf_event_header__init_id(&ksymbol_event->event_id.header,
>> + &sample, event);
>> + ret = perf_output_begin(&handle, event,
>> + ksymbol_event->event_id.header.size);
>> + if (ret)
>> + return;
>> +
>> + perf_output_put(&handle, ksymbol_event->event_id);
>> + __output_copy(&handle, ksymbol_event->name, ksymbol_event->name_len);
>> + perf_event__output_id_sample(event, &handle, &sample);
>> +
>> + perf_output_end(&handle);
>> +}
>> +
>> +void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len, bool unregister,
>> + perf_ksymbol_get_name_f get_name, void *data)
>> +{
>> + struct perf_ksymbol_event ksymbol_event;
>> + char name[KSYM_NAME_LEN];
>> + u16 flags = 0;
>> + int name_len;
>> +
>> + if (!atomic_read(&nr_ksymbol_events))
>> + return;
>> +
>> + if (ksym_type >= PERF_RECORD_KSYMBOL_TYPE_MAX ||
>> + ksym_type == PERF_RECORD_KSYMBOL_TYPE_UNKNOWN)
>> + goto err;
>> +
>> + get_name(name, KSYM_NAME_LEN, data);
>> + name_len = strlen(name) + 1;
>> + while (!IS_ALIGNED(name_len, sizeof(u64)))
>> + name[name_len++] = '\0';
>> + BUILD_BUG_ON(KSYM_NAME_LEN % sizeof(u64));
>> +
>> + if (unregister)
>> + flags |= PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER;
>> +
>> + ksymbol_event = (struct perf_ksymbol_event){
>> + .name = name,
>> + .name_len = name_len,
>> + .event_id = {
>> + .header = {
>> + .type = PERF_RECORD_KSYMBOL,
>> + .size = sizeof(ksymbol_event.event_id) +
>> + name_len,
>> + },
>> + .addr = addr,
>> + .len = len,
>> + .ksym_type = ksym_type,
>> + .flags = flags,
>> + },
>> + };
>> +
>> + perf_iterate_sb(perf_event_ksymbol_output, &ksymbol_event, NULL);
>> + return;
>> +err:
>> + WARN_ONCE(1, "%s: Invalid KSYMBOL type 0x%x\n", __func__, ksym_type);
>> +}
>> +
>> void perf_event_itrace_started(struct perf_event *event)
>> {
>> event->attach_state |= PERF_ATTACH_ITRACE;
>> @@ -9900,6 +9994,8 @@ static void account_event(struct perf_event *event)
>> inc = true;
>> if (is_cgroup_event(event))
>> inc = true;
>> + if (event->attr.ksymbol)
>> + atomic_inc(&nr_ksymbol_events);
>>
>> if (inc) {
>> /*
>> --
>> 2.17.1
>
> --
>
> - Arnaldo
Em Thu, Jan 10, 2019 at 06:40:37PM +0000, Song Liu escreveu:
>
>
> > On Jan 10, 2019, at 10:24 AM, Arnaldo Carvalho de Melo <[email protected]> wrote:
> >
> > Em Wed, Jan 09, 2019 at 11:21:05AM -0800, Song Liu escreveu:
> >> For better performance analysis of dynamically JITed and loaded kernel
> >> functions, such as BPF programs, this patch introduces
> >> PERF_RECORD_KSYMBOL, a new perf_event_type that exposes kernel symbol
> >> register/unregister information to user space.
> >>
> >> The following data structure is used for PERF_RECORD_KSYMBOL.
> >>
> >> /*
> >> * struct {
> >> * struct perf_event_header header;
> >> * u64 addr;
> >> * u32 len;
> >> * u16 ksym_type;
> >> * u16 flags;
> >> * char name[];
> >> * struct sample_id sample_id;
> >> * };
> >> */
> >
> > So, I couldn't find where this gets used, the intention here is just to
> > add the interfaces and afterwards is that you will wire this up? I would
> > like to test the whole shebang to see it working.
>
> I guess you meant PERF_RECORD_BPF_EVENT not being used?
>
> PERF_RECORD_KSYMBOL is used by BPF in 3/7 and 5/7. I tested
Oops, I didn't look at 3/7, just read its cset summary line and as it
says:
Subject: [PATCH v6 perf, bpf-next 3/7] perf, bpf: introduce PERF_RECORD_BPF_EVENT
I didn't thought it was related, perhaps break it down into one that
states that it is wiring up PERF_RECORD_KSYMBOL, and at that point we
could just test it, getting the notifications for new kallsyms related
to BPF?
> PERF_RECORD_BPF_EVENT with dump_trace. As we separate RECORD_KSYMBOL from
> RECORD_BPF_EVENT, user space won't use BPF_EVENT until annotation support.
Right, so why not just introduce PERF_RECORD_KSYMBOL, make it be used by
tooling, etc, then move on to PERF_RECORD_BPF_EVENT?
- Arnaldo
> Thanks,
> Song
>
> >> Signed-off-by: Song Liu <[email protected]>
> >> ---
> >> include/linux/perf_event.h | 13 +++++
> >> include/uapi/linux/perf_event.h | 26 ++++++++-
> >> kernel/events/core.c | 98 ++++++++++++++++++++++++++++++++-
> >> 3 files changed, 135 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> >> index 1d5c551a5add..6b5f08db5ef3 100644
> >> --- a/include/linux/perf_event.h
> >> +++ b/include/linux/perf_event.h
> >> @@ -1113,6 +1113,13 @@ static inline void perf_event_task_sched_out(struct task_struct *prev,
> >> }
> >>
> >> extern void perf_event_mmap(struct vm_area_struct *vma);
> >> +
> >> +/* callback function to generate ksymbol name */
> >> +typedef int (perf_ksymbol_get_name_f)(char *name, int name_len, void *data);
> >> +extern void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len,
> >> + bool unregister,
> >> + perf_ksymbol_get_name_f get_name, void *data);
> >> +
> >> extern struct perf_guest_info_callbacks *perf_guest_cbs;
> >> extern int perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks);
> >> extern int perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks);
> >> @@ -1333,6 +1340,12 @@ static inline int perf_unregister_guest_info_callbacks
> >> (struct perf_guest_info_callbacks *callbacks) { return 0; }
> >>
> >> static inline void perf_event_mmap(struct vm_area_struct *vma) { }
> >> +
> >> +typedef int (perf_ksymbol_get_name_f)(char *name, int name_len, void *data);
> >> +static inline void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len,
> >> + bool unregister,
> >> + perf_ksymbol_get_name_f get_name,
> >> + void *data) { }
> >> static inline void perf_event_exec(void) { }
> >> static inline void perf_event_comm(struct task_struct *tsk, bool exec) { }
> >> static inline void perf_event_namespaces(struct task_struct *tsk) { }
> >> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> >> index 9de8780ac8d9..68c4da0227c5 100644
> >> --- a/include/uapi/linux/perf_event.h
> >> +++ b/include/uapi/linux/perf_event.h
> >> @@ -372,7 +372,8 @@ struct perf_event_attr {
> >> context_switch : 1, /* context switch data */
> >> write_backward : 1, /* Write ring buffer from end to beginning */
> >> namespaces : 1, /* include namespaces data */
> >> - __reserved_1 : 35;
> >> + ksymbol : 1, /* include ksymbol events */
> >> + __reserved_1 : 34;
> >>
> >> union {
> >> __u32 wakeup_events; /* wakeup every n events */
> >> @@ -965,9 +966,32 @@ enum perf_event_type {
> >> */
> >> PERF_RECORD_NAMESPACES = 16,
> >>
> >> + /*
> >> + * Record ksymbol register/unregister events:
> >> + *
> >> + * struct {
> >> + * struct perf_event_header header;
> >> + * u64 addr;
> >> + * u32 len;
> >> + * u16 ksym_type;
> >> + * u16 flags;
> >> + * char name[];
> >> + * struct sample_id sample_id;
> >> + * };
> >> + */
> >> + PERF_RECORD_KSYMBOL = 17,
> >> +
> >> PERF_RECORD_MAX, /* non-ABI */
> >> };
> >>
> >> +enum perf_record_ksymbol_type {
> >> + PERF_RECORD_KSYMBOL_TYPE_UNKNOWN = 0,
> >> + PERF_RECORD_KSYMBOL_TYPE_BPF = 1,
> >> + PERF_RECORD_KSYMBOL_TYPE_MAX /* non-ABI */
> >> +};
> >> +
> >> +#define PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER (1 << 0)
> >> +
> >> #define PERF_MAX_STACK_DEPTH 127
> >> #define PERF_MAX_CONTEXTS_PER_STACK 8
> >>
> >> diff --git a/kernel/events/core.c b/kernel/events/core.c
> >> index 3cd13a30f732..ef27f2776999 100644
> >> --- a/kernel/events/core.c
> >> +++ b/kernel/events/core.c
> >> @@ -385,6 +385,7 @@ static atomic_t nr_namespaces_events __read_mostly;
> >> static atomic_t nr_task_events __read_mostly;
> >> static atomic_t nr_freq_events __read_mostly;
> >> static atomic_t nr_switch_events __read_mostly;
> >> +static atomic_t nr_ksymbol_events __read_mostly;
> >>
> >> static LIST_HEAD(pmus);
> >> static DEFINE_MUTEX(pmus_lock);
> >> @@ -4235,7 +4236,7 @@ static bool is_sb_event(struct perf_event *event)
> >>
> >> if (attr->mmap || attr->mmap_data || attr->mmap2 ||
> >> attr->comm || attr->comm_exec ||
> >> - attr->task ||
> >> + attr->task || attr->ksymbol ||
> >> attr->context_switch)
> >> return true;
> >> return false;
> >> @@ -4305,6 +4306,8 @@ static void unaccount_event(struct perf_event *event)
> >> dec = true;
> >> if (has_branch_stack(event))
> >> dec = true;
> >> + if (event->attr.ksymbol)
> >> + atomic_dec(&nr_ksymbol_events);
> >>
> >> if (dec) {
> >> if (!atomic_add_unless(&perf_sched_count, -1, 1))
> >> @@ -7650,6 +7653,97 @@ static void perf_log_throttle(struct perf_event *event, int enable)
> >> perf_output_end(&handle);
> >> }
> >>
> >> +/*
> >> + * ksymbol register/unregister tracking
> >> + */
> >> +
> >> +struct perf_ksymbol_event {
> >> + const char *name;
> >> + int name_len;
> >> + struct {
> >> + struct perf_event_header header;
> >> + u64 addr;
> >> + u32 len;
> >> + u16 ksym_type;
> >> + u16 flags;
> >> + } event_id;
> >> +};
> >> +
> >> +static int perf_event_ksymbol_match(struct perf_event *event)
> >> +{
> >> + return event->attr.ksymbol;
> >> +}
> >> +
> >> +static void perf_event_ksymbol_output(struct perf_event *event, void *data)
> >> +{
> >> + struct perf_ksymbol_event *ksymbol_event = data;
> >> + struct perf_output_handle handle;
> >> + struct perf_sample_data sample;
> >> + int ret;
> >> +
> >> + if (!perf_event_ksymbol_match(event))
> >> + return;
> >> +
> >> + perf_event_header__init_id(&ksymbol_event->event_id.header,
> >> + &sample, event);
> >> + ret = perf_output_begin(&handle, event,
> >> + ksymbol_event->event_id.header.size);
> >> + if (ret)
> >> + return;
> >> +
> >> + perf_output_put(&handle, ksymbol_event->event_id);
> >> + __output_copy(&handle, ksymbol_event->name, ksymbol_event->name_len);
> >> + perf_event__output_id_sample(event, &handle, &sample);
> >> +
> >> + perf_output_end(&handle);
> >> +}
> >> +
> >> +void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len, bool unregister,
> >> + perf_ksymbol_get_name_f get_name, void *data)
> >> +{
> >> + struct perf_ksymbol_event ksymbol_event;
> >> + char name[KSYM_NAME_LEN];
> >> + u16 flags = 0;
> >> + int name_len;
> >> +
> >> + if (!atomic_read(&nr_ksymbol_events))
> >> + return;
> >> +
> >> + if (ksym_type >= PERF_RECORD_KSYMBOL_TYPE_MAX ||
> >> + ksym_type == PERF_RECORD_KSYMBOL_TYPE_UNKNOWN)
> >> + goto err;
> >> +
> >> + get_name(name, KSYM_NAME_LEN, data);
> >> + name_len = strlen(name) + 1;
> >> + while (!IS_ALIGNED(name_len, sizeof(u64)))
> >> + name[name_len++] = '\0';
> >> + BUILD_BUG_ON(KSYM_NAME_LEN % sizeof(u64));
> >> +
> >> + if (unregister)
> >> + flags |= PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER;
> >> +
> >> + ksymbol_event = (struct perf_ksymbol_event){
> >> + .name = name,
> >> + .name_len = name_len,
> >> + .event_id = {
> >> + .header = {
> >> + .type = PERF_RECORD_KSYMBOL,
> >> + .size = sizeof(ksymbol_event.event_id) +
> >> + name_len,
> >> + },
> >> + .addr = addr,
> >> + .len = len,
> >> + .ksym_type = ksym_type,
> >> + .flags = flags,
> >> + },
> >> + };
> >> +
> >> + perf_iterate_sb(perf_event_ksymbol_output, &ksymbol_event, NULL);
> >> + return;
> >> +err:
> >> + WARN_ONCE(1, "%s: Invalid KSYMBOL type 0x%x\n", __func__, ksym_type);
> >> +}
> >> +
> >> void perf_event_itrace_started(struct perf_event *event)
> >> {
> >> event->attach_state |= PERF_ATTACH_ITRACE;
> >> @@ -9900,6 +9994,8 @@ static void account_event(struct perf_event *event)
> >> inc = true;
> >> if (is_cgroup_event(event))
> >> inc = true;
> >> + if (event->attr.ksymbol)
> >> + atomic_inc(&nr_ksymbol_events);
> >>
> >> if (inc) {
> >> /*
> >> --
> >> 2.17.1
> >
> > --
> >
> > - Arnaldo
--
- Arnaldo
Em Wed, Jan 09, 2019 at 11:21:05AM -0800, Song Liu escreveu:
> For better performance analysis of dynamically JITed and loaded kernel
> functions, such as BPF programs, this patch introduces
> PERF_RECORD_KSYMBOL, a new perf_event_type that exposes kernel symbol
> register/unregister information to user space.
>
> The following data structure is used for PERF_RECORD_KSYMBOL.
>
> /*
> * struct {
> * struct perf_event_header header;
> * u64 addr;
> * u32 len;
> * u16 ksym_type;
> * u16 flags;
> * char name[];
> * struct sample_id sample_id;
> * };
> */
So, I couldn't find where this gets used, the intention here is just to
add the interfaces and afterwards is that you will wire this up? I would
like to test the whole shebang to see it working.
- Arnaldo
> Signed-off-by: Song Liu <[email protected]>
> ---
> include/linux/perf_event.h | 13 +++++
> include/uapi/linux/perf_event.h | 26 ++++++++-
> kernel/events/core.c | 98 ++++++++++++++++++++++++++++++++-
> 3 files changed, 135 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 1d5c551a5add..6b5f08db5ef3 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -1113,6 +1113,13 @@ static inline void perf_event_task_sched_out(struct task_struct *prev,
> }
>
> extern void perf_event_mmap(struct vm_area_struct *vma);
> +
> +/* callback function to generate ksymbol name */
> +typedef int (perf_ksymbol_get_name_f)(char *name, int name_len, void *data);
> +extern void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len,
> + bool unregister,
> + perf_ksymbol_get_name_f get_name, void *data);
> +
> extern struct perf_guest_info_callbacks *perf_guest_cbs;
> extern int perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks);
> extern int perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks);
> @@ -1333,6 +1340,12 @@ static inline int perf_unregister_guest_info_callbacks
> (struct perf_guest_info_callbacks *callbacks) { return 0; }
>
> static inline void perf_event_mmap(struct vm_area_struct *vma) { }
> +
> +typedef int (perf_ksymbol_get_name_f)(char *name, int name_len, void *data);
> +static inline void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len,
> + bool unregister,
> + perf_ksymbol_get_name_f get_name,
> + void *data) { }
> static inline void perf_event_exec(void) { }
> static inline void perf_event_comm(struct task_struct *tsk, bool exec) { }
> static inline void perf_event_namespaces(struct task_struct *tsk) { }
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 9de8780ac8d9..68c4da0227c5 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -372,7 +372,8 @@ struct perf_event_attr {
> context_switch : 1, /* context switch data */
> write_backward : 1, /* Write ring buffer from end to beginning */
> namespaces : 1, /* include namespaces data */
> - __reserved_1 : 35;
> + ksymbol : 1, /* include ksymbol events */
> + __reserved_1 : 34;
>
> union {
> __u32 wakeup_events; /* wakeup every n events */
> @@ -965,9 +966,32 @@ enum perf_event_type {
> */
> PERF_RECORD_NAMESPACES = 16,
>
> + /*
> + * Record ksymbol register/unregister events:
> + *
> + * struct {
> + * struct perf_event_header header;
> + * u64 addr;
> + * u32 len;
> + * u16 ksym_type;
> + * u16 flags;
> + * char name[];
> + * struct sample_id sample_id;
> + * };
> + */
> + PERF_RECORD_KSYMBOL = 17,
> +
> PERF_RECORD_MAX, /* non-ABI */
> };
>
> +enum perf_record_ksymbol_type {
> + PERF_RECORD_KSYMBOL_TYPE_UNKNOWN = 0,
> + PERF_RECORD_KSYMBOL_TYPE_BPF = 1,
> + PERF_RECORD_KSYMBOL_TYPE_MAX /* non-ABI */
> +};
> +
> +#define PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER (1 << 0)
> +
> #define PERF_MAX_STACK_DEPTH 127
> #define PERF_MAX_CONTEXTS_PER_STACK 8
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 3cd13a30f732..ef27f2776999 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -385,6 +385,7 @@ static atomic_t nr_namespaces_events __read_mostly;
> static atomic_t nr_task_events __read_mostly;
> static atomic_t nr_freq_events __read_mostly;
> static atomic_t nr_switch_events __read_mostly;
> +static atomic_t nr_ksymbol_events __read_mostly;
>
> static LIST_HEAD(pmus);
> static DEFINE_MUTEX(pmus_lock);
> @@ -4235,7 +4236,7 @@ static bool is_sb_event(struct perf_event *event)
>
> if (attr->mmap || attr->mmap_data || attr->mmap2 ||
> attr->comm || attr->comm_exec ||
> - attr->task ||
> + attr->task || attr->ksymbol ||
> attr->context_switch)
> return true;
> return false;
> @@ -4305,6 +4306,8 @@ static void unaccount_event(struct perf_event *event)
> dec = true;
> if (has_branch_stack(event))
> dec = true;
> + if (event->attr.ksymbol)
> + atomic_dec(&nr_ksymbol_events);
>
> if (dec) {
> if (!atomic_add_unless(&perf_sched_count, -1, 1))
> @@ -7650,6 +7653,97 @@ static void perf_log_throttle(struct perf_event *event, int enable)
> perf_output_end(&handle);
> }
>
> +/*
> + * ksymbol register/unregister tracking
> + */
> +
> +struct perf_ksymbol_event {
> + const char *name;
> + int name_len;
> + struct {
> + struct perf_event_header header;
> + u64 addr;
> + u32 len;
> + u16 ksym_type;
> + u16 flags;
> + } event_id;
> +};
> +
> +static int perf_event_ksymbol_match(struct perf_event *event)
> +{
> + return event->attr.ksymbol;
> +}
> +
> +static void perf_event_ksymbol_output(struct perf_event *event, void *data)
> +{
> + struct perf_ksymbol_event *ksymbol_event = data;
> + struct perf_output_handle handle;
> + struct perf_sample_data sample;
> + int ret;
> +
> + if (!perf_event_ksymbol_match(event))
> + return;
> +
> + perf_event_header__init_id(&ksymbol_event->event_id.header,
> + &sample, event);
> + ret = perf_output_begin(&handle, event,
> + ksymbol_event->event_id.header.size);
> + if (ret)
> + return;
> +
> + perf_output_put(&handle, ksymbol_event->event_id);
> + __output_copy(&handle, ksymbol_event->name, ksymbol_event->name_len);
> + perf_event__output_id_sample(event, &handle, &sample);
> +
> + perf_output_end(&handle);
> +}
> +
> +void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len, bool unregister,
> + perf_ksymbol_get_name_f get_name, void *data)
> +{
> + struct perf_ksymbol_event ksymbol_event;
> + char name[KSYM_NAME_LEN];
> + u16 flags = 0;
> + int name_len;
> +
> + if (!atomic_read(&nr_ksymbol_events))
> + return;
> +
> + if (ksym_type >= PERF_RECORD_KSYMBOL_TYPE_MAX ||
> + ksym_type == PERF_RECORD_KSYMBOL_TYPE_UNKNOWN)
> + goto err;
> +
> + get_name(name, KSYM_NAME_LEN, data);
> + name_len = strlen(name) + 1;
> + while (!IS_ALIGNED(name_len, sizeof(u64)))
> + name[name_len++] = '\0';
> + BUILD_BUG_ON(KSYM_NAME_LEN % sizeof(u64));
> +
> + if (unregister)
> + flags |= PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER;
> +
> + ksymbol_event = (struct perf_ksymbol_event){
> + .name = name,
> + .name_len = name_len,
> + .event_id = {
> + .header = {
> + .type = PERF_RECORD_KSYMBOL,
> + .size = sizeof(ksymbol_event.event_id) +
> + name_len,
> + },
> + .addr = addr,
> + .len = len,
> + .ksym_type = ksym_type,
> + .flags = flags,
> + },
> + };
> +
> + perf_iterate_sb(perf_event_ksymbol_output, &ksymbol_event, NULL);
> + return;
> +err:
> + WARN_ONCE(1, "%s: Invalid KSYMBOL type 0x%x\n", __func__, ksym_type);
> +}
> +
> void perf_event_itrace_started(struct perf_event *event)
> {
> event->attach_state |= PERF_ATTACH_ITRACE;
> @@ -9900,6 +9994,8 @@ static void account_event(struct perf_event *event)
> inc = true;
> if (is_cgroup_event(event))
> inc = true;
> + if (event->attr.ksymbol)
> + atomic_inc(&nr_ksymbol_events);
>
> if (inc) {
> /*
> --
> 2.17.1
--
- Arnaldo
> On Jan 10, 2019, at 10:55 AM, Arnaldo Carvalho de Melo <[email protected]> wrote:
>
> Em Thu, Jan 10, 2019 at 06:40:37PM +0000, Song Liu escreveu:
>>
>>
>>> On Jan 10, 2019, at 10:24 AM, Arnaldo Carvalho de Melo <[email protected]> wrote:
>>>
>>> Em Wed, Jan 09, 2019 at 11:21:05AM -0800, Song Liu escreveu:
>>>> For better performance analysis of dynamically JITed and loaded kernel
>>>> functions, such as BPF programs, this patch introduces
>>>> PERF_RECORD_KSYMBOL, a new perf_event_type that exposes kernel symbol
>>>> register/unregister information to user space.
>>>>
>>>> The following data structure is used for PERF_RECORD_KSYMBOL.
>>>>
>>>> /*
>>>> * struct {
>>>> * struct perf_event_header header;
>>>> * u64 addr;
>>>> * u32 len;
>>>> * u16 ksym_type;
>>>> * u16 flags;
>>>> * char name[];
>>>> * struct sample_id sample_id;
>>>> * };
>>>> */
>>>
>>> So, I couldn't find where this gets used, the intention here is just to
>>> add the interfaces and afterwards is that you will wire this up? I would
>>> like to test the whole shebang to see it working.
>>
>> I guess you meant PERF_RECORD_BPF_EVENT not being used?
>>
>> PERF_RECORD_KSYMBOL is used by BPF in 3/7 and 5/7. I tested
>
> Oops, I didn't look at 3/7, just read its cset summary line and as it
> says:
>
> Subject: [PATCH v6 perf, bpf-next 3/7] perf, bpf: introduce PERF_RECORD_BPF_EVENT
>
> I didn't thought it was related, perhaps break it down into one that
> states that it is wiring up PERF_RECORD_KSYMBOL, and at that point we
> could just test it, getting the notifications for new kallsyms related
> to BPF?
Good idea! I will split it into two patches as:
[3/8] perf, bpf: generate PERF_RECORD_KSYMBOL for BPF program
[4/8] perf, bpf: introduce PERF_RECORD_BPF_EVENT
>
>> PERF_RECORD_BPF_EVENT with dump_trace. As we separate RECORD_KSYMBOL from
>> RECORD_BPF_EVENT, user space won't use BPF_EVENT until annotation support.
>
> Right, so why not just introduce PERF_RECORD_KSYMBOL, make it be used by
> tooling, etc, then move on to PERF_RECORD_BPF_EVENT?
I'd like to make sure we all agree on the new ABI for RECORD_KSYMBOL and
RECORD_BPF_EVENT. Multiple user space tools dependent on RECORD_BPF_EVENT,
for example, bcc and auditing. Finalizing RECORD_BPF_EVENT will unblock the
development of these tools. On perf side, it will take us quite some time
to finish annotation. Ideally, I don't want to block the development of
other tools for so long.
Thanks,
Song
Em Thu, Jan 10, 2019 at 07:30:22PM +0000, Song Liu escreveu:
>
>
> > On Jan 10, 2019, at 10:55 AM, Arnaldo Carvalho de Melo <[email protected]> wrote:
> >
> > Em Thu, Jan 10, 2019 at 06:40:37PM +0000, Song Liu escreveu:
> >>
> >>
> >>> On Jan 10, 2019, at 10:24 AM, Arnaldo Carvalho de Melo <[email protected]> wrote:
> >>>
> >>> Em Wed, Jan 09, 2019 at 11:21:05AM -0800, Song Liu escreveu:
> >>>> For better performance analysis of dynamically JITed and loaded kernel
> >>>> functions, such as BPF programs, this patch introduces
> >>>> PERF_RECORD_KSYMBOL, a new perf_event_type that exposes kernel symbol
> >>>> register/unregister information to user space.
> >>>>
> >>>> The following data structure is used for PERF_RECORD_KSYMBOL.
> >>>>
> >>>> /*
> >>>> * struct {
> >>>> * struct perf_event_header header;
> >>>> * u64 addr;
> >>>> * u32 len;
> >>>> * u16 ksym_type;
> >>>> * u16 flags;
> >>>> * char name[];
> >>>> * struct sample_id sample_id;
> >>>> * };
> >>>> */
> >>>
> >>> So, I couldn't find where this gets used, the intention here is just to
> >>> add the interfaces and afterwards is that you will wire this up? I would
> >>> like to test the whole shebang to see it working.
> >>
> >> I guess you meant PERF_RECORD_BPF_EVENT not being used?
> >>
> >> PERF_RECORD_KSYMBOL is used by BPF in 3/7 and 5/7. I tested
> >
> > Oops, I didn't look at 3/7, just read its cset summary line and as it
> > says:
> >
> > Subject: [PATCH v6 perf, bpf-next 3/7] perf, bpf: introduce PERF_RECORD_BPF_EVENT
> >
> > I didn't thought it was related, perhaps break it down into one that
> > states that it is wiring up PERF_RECORD_KSYMBOL, and at that point we
> > could just test it, getting the notifications for new kallsyms related
> > to BPF?
>
> Good idea! I will split it into two patches as:
>
> [3/8] perf, bpf: generate PERF_RECORD_KSYMBOL for BPF program
> [4/8] perf, bpf: introduce PERF_RECORD_BPF_EVENT
Thanks! I'm juggling a lot of stuff right now, so I didn't read all
patches in the series, just the first one and when I couldn't find where
perf_event_ksymbol() was being called in that patch nor by looking at
just the Subject for the others, I gave up and got back to pahole day :-)
> >> PERF_RECORD_BPF_EVENT with dump_trace. As we separate RECORD_KSYMBOL from
> >> RECORD_BPF_EVENT, user space won't use BPF_EVENT until annotation support.
> >
> > Right, so why not just introduce PERF_RECORD_KSYMBOL, make it be used by
> > tooling, etc, then move on to PERF_RECORD_BPF_EVENT?
>
> I'd like to make sure we all agree on the new ABI for RECORD_KSYMBOL and
> RECORD_BPF_EVENT. Multiple user space tools dependent on RECORD_BPF_EVENT,
> for example, bcc and auditing. Finalizing RECORD_BPF_EVENT will unblock the
> development of these tools. On perf side, it will take us quite some time
> to finish annotation. Ideally, I don't want to block the development of
> other tools for so long.
With that 3/7 split I guess we can go on with what is in this patchset
if PeterZ is happy with it.
- Arnaldo
> On Jan 10, 2019, at 11:30 AM, Song Liu <[email protected]> wrote:
>
>
>
>> On Jan 10, 2019, at 10:55 AM, Arnaldo Carvalho de Melo <[email protected]> wrote:
>>
>> Em Thu, Jan 10, 2019 at 06:40:37PM +0000, Song Liu escreveu:
>>>
>>>
>>>> On Jan 10, 2019, at 10:24 AM, Arnaldo Carvalho de Melo <[email protected]> wrote:
>>>>
>>>> Em Wed, Jan 09, 2019 at 11:21:05AM -0800, Song Liu escreveu:
>>>>> For better performance analysis of dynamically JITed and loaded kernel
>>>>> functions, such as BPF programs, this patch introduces
>>>>> PERF_RECORD_KSYMBOL, a new perf_event_type that exposes kernel symbol
>>>>> register/unregister information to user space.
>>>>>
>>>>> The following data structure is used for PERF_RECORD_KSYMBOL.
>>>>>
>>>>> /*
>>>>> * struct {
>>>>> * struct perf_event_header header;
>>>>> * u64 addr;
>>>>> * u32 len;
>>>>> * u16 ksym_type;
>>>>> * u16 flags;
>>>>> * char name[];
>>>>> * struct sample_id sample_id;
>>>>> * };
>>>>> */
>>>>
>>>> So, I couldn't find where this gets used, the intention here is just to
>>>> add the interfaces and afterwards is that you will wire this up? I would
>>>> like to test the whole shebang to see it working.
>>>
>>> I guess you meant PERF_RECORD_BPF_EVENT not being used?
>>>
>>> PERF_RECORD_KSYMBOL is used by BPF in 3/7 and 5/7. I tested
>>
>> Oops, I didn't look at 3/7, just read its cset summary line and as it
>> says:
>>
>> Subject: [PATCH v6 perf, bpf-next 3/7] perf, bpf: introduce PERF_RECORD_BPF_EVENT
>>
>> I didn't thought it was related, perhaps break it down into one that
>> states that it is wiring up PERF_RECORD_KSYMBOL, and at that point we
>> could just test it, getting the notifications for new kallsyms related
>> to BPF?
>
> Good idea! I will split it into two patches as:
>
> [3/8] perf, bpf: generate PERF_RECORD_KSYMBOL for BPF program
> [4/8] perf, bpf: introduce PERF_RECORD_BPF_EVENT
>
>>
>>> PERF_RECORD_BPF_EVENT with dump_trace. As we separate RECORD_KSYMBOL from
>>> RECORD_BPF_EVENT, user space won't use BPF_EVENT until annotation support.
>>
>> Right, so why not just introduce PERF_RECORD_KSYMBOL, make it be used by
>> tooling, etc, then move on to PERF_RECORD_BPF_EVENT?
>
> I'd like to make sure we all agree on the new ABI for RECORD_KSYMBOL and
> RECORD_BPF_EVENT. Multiple user space tools dependent on RECORD_BPF_EVENT,
> for example, bcc and auditing. Finalizing RECORD_BPF_EVENT will unblock the
> development of these tools. On perf side, it will take us quite some time
> to finish annotation. Ideally, I don't want to block the development of
> other tools for so long.
>
> Thanks,
> Song
+ DavidA
Hi David,
Could you please share your feedback on PERF_RECORD_BPF_EVENT for auditing
use cases?
Thanks,
Song
On 1/10/19 12:45 PM, Song Liu wrote:
> Could you please share your feedback on PERF_RECORD_BPF_EVENT for auditing
> use cases?
Google shows Daniel was the one looking at audit use cases:
https://www.mail-archive.com/[email protected]/msg250728.html
My comment was that using a PERF_RECORD_BPF_EVENT limits the usability
with combinations of other tracepoints (e.g, scheduling) when tracing
processes.