2023-01-18 06:51:15

by Namhyung Kim

[permalink] [raw]
Subject: [PATCHSET 0/8] perf/core: Prepare sample data for BPF (v3)

Hello,

The perf_prepare_sample() is to fill the perf sample data and update the
header info before sending it to the ring buffer. But we want to use it
for BPF overflow handler so that it can access the sample data to filter
relevant ones.

Changes in v3)
* update data->sample_flags in __perf_event_header__init_id() (PeterZ)
* add Acked-by's from Jiri and Song

Changes in v2)
* the layout change is merged
* reduce branches using __cond_set (PeterZ)
* add helpers to set dynamic sample data (PeterZ)
* introduce perf_prepare_header() (PeterZ)
* call perf_prepare_sample() before bpf_overflow_handler unconditionally

This means the perf_prepare_handler() can be called more than once. To
avoid duplicate work, use the data->sample_flags and save the data size.

I also added a few of helpers to set those information accordingly.
But it looks some fields like REGS_USER, STACK_USER and AUX are saved in
the perf_prepare_sample() so I didn't add the helpers for them.

After than we can just check the filtered_sample_type flags begin zero
to determine if it has more work. In that case, it needs to update the
data->type since it's possible to miss when PMU driver sets all required
sample flags before calling perf_prepare_sample().

The code is also available at 'perf/prepare-sample-v3' branch in

git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung

Cc: Song Liu <[email protected]>
Cc: [email protected]

Namhyung Kim (8):
perf/core: Save the dynamic parts of sample data size
perf/core: Add perf_sample_save_callchain() helper
perf/core: Add perf_sample_save_raw_data() helper
perf/core: Add perf_sample_save_brstack() helper
perf/core: Set data->sample_flags in perf_prepare_sample()
perf/core: Do not pass header for sample id init
perf/core: Introduce perf_prepare_header()
perf/core: Call perf_prepare_sample() before running BPF

arch/powerpc/perf/core-book3s.c | 3 +-
arch/s390/kernel/perf_cpum_cf.c | 4 +-
arch/s390/kernel/perf_cpum_sf.c | 3 +-
arch/s390/kernel/perf_pai_crypto.c | 4 +-
arch/s390/kernel/perf_pai_ext.c | 4 +-
arch/x86/events/amd/core.c | 6 +-
arch/x86/events/amd/ibs.c | 9 +-
arch/x86/events/intel/core.c | 6 +-
arch/x86/events/intel/ds.c | 24 ++--
include/linux/perf_event.h | 133 +++++++++++++-----
kernel/events/core.c | 207 ++++++++++++++++-------------
kernel/trace/bpf_trace.c | 6 +-
12 files changed, 236 insertions(+), 173 deletions(-)


base-commit: 9fcad995c6c52cc9791f7ee9f1386a5684055f9c
--
2.39.0.314.g84b9a713c41-goog


2023-01-18 07:09:42

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 1/8] perf/core: Save the dynamic parts of sample data size

The perf sample data can be divided into parts. The event->header_size
and event->id_header_size keep the static part of the sample data which
is determined by the sample_type flags.

But other parts like CALLCHAIN and BRANCH_STACK are changing dynamically
so it needs to see the actual data. In preparation of handling repeated
calls for perf_prepare_sample(), it can save the dynamic size to the
perf sample data to avoid the duplicate work.

Acked-by: Jiri Olsa <[email protected]>
Acked-by: Song Liu <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
include/linux/perf_event.h | 2 ++
kernel/events/core.c | 17 ++++++++++-------
2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 03949d017ac9..16b980014449 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1103,6 +1103,7 @@ struct perf_sample_data {
*/
u64 sample_flags;
u64 period;
+ u64 dyn_size;

/*
* Fields commonly set by __perf_event_header__init_id(),
@@ -1158,6 +1159,7 @@ static inline void perf_sample_data_init(struct perf_sample_data *data,
/* remaining struct members initialized in perf_prepare_sample() */
data->sample_flags = PERF_SAMPLE_PERIOD;
data->period = period;
+ data->dyn_size = 0;

if (addr) {
data->addr = addr;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index eacc3702654d..8c8de26f04ab 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7593,7 +7593,7 @@ void perf_prepare_sample(struct perf_event_header *header,

size += data->callchain->nr;

- header->size += size * sizeof(u64);
+ data->dyn_size += size * sizeof(u64);
}

if (sample_type & PERF_SAMPLE_RAW) {
@@ -7619,7 +7619,7 @@ void perf_prepare_sample(struct perf_event_header *header,
data->raw = NULL;
}

- header->size += size;
+ data->dyn_size += size;
}

if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
@@ -7631,7 +7631,7 @@ void perf_prepare_sample(struct perf_event_header *header,
size += data->br_stack->nr
* sizeof(struct perf_branch_entry);
}
- header->size += size;
+ data->dyn_size += size;
}

if (sample_type & (PERF_SAMPLE_REGS_USER | PERF_SAMPLE_STACK_USER))
@@ -7646,7 +7646,7 @@ void perf_prepare_sample(struct perf_event_header *header,
size += hweight64(mask) * sizeof(u64);
}

- header->size += size;
+ data->dyn_size += size;
}

if (sample_type & PERF_SAMPLE_STACK_USER) {
@@ -7671,7 +7671,7 @@ void perf_prepare_sample(struct perf_event_header *header,
size += sizeof(u64) + stack_size;

data->stack_user_size = stack_size;
- header->size += size;
+ data->dyn_size += size;
}

if (filtered_sample_type & PERF_SAMPLE_WEIGHT_TYPE)
@@ -7700,7 +7700,7 @@ void perf_prepare_sample(struct perf_event_header *header,
size += hweight64(mask) * sizeof(u64);
}

- header->size += size;
+ data->dyn_size += size;
}

if (sample_type & PERF_SAMPLE_PHYS_ADDR &&
@@ -7745,8 +7745,11 @@ void perf_prepare_sample(struct perf_event_header *header,
size = perf_prepare_sample_aux(event, data, size);

WARN_ON_ONCE(size + header->size > U16_MAX);
- header->size += size;
+ data->dyn_size += size + sizeof(u64); /* size above */
}
+
+ header->size += data->dyn_size;
+
/*
* If you're adding more sample types here, you likely need to do
* something about the overflowing header::size, like repurpose the
--
2.39.0.314.g84b9a713c41-goog

2023-01-18 07:19:52

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 5/8] perf/core: Set data->sample_flags in perf_prepare_sample()

The perf_prepare_sample() sets the perf_sample_data according to the
attr->sample_type before copying it to the ring buffer. But BPF also
wants to access the sample data so it needs to prepare the sample even
before the regular path.

That means the perf_prepare_sample() can be called more than once. Set
the data->sample_flags consistently so that it can indicate which fields
are set already and skip them if sets.

Also update the filtered_sample_type to have the dependent flags to
reduce the number of branches.

Suggested-by: Peter Zijlstra <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
kernel/events/core.c | 85 +++++++++++++++++++++++++++++++++-----------
1 file changed, 65 insertions(+), 20 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 0218b6ffaf36..1384137a90f7 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7053,12 +7053,21 @@ static void perf_aux_sample_output(struct perf_event *event,
ring_buffer_put(rb);
}

+/*
+ * A set of common sample data types saved even for non-sample records
+ * when event->attr.sample_id_all is set.
+ */
+#define PERF_SAMPLE_ID_ALL (PERF_SAMPLE_TID | PERF_SAMPLE_TIME | \
+ PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
+ PERF_SAMPLE_CPU | PERF_SAMPLE_IDENTIFIER)
+
static void __perf_event_header__init_id(struct perf_event_header *header,
struct perf_sample_data *data,
struct perf_event *event,
u64 sample_type)
{
data->type = event->attr.sample_type;
+ data->sample_flags |= data->type & PERF_SAMPLE_ID_ALL;
header->size += event->id_header_size;

if (sample_type & PERF_SAMPLE_TID) {
@@ -7561,6 +7570,11 @@ perf_callchain(struct perf_event *event, struct pt_regs *regs)
return callchain ?: &__empty_callchain;
}

+static __always_inline u64 __cond_set(u64 flags, u64 s, u64 d)
+{
+ return d * !!(flags & s);
+}
+
void perf_prepare_sample(struct perf_event_header *header,
struct perf_sample_data *data,
struct perf_event *event,
@@ -7576,14 +7590,24 @@ void perf_prepare_sample(struct perf_event_header *header,
header->misc |= perf_misc_flags(regs);

/*
- * Clear the sample flags that have already been done by the
- * PMU driver.
+ * Add the sample flags that are dependent to others. And clear the
+ * sample flags that have already been done by the PMU driver.
*/
- filtered_sample_type = sample_type & ~data->sample_flags;
+ filtered_sample_type = sample_type;
+ filtered_sample_type |= __cond_set(sample_type, PERF_SAMPLE_CODE_PAGE_SIZE,
+ PERF_SAMPLE_IP);
+ filtered_sample_type |= __cond_set(sample_type, PERF_SAMPLE_DATA_PAGE_SIZE |
+ PERF_SAMPLE_PHYS_ADDR, PERF_SAMPLE_ADDR);
+ filtered_sample_type |= __cond_set(sample_type, PERF_SAMPLE_STACK_USER,
+ PERF_SAMPLE_REGS_USER);
+ filtered_sample_type &= ~data->sample_flags;
+
__perf_event_header__init_id(header, data, event, filtered_sample_type);

- if (sample_type & (PERF_SAMPLE_IP | PERF_SAMPLE_CODE_PAGE_SIZE))
+ if (filtered_sample_type & PERF_SAMPLE_IP) {
data->ip = perf_instruction_pointer(regs);
+ data->sample_flags |= PERF_SAMPLE_IP;
+ }

if (filtered_sample_type & PERF_SAMPLE_CALLCHAIN)
perf_sample_save_callchain(data, event, regs);
@@ -7600,10 +7624,15 @@ void perf_prepare_sample(struct perf_event_header *header,
data->sample_flags |= PERF_SAMPLE_BRANCH_STACK;
}

- if (sample_type & (PERF_SAMPLE_REGS_USER | PERF_SAMPLE_STACK_USER))
+ if (filtered_sample_type & PERF_SAMPLE_REGS_USER)
perf_sample_regs_user(&data->regs_user, regs);

- if (sample_type & PERF_SAMPLE_REGS_USER) {
+ /*
+ * It cannot use the filtered_sample_type here as REGS_USER can be set
+ * by STACK_USER (using __cond_set() above) and we don't want to update
+ * the dyn_size if it's not requested by users.
+ */
+ if ((sample_type & ~data->sample_flags) & PERF_SAMPLE_REGS_USER) {
/* regs dump ABI info */
int size = sizeof(u64);

@@ -7613,9 +7642,10 @@ void perf_prepare_sample(struct perf_event_header *header,
}

data->dyn_size += size;
+ data->sample_flags |= PERF_SAMPLE_REGS_USER;
}

- if (sample_type & PERF_SAMPLE_STACK_USER) {
+ if (filtered_sample_type & PERF_SAMPLE_STACK_USER) {
/*
* Either we need PERF_SAMPLE_STACK_USER bit to be always
* processed as the last one or have additional check added
@@ -7638,23 +7668,30 @@ void perf_prepare_sample(struct perf_event_header *header,

data->stack_user_size = stack_size;
data->dyn_size += size;
+ data->sample_flags |= PERF_SAMPLE_STACK_USER;
}

- if (filtered_sample_type & PERF_SAMPLE_WEIGHT_TYPE)
+ if (filtered_sample_type & PERF_SAMPLE_WEIGHT_TYPE) {
data->weight.full = 0;
+ data->sample_flags |= PERF_SAMPLE_WEIGHT_TYPE;
+ }

- if (filtered_sample_type & PERF_SAMPLE_DATA_SRC)
+ if (filtered_sample_type & PERF_SAMPLE_DATA_SRC) {
data->data_src.val = PERF_MEM_NA;
+ data->sample_flags |= PERF_SAMPLE_DATA_SRC;
+ }

- if (filtered_sample_type & PERF_SAMPLE_TRANSACTION)
+ if (filtered_sample_type & PERF_SAMPLE_TRANSACTION) {
data->txn = 0;
+ data->sample_flags |= PERF_SAMPLE_TRANSACTION;
+ }

- if (sample_type & (PERF_SAMPLE_ADDR | PERF_SAMPLE_PHYS_ADDR | PERF_SAMPLE_DATA_PAGE_SIZE)) {
- if (filtered_sample_type & PERF_SAMPLE_ADDR)
- data->addr = 0;
+ if (filtered_sample_type & PERF_SAMPLE_ADDR) {
+ data->addr = 0;
+ data->sample_flags |= PERF_SAMPLE_ADDR;
}

- if (sample_type & PERF_SAMPLE_REGS_INTR) {
+ if (filtered_sample_type & PERF_SAMPLE_REGS_INTR) {
/* regs dump ABI info */
int size = sizeof(u64);

@@ -7667,19 +7704,22 @@ void perf_prepare_sample(struct perf_event_header *header,
}

data->dyn_size += size;
+ data->sample_flags |= PERF_SAMPLE_REGS_INTR;
}

- if (sample_type & PERF_SAMPLE_PHYS_ADDR &&
- filtered_sample_type & PERF_SAMPLE_PHYS_ADDR)
+ if (filtered_sample_type & PERF_SAMPLE_PHYS_ADDR) {
data->phys_addr = perf_virt_to_phys(data->addr);
+ data->sample_flags |= PERF_SAMPLE_PHYS_ADDR;
+ }

#ifdef CONFIG_CGROUP_PERF
- if (sample_type & PERF_SAMPLE_CGROUP) {
+ if (filtered_sample_type & PERF_SAMPLE_CGROUP) {
struct cgroup *cgrp;

/* protected by RCU */
cgrp = task_css_check(current, perf_event_cgrp_id, 1)->cgroup;
data->cgroup = cgroup_id(cgrp);
+ data->sample_flags |= PERF_SAMPLE_CGROUP;
}
#endif

@@ -7688,13 +7728,17 @@ void perf_prepare_sample(struct perf_event_header *header,
* require PERF_SAMPLE_ADDR, kernel implicitly retrieve the data->addr,
* but the value will not dump to the userspace.
*/
- if (sample_type & PERF_SAMPLE_DATA_PAGE_SIZE)
+ if (filtered_sample_type & PERF_SAMPLE_DATA_PAGE_SIZE) {
data->data_page_size = perf_get_page_size(data->addr);
+ data->sample_flags |= PERF_SAMPLE_DATA_PAGE_SIZE;
+ }

- if (sample_type & PERF_SAMPLE_CODE_PAGE_SIZE)
+ if (filtered_sample_type & PERF_SAMPLE_CODE_PAGE_SIZE) {
data->code_page_size = perf_get_page_size(data->ip);
+ data->sample_flags |= PERF_SAMPLE_CODE_PAGE_SIZE;
+ }

- if (sample_type & PERF_SAMPLE_AUX) {
+ if (filtered_sample_type & PERF_SAMPLE_AUX) {
u64 size;

header->size += sizeof(u64); /* size */
@@ -7712,6 +7756,7 @@ void perf_prepare_sample(struct perf_event_header *header,

WARN_ON_ONCE(size + header->size > U16_MAX);
data->dyn_size += size + sizeof(u64); /* size above */
+ data->sample_flags |= PERF_SAMPLE_AUX;
}

header->size += data->dyn_size;
--
2.39.0.314.g84b9a713c41-goog

2023-01-18 07:21:30

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 6/8] perf/core: Do not pass header for sample id init

The only thing it does for header in __perf_event_header__init_id() is
to update the header size with event->id_header_size. We can do this
outside and get rid of the argument for the later change.

Acked-by: Jiri Olsa <[email protected]>
Acked-by: Song Liu <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
kernel/events/core.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 1384137a90f7..9cc55122188f 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7061,14 +7061,12 @@ static void perf_aux_sample_output(struct perf_event *event,
PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
PERF_SAMPLE_CPU | PERF_SAMPLE_IDENTIFIER)

-static void __perf_event_header__init_id(struct perf_event_header *header,
- struct perf_sample_data *data,
+static void __perf_event_header__init_id(struct perf_sample_data *data,
struct perf_event *event,
u64 sample_type)
{
data->type = event->attr.sample_type;
data->sample_flags |= data->type & PERF_SAMPLE_ID_ALL;
- header->size += event->id_header_size;

if (sample_type & PERF_SAMPLE_TID) {
/* namespace issues */
@@ -7095,8 +7093,10 @@ void perf_event_header__init_id(struct perf_event_header *header,
struct perf_sample_data *data,
struct perf_event *event)
{
- if (event->attr.sample_id_all)
- __perf_event_header__init_id(header, data, event, event->attr.sample_type);
+ if (event->attr.sample_id_all) {
+ header->size += event->id_header_size;
+ __perf_event_header__init_id(data, event, event->attr.sample_type);
+ }
}

static void __perf_event__output_id_sample(struct perf_output_handle *handle,
@@ -7584,7 +7584,7 @@ void perf_prepare_sample(struct perf_event_header *header,
u64 filtered_sample_type;

header->type = PERF_RECORD_SAMPLE;
- header->size = sizeof(*header) + event->header_size;
+ header->size = sizeof(*header) + event->header_size + event->id_header_size;

header->misc = 0;
header->misc |= perf_misc_flags(regs);
@@ -7602,7 +7602,7 @@ void perf_prepare_sample(struct perf_event_header *header,
PERF_SAMPLE_REGS_USER);
filtered_sample_type &= ~data->sample_flags;

- __perf_event_header__init_id(header, data, event, filtered_sample_type);
+ __perf_event_header__init_id(data, event, filtered_sample_type);

if (filtered_sample_type & PERF_SAMPLE_IP) {
data->ip = perf_instruction_pointer(regs);
--
2.39.0.314.g84b9a713c41-goog

2023-01-18 07:30:09

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 4/8] perf/core: Add perf_sample_save_brstack() helper

When it saves the branch stack to the perf sample data, it needs to
update the sample flags and the dynamic size. To make sure this,
add the perf_sample_save_brstack() helper and convert all call sites.

Cc: [email protected]
Cc: [email protected]
Suggested-by: Peter Zijlstra <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
arch/powerpc/perf/core-book3s.c | 3 +-
arch/x86/events/amd/core.c | 6 +--
arch/x86/events/intel/core.c | 6 +--
arch/x86/events/intel/ds.c | 9 ++---
include/linux/perf_event.h | 66 ++++++++++++++++++++-------------
kernel/events/core.c | 16 +++-----
6 files changed, 53 insertions(+), 53 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index bf318dd9b709..8c1f7def596e 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2313,8 +2313,7 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
struct cpu_hw_events *cpuhw;
cpuhw = this_cpu_ptr(&cpu_hw_events);
power_pmu_bhrb_read(event, cpuhw);
- data.br_stack = &cpuhw->bhrb_stack;
- data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
+ perf_sample_save_brstack(&data, event, &cpuhw->bhrb_stack);
}

if (event->attr.sample_type & PERF_SAMPLE_DATA_SRC &&
diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index d6f3703e4119..463f3eb8bbd7 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -928,10 +928,8 @@ static int amd_pmu_v2_handle_irq(struct pt_regs *regs)
if (!x86_perf_event_set_period(event))
continue;

- if (has_branch_stack(event)) {
- data.br_stack = &cpuc->lbr_stack;
- data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
- }
+ if (has_branch_stack(event))
+ perf_sample_save_brstack(&data, event, &cpuc->lbr_stack);

if (perf_event_overflow(event, &data, regs))
x86_pmu_stop(event, 0);
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 29d2d0411caf..14f0a746257d 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3036,10 +3036,8 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)

perf_sample_data_init(&data, 0, event->hw.last_period);

- if (has_branch_stack(event)) {
- data.br_stack = &cpuc->lbr_stack;
- data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
- }
+ if (has_branch_stack(event))
+ perf_sample_save_brstack(&data, event, &cpuc->lbr_stack);

if (perf_event_overflow(event, &data, regs))
x86_pmu_stop(event, 0);
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 158cf845fc80..07c8a2cdc3ee 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1720,10 +1720,8 @@ static void setup_pebs_fixed_sample_data(struct perf_event *event,
data->sample_flags |= PERF_SAMPLE_TIME;
}

- if (has_branch_stack(event)) {
- data->br_stack = &cpuc->lbr_stack;
- data->sample_flags |= PERF_SAMPLE_BRANCH_STACK;
- }
+ if (has_branch_stack(event))
+ perf_sample_save_brstack(data, event, &cpuc->lbr_stack);
}

static void adaptive_pebs_save_regs(struct pt_regs *regs,
@@ -1883,8 +1881,7 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,

if (has_branch_stack(event)) {
intel_pmu_store_pebs_lbrs(lbr);
- data->br_stack = &cpuc->lbr_stack;
- data->sample_flags |= PERF_SAMPLE_BRANCH_STACK;
+ perf_sample_save_brstack(data, event, &cpuc->lbr_stack);
}
}

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 569dfac5887f..7db0e9cc2682 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1102,6 +1102,31 @@ extern u64 perf_event_read_value(struct perf_event *event,

extern struct perf_callchain_entry *perf_callchain(struct perf_event *event, struct pt_regs *regs);

+static inline bool branch_sample_no_flags(const struct perf_event *event)
+{
+ return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_NO_FLAGS;
+}
+
+static inline bool branch_sample_no_cycles(const struct perf_event *event)
+{
+ return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_NO_CYCLES;
+}
+
+static inline bool branch_sample_type(const struct perf_event *event)
+{
+ return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_TYPE_SAVE;
+}
+
+static inline bool branch_sample_hw_index(const struct perf_event *event)
+{
+ return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_HW_INDEX;
+}
+
+static inline bool branch_sample_priv(const struct perf_event *event)
+{
+ return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_PRIV_SAVE;
+}
+

struct perf_sample_data {
/*
@@ -1210,6 +1235,21 @@ static inline void perf_sample_save_raw_data(struct perf_sample_data *data,
data->sample_flags |= PERF_SAMPLE_RAW;
}

+static inline void perf_sample_save_brstack(struct perf_sample_data *data,
+ struct perf_event *event,
+ struct perf_branch_stack *brs)
+{
+ int size = sizeof(u64); /* nr */
+
+ if (branch_sample_hw_index(event))
+ size += sizeof(u64);
+ size += brs->nr * sizeof(struct perf_branch_entry);
+
+ data->br_stack = brs;
+ data->dyn_size += size;
+ data->sample_flags |= PERF_SAMPLE_BRANCH_STACK;
+}
+
/*
* Clear all bitfields in the perf_branch_entry.
* The to and from fields are not cleared because they are
@@ -1827,30 +1867,4 @@ static inline void perf_lopwr_cb(bool mode)
}
#endif

-#ifdef CONFIG_PERF_EVENTS
-static inline bool branch_sample_no_flags(const struct perf_event *event)
-{
- return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_NO_FLAGS;
-}
-
-static inline bool branch_sample_no_cycles(const struct perf_event *event)
-{
- return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_NO_CYCLES;
-}
-
-static inline bool branch_sample_type(const struct perf_event *event)
-{
- return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_TYPE_SAVE;
-}
-
-static inline bool branch_sample_hw_index(const struct perf_event *event)
-{
- return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_HW_INDEX;
-}
-
-static inline bool branch_sample_priv(const struct perf_event *event)
-{
- return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_PRIV_SAVE;
-}
-#endif /* CONFIG_PERF_EVENTS */
#endif /* _LINUX_PERF_EVENT_H */
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 133894ae5e30..0218b6ffaf36 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7317,7 +7317,7 @@ void perf_output_sample(struct perf_output_handle *handle,
}

if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
- if (data->sample_flags & PERF_SAMPLE_BRANCH_STACK) {
+ if (data->br_stack) {
size_t size;

size = data->br_stack->nr
@@ -7594,16 +7594,10 @@ void perf_prepare_sample(struct perf_event_header *header,
data->sample_flags |= PERF_SAMPLE_RAW;
}

- if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
- int size = sizeof(u64); /* nr */
- if (data->sample_flags & PERF_SAMPLE_BRANCH_STACK) {
- if (branch_sample_hw_index(event))
- size += sizeof(u64);
-
- size += data->br_stack->nr
- * sizeof(struct perf_branch_entry);
- }
- data->dyn_size += size;
+ if (filtered_sample_type & PERF_SAMPLE_BRANCH_STACK) {
+ data->br_stack = NULL;
+ data->dyn_size += sizeof(u64);
+ data->sample_flags |= PERF_SAMPLE_BRANCH_STACK;
}

if (sample_type & (PERF_SAMPLE_REGS_USER | PERF_SAMPLE_STACK_USER))
--
2.39.0.314.g84b9a713c41-goog

2023-01-18 07:31:00

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 7/8] perf/core: Introduce perf_prepare_header()

Factor out perf_prepare_header() so that it can call
perf_prepare_sample() without a header if not needed.

Also it checks the filtered_sample_type to avoid duplicate
work when perf_prepare_sample() is called twice (or more).

Cc: [email protected]
Cc: [email protected]
Suggested-by: Peter Zijlstr <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Acked-by: Song Liu <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
arch/s390/kernel/perf_cpum_sf.c | 3 ++-
arch/x86/events/intel/ds.c | 3 ++-
include/linux/perf_event.h | 16 +++++++++++++-
kernel/events/core.c | 38 +++++++++++++++++++++------------
4 files changed, 43 insertions(+), 17 deletions(-)

diff --git a/arch/s390/kernel/perf_cpum_sf.c b/arch/s390/kernel/perf_cpum_sf.c
index 332a49965130..fd02f8423243 100644
--- a/arch/s390/kernel/perf_cpum_sf.c
+++ b/arch/s390/kernel/perf_cpum_sf.c
@@ -671,7 +671,8 @@ static void cpumsf_output_event_pid(struct perf_event *event,
/* Protect callchain buffers, tasks */
rcu_read_lock();

- perf_prepare_sample(&header, data, event, regs);
+ perf_prepare_sample(data, event, regs);
+ perf_prepare_header(&header, data, event, regs);
if (perf_output_begin(&handle, data, event, header.size))
goto out;

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 07c8a2cdc3ee..183efa914b99 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -807,7 +807,8 @@ int intel_pmu_drain_bts_buffer(void)
* the sample.
*/
rcu_read_lock();
- perf_prepare_sample(&header, &data, event, &regs);
+ perf_prepare_sample(&data, event, &regs);
+ perf_prepare_header(&header, &data, event, &regs);

if (perf_output_begin(&handle, &data, event,
header.size * (top - base - skip)))
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 7db0e9cc2682..d5628a7b5eaa 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1250,6 +1250,17 @@ static inline void perf_sample_save_brstack(struct perf_sample_data *data,
data->sample_flags |= PERF_SAMPLE_BRANCH_STACK;
}

+static inline u32 perf_sample_data_size(struct perf_sample_data *data,
+ struct perf_event *event)
+{
+ u32 size = sizeof(struct perf_event_header);
+
+ size += event->header_size + event->id_header_size;
+ size += data->dyn_size;
+
+ return size;
+}
+
/*
* Clear all bitfields in the perf_branch_entry.
* The to and from fields are not cleared because they are
@@ -1271,7 +1282,10 @@ extern void perf_output_sample(struct perf_output_handle *handle,
struct perf_event_header *header,
struct perf_sample_data *data,
struct perf_event *event);
-extern void perf_prepare_sample(struct perf_event_header *header,
+extern void perf_prepare_sample(struct perf_sample_data *data,
+ struct perf_event *event,
+ struct pt_regs *regs);
+extern void perf_prepare_header(struct perf_event_header *header,
struct perf_sample_data *data,
struct perf_event *event,
struct pt_regs *regs);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 9cc55122188f..73c40ce84c48 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7575,20 +7575,13 @@ static __always_inline u64 __cond_set(u64 flags, u64 s, u64 d)
return d * !!(flags & s);
}

-void perf_prepare_sample(struct perf_event_header *header,
- struct perf_sample_data *data,
+void perf_prepare_sample(struct perf_sample_data *data,
struct perf_event *event,
struct pt_regs *regs)
{
u64 sample_type = event->attr.sample_type;
u64 filtered_sample_type;

- header->type = PERF_RECORD_SAMPLE;
- header->size = sizeof(*header) + event->header_size + event->id_header_size;
-
- header->misc = 0;
- header->misc |= perf_misc_flags(regs);
-
/*
* Add the sample flags that are dependent to others. And clear the
* sample flags that have already been done by the PMU driver.
@@ -7602,6 +7595,12 @@ void perf_prepare_sample(struct perf_event_header *header,
PERF_SAMPLE_REGS_USER);
filtered_sample_type &= ~data->sample_flags;

+ if (filtered_sample_type == 0) {
+ /* Make sure it has the correct data->type for output */
+ data->type = event->attr.sample_type;
+ return;
+ }
+
__perf_event_header__init_id(data, event, filtered_sample_type);

if (filtered_sample_type & PERF_SAMPLE_IP) {
@@ -7653,9 +7652,10 @@ void perf_prepare_sample(struct perf_event_header *header,
* up the rest of the sample size.
*/
u16 stack_size = event->attr.sample_stack_user;
+ u16 header_size = perf_sample_data_size(data, event);
u16 size = sizeof(u64);

- stack_size = perf_sample_ustack_size(stack_size, header->size,
+ stack_size = perf_sample_ustack_size(stack_size, header_size,
data->regs_user.regs);

/*
@@ -7740,8 +7740,9 @@ void perf_prepare_sample(struct perf_event_header *header,

if (filtered_sample_type & PERF_SAMPLE_AUX) {
u64 size;
+ u16 header_size = perf_sample_data_size(data, event);

- header->size += sizeof(u64); /* size */
+ header_size += sizeof(u64); /* size */

/*
* Given the 16bit nature of header::size, an AUX sample can
@@ -7749,17 +7750,25 @@ void perf_prepare_sample(struct perf_event_header *header,
* Make sure this doesn't happen by using up to U16_MAX bytes
* per sample in total (rounded down to 8 byte boundary).
*/
- size = min_t(size_t, U16_MAX - header->size,
+ size = min_t(size_t, U16_MAX - header_size,
event->attr.aux_sample_size);
size = rounddown(size, 8);
size = perf_prepare_sample_aux(event, data, size);

- WARN_ON_ONCE(size + header->size > U16_MAX);
+ WARN_ON_ONCE(size + header_size > U16_MAX);
data->dyn_size += size + sizeof(u64); /* size above */
data->sample_flags |= PERF_SAMPLE_AUX;
}
+}

- header->size += data->dyn_size;
+void perf_prepare_header(struct perf_event_header *header,
+ struct perf_sample_data *data,
+ struct perf_event *event,
+ struct pt_regs *regs)
+{
+ header->type = PERF_RECORD_SAMPLE;
+ header->size = perf_sample_data_size(data, event);
+ header->misc = perf_misc_flags(regs);

/*
* If you're adding more sample types here, you likely need to do
@@ -7788,7 +7797,8 @@ __perf_event_output(struct perf_event *event,
/* protect the callchain buffers */
rcu_read_lock();

- perf_prepare_sample(&header, data, event, regs);
+ perf_prepare_sample(data, event, regs);
+ perf_prepare_header(&header, data, event, regs);

err = output_begin(&handle, data, event, header.size);
if (err)
--
2.39.0.314.g84b9a713c41-goog

2023-01-18 07:33:00

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 8/8] perf/core: Call perf_prepare_sample() before running BPF

As BPF can access sample data, it needs to populate the data. Also
remove the logic to get the callchain specifically as it's covered by
the perf_prepare_sample() now.

Cc: [email protected]
Acked-by: Jiri Olsa <[email protected]>
Acked-by: Song Liu <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
kernel/events/core.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 73c40ce84c48..5af61d0292ab 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10363,13 +10363,7 @@ static void bpf_overflow_handler(struct perf_event *event,
rcu_read_lock();
prog = READ_ONCE(event->prog);
if (prog) {
- if (prog->call_get_stack &&
- (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN) &&
- !(data->sample_flags & PERF_SAMPLE_CALLCHAIN)) {
- data->callchain = perf_callchain(event, regs);
- data->sample_flags |= PERF_SAMPLE_CALLCHAIN;
- }
-
+ perf_prepare_sample(data, event, regs);
ret = bpf_prog_run(prog, &ctx);
}
rcu_read_unlock();
--
2.39.0.314.g84b9a713c41-goog

2023-01-18 07:34:02

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 2/8] perf/core: Add perf_sample_save_callchain() helper

When it saves the callchain to the perf sample data, it needs to update
the sample flags and the dynamic size. To make sure this, add the
perf_sample_save_callchain() helper and convert all call sites.

Cc: [email protected]
Suggested-by: Peter Zijlstra <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Acked-by: Song Liu <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
arch/x86/events/amd/ibs.c | 6 ++----
arch/x86/events/intel/ds.c | 12 ++++--------
include/linux/perf_event.h | 16 +++++++++++++++-
kernel/events/core.c | 12 ++----------
4 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index da3f5ebac4e1..417c80bd3274 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -1122,10 +1122,8 @@ static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
* recorded as part of interrupt regs. Thus we need to use rip from
* interrupt regs while unwinding call stack.
*/
- if (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN) {
- data.callchain = perf_callchain(event, iregs);
- data.sample_flags |= PERF_SAMPLE_CALLCHAIN;
- }
+ if (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN)
+ perf_sample_save_callchain(&data, event, iregs);

throttle = perf_event_overflow(event, &data, &regs);
out:
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 6ec326b47e2e..158cf845fc80 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1617,10 +1617,8 @@ static void setup_pebs_fixed_sample_data(struct perf_event *event,
* previous PMI context or an (I)RET happened between the record and
* PMI.
*/
- if (sample_type & PERF_SAMPLE_CALLCHAIN) {
- data->callchain = perf_callchain(event, iregs);
- data->sample_flags |= PERF_SAMPLE_CALLCHAIN;
- }
+ if (sample_type & PERF_SAMPLE_CALLCHAIN)
+ perf_sample_save_callchain(data, event, iregs);

/*
* We use the interrupt regs as a base because the PEBS record does not
@@ -1795,10 +1793,8 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,
* previous PMI context or an (I)RET happened between the record and
* PMI.
*/
- if (sample_type & PERF_SAMPLE_CALLCHAIN) {
- data->callchain = perf_callchain(event, iregs);
- data->sample_flags |= PERF_SAMPLE_CALLCHAIN;
- }
+ if (sample_type & PERF_SAMPLE_CALLCHAIN)
+ perf_sample_save_callchain(data, event, iregs);

*regs = *iregs;
/* The ip in basic is EventingIP */
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 16b980014449..a9419608402b 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1095,6 +1095,8 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
extern u64 perf_event_read_value(struct perf_event *event,
u64 *enabled, u64 *running);

+extern struct perf_callchain_entry *perf_callchain(struct perf_event *event, struct pt_regs *regs);
+

struct perf_sample_data {
/*
@@ -1167,6 +1169,19 @@ static inline void perf_sample_data_init(struct perf_sample_data *data,
}
}

+static inline void perf_sample_save_callchain(struct perf_sample_data *data,
+ struct perf_event *event,
+ struct pt_regs *regs)
+{
+ int size = 1;
+
+ data->callchain = perf_callchain(event, regs);
+ size += data->callchain->nr;
+
+ data->dyn_size += size * sizeof(u64);
+ data->sample_flags |= PERF_SAMPLE_CALLCHAIN;
+}
+
/*
* Clear all bitfields in the perf_branch_entry.
* The to and from fields are not cleared because they are
@@ -1408,7 +1423,6 @@ extern void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct
extern struct perf_callchain_entry *
get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
u32 max_stack, bool crosstask, bool add_mark);
-extern struct perf_callchain_entry *perf_callchain(struct perf_event *event, struct pt_regs *regs);
extern int get_callchain_buffers(int max_stack);
extern void put_callchain_buffers(void);
extern struct perf_callchain_entry *get_callchain_entry(int *rctx);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 8c8de26f04ab..0fba98b9cd65 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7585,16 +7585,8 @@ void perf_prepare_sample(struct perf_event_header *header,
if (sample_type & (PERF_SAMPLE_IP | PERF_SAMPLE_CODE_PAGE_SIZE))
data->ip = perf_instruction_pointer(regs);

- if (sample_type & PERF_SAMPLE_CALLCHAIN) {
- int size = 1;
-
- if (filtered_sample_type & PERF_SAMPLE_CALLCHAIN)
- data->callchain = perf_callchain(event, regs);
-
- size += data->callchain->nr;
-
- data->dyn_size += size * sizeof(u64);
- }
+ if (filtered_sample_type & PERF_SAMPLE_CALLCHAIN)
+ perf_sample_save_callchain(data, event, regs);

if (sample_type & PERF_SAMPLE_RAW) {
struct perf_raw_record *raw = data->raw;
--
2.39.0.314.g84b9a713c41-goog

2023-01-18 10:48:24

by Athira Rajeev

[permalink] [raw]
Subject: Re: [PATCH 4/8] perf/core: Add perf_sample_save_brstack() helper



> On 18-Jan-2023, at 11:35 AM, Namhyung Kim <[email protected]> wrote:
>
> When it saves the branch stack to the perf sample data, it needs to
> update the sample flags and the dynamic size. To make sure this,
> add the perf_sample_save_brstack() helper and convert all call sites.
>
> Cc: [email protected]
> Cc: [email protected]
> Suggested-by: Peter Zijlstra <[email protected]>
> Acked-by: Jiri Olsa <[email protected]>
> Tested-by: Jiri Olsa <[email protected]>
> Signed-off-by: Namhyung Kim <[email protected]>

Hi Namhyung,

Changes looks good to me.

Acked-by: Athira Rajeev <[email protected]>

Thanks
Athira

> ---
> arch/powerpc/perf/core-book3s.c | 3 +-
> arch/x86/events/amd/core.c | 6 +--
> arch/x86/events/intel/core.c | 6 +--
> arch/x86/events/intel/ds.c | 9 ++---
> include/linux/perf_event.h | 66 ++++++++++++++++++++-------------
> kernel/events/core.c | 16 +++-----
> 6 files changed, 53 insertions(+), 53 deletions(-)
>
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index bf318dd9b709..8c1f7def596e 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -2313,8 +2313,7 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
> struct cpu_hw_events *cpuhw;
> cpuhw = this_cpu_ptr(&cpu_hw_events);
> power_pmu_bhrb_read(event, cpuhw);
> - data.br_stack = &cpuhw->bhrb_stack;
> - data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
> + perf_sample_save_brstack(&data, event, &cpuhw->bhrb_stack);
> }
>
> if (event->attr.sample_type & PERF_SAMPLE_DATA_SRC &&
> diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
> index d6f3703e4119..463f3eb8bbd7 100644
> --- a/arch/x86/events/amd/core.c
> +++ b/arch/x86/events/amd/core.c
> @@ -928,10 +928,8 @@ static int amd_pmu_v2_handle_irq(struct pt_regs *regs)
> if (!x86_perf_event_set_period(event))
> continue;
>
> - if (has_branch_stack(event)) {
> - data.br_stack = &cpuc->lbr_stack;
> - data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
> - }
> + if (has_branch_stack(event))
> + perf_sample_save_brstack(&data, event, &cpuc->lbr_stack);
>
> if (perf_event_overflow(event, &data, regs))
> x86_pmu_stop(event, 0);
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 29d2d0411caf..14f0a746257d 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -3036,10 +3036,8 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
>
> perf_sample_data_init(&data, 0, event->hw.last_period);
>
> - if (has_branch_stack(event)) {
> - data.br_stack = &cpuc->lbr_stack;
> - data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
> - }
> + if (has_branch_stack(event))
> + perf_sample_save_brstack(&data, event, &cpuc->lbr_stack);
>
> if (perf_event_overflow(event, &data, regs))
> x86_pmu_stop(event, 0);
> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> index 158cf845fc80..07c8a2cdc3ee 100644
> --- a/arch/x86/events/intel/ds.c
> +++ b/arch/x86/events/intel/ds.c
> @@ -1720,10 +1720,8 @@ static void setup_pebs_fixed_sample_data(struct perf_event *event,
> data->sample_flags |= PERF_SAMPLE_TIME;
> }
>
> - if (has_branch_stack(event)) {
> - data->br_stack = &cpuc->lbr_stack;
> - data->sample_flags |= PERF_SAMPLE_BRANCH_STACK;
> - }
> + if (has_branch_stack(event))
> + perf_sample_save_brstack(data, event, &cpuc->lbr_stack);
> }
>
> static void adaptive_pebs_save_regs(struct pt_regs *regs,
> @@ -1883,8 +1881,7 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,
>
> if (has_branch_stack(event)) {
> intel_pmu_store_pebs_lbrs(lbr);
> - data->br_stack = &cpuc->lbr_stack;
> - data->sample_flags |= PERF_SAMPLE_BRANCH_STACK;
> + perf_sample_save_brstack(data, event, &cpuc->lbr_stack);
> }
> }
>
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 569dfac5887f..7db0e9cc2682 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -1102,6 +1102,31 @@ extern u64 perf_event_read_value(struct perf_event *event,
>
> extern struct perf_callchain_entry *perf_callchain(struct perf_event *event, struct pt_regs *regs);
>
> +static inline bool branch_sample_no_flags(const struct perf_event *event)
> +{
> + return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_NO_FLAGS;
> +}
> +
> +static inline bool branch_sample_no_cycles(const struct perf_event *event)
> +{
> + return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_NO_CYCLES;
> +}
> +
> +static inline bool branch_sample_type(const struct perf_event *event)
> +{
> + return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_TYPE_SAVE;
> +}
> +
> +static inline bool branch_sample_hw_index(const struct perf_event *event)
> +{
> + return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_HW_INDEX;
> +}
> +
> +static inline bool branch_sample_priv(const struct perf_event *event)
> +{
> + return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_PRIV_SAVE;
> +}
> +
>
> struct perf_sample_data {
> /*
> @@ -1210,6 +1235,21 @@ static inline void perf_sample_save_raw_data(struct perf_sample_data *data,
> data->sample_flags |= PERF_SAMPLE_RAW;
> }
>
> +static inline void perf_sample_save_brstack(struct perf_sample_data *data,
> + struct perf_event *event,
> + struct perf_branch_stack *brs)
> +{
> + int size = sizeof(u64); /* nr */
> +
> + if (branch_sample_hw_index(event))
> + size += sizeof(u64);
> + size += brs->nr * sizeof(struct perf_branch_entry);
> +
> + data->br_stack = brs;
> + data->dyn_size += size;
> + data->sample_flags |= PERF_SAMPLE_BRANCH_STACK;
> +}
> +
> /*
> * Clear all bitfields in the perf_branch_entry.
> * The to and from fields are not cleared because they are
> @@ -1827,30 +1867,4 @@ static inline void perf_lopwr_cb(bool mode)
> }
> #endif
>
> -#ifdef CONFIG_PERF_EVENTS
> -static inline bool branch_sample_no_flags(const struct perf_event *event)
> -{
> - return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_NO_FLAGS;
> -}
> -
> -static inline bool branch_sample_no_cycles(const struct perf_event *event)
> -{
> - return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_NO_CYCLES;
> -}
> -
> -static inline bool branch_sample_type(const struct perf_event *event)
> -{
> - return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_TYPE_SAVE;
> -}
> -
> -static inline bool branch_sample_hw_index(const struct perf_event *event)
> -{
> - return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_HW_INDEX;
> -}
> -
> -static inline bool branch_sample_priv(const struct perf_event *event)
> -{
> - return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_PRIV_SAVE;
> -}
> -#endif /* CONFIG_PERF_EVENTS */
> #endif /* _LINUX_PERF_EVENT_H */
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 133894ae5e30..0218b6ffaf36 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -7317,7 +7317,7 @@ void perf_output_sample(struct perf_output_handle *handle,
> }
>
> if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
> - if (data->sample_flags & PERF_SAMPLE_BRANCH_STACK) {
> + if (data->br_stack) {
> size_t size;
>
> size = data->br_stack->nr
> @@ -7594,16 +7594,10 @@ void perf_prepare_sample(struct perf_event_header *header,
> data->sample_flags |= PERF_SAMPLE_RAW;
> }
>
> - if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
> - int size = sizeof(u64); /* nr */
> - if (data->sample_flags & PERF_SAMPLE_BRANCH_STACK) {
> - if (branch_sample_hw_index(event))
> - size += sizeof(u64);
> -
> - size += data->br_stack->nr
> - * sizeof(struct perf_branch_entry);
> - }
> - data->dyn_size += size;
> + if (filtered_sample_type & PERF_SAMPLE_BRANCH_STACK) {
> + data->br_stack = NULL;
> + data->dyn_size += sizeof(u64);
> + data->sample_flags |= PERF_SAMPLE_BRANCH_STACK;
> }
>
> if (sample_type & (PERF_SAMPLE_REGS_USER | PERF_SAMPLE_STACK_USER))
> --
> 2.39.0.314.g84b9a713c41-goog
>

2023-01-18 11:40:46

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCHSET 0/8] perf/core: Prepare sample data for BPF (v3)

On Tue, Jan 17, 2023 at 10:05:51PM -0800, Namhyung Kim wrote:
> Namhyung Kim (8):
> perf/core: Save the dynamic parts of sample data size
> perf/core: Add perf_sample_save_callchain() helper
> perf/core: Add perf_sample_save_raw_data() helper
> perf/core: Add perf_sample_save_brstack() helper
> perf/core: Set data->sample_flags in perf_prepare_sample()
> perf/core: Do not pass header for sample id init
> perf/core: Introduce perf_prepare_header()
> perf/core: Call perf_prepare_sample() before running BPF

Thanks!,

Acked-by: Peter Zijlstra (Intel) <[email protected]>

Subject: [tip: perf/core] perf/core: Introduce perf_prepare_header()

The following commit has been merged into the perf/core branch of tip:

Commit-ID: f6e707156e1d5d150f288823987bee1ba0104c4c
Gitweb: https://git.kernel.org/tip/f6e707156e1d5d150f288823987bee1ba0104c4c
Author: Namhyung Kim <[email protected]>
AuthorDate: Tue, 17 Jan 2023 22:05:58 -08:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Wed, 18 Jan 2023 11:57:20 +01:00

perf/core: Introduce perf_prepare_header()

Factor out perf_prepare_header() so that it can call
perf_prepare_sample() without a header if not needed.

Also it checks the filtered_sample_type to avoid duplicate
work when perf_prepare_sample() is called twice (or more).

Suggested-by: Peter Zijlstr <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Acked-by: Song Liu <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/s390/kernel/perf_cpum_sf.c | 3 ++-
arch/x86/events/intel/ds.c | 3 ++-
include/linux/perf_event.h | 16 ++++++++++++-
kernel/events/core.c | 38 ++++++++++++++++++++------------
4 files changed, 43 insertions(+), 17 deletions(-)

diff --git a/arch/s390/kernel/perf_cpum_sf.c b/arch/s390/kernel/perf_cpum_sf.c
index ce886a0..ead6eea 100644
--- a/arch/s390/kernel/perf_cpum_sf.c
+++ b/arch/s390/kernel/perf_cpum_sf.c
@@ -672,7 +672,8 @@ static void cpumsf_output_event_pid(struct perf_event *event,
/* Protect callchain buffers, tasks */
rcu_read_lock();

- perf_prepare_sample(&header, data, event, regs);
+ perf_prepare_sample(data, event, regs);
+ perf_prepare_header(&header, data, event, regs);
if (perf_output_begin(&handle, data, event, header.size))
goto out;

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 07c8a2c..183efa9 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -807,7 +807,8 @@ int intel_pmu_drain_bts_buffer(void)
* the sample.
*/
rcu_read_lock();
- perf_prepare_sample(&header, &data, event, &regs);
+ perf_prepare_sample(&data, event, &regs);
+ perf_prepare_header(&header, &data, event, &regs);

if (perf_output_begin(&handle, &data, event,
header.size * (top - base - skip)))
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 7db0e9c..d5628a7 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1250,6 +1250,17 @@ static inline void perf_sample_save_brstack(struct perf_sample_data *data,
data->sample_flags |= PERF_SAMPLE_BRANCH_STACK;
}

+static inline u32 perf_sample_data_size(struct perf_sample_data *data,
+ struct perf_event *event)
+{
+ u32 size = sizeof(struct perf_event_header);
+
+ size += event->header_size + event->id_header_size;
+ size += data->dyn_size;
+
+ return size;
+}
+
/*
* Clear all bitfields in the perf_branch_entry.
* The to and from fields are not cleared because they are
@@ -1271,7 +1282,10 @@ extern void perf_output_sample(struct perf_output_handle *handle,
struct perf_event_header *header,
struct perf_sample_data *data,
struct perf_event *event);
-extern void perf_prepare_sample(struct perf_event_header *header,
+extern void perf_prepare_sample(struct perf_sample_data *data,
+ struct perf_event *event,
+ struct pt_regs *regs);
+extern void perf_prepare_header(struct perf_event_header *header,
struct perf_sample_data *data,
struct perf_event *event,
struct pt_regs *regs);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 47bfd99..4aa73ed 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7568,20 +7568,13 @@ static __always_inline u64 __cond_set(u64 flags, u64 s, u64 d)
return d * !!(flags & s);
}

-void perf_prepare_sample(struct perf_event_header *header,
- struct perf_sample_data *data,
+void perf_prepare_sample(struct perf_sample_data *data,
struct perf_event *event,
struct pt_regs *regs)
{
u64 sample_type = event->attr.sample_type;
u64 filtered_sample_type;

- header->type = PERF_RECORD_SAMPLE;
- header->size = sizeof(*header) + event->header_size + event->id_header_size;
-
- header->misc = 0;
- header->misc |= perf_misc_flags(regs);
-
/*
* Add the sample flags that are dependent to others. And clear the
* sample flags that have already been done by the PMU driver.
@@ -7595,6 +7588,12 @@ void perf_prepare_sample(struct perf_event_header *header,
PERF_SAMPLE_REGS_USER);
filtered_sample_type &= ~data->sample_flags;

+ if (filtered_sample_type == 0) {
+ /* Make sure it has the correct data->type for output */
+ data->type = event->attr.sample_type;
+ return;
+ }
+
__perf_event_header__init_id(data, event, filtered_sample_type);

if (filtered_sample_type & PERF_SAMPLE_IP) {
@@ -7646,9 +7645,10 @@ void perf_prepare_sample(struct perf_event_header *header,
* up the rest of the sample size.
*/
u16 stack_size = event->attr.sample_stack_user;
+ u16 header_size = perf_sample_data_size(data, event);
u16 size = sizeof(u64);

- stack_size = perf_sample_ustack_size(stack_size, header->size,
+ stack_size = perf_sample_ustack_size(stack_size, header_size,
data->regs_user.regs);

/*
@@ -7733,8 +7733,9 @@ void perf_prepare_sample(struct perf_event_header *header,

if (filtered_sample_type & PERF_SAMPLE_AUX) {
u64 size;
+ u16 header_size = perf_sample_data_size(data, event);

- header->size += sizeof(u64); /* size */
+ header_size += sizeof(u64); /* size */

/*
* Given the 16bit nature of header::size, an AUX sample can
@@ -7742,17 +7743,25 @@ void perf_prepare_sample(struct perf_event_header *header,
* Make sure this doesn't happen by using up to U16_MAX bytes
* per sample in total (rounded down to 8 byte boundary).
*/
- size = min_t(size_t, U16_MAX - header->size,
+ size = min_t(size_t, U16_MAX - header_size,
event->attr.aux_sample_size);
size = rounddown(size, 8);
size = perf_prepare_sample_aux(event, data, size);

- WARN_ON_ONCE(size + header->size > U16_MAX);
+ WARN_ON_ONCE(size + header_size > U16_MAX);
data->dyn_size += size + sizeof(u64); /* size above */
data->sample_flags |= PERF_SAMPLE_AUX;
}
+}

- header->size += data->dyn_size;
+void perf_prepare_header(struct perf_event_header *header,
+ struct perf_sample_data *data,
+ struct perf_event *event,
+ struct pt_regs *regs)
+{
+ header->type = PERF_RECORD_SAMPLE;
+ header->size = perf_sample_data_size(data, event);
+ header->misc = perf_misc_flags(regs);

/*
* If you're adding more sample types here, you likely need to do
@@ -7781,7 +7790,8 @@ __perf_event_output(struct perf_event *event,
/* protect the callchain buffers */
rcu_read_lock();

- perf_prepare_sample(&header, data, event, regs);
+ perf_prepare_sample(data, event, regs);
+ perf_prepare_header(&header, data, event, regs);

err = output_begin(&handle, data, event, header.size);
if (err)

Subject: [tip: perf/core] perf/core: Set data->sample_flags in perf_prepare_sample()

The following commit has been merged into the perf/core branch of tip:

Commit-ID: bb447c27a4674628ea50341cfa4b31618f2010af
Gitweb: https://git.kernel.org/tip/bb447c27a4674628ea50341cfa4b31618f2010af
Author: Namhyung Kim <[email protected]>
AuthorDate: Tue, 17 Jan 2023 22:05:56 -08:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Wed, 18 Jan 2023 11:57:20 +01:00

perf/core: Set data->sample_flags in perf_prepare_sample()

The perf_prepare_sample() function sets the perf_sample_data according
to the attr->sample_type before copying it to the ring buffer. But BPF
also wants to access the sample data so it needs to prepare the sample
even before the regular path.

That means perf_prepare_sample() can be called more than once. Set
the data->sample_flags consistently so that it can indicate which fields
are set already and skip them if sets.

Also update the filtered_sample_type to have the dependent flags to
reduce the number of branches.

Suggested-by: Peter Zijlstra <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
kernel/events/core.c | 85 ++++++++++++++++++++++++++++++++-----------
1 file changed, 65 insertions(+), 20 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index bd20705..7135cb9 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7046,12 +7046,21 @@ out_put:
ring_buffer_put(rb);
}

+/*
+ * A set of common sample data types saved even for non-sample records
+ * when event->attr.sample_id_all is set.
+ */
+#define PERF_SAMPLE_ID_ALL (PERF_SAMPLE_TID | PERF_SAMPLE_TIME | \
+ PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
+ PERF_SAMPLE_CPU | PERF_SAMPLE_IDENTIFIER)
+
static void __perf_event_header__init_id(struct perf_event_header *header,
struct perf_sample_data *data,
struct perf_event *event,
u64 sample_type)
{
data->type = event->attr.sample_type;
+ data->sample_flags |= data->type & PERF_SAMPLE_ID_ALL;
header->size += event->id_header_size;

if (sample_type & PERF_SAMPLE_TID) {
@@ -7554,6 +7563,11 @@ perf_callchain(struct perf_event *event, struct pt_regs *regs)
return callchain ?: &__empty_callchain;
}

+static __always_inline u64 __cond_set(u64 flags, u64 s, u64 d)
+{
+ return d * !!(flags & s);
+}
+
void perf_prepare_sample(struct perf_event_header *header,
struct perf_sample_data *data,
struct perf_event *event,
@@ -7569,14 +7583,24 @@ void perf_prepare_sample(struct perf_event_header *header,
header->misc |= perf_misc_flags(regs);

/*
- * Clear the sample flags that have already been done by the
- * PMU driver.
+ * Add the sample flags that are dependent to others. And clear the
+ * sample flags that have already been done by the PMU driver.
*/
- filtered_sample_type = sample_type & ~data->sample_flags;
+ filtered_sample_type = sample_type;
+ filtered_sample_type |= __cond_set(sample_type, PERF_SAMPLE_CODE_PAGE_SIZE,
+ PERF_SAMPLE_IP);
+ filtered_sample_type |= __cond_set(sample_type, PERF_SAMPLE_DATA_PAGE_SIZE |
+ PERF_SAMPLE_PHYS_ADDR, PERF_SAMPLE_ADDR);
+ filtered_sample_type |= __cond_set(sample_type, PERF_SAMPLE_STACK_USER,
+ PERF_SAMPLE_REGS_USER);
+ filtered_sample_type &= ~data->sample_flags;
+
__perf_event_header__init_id(header, data, event, filtered_sample_type);

- if (sample_type & (PERF_SAMPLE_IP | PERF_SAMPLE_CODE_PAGE_SIZE))
+ if (filtered_sample_type & PERF_SAMPLE_IP) {
data->ip = perf_instruction_pointer(regs);
+ data->sample_flags |= PERF_SAMPLE_IP;
+ }

if (filtered_sample_type & PERF_SAMPLE_CALLCHAIN)
perf_sample_save_callchain(data, event, regs);
@@ -7593,10 +7617,15 @@ void perf_prepare_sample(struct perf_event_header *header,
data->sample_flags |= PERF_SAMPLE_BRANCH_STACK;
}

- if (sample_type & (PERF_SAMPLE_REGS_USER | PERF_SAMPLE_STACK_USER))
+ if (filtered_sample_type & PERF_SAMPLE_REGS_USER)
perf_sample_regs_user(&data->regs_user, regs);

- if (sample_type & PERF_SAMPLE_REGS_USER) {
+ /*
+ * It cannot use the filtered_sample_type here as REGS_USER can be set
+ * by STACK_USER (using __cond_set() above) and we don't want to update
+ * the dyn_size if it's not requested by users.
+ */
+ if ((sample_type & ~data->sample_flags) & PERF_SAMPLE_REGS_USER) {
/* regs dump ABI info */
int size = sizeof(u64);

@@ -7606,9 +7635,10 @@ void perf_prepare_sample(struct perf_event_header *header,
}

data->dyn_size += size;
+ data->sample_flags |= PERF_SAMPLE_REGS_USER;
}

- if (sample_type & PERF_SAMPLE_STACK_USER) {
+ if (filtered_sample_type & PERF_SAMPLE_STACK_USER) {
/*
* Either we need PERF_SAMPLE_STACK_USER bit to be always
* processed as the last one or have additional check added
@@ -7631,23 +7661,30 @@ void perf_prepare_sample(struct perf_event_header *header,

data->stack_user_size = stack_size;
data->dyn_size += size;
+ data->sample_flags |= PERF_SAMPLE_STACK_USER;
}

- if (filtered_sample_type & PERF_SAMPLE_WEIGHT_TYPE)
+ if (filtered_sample_type & PERF_SAMPLE_WEIGHT_TYPE) {
data->weight.full = 0;
+ data->sample_flags |= PERF_SAMPLE_WEIGHT_TYPE;
+ }

- if (filtered_sample_type & PERF_SAMPLE_DATA_SRC)
+ if (filtered_sample_type & PERF_SAMPLE_DATA_SRC) {
data->data_src.val = PERF_MEM_NA;
+ data->sample_flags |= PERF_SAMPLE_DATA_SRC;
+ }

- if (filtered_sample_type & PERF_SAMPLE_TRANSACTION)
+ if (filtered_sample_type & PERF_SAMPLE_TRANSACTION) {
data->txn = 0;
+ data->sample_flags |= PERF_SAMPLE_TRANSACTION;
+ }

- if (sample_type & (PERF_SAMPLE_ADDR | PERF_SAMPLE_PHYS_ADDR | PERF_SAMPLE_DATA_PAGE_SIZE)) {
- if (filtered_sample_type & PERF_SAMPLE_ADDR)
- data->addr = 0;
+ if (filtered_sample_type & PERF_SAMPLE_ADDR) {
+ data->addr = 0;
+ data->sample_flags |= PERF_SAMPLE_ADDR;
}

- if (sample_type & PERF_SAMPLE_REGS_INTR) {
+ if (filtered_sample_type & PERF_SAMPLE_REGS_INTR) {
/* regs dump ABI info */
int size = sizeof(u64);

@@ -7660,19 +7697,22 @@ void perf_prepare_sample(struct perf_event_header *header,
}

data->dyn_size += size;
+ data->sample_flags |= PERF_SAMPLE_REGS_INTR;
}

- if (sample_type & PERF_SAMPLE_PHYS_ADDR &&
- filtered_sample_type & PERF_SAMPLE_PHYS_ADDR)
+ if (filtered_sample_type & PERF_SAMPLE_PHYS_ADDR) {
data->phys_addr = perf_virt_to_phys(data->addr);
+ data->sample_flags |= PERF_SAMPLE_PHYS_ADDR;
+ }

#ifdef CONFIG_CGROUP_PERF
- if (sample_type & PERF_SAMPLE_CGROUP) {
+ if (filtered_sample_type & PERF_SAMPLE_CGROUP) {
struct cgroup *cgrp;

/* protected by RCU */
cgrp = task_css_check(current, perf_event_cgrp_id, 1)->cgroup;
data->cgroup = cgroup_id(cgrp);
+ data->sample_flags |= PERF_SAMPLE_CGROUP;
}
#endif

@@ -7681,13 +7721,17 @@ void perf_prepare_sample(struct perf_event_header *header,
* require PERF_SAMPLE_ADDR, kernel implicitly retrieve the data->addr,
* but the value will not dump to the userspace.
*/
- if (sample_type & PERF_SAMPLE_DATA_PAGE_SIZE)
+ if (filtered_sample_type & PERF_SAMPLE_DATA_PAGE_SIZE) {
data->data_page_size = perf_get_page_size(data->addr);
+ data->sample_flags |= PERF_SAMPLE_DATA_PAGE_SIZE;
+ }

- if (sample_type & PERF_SAMPLE_CODE_PAGE_SIZE)
+ if (filtered_sample_type & PERF_SAMPLE_CODE_PAGE_SIZE) {
data->code_page_size = perf_get_page_size(data->ip);
+ data->sample_flags |= PERF_SAMPLE_CODE_PAGE_SIZE;
+ }

- if (sample_type & PERF_SAMPLE_AUX) {
+ if (filtered_sample_type & PERF_SAMPLE_AUX) {
u64 size;

header->size += sizeof(u64); /* size */
@@ -7705,6 +7749,7 @@ void perf_prepare_sample(struct perf_event_header *header,

WARN_ON_ONCE(size + header->size > U16_MAX);
data->dyn_size += size + sizeof(u64); /* size above */
+ data->sample_flags |= PERF_SAMPLE_AUX;
}

header->size += data->dyn_size;

Subject: [tip: perf/core] perf/core: Add perf_sample_save_brstack() helper

The following commit has been merged into the perf/core branch of tip:

Commit-ID: eb55b455ef9c7123bdfa7e8a7f1ebeaa8034eb83
Gitweb: https://git.kernel.org/tip/eb55b455ef9c7123bdfa7e8a7f1ebeaa8034eb83
Author: Namhyung Kim <[email protected]>
AuthorDate: Tue, 17 Jan 2023 22:05:55 -08:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Wed, 18 Jan 2023 11:57:20 +01:00

perf/core: Add perf_sample_save_brstack() helper

When we saves the branch stack to the perf sample data, we needs to
update the sample flags and the dynamic size. To make sure this is
done consistently, add the perf_sample_save_brstack() helper and
convert all call sites.

Suggested-by: Peter Zijlstra <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Acked-by: Athira Rajeev <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/powerpc/perf/core-book3s.c | 3 +-
arch/x86/events/amd/core.c | 6 +--
arch/x86/events/intel/core.c | 6 +--
arch/x86/events/intel/ds.c | 9 +---
include/linux/perf_event.h | 66 +++++++++++++++++++-------------
kernel/events/core.c | 16 ++------
6 files changed, 53 insertions(+), 53 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index bf318dd..8c1f7de 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2313,8 +2313,7 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
struct cpu_hw_events *cpuhw;
cpuhw = this_cpu_ptr(&cpu_hw_events);
power_pmu_bhrb_read(event, cpuhw);
- data.br_stack = &cpuhw->bhrb_stack;
- data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
+ perf_sample_save_brstack(&data, event, &cpuhw->bhrb_stack);
}

if (event->attr.sample_type & PERF_SAMPLE_DATA_SRC &&
diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
index 4386b10..8c45b19 100644
--- a/arch/x86/events/amd/core.c
+++ b/arch/x86/events/amd/core.c
@@ -928,10 +928,8 @@ static int amd_pmu_v2_handle_irq(struct pt_regs *regs)
if (!x86_perf_event_set_period(event))
continue;

- if (has_branch_stack(event)) {
- data.br_stack = &cpuc->lbr_stack;
- data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
- }
+ if (has_branch_stack(event))
+ perf_sample_save_brstack(&data, event, &cpuc->lbr_stack);

if (perf_event_overflow(event, &data, regs))
x86_pmu_stop(event, 0);
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 29d2d04..14f0a74 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3036,10 +3036,8 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)

perf_sample_data_init(&data, 0, event->hw.last_period);

- if (has_branch_stack(event)) {
- data.br_stack = &cpuc->lbr_stack;
- data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
- }
+ if (has_branch_stack(event))
+ perf_sample_save_brstack(&data, event, &cpuc->lbr_stack);

if (perf_event_overflow(event, &data, regs))
x86_pmu_stop(event, 0);
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 158cf84..07c8a2c 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1720,10 +1720,8 @@ static void setup_pebs_fixed_sample_data(struct perf_event *event,
data->sample_flags |= PERF_SAMPLE_TIME;
}

- if (has_branch_stack(event)) {
- data->br_stack = &cpuc->lbr_stack;
- data->sample_flags |= PERF_SAMPLE_BRANCH_STACK;
- }
+ if (has_branch_stack(event))
+ perf_sample_save_brstack(data, event, &cpuc->lbr_stack);
}

static void adaptive_pebs_save_regs(struct pt_regs *regs,
@@ -1883,8 +1881,7 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,

if (has_branch_stack(event)) {
intel_pmu_store_pebs_lbrs(lbr);
- data->br_stack = &cpuc->lbr_stack;
- data->sample_flags |= PERF_SAMPLE_BRANCH_STACK;
+ perf_sample_save_brstack(data, event, &cpuc->lbr_stack);
}
}

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 569dfac..7db0e9c 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1102,6 +1102,31 @@ extern u64 perf_event_read_value(struct perf_event *event,

extern struct perf_callchain_entry *perf_callchain(struct perf_event *event, struct pt_regs *regs);

+static inline bool branch_sample_no_flags(const struct perf_event *event)
+{
+ return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_NO_FLAGS;
+}
+
+static inline bool branch_sample_no_cycles(const struct perf_event *event)
+{
+ return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_NO_CYCLES;
+}
+
+static inline bool branch_sample_type(const struct perf_event *event)
+{
+ return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_TYPE_SAVE;
+}
+
+static inline bool branch_sample_hw_index(const struct perf_event *event)
+{
+ return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_HW_INDEX;
+}
+
+static inline bool branch_sample_priv(const struct perf_event *event)
+{
+ return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_PRIV_SAVE;
+}
+

struct perf_sample_data {
/*
@@ -1210,6 +1235,21 @@ static inline void perf_sample_save_raw_data(struct perf_sample_data *data,
data->sample_flags |= PERF_SAMPLE_RAW;
}

+static inline void perf_sample_save_brstack(struct perf_sample_data *data,
+ struct perf_event *event,
+ struct perf_branch_stack *brs)
+{
+ int size = sizeof(u64); /* nr */
+
+ if (branch_sample_hw_index(event))
+ size += sizeof(u64);
+ size += brs->nr * sizeof(struct perf_branch_entry);
+
+ data->br_stack = brs;
+ data->dyn_size += size;
+ data->sample_flags |= PERF_SAMPLE_BRANCH_STACK;
+}
+
/*
* Clear all bitfields in the perf_branch_entry.
* The to and from fields are not cleared because they are
@@ -1827,30 +1867,4 @@ static inline void perf_lopwr_cb(bool mode)
}
#endif

-#ifdef CONFIG_PERF_EVENTS
-static inline bool branch_sample_no_flags(const struct perf_event *event)
-{
- return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_NO_FLAGS;
-}
-
-static inline bool branch_sample_no_cycles(const struct perf_event *event)
-{
- return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_NO_CYCLES;
-}
-
-static inline bool branch_sample_type(const struct perf_event *event)
-{
- return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_TYPE_SAVE;
-}
-
-static inline bool branch_sample_hw_index(const struct perf_event *event)
-{
- return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_HW_INDEX;
-}
-
-static inline bool branch_sample_priv(const struct perf_event *event)
-{
- return event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_PRIV_SAVE;
-}
-#endif /* CONFIG_PERF_EVENTS */
#endif /* _LINUX_PERF_EVENT_H */
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 17108a2..bd20705 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7310,7 +7310,7 @@ void perf_output_sample(struct perf_output_handle *handle,
}

if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
- if (data->sample_flags & PERF_SAMPLE_BRANCH_STACK) {
+ if (data->br_stack) {
size_t size;

size = data->br_stack->nr
@@ -7587,16 +7587,10 @@ void perf_prepare_sample(struct perf_event_header *header,
data->sample_flags |= PERF_SAMPLE_RAW;
}

- if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
- int size = sizeof(u64); /* nr */
- if (data->sample_flags & PERF_SAMPLE_BRANCH_STACK) {
- if (branch_sample_hw_index(event))
- size += sizeof(u64);
-
- size += data->br_stack->nr
- * sizeof(struct perf_branch_entry);
- }
- data->dyn_size += size;
+ if (filtered_sample_type & PERF_SAMPLE_BRANCH_STACK) {
+ data->br_stack = NULL;
+ data->dyn_size += sizeof(u64);
+ data->sample_flags |= PERF_SAMPLE_BRANCH_STACK;
}

if (sample_type & (PERF_SAMPLE_REGS_USER | PERF_SAMPLE_STACK_USER))

Subject: [tip: perf/core] perf/core: Call perf_prepare_sample() before running BPF

The following commit has been merged into the perf/core branch of tip:

Commit-ID: 0eed28220598cd990d094b7b9f8c832c425080c0
Gitweb: https://git.kernel.org/tip/0eed28220598cd990d094b7b9f8c832c425080c0
Author: Namhyung Kim <[email protected]>
AuthorDate: Tue, 17 Jan 2023 22:05:59 -08:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Wed, 18 Jan 2023 11:57:21 +01:00

perf/core: Call perf_prepare_sample() before running BPF

As BPF can access sample data, it needs to populate the data. Also
remove the logic to get the callchain specifically as it's covered by
the perf_prepare_sample() now.

Signed-off-by: Namhyung Kim <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Acked-by: Song Liu <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
kernel/events/core.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4aa73ed..380476a 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10356,13 +10356,7 @@ static void bpf_overflow_handler(struct perf_event *event,
rcu_read_lock();
prog = READ_ONCE(event->prog);
if (prog) {
- if (prog->call_get_stack &&
- (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN) &&
- !(data->sample_flags & PERF_SAMPLE_CALLCHAIN)) {
- data->callchain = perf_callchain(event, regs);
- data->sample_flags |= PERF_SAMPLE_CALLCHAIN;
- }
-
+ perf_prepare_sample(data, event, regs);
ret = bpf_prog_run(prog, &ctx);
}
rcu_read_unlock();

Subject: [tip: perf/core] perf/core: Do not pass header for sample ID init

The following commit has been merged into the perf/core branch of tip:

Commit-ID: a7c8d0daa87581cab8435c83cc6ecbfbcb8b60cf
Gitweb: https://git.kernel.org/tip/a7c8d0daa87581cab8435c83cc6ecbfbcb8b60cf
Author: Namhyung Kim <[email protected]>
AuthorDate: Tue, 17 Jan 2023 22:05:57 -08:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Wed, 18 Jan 2023 11:57:20 +01:00

perf/core: Do not pass header for sample ID init

The only thing it does for header in __perf_event_header__init_id() is
to update the header size with event->id_header_size. We can do this
outside and get rid of the argument for the later change.

Signed-off-by: Namhyung Kim <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Acked-by: Song Liu <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
kernel/events/core.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 7135cb9..47bfd99 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7054,14 +7054,12 @@ out_put:
PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
PERF_SAMPLE_CPU | PERF_SAMPLE_IDENTIFIER)

-static void __perf_event_header__init_id(struct perf_event_header *header,
- struct perf_sample_data *data,
+static void __perf_event_header__init_id(struct perf_sample_data *data,
struct perf_event *event,
u64 sample_type)
{
data->type = event->attr.sample_type;
data->sample_flags |= data->type & PERF_SAMPLE_ID_ALL;
- header->size += event->id_header_size;

if (sample_type & PERF_SAMPLE_TID) {
/* namespace issues */
@@ -7088,8 +7086,10 @@ void perf_event_header__init_id(struct perf_event_header *header,
struct perf_sample_data *data,
struct perf_event *event)
{
- if (event->attr.sample_id_all)
- __perf_event_header__init_id(header, data, event, event->attr.sample_type);
+ if (event->attr.sample_id_all) {
+ header->size += event->id_header_size;
+ __perf_event_header__init_id(data, event, event->attr.sample_type);
+ }
}

static void __perf_event__output_id_sample(struct perf_output_handle *handle,
@@ -7577,7 +7577,7 @@ void perf_prepare_sample(struct perf_event_header *header,
u64 filtered_sample_type;

header->type = PERF_RECORD_SAMPLE;
- header->size = sizeof(*header) + event->header_size;
+ header->size = sizeof(*header) + event->header_size + event->id_header_size;

header->misc = 0;
header->misc |= perf_misc_flags(regs);
@@ -7595,7 +7595,7 @@ void perf_prepare_sample(struct perf_event_header *header,
PERF_SAMPLE_REGS_USER);
filtered_sample_type &= ~data->sample_flags;

- __perf_event_header__init_id(header, data, event, filtered_sample_type);
+ __perf_event_header__init_id(data, event, filtered_sample_type);

if (filtered_sample_type & PERF_SAMPLE_IP) {
data->ip = perf_instruction_pointer(regs);

Subject: [tip: perf/core] perf/core: Add perf_sample_save_callchain() helper

The following commit has been merged into the perf/core branch of tip:

Commit-ID: 31046500c1864b8ab25d1b9846ad10aa3f7b1821
Gitweb: https://git.kernel.org/tip/31046500c1864b8ab25d1b9846ad10aa3f7b1821
Author: Namhyung Kim <[email protected]>
AuthorDate: Tue, 17 Jan 2023 22:05:53 -08:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Wed, 18 Jan 2023 11:57:19 +01:00

perf/core: Add perf_sample_save_callchain() helper

When we save the callchain to the perf sample data, we need to update
the sample flags and the dynamic size. To ensure this is done consistently,
add the perf_sample_save_callchain() helper and convert all call sites.

Suggested-by: Peter Zijlstra <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Acked-by: Song Liu <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/events/amd/ibs.c | 6 ++----
arch/x86/events/intel/ds.c | 12 ++++--------
include/linux/perf_event.h | 16 +++++++++++++++-
kernel/events/core.c | 12 ++----------
4 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index da3f5eb..417c80b 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -1122,10 +1122,8 @@ fail:
* recorded as part of interrupt regs. Thus we need to use rip from
* interrupt regs while unwinding call stack.
*/
- if (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN) {
- data.callchain = perf_callchain(event, iregs);
- data.sample_flags |= PERF_SAMPLE_CALLCHAIN;
- }
+ if (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN)
+ perf_sample_save_callchain(&data, event, iregs);

throttle = perf_event_overflow(event, &data, &regs);
out:
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 6ec326b..158cf84 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1617,10 +1617,8 @@ static void setup_pebs_fixed_sample_data(struct perf_event *event,
* previous PMI context or an (I)RET happened between the record and
* PMI.
*/
- if (sample_type & PERF_SAMPLE_CALLCHAIN) {
- data->callchain = perf_callchain(event, iregs);
- data->sample_flags |= PERF_SAMPLE_CALLCHAIN;
- }
+ if (sample_type & PERF_SAMPLE_CALLCHAIN)
+ perf_sample_save_callchain(data, event, iregs);

/*
* We use the interrupt regs as a base because the PEBS record does not
@@ -1795,10 +1793,8 @@ static void setup_pebs_adaptive_sample_data(struct perf_event *event,
* previous PMI context or an (I)RET happened between the record and
* PMI.
*/
- if (sample_type & PERF_SAMPLE_CALLCHAIN) {
- data->callchain = perf_callchain(event, iregs);
- data->sample_flags |= PERF_SAMPLE_CALLCHAIN;
- }
+ if (sample_type & PERF_SAMPLE_CALLCHAIN)
+ perf_sample_save_callchain(data, event, iregs);

*regs = *iregs;
/* The ip in basic is EventingIP */
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 16b9800..a941960 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1095,6 +1095,8 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
extern u64 perf_event_read_value(struct perf_event *event,
u64 *enabled, u64 *running);

+extern struct perf_callchain_entry *perf_callchain(struct perf_event *event, struct pt_regs *regs);
+

struct perf_sample_data {
/*
@@ -1167,6 +1169,19 @@ static inline void perf_sample_data_init(struct perf_sample_data *data,
}
}

+static inline void perf_sample_save_callchain(struct perf_sample_data *data,
+ struct perf_event *event,
+ struct pt_regs *regs)
+{
+ int size = 1;
+
+ data->callchain = perf_callchain(event, regs);
+ size += data->callchain->nr;
+
+ data->dyn_size += size * sizeof(u64);
+ data->sample_flags |= PERF_SAMPLE_CALLCHAIN;
+}
+
/*
* Clear all bitfields in the perf_branch_entry.
* The to and from fields are not cleared because they are
@@ -1408,7 +1423,6 @@ extern void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct
extern struct perf_callchain_entry *
get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
u32 max_stack, bool crosstask, bool add_mark);
-extern struct perf_callchain_entry *perf_callchain(struct perf_event *event, struct pt_regs *regs);
extern int get_callchain_buffers(int max_stack);
extern void put_callchain_buffers(void);
extern struct perf_callchain_entry *get_callchain_entry(int *rctx);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 827082d..12b7d51 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7578,16 +7578,8 @@ void perf_prepare_sample(struct perf_event_header *header,
if (sample_type & (PERF_SAMPLE_IP | PERF_SAMPLE_CODE_PAGE_SIZE))
data->ip = perf_instruction_pointer(regs);

- if (sample_type & PERF_SAMPLE_CALLCHAIN) {
- int size = 1;
-
- if (filtered_sample_type & PERF_SAMPLE_CALLCHAIN)
- data->callchain = perf_callchain(event, regs);
-
- size += data->callchain->nr;
-
- data->dyn_size += size * sizeof(u64);
- }
+ if (filtered_sample_type & PERF_SAMPLE_CALLCHAIN)
+ perf_sample_save_callchain(data, event, regs);

if (sample_type & PERF_SAMPLE_RAW) {
struct perf_raw_record *raw = data->raw;

Subject: [tip: perf/core] perf/core: Save the dynamic parts of sample data size

The following commit has been merged into the perf/core branch of tip:

Commit-ID: 4cf7a136115e96241f9f1089d2b53c47accf3823
Gitweb: https://git.kernel.org/tip/4cf7a136115e96241f9f1089d2b53c47accf3823
Author: Namhyung Kim <[email protected]>
AuthorDate: Tue, 17 Jan 2023 22:05:52 -08:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Wed, 18 Jan 2023 11:57:19 +01:00

perf/core: Save the dynamic parts of sample data size

The perf sample data can be divided into parts. The event->header_size
and event->id_header_size keep the static part of the sample data which
is determined by the sample_type flags.

But other parts like CALLCHAIN and BRANCH_STACK are changing dynamically
so it needs to see the actual data. In preparation of handling repeated
calls for perf_prepare_sample(), it can save the dynamic size to the
perf sample data to avoid the duplicate work.

Signed-off-by: Namhyung Kim <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Acked-by: Song Liu <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
include/linux/perf_event.h | 2 ++
kernel/events/core.c | 17 ++++++++++-------
2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 03949d0..16b9800 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1103,6 +1103,7 @@ struct perf_sample_data {
*/
u64 sample_flags;
u64 period;
+ u64 dyn_size;

/*
* Fields commonly set by __perf_event_header__init_id(),
@@ -1158,6 +1159,7 @@ static inline void perf_sample_data_init(struct perf_sample_data *data,
/* remaining struct members initialized in perf_prepare_sample() */
data->sample_flags = PERF_SAMPLE_PERIOD;
data->period = period;
+ data->dyn_size = 0;

if (addr) {
data->addr = addr;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index d56328e..827082d 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7586,7 +7586,7 @@ void perf_prepare_sample(struct perf_event_header *header,

size += data->callchain->nr;

- header->size += size * sizeof(u64);
+ data->dyn_size += size * sizeof(u64);
}

if (sample_type & PERF_SAMPLE_RAW) {
@@ -7612,7 +7612,7 @@ void perf_prepare_sample(struct perf_event_header *header,
data->raw = NULL;
}

- header->size += size;
+ data->dyn_size += size;
}

if (sample_type & PERF_SAMPLE_BRANCH_STACK) {
@@ -7624,7 +7624,7 @@ void perf_prepare_sample(struct perf_event_header *header,
size += data->br_stack->nr
* sizeof(struct perf_branch_entry);
}
- header->size += size;
+ data->dyn_size += size;
}

if (sample_type & (PERF_SAMPLE_REGS_USER | PERF_SAMPLE_STACK_USER))
@@ -7639,7 +7639,7 @@ void perf_prepare_sample(struct perf_event_header *header,
size += hweight64(mask) * sizeof(u64);
}

- header->size += size;
+ data->dyn_size += size;
}

if (sample_type & PERF_SAMPLE_STACK_USER) {
@@ -7664,7 +7664,7 @@ void perf_prepare_sample(struct perf_event_header *header,
size += sizeof(u64) + stack_size;

data->stack_user_size = stack_size;
- header->size += size;
+ data->dyn_size += size;
}

if (filtered_sample_type & PERF_SAMPLE_WEIGHT_TYPE)
@@ -7693,7 +7693,7 @@ void perf_prepare_sample(struct perf_event_header *header,
size += hweight64(mask) * sizeof(u64);
}

- header->size += size;
+ data->dyn_size += size;
}

if (sample_type & PERF_SAMPLE_PHYS_ADDR &&
@@ -7738,8 +7738,11 @@ void perf_prepare_sample(struct perf_event_header *header,
size = perf_prepare_sample_aux(event, data, size);

WARN_ON_ONCE(size + header->size > U16_MAX);
- header->size += size;
+ data->dyn_size += size + sizeof(u64); /* size above */
}
+
+ header->size += data->dyn_size;
+
/*
* If you're adding more sample types here, you likely need to do
* something about the overflowing header::size, like repurpose the

2023-01-18 19:49:54

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [PATCHSET 0/8] perf/core: Prepare sample data for BPF (v3)

On Wed, Jan 18, 2023 at 11:50:37AM +0100, Peter Zijlstra wrote:
> On Tue, Jan 17, 2023 at 10:05:51PM -0800, Namhyung Kim wrote:
> > Namhyung Kim (8):
> > perf/core: Save the dynamic parts of sample data size
> > perf/core: Add perf_sample_save_callchain() helper
> > perf/core: Add perf_sample_save_raw_data() helper
> > perf/core: Add perf_sample_save_brstack() helper
> > perf/core: Set data->sample_flags in perf_prepare_sample()
> > perf/core: Do not pass header for sample id init
> > perf/core: Introduce perf_prepare_header()
> > perf/core: Call perf_prepare_sample() before running BPF
>
> Thanks!,
>
> Acked-by: Peter Zijlstra (Intel) <[email protected]>

Thanks Peter.
The patches look good to me as well.
How do you want to route them bpf-next or tip ?

Namhyung,
I'd also like to see a follow up patch with a selftest/bpf for this.

BPF CI didn't have a chance to really test the set, since
there is a build issue on s390 due to llvm upstream repo.
We're aware and working on that.

2023-01-18 22:47:58

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCHSET 0/8] perf/core: Prepare sample data for BPF (v3)

Hi Alexei,

On Wed, Jan 18, 2023 at 11:43 AM Alexei Starovoitov
<[email protected]> wrote:
>
> On Wed, Jan 18, 2023 at 11:50:37AM +0100, Peter Zijlstra wrote:
> > On Tue, Jan 17, 2023 at 10:05:51PM -0800, Namhyung Kim wrote:
> > > Namhyung Kim (8):
> > > perf/core: Save the dynamic parts of sample data size
> > > perf/core: Add perf_sample_save_callchain() helper
> > > perf/core: Add perf_sample_save_raw_data() helper
> > > perf/core: Add perf_sample_save_brstack() helper
> > > perf/core: Set data->sample_flags in perf_prepare_sample()
> > > perf/core: Do not pass header for sample id init
> > > perf/core: Introduce perf_prepare_header()
> > > perf/core: Call perf_prepare_sample() before running BPF
> >
> > Thanks!,
> >
> > Acked-by: Peter Zijlstra (Intel) <[email protected]>
>
> Thanks Peter.
> The patches look good to me as well.
> How do you want to route them bpf-next or tip ?
>
> Namhyung,
> I'd also like to see a follow up patch with a selftest/bpf for this.
>
> BPF CI didn't have a chance to really test the set, since
> there is a build issue on s390 due to llvm upstream repo.
> We're aware and working on that.

Yep, I'll add a test case as soon as it gets the change.

Thanks,
Namhyung