2022-09-21 22:11:47

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 2/2] perf: Use sample_flags for raw_data

Use the new sample_flags to indicate whether the raw data field is
filled by the PMU driver. Although it could check with the NULL,
follow the same rule with other fields.

Remove the raw field from the perf_sample_data_init() to minimize
the number of cache lines touched.

Cc: Kan Liang <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
arch/s390/kernel/perf_cpum_cf.c | 1 +
arch/s390/kernel/perf_pai_crypto.c | 1 +
arch/x86/events/amd/ibs.c | 1 +
include/linux/perf_event.h | 5 ++---
kernel/events/core.c | 3 ++-
5 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/s390/kernel/perf_cpum_cf.c b/arch/s390/kernel/perf_cpum_cf.c
index f7dd3c849e68..f043a7ff220b 100644
--- a/arch/s390/kernel/perf_cpum_cf.c
+++ b/arch/s390/kernel/perf_cpum_cf.c
@@ -664,6 +664,7 @@ static int cfdiag_push_sample(struct perf_event *event,
raw.frag.data = cpuhw->stop;
raw.size = raw.frag.size;
data.raw = &raw;
+ data.sample_flags |= PERF_SAMPLE_RAW;
}

overflow = perf_event_overflow(event, &data, &regs);
diff --git a/arch/s390/kernel/perf_pai_crypto.c b/arch/s390/kernel/perf_pai_crypto.c
index b38b4ae01589..6826e2a69a21 100644
--- a/arch/s390/kernel/perf_pai_crypto.c
+++ b/arch/s390/kernel/perf_pai_crypto.c
@@ -366,6 +366,7 @@ static int paicrypt_push_sample(void)
raw.frag.data = cpump->save;
raw.size = raw.frag.size;
data.raw = &raw;
+ data.sample_flags |= PERF_SAMPLE_RAW;
}

overflow = perf_event_overflow(event, &data, &regs);
diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index ce5720bfb350..c29a006954c7 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -781,6 +781,7 @@ static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
},
};
data.raw = &raw;
+ data.sample_flags |= PERF_SAMPLE_RAW;
}

/*
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index f4a13579b0e8..e9b151cde491 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1028,7 +1028,6 @@ struct perf_sample_data {
* minimize the cachelines touched.
*/
u64 sample_flags;
- struct perf_raw_record *raw;
u64 period;

/*
@@ -1040,6 +1039,7 @@ struct perf_sample_data {
union perf_mem_data_src data_src;
u64 txn;
u64 addr;
+ struct perf_raw_record *raw;

u64 type;
u64 ip;
@@ -1078,8 +1078,7 @@ static inline void perf_sample_data_init(struct perf_sample_data *data,
u64 addr, u64 period)
{
/* remaining struct members initialized in perf_prepare_sample() */
- data->sample_flags = 0;
- data->raw = NULL;
+ data->sample_flags = PERF_SAMPLE_PERIOD;
data->period = period;

if (addr) {
diff --git a/kernel/events/core.c b/kernel/events/core.c
index a91f74db9fe9..04e19a857d4b 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7332,7 +7332,7 @@ void perf_prepare_sample(struct perf_event_header *header,
struct perf_raw_record *raw = data->raw;
int size;

- if (raw) {
+ if (raw && (data->sample_flags & PERF_SAMPLE_RAW)) {
struct perf_raw_frag *frag = &raw->frag;
u32 sum = 0;

@@ -7348,6 +7348,7 @@ void perf_prepare_sample(struct perf_event_header *header,
frag->pad = raw->size - sum;
} else {
size = sizeof(u64);
+ data->raw = NULL;
}

header->size += size;
--
2.37.3.968.ga6b4b080e4-goog


2022-09-28 07:23:51

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: perf/core] perf: Use sample_flags for raw_data

The following commit has been merged into the perf/core branch of tip:

Commit-ID: 838d9bb62d132ec3baf1b5aba2e95ef9a7a9a3cd
Gitweb: https://git.kernel.org/tip/838d9bb62d132ec3baf1b5aba2e95ef9a7a9a3cd
Author: Namhyung Kim <[email protected]>
AuthorDate: Wed, 21 Sep 2022 15:00:32 -07:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Tue, 27 Sep 2022 22:50:24 +02:00

perf: Use sample_flags for raw_data

Use the new sample_flags to indicate whether the raw data field is
filled by the PMU driver. Although it could check with the NULL,
follow the same rule with other fields.

Remove the raw field from the perf_sample_data_init() to minimize
the number of cache lines touched.

Signed-off-by: Namhyung Kim <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/s390/kernel/perf_cpum_cf.c | 1 +
arch/s390/kernel/perf_pai_crypto.c | 1 +
arch/x86/events/amd/ibs.c | 1 +
include/linux/perf_event.h | 5 ++---
kernel/events/core.c | 3 ++-
5 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/s390/kernel/perf_cpum_cf.c b/arch/s390/kernel/perf_cpum_cf.c
index f7dd3c8..f043a7f 100644
--- a/arch/s390/kernel/perf_cpum_cf.c
+++ b/arch/s390/kernel/perf_cpum_cf.c
@@ -664,6 +664,7 @@ static int cfdiag_push_sample(struct perf_event *event,
raw.frag.data = cpuhw->stop;
raw.size = raw.frag.size;
data.raw = &raw;
+ data.sample_flags |= PERF_SAMPLE_RAW;
}

overflow = perf_event_overflow(event, &data, &regs);
diff --git a/arch/s390/kernel/perf_pai_crypto.c b/arch/s390/kernel/perf_pai_crypto.c
index b38b4ae..6826e2a 100644
--- a/arch/s390/kernel/perf_pai_crypto.c
+++ b/arch/s390/kernel/perf_pai_crypto.c
@@ -366,6 +366,7 @@ static int paicrypt_push_sample(void)
raw.frag.data = cpump->save;
raw.size = raw.frag.size;
data.raw = &raw;
+ data.sample_flags |= PERF_SAMPLE_RAW;
}

overflow = perf_event_overflow(event, &data, &regs);
diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index ce5720b..c29a006 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -781,6 +781,7 @@ fail:
},
};
data.raw = &raw;
+ data.sample_flags |= PERF_SAMPLE_RAW;
}

/*
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index f4a1357..e9b151c 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1028,7 +1028,6 @@ struct perf_sample_data {
* minimize the cachelines touched.
*/
u64 sample_flags;
- struct perf_raw_record *raw;
u64 period;

/*
@@ -1040,6 +1039,7 @@ struct perf_sample_data {
union perf_mem_data_src data_src;
u64 txn;
u64 addr;
+ struct perf_raw_record *raw;

u64 type;
u64 ip;
@@ -1078,8 +1078,7 @@ static inline void perf_sample_data_init(struct perf_sample_data *data,
u64 addr, u64 period)
{
/* remaining struct members initialized in perf_prepare_sample() */
- data->sample_flags = 0;
- data->raw = NULL;
+ data->sample_flags = PERF_SAMPLE_PERIOD;
data->period = period;

if (addr) {
diff --git a/kernel/events/core.c b/kernel/events/core.c
index a91f74d..04e19a8 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7332,7 +7332,7 @@ void perf_prepare_sample(struct perf_event_header *header,
struct perf_raw_record *raw = data->raw;
int size;

- if (raw) {
+ if (raw && (data->sample_flags & PERF_SAMPLE_RAW)) {
struct perf_raw_frag *frag = &raw->frag;
u32 sum = 0;

@@ -7348,6 +7348,7 @@ void perf_prepare_sample(struct perf_event_header *header,
frag->pad = raw->size - sum;
} else {
size = sizeof(u64);
+ data->raw = NULL;
}

header->size += size;

2022-10-06 16:11:21

by Sumanth Korikkar

[permalink] [raw]
Subject: [PATCH] Re: [tip: perf/core] perf: Use sample_flags for raw_data

Hi,

This causes segfaults.

Steps to recreate:
* Run ./samples/bpf/trace_output
BUG pid 9 cookie 1001000000004 sized 4
BUG pid 9 cookie 1001000000004 sized 4
BUG pid 9 cookie 1001000000004 sized 4
Segmentation fault (core dumped)

Problem:
* The following commit sets data->raw to NULL, when the raw data is not filled
by PMU driver. This leads to stale data.

* raw data could also be filled by bpf_perf_event_output(), bpf_event_output()
...
686 perf_sample_data_init(sd, 0, 0);
687 sd->raw = &raw;
688
689 err = __bpf_perf_event_output(regs, map, flags, sd);
...

* The below patch eliminates segfaults. However, contradicts with
the description mentioned in this commit (Filled by only PMU driver).

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 49fb9ec8366d..1ed08967fb97 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -687,6 +687,7 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,

perf_sample_data_init(sd, 0, 0);
sd->raw = &raw;
+ sd->sample_flags |= PERF_SAMPLE_RAW;

err = __bpf_perf_event_output(regs, map, flags, sd);

@@ -745,6 +746,7 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
perf_fetch_caller_regs(regs);
perf_sample_data_init(sd, 0, 0);
sd->raw = &raw;
+ sd->sample_flags |= PERF_SAMPLE_RAW;

ret = __bpf_perf_event_output(regs, map, flags, sd);
out:

--
Thanks,
Sumanth

2022-10-06 17:15:14

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH] Re: [tip: perf/core] perf: Use sample_flags for raw_data

Hello,

On Thu, Oct 6, 2022 at 9:01 AM Sumanth Korikkar <[email protected]> wrote:
>
> Hi,
>
> This causes segfaults.
>
> Steps to recreate:
> * Run ./samples/bpf/trace_output
> BUG pid 9 cookie 1001000000004 sized 4
> BUG pid 9 cookie 1001000000004 sized 4
> BUG pid 9 cookie 1001000000004 sized 4
> Segmentation fault (core dumped)
>
> Problem:
> * The following commit sets data->raw to NULL, when the raw data is not filled
> by PMU driver. This leads to stale data.
>
> * raw data could also be filled by bpf_perf_event_output(), bpf_event_output()
> ...
> 686 perf_sample_data_init(sd, 0, 0);
> 687 sd->raw = &raw;
> 688
> 689 err = __bpf_perf_event_output(regs, map, flags, sd);
> ...
>
> * The below patch eliminates segfaults. However, contradicts with
> the description mentioned in this commit (Filled by only PMU driver).

Thank you for the fix. Don't worry about the description - it said
it's usually filled by PMU drivers and it should be fine as long as
you set the sample flags after filling the raw data.

Acked-by: Namhyung Kim <[email protected]>

Thanks,
Namhyung

>
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 49fb9ec8366d..1ed08967fb97 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -687,6 +687,7 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
>
> perf_sample_data_init(sd, 0, 0);
> sd->raw = &raw;
> + sd->sample_flags |= PERF_SAMPLE_RAW;
>
> err = __bpf_perf_event_output(regs, map, flags, sd);
>
> @@ -745,6 +746,7 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
> perf_fetch_caller_regs(regs);
> perf_sample_data_init(sd, 0, 0);
> sd->raw = &raw;
> + sd->sample_flags |= PERF_SAMPLE_RAW;
>
> ret = __bpf_perf_event_output(regs, map, flags, sd);
> out:
>
> --
> Thanks,
> Sumanth

2022-10-06 19:01:05

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH] Re: [tip: perf/core] perf: Use sample_flags for raw_data

On Thu, Oct 06, 2022 at 06:00:44PM +0200, Sumanth Korikkar wrote:
> Hi,
>
> This causes segfaults.
>
> Steps to recreate:
> * Run ./samples/bpf/trace_output
> BUG pid 9 cookie 1001000000004 sized 4
> BUG pid 9 cookie 1001000000004 sized 4
> BUG pid 9 cookie 1001000000004 sized 4
> Segmentation fault (core dumped)
>
> Problem:
> * The following commit sets data->raw to NULL, when the raw data is not filled
> by PMU driver. This leads to stale data.
>
> * raw data could also be filled by bpf_perf_event_output(), bpf_event_output()
> ...
> 686 perf_sample_data_init(sd, 0, 0);
> 687 sd->raw = &raw;
> 688
> 689 err = __bpf_perf_event_output(regs, map, flags, sd);
> ...
>
> * The below patch eliminates segfaults. However, contradicts with
> the description mentioned in this commit (Filled by only PMU driver).

hi,
could you please resend the patch with formal changelog and Fixes tag?

thanks,
jirka

>
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 49fb9ec8366d..1ed08967fb97 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -687,6 +687,7 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
>
> perf_sample_data_init(sd, 0, 0);
> sd->raw = &raw;
> + sd->sample_flags |= PERF_SAMPLE_RAW;
>
> err = __bpf_perf_event_output(regs, map, flags, sd);
>
> @@ -745,6 +746,7 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
> perf_fetch_caller_regs(regs);
> perf_sample_data_init(sd, 0, 0);
> sd->raw = &raw;
> + sd->sample_flags |= PERF_SAMPLE_RAW;
>
> ret = __bpf_perf_event_output(regs, map, flags, sd);
> out:
>
> --
> Thanks,
> Sumanth

2022-10-07 08:26:26

by Sumanth Korikkar

[permalink] [raw]
Subject: [PATCH] bpf: fix sample_flags for bpf_perf_event_output

* Raw data is also filled by bpf_perf_event_output.
* Add sample_flags to indicate raw data.
* This eliminates the segfaults as shown below:
Run ./samples/bpf/trace_output
BUG pid 9 cookie 1001000000004 sized 4
BUG pid 9 cookie 1001000000004 sized 4
BUG pid 9 cookie 1001000000004 sized 4
Segmentation fault (core dumped)

Fixes: 838d9bb62d13 ("perf: Use sample_flags for raw_data")
Acked-by: Namhyung Kim <[email protected]>
Signed-off-by: Sumanth Korikkar <[email protected]>
---
kernel/trace/bpf_trace.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 49fb9ec8366d..1ed08967fb97 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -687,6 +687,7 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,

perf_sample_data_init(sd, 0, 0);
sd->raw = &raw;
+ sd->sample_flags |= PERF_SAMPLE_RAW;

err = __bpf_perf_event_output(regs, map, flags, sd);

@@ -745,6 +746,7 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
perf_fetch_caller_regs(regs);
perf_sample_data_init(sd, 0, 0);
sd->raw = &raw;
+ sd->sample_flags |= PERF_SAMPLE_RAW;

ret = __bpf_perf_event_output(regs, map, flags, sd);
out:
--
2.36.1

2022-10-07 10:04:43

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH] bpf: fix sample_flags for bpf_perf_event_output

On Fri, Oct 07, 2022 at 10:13:27AM +0200, Sumanth Korikkar wrote:
> * Raw data is also filled by bpf_perf_event_output.
> * Add sample_flags to indicate raw data.
> * This eliminates the segfaults as shown below:
> Run ./samples/bpf/trace_output
> BUG pid 9 cookie 1001000000004 sized 4
> BUG pid 9 cookie 1001000000004 sized 4
> BUG pid 9 cookie 1001000000004 sized 4
> Segmentation fault (core dumped)
>
> Fixes: 838d9bb62d13 ("perf: Use sample_flags for raw_data")
> Acked-by: Namhyung Kim <[email protected]>
> Signed-off-by: Sumanth Korikkar <[email protected]>

Acked-by: Jiri Olsa <[email protected]>

Peter,
I think this should go through your tree again?
bpf-next/master does not have sample_flags merged yet

thanks,
jirka

> ---
> kernel/trace/bpf_trace.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 49fb9ec8366d..1ed08967fb97 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -687,6 +687,7 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
>
> perf_sample_data_init(sd, 0, 0);
> sd->raw = &raw;
> + sd->sample_flags |= PERF_SAMPLE_RAW;
>
> err = __bpf_perf_event_output(regs, map, flags, sd);
>
> @@ -745,6 +746,7 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
> perf_fetch_caller_regs(regs);
> perf_sample_data_init(sd, 0, 0);
> sd->raw = &raw;
> + sd->sample_flags |= PERF_SAMPLE_RAW;
>
> ret = __bpf_perf_event_output(regs, map, flags, sd);
> out:
> --
> 2.36.1
>

2022-10-07 16:00:50

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] bpf: fix sample_flags for bpf_perf_event_output

On Fri, Oct 07, 2022 at 11:45:36AM +0200, Jiri Olsa wrote:
> On Fri, Oct 07, 2022 at 10:13:27AM +0200, Sumanth Korikkar wrote:
> > * Raw data is also filled by bpf_perf_event_output.
> > * Add sample_flags to indicate raw data.
> > * This eliminates the segfaults as shown below:
> > Run ./samples/bpf/trace_output
> > BUG pid 9 cookie 1001000000004 sized 4
> > BUG pid 9 cookie 1001000000004 sized 4
> > BUG pid 9 cookie 1001000000004 sized 4
> > Segmentation fault (core dumped)
> >
> > Fixes: 838d9bb62d13 ("perf: Use sample_flags for raw_data")
> > Acked-by: Namhyung Kim <[email protected]>
> > Signed-off-by: Sumanth Korikkar <[email protected]>
>
> Acked-by: Jiri Olsa <[email protected]>
>
> Peter,
> I think this should go through your tree again?
> bpf-next/master does not have sample_flags merged yet

Yep can do. I'll line it up in perf/urgent (Ingo just send out
perf/core).

2022-10-17 14:53:51

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: perf/urgent] bpf: Fix sample_flags for bpf_perf_event_output

The following commit has been merged into the perf/urgent branch of tip:

Commit-ID: 21da7472a040420f2dc624ffec70291a72c5d6a6
Gitweb: https://git.kernel.org/tip/21da7472a040420f2dc624ffec70291a72c5d6a6
Author: Sumanth Korikkar <[email protected]>
AuthorDate: Fri, 07 Oct 2022 10:13:27 +02:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Mon, 17 Oct 2022 16:32:06 +02:00

bpf: Fix sample_flags for bpf_perf_event_output

* Raw data is also filled by bpf_perf_event_output.
* Add sample_flags to indicate raw data.
* This eliminates the segfaults as shown below:
Run ./samples/bpf/trace_output
BUG pid 9 cookie 1001000000004 sized 4
BUG pid 9 cookie 1001000000004 sized 4
BUG pid 9 cookie 1001000000004 sized 4
Segmentation fault (core dumped)

Fixes: 838d9bb62d13 ("perf: Use sample_flags for raw_data")
Signed-off-by: Sumanth Korikkar <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
kernel/trace/bpf_trace.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 49fb9ec..1ed0896 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -687,6 +687,7 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,

perf_sample_data_init(sd, 0, 0);
sd->raw = &raw;
+ sd->sample_flags |= PERF_SAMPLE_RAW;

err = __bpf_perf_event_output(regs, map, flags, sd);

@@ -745,6 +746,7 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
perf_fetch_caller_regs(regs);
perf_sample_data_init(sd, 0, 0);
sd->raw = &raw;
+ sd->sample_flags |= PERF_SAMPLE_RAW;

ret = __bpf_perf_event_output(regs, map, flags, sd);
out:

2022-10-17 19:59:00

by SeongJae Park

[permalink] [raw]
Subject: Re: [PATCH] bpf: fix sample_flags for bpf_perf_event_output

Hello,


The commit that this patch is fixing[1] also causes yet another segfault for
'perf-script' of tracepoint records. For example:

$ sudo timeout 3 perf record -e exceptions:page_fault_user
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.228 MB perf.data (74 samples) ]
$ sudo perf script
Segmentation fault

Reverting this patch and the original bug commit[1] fixes the issue. I haven't
deep dive yet because I'm not familiar with this area. Anybody has any idea
about this?

[1] 838d9bb62d13 ("perf: Use sample_flags for raw_data")


Thanks,
SJ

On Fri, 7 Oct 2022 10:13:27 +0200 Sumanth Korikkar <[email protected]> wrote:

> * Raw data is also filled by bpf_perf_event_output.
> * Add sample_flags to indicate raw data.
> * This eliminates the segfaults as shown below:
> Run ./samples/bpf/trace_output
> BUG pid 9 cookie 1001000000004 sized 4
> BUG pid 9 cookie 1001000000004 sized 4
> BUG pid 9 cookie 1001000000004 sized 4
> Segmentation fault (core dumped)
>
> Fixes: 838d9bb62d13 ("perf: Use sample_flags for raw_data")
> Acked-by: Namhyung Kim <[email protected]>
> Signed-off-by: Sumanth Korikkar <[email protected]>
> ---
> kernel/trace/bpf_trace.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 49fb9ec8366d..1ed08967fb97 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -687,6 +687,7 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
>
> perf_sample_data_init(sd, 0, 0);
> sd->raw = &raw;
> + sd->sample_flags |= PERF_SAMPLE_RAW;
>
> err = __bpf_perf_event_output(regs, map, flags, sd);
>
> @@ -745,6 +746,7 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
> perf_fetch_caller_regs(regs);
> perf_sample_data_init(sd, 0, 0);
> sd->raw = &raw;
> + sd->sample_flags |= PERF_SAMPLE_RAW;
>
> ret = __bpf_perf_event_output(regs, map, flags, sd);
> out:
> --
> 2.36.1

2022-10-17 23:27:54

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH] bpf: fix sample_flags for bpf_perf_event_output

Hi SeongJae,

On Mon, Oct 17, 2022 at 12:27 PM SeongJae Park <[email protected]> wrote:
>
> Hello,
>
>
> The commit that this patch is fixing[1] also causes yet another segfault for
> 'perf-script' of tracepoint records. For example:
>
> $ sudo timeout 3 perf record -e exceptions:page_fault_user
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.228 MB perf.data (74 samples) ]
> $ sudo perf script
> Segmentation fault
>
> Reverting this patch and the original bug commit[1] fixes the issue. I haven't
> deep dive yet because I'm not familiar with this area. Anybody has any idea
> about this?
>
> [1] 838d9bb62d13 ("perf: Use sample_flags for raw_data")

Sorry for the trouble. I think you also need to apply the below:

https://lore.kernel.org/r/[email protected]

Thanks,
Namhyung

>
> On Fri, 7 Oct 2022 10:13:27 +0200 Sumanth Korikkar <[email protected]> wrote:
>
> > * Raw data is also filled by bpf_perf_event_output.
> > * Add sample_flags to indicate raw data.
> > * This eliminates the segfaults as shown below:
> > Run ./samples/bpf/trace_output
> > BUG pid 9 cookie 1001000000004 sized 4
> > BUG pid 9 cookie 1001000000004 sized 4
> > BUG pid 9 cookie 1001000000004 sized 4
> > Segmentation fault (core dumped)
> >
> > Fixes: 838d9bb62d13 ("perf: Use sample_flags for raw_data")
> > Acked-by: Namhyung Kim <[email protected]>
> > Signed-off-by: Sumanth Korikkar <[email protected]>
> > ---
> > kernel/trace/bpf_trace.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index 49fb9ec8366d..1ed08967fb97 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -687,6 +687,7 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, struct bpf_map *, map,
> >
> > perf_sample_data_init(sd, 0, 0);
> > sd->raw = &raw;
> > + sd->sample_flags |= PERF_SAMPLE_RAW;
> >
> > err = __bpf_perf_event_output(regs, map, flags, sd);
> >
> > @@ -745,6 +746,7 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
> > perf_fetch_caller_regs(regs);
> > perf_sample_data_init(sd, 0, 0);
> > sd->raw = &raw;
> > + sd->sample_flags |= PERF_SAMPLE_RAW;
> >
> > ret = __bpf_perf_event_output(regs, map, flags, sd);
> > out:
> > --
> > 2.36.1

2022-10-17 23:56:15

by SeongJae Park

[permalink] [raw]
Subject: Re: [PATCH] bpf: fix sample_flags for bpf_perf_event_output

On Mon, 17 Oct 2022 15:52:15 -0700 Namhyung Kim <[email protected]> wrote:

> Hi SeongJae,
>
> On Mon, Oct 17, 2022 at 12:27 PM SeongJae Park <[email protected]> wrote:
> >
> > Hello,
> >
> >
> > The commit that this patch is fixing[1] also causes yet another segfault for
> > 'perf-script' of tracepoint records. For example:
> >
> > $ sudo timeout 3 perf record -e exceptions:page_fault_user
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 0.228 MB perf.data (74 samples) ]
> > $ sudo perf script
> > Segmentation fault
> >
> > Reverting this patch and the original bug commit[1] fixes the issue. I haven't
> > deep dive yet because I'm not familiar with this area. Anybody has any idea
> > about this?
> >
> > [1] 838d9bb62d13 ("perf: Use sample_flags for raw_data")
>
> Sorry for the trouble.

No problem.

> I think you also need to apply the below:
>
> https://lore.kernel.org/r/[email protected]

Thank you for this nice answer. I confirmed that this fixes my issue.


Thanks,
SJ

[...]

2022-10-19 05:05:46

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [PATCH] bpf: fix sample_flags for bpf_perf_event_output

On Fri, Oct 7, 2022 at 8:31 AM Peter Zijlstra <[email protected]> wrote:
>
> On Fri, Oct 07, 2022 at 11:45:36AM +0200, Jiri Olsa wrote:
> > On Fri, Oct 07, 2022 at 10:13:27AM +0200, Sumanth Korikkar wrote:
> > > * Raw data is also filled by bpf_perf_event_output.
> > > * Add sample_flags to indicate raw data.
> > > * This eliminates the segfaults as shown below:
> > > Run ./samples/bpf/trace_output
> > > BUG pid 9 cookie 1001000000004 sized 4
> > > BUG pid 9 cookie 1001000000004 sized 4
> > > BUG pid 9 cookie 1001000000004 sized 4
> > > Segmentation fault (core dumped)
> > >
> > > Fixes: 838d9bb62d13 ("perf: Use sample_flags for raw_data")
> > > Acked-by: Namhyung Kim <[email protected]>
> > > Signed-off-by: Sumanth Korikkar <[email protected]>
> >
> > Acked-by: Jiri Olsa <[email protected]>
> >
> > Peter,
> > I think this should go through your tree again?
> > bpf-next/master does not have sample_flags merged yet
>
> Yep can do. I'll line it up in perf/urgent (Ingo just send out
> perf/core).

Peter,

Could you please hurry up. 11 days have passed.

This issue affects everyone the hard way now after merging
all the trees: tip -> linus -> net-next -> bpf-next.
The BPF CI is red right now with 5 tests failing because
this fix is still missing.
It's causing a headache to maintainers and developers.

2022-10-19 12:18:14

by Athira Rajeev

[permalink] [raw]
Subject: Re: [tip: perf/core] perf: Use sample_flags for raw_data



> On 28-Sep-2022, at 12:27 PM, tip-bot2 for Namhyung Kim <[email protected]> wrote:
>
> The following commit has been merged into the perf/core branch of tip:
>
> Commit-ID: 838d9bb62d132ec3baf1b5aba2e95ef9a7a9a3cd
> Gitweb: https://git.kernel.org/tip/838d9bb62d132ec3baf1b5aba2e95ef9a7a9a3cd
> Author: Namhyung Kim <[email protected]>
> AuthorDate: Wed, 21 Sep 2022 15:00:32 -07:00
> Committer: Peter Zijlstra <[email protected]>
> CommitterDate: Tue, 27 Sep 2022 22:50:24 +02:00
>
> perf: Use sample_flags for raw_data
>
> Use the new sample_flags to indicate whether the raw data field is
> filled by the PMU driver. Although it could check with the NULL,
> follow the same rule with other fields.
>
> Remove the raw field from the perf_sample_data_init() to minimize
> the number of cache lines touched.
>
> Signed-off-by: Namhyung Kim <[email protected]>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> Link: https://lkml.kernel.org/r/[email protected]

Hi Namhyung,

This commit ("perf: Use sample_flags for raw_data") added
PERF_SAMPLE_RAW check in perf_prepare_sample. To be in sync
while we output sample to memory, do we also need to add
similar check in perf_output_sample ? I am pasting change below.
Please share your thoughts.

From 46d874bc4a915dd710ddbc5198588cbb66d3ea8e Mon Sep 17 00:00:00 2001
From: Athira Rajeev <[email protected]>
Date: Wed, 19 Oct 2022 13:02:06 +0530
Subject: [PATCH] perf/core: Update sample_flags for raw_data in
perf_output_sample

commit 838d9bb62d13 ("perf: Use sample_flags for raw_data")
added check for PERF_SAMPLE_RAW in sample_flags in
perf_prepare_sample(). But while copying the sample in memory,
the check for sample_flags is not added in perf_output_sample().
Fix adds the same in perf_output_sample as well.

Fixes: 838d9bb62d13 ("perf: Use sample_flags for raw_data")
Signed-off-by: Athira Rajeev <[email protected]>
---
kernel/events/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4ec3717003d5..daf387c75d33 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7099,7 +7099,7 @@ void perf_output_sample(struct perf_output_handle *handle,
if (sample_type & PERF_SAMPLE_RAW) {
struct perf_raw_record *raw = data->raw;

- if (raw) {
+ if (raw && (data->sample_flags & PERF_SAMPLE_RAW)) {
struct perf_raw_frag *frag = &raw->frag;

perf_output_put(handle, raw->size);
--
2.31.1

Thanks
Athira

> ---
> arch/s390/kernel/perf_cpum_cf.c | 1 +
> arch/s390/kernel/perf_pai_crypto.c | 1 +
> arch/x86/events/amd/ibs.c | 1 +
> include/linux/perf_event.h | 5 ++---
> kernel/events/core.c | 3 ++-
> 5 files changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/arch/s390/kernel/perf_cpum_cf.c b/arch/s390/kernel/perf_cpum_cf.c
> index f7dd3c8..f043a7f 100644
> --- a/arch/s390/kernel/perf_cpum_cf.c
> +++ b/arch/s390/kernel/perf_cpum_cf.c
> @@ -664,6 +664,7 @@ static int cfdiag_push_sample(struct perf_event *event,
> raw.frag.data = cpuhw->stop;
> raw.size = raw.frag.size;
> data.raw = &raw;
> + data.sample_flags |= PERF_SAMPLE_RAW;
> }
>
> overflow = perf_event_overflow(event, &data, &regs);
> diff --git a/arch/s390/kernel/perf_pai_crypto.c b/arch/s390/kernel/perf_pai_crypto.c
> index b38b4ae..6826e2a 100644
> --- a/arch/s390/kernel/perf_pai_crypto.c
> +++ b/arch/s390/kernel/perf_pai_crypto.c
> @@ -366,6 +366,7 @@ static int paicrypt_push_sample(void)
> raw.frag.data = cpump->save;
> raw.size = raw.frag.size;
> data.raw = &raw;
> + data.sample_flags |= PERF_SAMPLE_RAW;
> }
>
> overflow = perf_event_overflow(event, &data, &regs);
> diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
> index ce5720b..c29a006 100644
> --- a/arch/x86/events/amd/ibs.c
> +++ b/arch/x86/events/amd/ibs.c
> @@ -781,6 +781,7 @@ fail:
> },
> };
> data.raw = &raw;
> + data.sample_flags |= PERF_SAMPLE_RAW;
> }
>
> /*
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index f4a1357..e9b151c 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -1028,7 +1028,6 @@ struct perf_sample_data {
> * minimize the cachelines touched.
> */
> u64 sample_flags;
> - struct perf_raw_record *raw;
> u64 period;
>
> /*
> @@ -1040,6 +1039,7 @@ struct perf_sample_data {
> union perf_mem_data_src data_src;
> u64 txn;
> u64 addr;
> + struct perf_raw_record *raw;
>
> u64 type;
> u64 ip;
> @@ -1078,8 +1078,7 @@ static inline void perf_sample_data_init(struct perf_sample_data *data,
> u64 addr, u64 period)
> {
> /* remaining struct members initialized in perf_prepare_sample() */
> - data->sample_flags = 0;
> - data->raw = NULL;
> + data->sample_flags = PERF_SAMPLE_PERIOD;
> data->period = period;
>
> if (addr) {
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index a91f74d..04e19a8 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -7332,7 +7332,7 @@ void perf_prepare_sample(struct perf_event_header *header,
> struct perf_raw_record *raw = data->raw;
> int size;
>
> - if (raw) {
> + if (raw && (data->sample_flags & PERF_SAMPLE_RAW)) {
> struct perf_raw_frag *frag = &raw->frag;
> u32 sum = 0;
>
> @@ -7348,6 +7348,7 @@ void perf_prepare_sample(struct perf_event_header *header,
> frag->pad = raw->size - sum;
> } else {
> size = sizeof(u64);
> + data->raw = NULL;
> }
>
> header->size += size;

2022-10-21 02:05:29

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [PATCH] bpf: fix sample_flags for bpf_perf_event_output

Peter,

Another 2 days have passed and bpf side is still broken
due to the change that went during the merge window without
corresponding fix from the bpf side.
Looks like the patch is sitting in tip:perf/urgent.
Please send it to Linus asap.

We're not sending bpf fixes to avoid breaking bpf tree too.
We've worked around the issue in bpf CI for bpf-next tree only.
Developers still see failures when they run tests locally.

On Tue, Oct 18, 2022 at 9:57 PM Alexei Starovoitov
<[email protected]> wrote:
>
> On Fri, Oct 7, 2022 at 8:31 AM Peter Zijlstra <[email protected]> wrote:
> >
> > On Fri, Oct 07, 2022 at 11:45:36AM +0200, Jiri Olsa wrote:
> > > On Fri, Oct 07, 2022 at 10:13:27AM +0200, Sumanth Korikkar wrote:
> > > > * Raw data is also filled by bpf_perf_event_output.
> > > > * Add sample_flags to indicate raw data.
> > > > * This eliminates the segfaults as shown below:
> > > > Run ./samples/bpf/trace_output
> > > > BUG pid 9 cookie 1001000000004 sized 4
> > > > BUG pid 9 cookie 1001000000004 sized 4
> > > > BUG pid 9 cookie 1001000000004 sized 4
> > > > Segmentation fault (core dumped)
> > > >
> > > > Fixes: 838d9bb62d13 ("perf: Use sample_flags for raw_data")
> > > > Acked-by: Namhyung Kim <[email protected]>
> > > > Signed-off-by: Sumanth Korikkar <[email protected]>
> > >
> > > Acked-by: Jiri Olsa <[email protected]>
> > >
> > > Peter,
> > > I think this should go through your tree again?
> > > bpf-next/master does not have sample_flags merged yet
> >
> > Yep can do. I'll line it up in perf/urgent (Ingo just send out
> > perf/core).
>
> Peter,
>
> Could you please hurry up. 11 days have passed.
>
> This issue affects everyone the hard way now after merging
> all the trees: tip -> linus -> net-next -> bpf-next.
> The BPF CI is red right now with 5 tests failing because
> this fix is still missing.
> It's causing a headache to maintainers and developers.

2022-10-23 01:57:09

by Alexei Starovoitov

[permalink] [raw]
Subject: bpf+perf is still broken. Was: [PATCH] bpf: fix sample_flags for bpf_perf_event_output

Another 2 days have passed and the fix is still not in the Linus's tree.

Peter,
whatever your excuse is for not sending tip:perf/urgent
this is not acceptable.

Linus,

please apply this fix directly:
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=perf/urgent&id=21da7472a040420f2dc624ffec70291a72c5d6a6

or suggest the course of action.

It sucked to have such a breakage in rc1 and we don't want rc2
to stay broken.

Thanks

On Thu, Oct 20, 2022 at 6:36 PM Alexei Starovoitov
<[email protected]> wrote:
>
> Peter,
>
> Another 2 days have passed and bpf side is still broken
> due to the change that went during the merge window without
> corresponding fix from the bpf side.
> Looks like the patch is sitting in tip:perf/urgent.
> Please send it to Linus asap.
>
> We're not sending bpf fixes to avoid breaking bpf tree too.
> We've worked around the issue in bpf CI for bpf-next tree only.
> Developers still see failures when they run tests locally.
>
> On Tue, Oct 18, 2022 at 9:57 PM Alexei Starovoitov
> <[email protected]> wrote:
> >
> > On Fri, Oct 7, 2022 at 8:31 AM Peter Zijlstra <[email protected]> wrote:
> > >
> > > On Fri, Oct 07, 2022 at 11:45:36AM +0200, Jiri Olsa wrote:
> > > > On Fri, Oct 07, 2022 at 10:13:27AM +0200, Sumanth Korikkar wrote:
> > > > > * Raw data is also filled by bpf_perf_event_output.
> > > > > * Add sample_flags to indicate raw data.
> > > > > * This eliminates the segfaults as shown below:
> > > > > Run ./samples/bpf/trace_output
> > > > > BUG pid 9 cookie 1001000000004 sized 4
> > > > > BUG pid 9 cookie 1001000000004 sized 4
> > > > > BUG pid 9 cookie 1001000000004 sized 4
> > > > > Segmentation fault (core dumped)
> > > > >
> > > > > Fixes: 838d9bb62d13 ("perf: Use sample_flags for raw_data")
> > > > > Acked-by: Namhyung Kim <[email protected]>
> > > > > Signed-off-by: Sumanth Korikkar <[email protected]>
> > > >
> > > > Acked-by: Jiri Olsa <[email protected]>
> > > >
> > > > Peter,
> > > > I think this should go through your tree again?
> > > > bpf-next/master does not have sample_flags merged yet
> > >
> > > Yep can do. I'll line it up in perf/urgent (Ingo just send out
> > > perf/core).
> >
> > Peter,
> >
> > Could you please hurry up. 11 days have passed.
> >
> > This issue affects everyone the hard way now after merging
> > all the trees: tip -> linus -> net-next -> bpf-next.
> > The BPF CI is red right now with 5 tests failing because
> > this fix is still missing.
> > It's causing a headache to maintainers and developers.

2022-10-23 17:19:14

by Linus Torvalds

[permalink] [raw]
Subject: Re: bpf+perf is still broken. Was: [PATCH] bpf: fix sample_flags for bpf_perf_event_output

On Sat, Oct 22, 2022 at 6:16 PM Alexei Starovoitov
<[email protected]> wrote:
>
> Linus,
>
> please apply this fix directly or suggest the course of action.

I have a pull request from Borislav with the fix that came in
overnight, so this should be all fixed in rc2.

Linus

2022-10-23 17:20:31

by Linus Torvalds

[permalink] [raw]
Subject: Re: bpf+perf is still broken. Was: [PATCH] bpf: fix sample_flags for bpf_perf_event_output

On Sun, Oct 23, 2022 at 9:55 AM Linus Torvalds
<[email protected]> wrote:
>
> I have a pull request from Borislav with the fix that came in
> overnight, so this should be all fixed in rc2.

.. and now it has moved from my inbox to my -git tree.

Linus

2022-10-23 18:09:55

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: bpf+perf is still broken. Was: [PATCH] bpf: fix sample_flags for bpf_perf_event_output

On Sun, Oct 23, 2022 at 10:20 AM Linus Torvalds
<[email protected]> wrote:
>
> On Sun, Oct 23, 2022 at 9:55 AM Linus Torvalds
> <[email protected]> wrote:
> >
> > I have a pull request from Borislav with the fix that came in
> > overnight, so this should be all fixed in rc2.
>
> .. and now it has moved from my inbox to my -git tree.

Great. Thank you.