2015-07-21 03:09:16

by He Kuang

[permalink] [raw]
Subject: [RFC PATCH v6 0/2] Make eBPF programs output data to perf

Hi,

Previous patch v5 url:
http://thread.gmane.org/gmane.linux.kernel/1995274

The bugfix of dynamic array length in trace event goes to
kernel/git/rostedt/linux-trace.git ftrace/urgent and confirms that the
return value of __get_dynamic_array_len() is the total allocated
length of the dynamic array. For we print the bpf output data in byte
array from patch v5, that problem does not affect our patch any more,
but some comments in patch 1/2 is updated.

Patch 2/2 is acked by Alexei.

Thank you.

He Kuang (2):
tools lib traceevent: Support function __get_dynamic_array_len
bpf: Introduce function for outputing data to perf event

include/trace/events/bpf.h | 30 ++++++++++++
include/uapi/linux/bpf.h | 7 +++
kernel/trace/bpf_trace.c | 23 +++++++++
samples/bpf/bpf_helpers.h | 2 +
tools/lib/traceevent/event-parse.c | 56 +++++++++++++++++++++-
tools/lib/traceevent/event-parse.h | 1 +
.../perf/util/scripting-engines/trace-event-perl.c | 1 +
.../util/scripting-engines/trace-event-python.c | 1 +
8 files changed, 119 insertions(+), 2 deletions(-)
create mode 100644 include/trace/events/bpf.h

--
1.8.5.2


2015-07-21 03:09:34

by He Kuang

[permalink] [raw]
Subject: [RFC PATCH v6 1/2] tools lib traceevent: Support function __get_dynamic_array_len

Support helper function __get_dynamic_array_len() in libtraceevent,
this function is used accompany with __print_array() or __print_hex(),
but currently it is not an available function in the function list of
process_function().

The total allocated length of the dynamic array is embedded in the top
half of __data_loc_##item field. This patch adds new arg type
PRINT_DYNAMIC_ARRAY_LEN to return the length to eval_num_arg(),

Signed-off-by: He Kuang <[email protected]>
---
tools/lib/traceevent/event-parse.c | 56 +++++++++++++++++++++-
tools/lib/traceevent/event-parse.h | 1 +
.../perf/util/scripting-engines/trace-event-perl.c | 1 +
.../util/scripting-engines/trace-event-python.c | 1 +
4 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
index cc25f05..3af69cf 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -783,6 +783,7 @@ static void free_arg(struct print_arg *arg)
free(arg->bitmask.bitmask);
break;
case PRINT_DYNAMIC_ARRAY:
+ case PRINT_DYNAMIC_ARRAY_LEN:
free(arg->dynarray.index);
break;
case PRINT_OP:
@@ -2655,6 +2656,42 @@ process_dynamic_array(struct event_format *event, struct print_arg *arg, char **
}

static enum event_type
+process_dynamic_array_len(struct event_format *event, struct print_arg *arg,
+ char **tok)
+{
+ struct format_field *field;
+ enum event_type type;
+ char *token;
+
+ if (read_expect_type(EVENT_ITEM, &token) < 0)
+ goto out_free;
+
+ arg->type = PRINT_DYNAMIC_ARRAY_LEN;
+
+ /* Find the field */
+ field = pevent_find_field(event, token);
+ if (!field)
+ goto out_free;
+
+ arg->dynarray.field = field;
+ arg->dynarray.index = 0;
+
+ if (read_expected(EVENT_DELIM, ")") < 0)
+ goto out_err;
+
+ type = read_token(&token);
+ *tok = token;
+
+ return type;
+
+ out_free:
+ free_token(token);
+ out_err:
+ *tok = NULL;
+ return EVENT_ERROR;
+}
+
+static enum event_type
process_paren(struct event_format *event, struct print_arg *arg, char **tok)
{
struct print_arg *item_arg;
@@ -2901,6 +2938,10 @@ process_function(struct event_format *event, struct print_arg *arg,
free_token(token);
return process_dynamic_array(event, arg, tok);
}
+ if (strcmp(token, "__get_dynamic_array_len") == 0) {
+ free_token(token);
+ return process_dynamic_array_len(event, arg, tok);
+ }

func = find_func_handler(event->pevent, token);
if (func) {
@@ -3581,14 +3622,25 @@ eval_num_arg(void *data, int size, struct event_format *event, struct print_arg
goto out_warning_op;
}
break;
+ case PRINT_DYNAMIC_ARRAY_LEN:
+ offset = pevent_read_number(pevent,
+ data + arg->dynarray.field->offset,
+ arg->dynarray.field->size);
+ /*
+ * The total allocated length of the dynamic array is
+ * stored in the top half of the field, and the offset
+ * is in the bottom half of the 32 bit field.
+ */
+ val = (unsigned long long)(offset >> 16);
+ break;
case PRINT_DYNAMIC_ARRAY:
/* Without [], we pass the address to the dynamic data */
offset = pevent_read_number(pevent,
data + arg->dynarray.field->offset,
arg->dynarray.field->size);
/*
- * The actual length of the dynamic array is stored
- * in the top half of the field, and the offset
+ * The total allocated length of the dynamic array is
+ * stored in the top half of the field, and the offset
* is in the bottom half of the 32 bit field.
*/
offset &= 0xffff;
diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h
index 063b197..94543aa 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -294,6 +294,7 @@ enum print_arg_type {
PRINT_OP,
PRINT_FUNC,
PRINT_BITMASK,
+ PRINT_DYNAMIC_ARRAY_LEN,
};

struct print_arg {
diff --git a/tools/perf/util/scripting-engines/trace-event-perl.c b/tools/perf/util/scripting-engines/trace-event-perl.c
index 1bd593b..544509c 100644
--- a/tools/perf/util/scripting-engines/trace-event-perl.c
+++ b/tools/perf/util/scripting-engines/trace-event-perl.c
@@ -221,6 +221,7 @@ static void define_event_symbols(struct event_format *event,
break;
case PRINT_BSTRING:
case PRINT_DYNAMIC_ARRAY:
+ case PRINT_DYNAMIC_ARRAY_LEN:
case PRINT_STRING:
case PRINT_BITMASK:
break;
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index ace2484..aa9e125 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -251,6 +251,7 @@ static void define_event_symbols(struct event_format *event,
/* gcc warns for these? */
case PRINT_BSTRING:
case PRINT_DYNAMIC_ARRAY:
+ case PRINT_DYNAMIC_ARRAY_LEN:
case PRINT_FUNC:
case PRINT_BITMASK:
/* we should warn... */
--
1.8.5.2

2015-07-21 03:09:17

by He Kuang

[permalink] [raw]
Subject: [RFC PATCH v6 2/2] bpf: Introduce function for outputing data to perf event

There're scenarios that we need an eBPF program to record not only
kprobe point args, but also the PMU counters, time latencies or the
number of cache misses between two probe points and other information
when the probe point is entered.

This patch adds a new trace event to establish infrastruction for bpf to
output data to perf. Userspace perf tools can detect and use this event
as using the existing tracepoint events.

New bpf trace event entry in debugfs:

/sys/kernel/debug/tracing/events/bpf/bpf_output_data

Userspace perf tools detect the new tracepoint event as:

bpf:bpf_output_data [Tracepoint event]

Data in ring-buffer of perf events added to this event will be polled
out, sample types and other attributes can be adjusted to those events
directly without touching the original kprobe events.

The bpf helper function gives eBPF program ability to output data as
perf sample event. This helper simple call the new trace event and
userspace perf tools can record the BPF ftrace event to collect those
records.

Signed-off-by: He Kuang <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
---
include/trace/events/bpf.h | 30 ++++++++++++++++++++++++++++++
include/uapi/linux/bpf.h | 7 +++++++
kernel/trace/bpf_trace.c | 23 +++++++++++++++++++++++
samples/bpf/bpf_helpers.h | 2 ++
4 files changed, 62 insertions(+)
create mode 100644 include/trace/events/bpf.h

diff --git a/include/trace/events/bpf.h b/include/trace/events/bpf.h
new file mode 100644
index 0000000..6b739b8
--- /dev/null
+++ b/include/trace/events/bpf.h
@@ -0,0 +1,30 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM bpf
+
+#if !defined(_TRACE_BPF_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_BPF_H
+
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(bpf_output_data,
+
+ TP_PROTO(u64 *src, int size),
+
+ TP_ARGS(src, size),
+
+ TP_STRUCT__entry(
+ __dynamic_array(u8, buf, size)
+ ),
+
+ TP_fast_assign(
+ memcpy(__get_dynamic_array(buf), src, size);
+ ),
+
+ TP_printk("%s", __print_hex(__get_dynamic_array(buf),
+ __get_dynamic_array_len(buf)))
+);
+
+#endif /* _TRACE_BPF_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 29ef6f9..5068ab1 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -249,6 +249,13 @@ enum bpf_func_id {
* Return: 0 on success
*/
BPF_FUNC_get_current_comm,
+
+ /**
+ * int bpf_output_trace_data(void *src, int size)
+ * Return: 0 on success
+ */
+ BPF_FUNC_output_trace_data,
+
__BPF_FUNC_MAX_ID,
};

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 88a041a..219f670 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -11,7 +11,10 @@
#include <linux/filter.h>
#include <linux/uaccess.h>
#include <linux/ctype.h>
+
#include "trace.h"
+#define CREATE_TRACE_POINTS
+#include <trace/events/bpf.h>

static DEFINE_PER_CPU(int, bpf_prog_active);

@@ -79,6 +82,24 @@ static const struct bpf_func_proto bpf_probe_read_proto = {
.arg3_type = ARG_ANYTHING,
};

+static u64 bpf_output_trace_data(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5)
+{
+ void *src = (void *) (long) r1;
+ int size = (int) r2;
+
+ trace_bpf_output_data(src, size);
+
+ return 0;
+}
+
+static const struct bpf_func_proto bpf_output_trace_data_proto = {
+ .func = bpf_output_trace_data,
+ .gpl_only = true,
+ .ret_type = RET_INTEGER,
+ .arg1_type = ARG_PTR_TO_STACK,
+ .arg2_type = ARG_CONST_STACK_SIZE,
+};
+
/*
* limited trace_printk()
* only %d %u %x %ld %lu %lx %lld %llu %llx %p conversion specifiers allowed
@@ -169,6 +190,8 @@ static const struct bpf_func_proto *kprobe_prog_func_proto(enum bpf_func_id func
return &bpf_map_delete_elem_proto;
case BPF_FUNC_probe_read:
return &bpf_probe_read_proto;
+ case BPF_FUNC_output_trace_data:
+ return &bpf_output_trace_data_proto;
case BPF_FUNC_ktime_get_ns:
return &bpf_ktime_get_ns_proto;
case BPF_FUNC_tail_call:
diff --git a/samples/bpf/bpf_helpers.h b/samples/bpf/bpf_helpers.h
index bdf1c16..0aeaebe 100644
--- a/samples/bpf/bpf_helpers.h
+++ b/samples/bpf/bpf_helpers.h
@@ -59,5 +59,7 @@ static int (*bpf_l3_csum_replace)(void *ctx, int off, int from, int to, int flag
(void *) BPF_FUNC_l3_csum_replace;
static int (*bpf_l4_csum_replace)(void *ctx, int off, int from, int to, int flags) =
(void *) BPF_FUNC_l4_csum_replace;
+static int (*bpf_output_trace_data)(void *src, int size) =
+ (void *) BPF_FUNC_output_trace_data;

#endif
--
1.8.5.2

2015-07-21 15:10:58

by Namhyung Kim

[permalink] [raw]
Subject: Re: [RFC PATCH v6 1/2] tools lib traceevent: Support function __get_dynamic_array_len

On Tue, Jul 21, 2015 at 03:08:49AM +0000, He Kuang wrote:
> Support helper function __get_dynamic_array_len() in libtraceevent,
> this function is used accompany with __print_array() or __print_hex(),
> but currently it is not an available function in the function list of
> process_function().
>
> The total allocated length of the dynamic array is embedded in the top
> half of __data_loc_##item field. This patch adds new arg type
> PRINT_DYNAMIC_ARRAY_LEN to return the length to eval_num_arg(),
>
> Signed-off-by: He Kuang <[email protected]>

Acked-by: Namhyung Kim <[email protected]>

Thanks,
Namhyung


> ---
> tools/lib/traceevent/event-parse.c | 56 +++++++++++++++++++++-
> tools/lib/traceevent/event-parse.h | 1 +
> .../perf/util/scripting-engines/trace-event-perl.c | 1 +
> .../util/scripting-engines/trace-event-python.c | 1 +
> 4 files changed, 57 insertions(+), 2 deletions(-)
>
> diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
> index cc25f05..3af69cf 100644
> --- a/tools/lib/traceevent/event-parse.c
> +++ b/tools/lib/traceevent/event-parse.c
> @@ -783,6 +783,7 @@ static void free_arg(struct print_arg *arg)
> free(arg->bitmask.bitmask);
> break;
> case PRINT_DYNAMIC_ARRAY:
> + case PRINT_DYNAMIC_ARRAY_LEN:
> free(arg->dynarray.index);
> break;
> case PRINT_OP:
> @@ -2655,6 +2656,42 @@ process_dynamic_array(struct event_format *event, struct print_arg *arg, char **
> }
>
> static enum event_type
> +process_dynamic_array_len(struct event_format *event, struct print_arg *arg,
> + char **tok)
> +{
> + struct format_field *field;
> + enum event_type type;
> + char *token;
> +
> + if (read_expect_type(EVENT_ITEM, &token) < 0)
> + goto out_free;
> +
> + arg->type = PRINT_DYNAMIC_ARRAY_LEN;
> +
> + /* Find the field */
> + field = pevent_find_field(event, token);
> + if (!field)
> + goto out_free;
> +
> + arg->dynarray.field = field;
> + arg->dynarray.index = 0;
> +
> + if (read_expected(EVENT_DELIM, ")") < 0)
> + goto out_err;
> +
> + type = read_token(&token);
> + *tok = token;
> +
> + return type;
> +
> + out_free:
> + free_token(token);
> + out_err:
> + *tok = NULL;
> + return EVENT_ERROR;
> +}
> +
> +static enum event_type
> process_paren(struct event_format *event, struct print_arg *arg, char **tok)
> {
> struct print_arg *item_arg;
> @@ -2901,6 +2938,10 @@ process_function(struct event_format *event, struct print_arg *arg,
> free_token(token);
> return process_dynamic_array(event, arg, tok);
> }
> + if (strcmp(token, "__get_dynamic_array_len") == 0) {
> + free_token(token);
> + return process_dynamic_array_len(event, arg, tok);
> + }
>
> func = find_func_handler(event->pevent, token);
> if (func) {
> @@ -3581,14 +3622,25 @@ eval_num_arg(void *data, int size, struct event_format *event, struct print_arg
> goto out_warning_op;
> }
> break;
> + case PRINT_DYNAMIC_ARRAY_LEN:
> + offset = pevent_read_number(pevent,
> + data + arg->dynarray.field->offset,
> + arg->dynarray.field->size);
> + /*
> + * The total allocated length of the dynamic array is
> + * stored in the top half of the field, and the offset
> + * is in the bottom half of the 32 bit field.
> + */
> + val = (unsigned long long)(offset >> 16);
> + break;
> case PRINT_DYNAMIC_ARRAY:
> /* Without [], we pass the address to the dynamic data */
> offset = pevent_read_number(pevent,
> data + arg->dynarray.field->offset,
> arg->dynarray.field->size);
> /*
> - * The actual length of the dynamic array is stored
> - * in the top half of the field, and the offset
> + * The total allocated length of the dynamic array is
> + * stored in the top half of the field, and the offset
> * is in the bottom half of the 32 bit field.
> */
> offset &= 0xffff;
> diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h
> index 063b197..94543aa 100644
> --- a/tools/lib/traceevent/event-parse.h
> +++ b/tools/lib/traceevent/event-parse.h
> @@ -294,6 +294,7 @@ enum print_arg_type {
> PRINT_OP,
> PRINT_FUNC,
> PRINT_BITMASK,
> + PRINT_DYNAMIC_ARRAY_LEN,
> };
>
> struct print_arg {
> diff --git a/tools/perf/util/scripting-engines/trace-event-perl.c b/tools/perf/util/scripting-engines/trace-event-perl.c
> index 1bd593b..544509c 100644
> --- a/tools/perf/util/scripting-engines/trace-event-perl.c
> +++ b/tools/perf/util/scripting-engines/trace-event-perl.c
> @@ -221,6 +221,7 @@ static void define_event_symbols(struct event_format *event,
> break;
> case PRINT_BSTRING:
> case PRINT_DYNAMIC_ARRAY:
> + case PRINT_DYNAMIC_ARRAY_LEN:
> case PRINT_STRING:
> case PRINT_BITMASK:
> break;
> diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
> index ace2484..aa9e125 100644
> --- a/tools/perf/util/scripting-engines/trace-event-python.c
> +++ b/tools/perf/util/scripting-engines/trace-event-python.c
> @@ -251,6 +251,7 @@ static void define_event_symbols(struct event_format *event,
> /* gcc warns for these? */
> case PRINT_BSTRING:
> case PRINT_DYNAMIC_ARRAY:
> + case PRINT_DYNAMIC_ARRAY_LEN:
> case PRINT_FUNC:
> case PRINT_BITMASK:
> /* we should warn... */
> --
> 1.8.5.2
>

2015-07-21 15:15:34

by Namhyung Kim

[permalink] [raw]
Subject: Re: [RFC PATCH v6 2/2] bpf: Introduce function for outputing data to perf event

On Tue, Jul 21, 2015 at 03:08:50AM +0000, He Kuang wrote:
> There're scenarios that we need an eBPF program to record not only
> kprobe point args, but also the PMU counters, time latencies or the
> number of cache misses between two probe points and other information
> when the probe point is entered.
>
> This patch adds a new trace event to establish infrastruction for bpf to
> output data to perf. Userspace perf tools can detect and use this event
> as using the existing tracepoint events.
>
> New bpf trace event entry in debugfs:
>
> /sys/kernel/debug/tracing/events/bpf/bpf_output_data
>
> Userspace perf tools detect the new tracepoint event as:
>
> bpf:bpf_output_data [Tracepoint event]
>
> Data in ring-buffer of perf events added to this event will be polled
> out, sample types and other attributes can be adjusted to those events
> directly without touching the original kprobe events.
>
> The bpf helper function gives eBPF program ability to output data as
> perf sample event. This helper simple call the new trace event and
> userspace perf tools can record the BPF ftrace event to collect those
> records.
>
> Signed-off-by: He Kuang <[email protected]>
> Acked-by: Alexei Starovoitov <[email protected]>

Acked-by: Namhyung Kim <[email protected]>

Thanks,
Namhyung


> ---
> include/trace/events/bpf.h | 30 ++++++++++++++++++++++++++++++
> include/uapi/linux/bpf.h | 7 +++++++
> kernel/trace/bpf_trace.c | 23 +++++++++++++++++++++++
> samples/bpf/bpf_helpers.h | 2 ++
> 4 files changed, 62 insertions(+)
> create mode 100644 include/trace/events/bpf.h
>
> diff --git a/include/trace/events/bpf.h b/include/trace/events/bpf.h
> new file mode 100644
> index 0000000..6b739b8
> --- /dev/null
> +++ b/include/trace/events/bpf.h
> @@ -0,0 +1,30 @@
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM bpf
> +
> +#if !defined(_TRACE_BPF_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_BPF_H
> +
> +#include <linux/tracepoint.h>
> +
> +TRACE_EVENT(bpf_output_data,
> +
> + TP_PROTO(u64 *src, int size),
> +
> + TP_ARGS(src, size),
> +
> + TP_STRUCT__entry(
> + __dynamic_array(u8, buf, size)
> + ),
> +
> + TP_fast_assign(
> + memcpy(__get_dynamic_array(buf), src, size);
> + ),
> +
> + TP_printk("%s", __print_hex(__get_dynamic_array(buf),
> + __get_dynamic_array_len(buf)))
> +);
> +
> +#endif /* _TRACE_BPF_H */
> +
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 29ef6f9..5068ab1 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -249,6 +249,13 @@ enum bpf_func_id {
> * Return: 0 on success
> */
> BPF_FUNC_get_current_comm,
> +
> + /**
> + * int bpf_output_trace_data(void *src, int size)
> + * Return: 0 on success
> + */
> + BPF_FUNC_output_trace_data,
> +
> __BPF_FUNC_MAX_ID,
> };
>
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 88a041a..219f670 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -11,7 +11,10 @@
> #include <linux/filter.h>
> #include <linux/uaccess.h>
> #include <linux/ctype.h>
> +
> #include "trace.h"
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/bpf.h>
>
> static DEFINE_PER_CPU(int, bpf_prog_active);
>
> @@ -79,6 +82,24 @@ static const struct bpf_func_proto bpf_probe_read_proto = {
> .arg3_type = ARG_ANYTHING,
> };
>
> +static u64 bpf_output_trace_data(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5)
> +{
> + void *src = (void *) (long) r1;
> + int size = (int) r2;
> +
> + trace_bpf_output_data(src, size);
> +
> + return 0;
> +}
> +
> +static const struct bpf_func_proto bpf_output_trace_data_proto = {
> + .func = bpf_output_trace_data,
> + .gpl_only = true,
> + .ret_type = RET_INTEGER,
> + .arg1_type = ARG_PTR_TO_STACK,
> + .arg2_type = ARG_CONST_STACK_SIZE,
> +};
> +
> /*
> * limited trace_printk()
> * only %d %u %x %ld %lu %lx %lld %llu %llx %p conversion specifiers allowed
> @@ -169,6 +190,8 @@ static const struct bpf_func_proto *kprobe_prog_func_proto(enum bpf_func_id func
> return &bpf_map_delete_elem_proto;
> case BPF_FUNC_probe_read:
> return &bpf_probe_read_proto;
> + case BPF_FUNC_output_trace_data:
> + return &bpf_output_trace_data_proto;
> case BPF_FUNC_ktime_get_ns:
> return &bpf_ktime_get_ns_proto;
> case BPF_FUNC_tail_call:
> diff --git a/samples/bpf/bpf_helpers.h b/samples/bpf/bpf_helpers.h
> index bdf1c16..0aeaebe 100644
> --- a/samples/bpf/bpf_helpers.h
> +++ b/samples/bpf/bpf_helpers.h
> @@ -59,5 +59,7 @@ static int (*bpf_l3_csum_replace)(void *ctx, int off, int from, int to, int flag
> (void *) BPF_FUNC_l3_csum_replace;
> static int (*bpf_l4_csum_replace)(void *ctx, int off, int from, int to, int flags) =
> (void *) BPF_FUNC_l4_csum_replace;
> +static int (*bpf_output_trace_data)(void *src, int size) =
> + (void *) BPF_FUNC_output_trace_data;
>
> #endif
> --
> 1.8.5.2
>

2015-08-13 21:27:48

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [RFC PATCH v6 0/2] Make eBPF programs output data to perf

On 7/20/15 8:08 PM, He Kuang wrote:
> Hi,
>
> Previous patch v5 url:
> http://thread.gmane.org/gmane.linux.kernel/1995274
>
> The bugfix of dynamic array length in trace event goes to
> kernel/git/rostedt/linux-trace.git ftrace/urgent and confirms that the
> return value of __get_dynamic_array_len() is the total allocated
> length of the dynamic array. For we print the bpf output data in byte
> array from patch v5, that problem does not affect our patch any more,
> but some comments in patch 1/2 is updated.
>
> Patch 2/2 is acked by Alexei.

what is the status of it? If I remember correctly patch 1 was fixed
differently in Steven's tree. Patch 2 probably needs refreshing?

2015-08-13 21:37:09

by Wang Nan

[permalink] [raw]
Subject: Re: [RFC PATCH v6 0/2] Make eBPF programs output data to perf



?????ҵ? iPhone

> ?? 2015??8??14?գ?????5:27??Alexei Starovoitov <[email protected]> д????
>
>> On 7/20/15 8:08 PM, He Kuang wrote:
>> Hi,
>>
>> Previous patch v5 url:
>> http://thread.gmane.org/gmane.linux.kernel/1995274
>>
>> The bugfix of dynamic array length in trace event goes to
>> kernel/git/rostedt/linux-trace.git ftrace/urgent and confirms that the
>> return value of __get_dynamic_array_len() is the total allocated
>> length of the dynamic array. For we print the bpf output data in byte
>> array from patch v5, that problem does not affect our patch any more,
>> but some comments in patch 1/2 is updated.
>>
>> Patch 2/2 is acked by Alexei.
>
> what is the status of it? If I remember correctly patch 1 was fixed
> differently in Steven's tree. Patch 2 probably needs refreshing?
>

I was thinking about whether to add a "type" field there, so we will have an explicit
mov const instruction before the call instruction, which can act as a mark. Also, if
we generate the type code automatically, a type field in this API can make things
easier since we don't need wrap the user structure in BPF stack. However, the
LLVM side is not ready yet, so we haven't post the new version.

Thank you.

2015-08-13 21:49:39

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [RFC PATCH v6 0/2] Make eBPF programs output data to perf

On 8/13/15 2:35 PM, pi3orama wrote:
> I was thinking about whether to add a "type" field there, so we will have an explicit
> mov const instruction before the call instruction, which can act as a mark. Also, if
> we generate the type code automatically, a type field in this API can make things
> easier since we don't need wrap the user structure in BPF stack. However, the
> LLVM side is not ready yet, so we haven't post the new version.

I think the helper was clean enough. Any type info probably needs to be
done as a side channel and not part of the helper anyway.
But, ok, let's figure out the type stuff first.
Also I don't think you can rely on extra insn in front of a call insn.
Compiler can freely insert other insns there. You don't want to
introduce data flow analysis in elf parser.

2015-08-19 03:06:01

by Wang Nan

[permalink] [raw]
Subject: Re: [RFC PATCH v6 0/2] Make eBPF programs output data to perf



On 2015/8/14 5:49, Alexei Starovoitov wrote:
> On 8/13/15 2:35 PM, pi3orama wrote:
>> I was thinking about whether to add a "type" field there, so we will
>> have an explicit
>> mov const instruction before the call instruction, which can act as a
>> mark. Also, if
>> we generate the type code automatically, a type field in this API can
>> make things
>> easier since we don't need wrap the user structure in BPF stack.
>> However, the
>> LLVM side is not ready yet, so we haven't post the new version.
>
> I think the helper was clean enough. Any type info probably needs to be
> done as a side channel and not part of the helper anyway.
> But, ok, let's figure out the type stuff first.
> Also I don't think you can rely on extra insn in front of a call insn.
> Compiler can freely insert other insns there. You don't want to
> introduce data flow analysis in elf parser.
>
I agree with you. I think we can rely on user providing correct type
information.

Then we should make this two patches go into kernel. Both 1/2 and 2/2 are
required.

The bug mentioned in patch 1/3 of v5 series
(http://lkml.kernel.org/r/[email protected])
has already been fixed by adjusting sample
(d6726c8145290bef950ae2538ea6ae1d96a1944b)

So we only need these two patches. Currently they can be applied to
mainline master
clearly since you haven't add new BPF functions.

Thank you.