2015-07-10 10:04:12

by He Kuang

[permalink] [raw]
Subject: [RFC PATCH v4 0/3] Make eBPF programs output data to perf event

Hi,

Previous discussion url(patch v3):
http://thread.gmane.org/gmane.linux.kernel/1990197/focus=1991022

We found that creating new trace event for bpf subsystem is more
simple than adding new ftrace:bpf entry, the only thing we should do
for outputing bpf sample events is to call a predefined trace event
function. Then it is easy to extend new output formats and not
restricted to one fixed format like ftrace:function.

By using trace events, we can also benifit from the dynamic array
field type for outputing results by number of items we filled in, and
achieve the purpose of standardization of output format. Function for
getting number of items in dynamic array is added to libtraceevent and
a improper result of macro __get_dynamic_array_len is corrected.

eBPF sample code, tests outputing multiple items:

SEC("generic_perform_write=generic_perform_write")
int NODE_generic_perform_write(struct pt_regs *ctx)
{
char fmt[] = "generic_perform_write last=%lld, cur=%lld, del=%lld\n";
u64 cur_time, del_time, result[3] = {0};
int ind =0;
struct time_table *last = bpf_map_lookup_elem(&global_time_table, &ind);
struct time_table output;

if (!last)
return 0;

cur_time = bpf_ktime_get_ns();

if (!last->last_time)
del_time = 0;
else
del_time = cur_time - last->last_time;

/* For debug */
bpf_trace_printk(fmt, sizeof(fmt), last->last_time, cur_time, del_time);
result[0] = last->last_time;

/* Table update */
output.last_time = cur_time;
bpf_map_update_elem(&global_time_table, &ind, &output, BPF_ANY);

/* This is a casual condition to show the funciton */
if (del_time < 1000)
return 0;

result[1] = cur_time;
result[2] = del_time;
bpf_output_trace_data(result, sizeof(result));

return 0;
}

Record bpf events:

$ perf record -e bpf:bpf_output_data -e sample.o --
dd if=/dev/zero of=test bs=4k count=3

Results in /sys/kernel/debug/tracing/trace:

dd-984 [000] d... 60.894097: : generic_perform_write
last=60560862578, cur=60654629075, del=93766497
dd-984 [000] d... 60.896957: : generic_perform_write
last=60654629075, cur=60657510709, del=2881634
dd-984 [000] d... 60.897276: : generic_perform_write
last=60657510709, cur=60657829953, del=319244

Results showed in perf-script:

dd 984 [000] 60.655211: bpf:bpf_output_data: 60560862578 60654629075 93766497
dd 984 [000] 60.657552: bpf:bpf_output_data: 60654629075 60657510709 2881634
dd 984 [000] 60.657898: bpf:bpf_output_data: 60657510709 60657829953 319244

Thank you.

He Kuang (3):
tracing/events: Fix wrong sample output by storing array length
instead of size
tools lib traceevent: Add function to get dynamic arrays length
bpf: Introduce function for outputing data to perf event

include/trace/events/bpf.h | 30 +++++++++++++
include/trace/trace_events.h | 5 ++-
include/uapi/linux/bpf.h | 7 +++
kernel/trace/bpf_trace.c | 23 ++++++++++
samples/bpf/bpf_helpers.h | 2 +
tools/lib/traceevent/event-parse.c | 52 ++++++++++++++++++++++
tools/lib/traceevent/event-parse.h | 1 +
.../util/scripting-engines/trace-event-python.c | 1 +
8 files changed, 119 insertions(+), 2 deletions(-)
create mode 100644 include/trace/events/bpf.h

--
1.8.5.2


2015-07-10 10:04:40

by He Kuang

[permalink] [raw]
Subject: [RFC PATCH v4 1/3] tracing/events: Fix wrong sample output by storing array length instead of size

The output result of trace_foo_bar event in traceevent samples is
wrong. This problem can be reproduced as following:

(Build kernel with SAMPLE_TRACE_EVENTS=m)

$ insmod trace-events-sample.ko

$ echo 1 > /sys/kernel/debug/tracing/events/sample-trace/foo_bar/enable

$ cat /sys/kernel/debug/tracing/trace

event-sample-980 [000] .... 43.649559: foo_bar: foo hello 21 0x15
BIT1|BIT3|0x10 {0x1,0x6f6f6e53,0xff007970,0xffffffff} Snoopy
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The array length is not right, should be {0x1}.
(ffffffff,ffffffff)

event-sample-980 [000] .... 44.653827: foo_bar: foo hello 22 0x16
BIT2|BIT3|0x10
{0x1,0x2,0x646e6147,0x666c61,0xffffffff,0xffffffff,0x750aeffe,0x7}
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The array length is not right, should be {0x1,0x2}.
Gandalf (ffffffff,ffffffff)

The event defined in samples/trace_events/trace-events-sample.h uses
this helper function to output dynamic list:

__print_array(__get_dynamic_array(list),
__get_dynamic_array_len(list),
sizeof(int))

Currently, __get_dynamic_array_len() returns the total size of the
array instead of the number of items by referencing the high 16 bits
of entry->data_loc_##item. The element size for calculating the number
of items can not be fetched by referencing fields from __entry, so
macro __get_dynamic_array_len can not return the expected value.

This patch stores array item number instead of the total size in
entry->__data_loc_##item, and makes __get_dynamic_array_len get the
right value directly. Because the function __get_bitmask() is affected
by this change, __bitmask_size is assigned to the array len by
multiplied bitmask type size.

After this patch:

event-sample-993 [000] .... 692.348562: foo_bar: foo hello 201
0xc9 BIT1|BIT4|0xc0 {0x1} Snoopy (ffffffff,ffffffff)
^^^^^
Array length fixed.

event-sample-993 [000] .... 693.349276: foo_bar: foo hello 202
0xca BIT2|BIT4|0xc0 {0x1,0x2} Gandalf (ffffffff,ffffffff)
^^^^^^^^^
Array length fixed.

Signed-off-by: He Kuang <[email protected]>
---
include/trace/trace_events.h | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/trace/trace_events.h b/include/trace/trace_events.h
index 43be3b0..5abe027 100644
--- a/include/trace/trace_events.h
+++ b/include/trace/trace_events.h
@@ -257,7 +257,8 @@ TRACE_MAKE_SYSTEM_STR();
({ \
void *__bitmask = __get_dynamic_array(field); \
unsigned int __bitmask_size; \
- __bitmask_size = __get_dynamic_array_len(field); \
+ __bitmask_size = (__get_dynamic_array_len(field) * \
+ sizeof(unsigned long)); \
trace_print_bitmask_seq(p, __bitmask, __bitmask_size); \
})

@@ -453,7 +454,7 @@ trace_event_define_fields_##call(struct trace_event_call *event_call) \
__item_length = (len) * sizeof(type); \
__data_offsets->item = __data_size + \
offsetof(typeof(*entry), __data); \
- __data_offsets->item |= __item_length << 16; \
+ __data_offsets->item |= (len) << 16; \
__data_size += __item_length;

#undef __string
--
1.8.5.2

2015-07-10 10:04:25

by He Kuang

[permalink] [raw]
Subject: [RFC PATCH v4 2/3] tools lib traceevent: Add function to get dynamic arrays length

To print a trace event with a dynamic array, __print_array(array, len,
element_size) requires the number of items in the array, which can be
got by the helper function __get_dynamic_array_len(), currently it is
not an available function in the function list in process_function().

Add new arg type PRINT_DYNAMIC_ARRAY_LEN which returns the array
length embedded in the __data_loc_##item field to eval_num_arg().

Signed-off-by: He Kuang <[email protected]>
---
tools/lib/traceevent/event-parse.c | 52 ++++++++++++++++++++++
tools/lib/traceevent/event-parse.h | 1 +
.../util/scripting-engines/trace-event-python.c | 1 +
3 files changed, 54 insertions(+)

diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
index cc25f05..55fc3b6c 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -783,6 +783,7 @@ static void free_arg(struct print_arg *arg)
free(arg->bitmask.bitmask);
break;
case PRINT_DYNAMIC_ARRAY:
+ case PRINT_DYNAMIC_ARRAY_LEN:
free(arg->dynarray.index);
break;
case PRINT_OP:
@@ -2655,6 +2656,42 @@ process_dynamic_array(struct event_format *event, struct print_arg *arg, char **
}

static enum event_type
+process_dynamic_array_len(struct event_format *event, struct print_arg *arg,
+ char **tok)
+{
+ struct format_field *field;
+ enum event_type type;
+ char *token;
+
+ if (read_expect_type(EVENT_ITEM, &token) < 0)
+ goto out_free;
+
+ arg->type = PRINT_DYNAMIC_ARRAY_LEN;
+
+ /* Find the field */
+ field = pevent_find_field(event, token);
+ if (!field)
+ goto out_free;
+
+ arg->dynarray.field = field;
+ arg->dynarray.index = 0;
+
+ if (read_expected(EVENT_DELIM, ")") < 0)
+ goto out_err;
+
+ type = read_token(&token);
+ *tok = token;
+
+ return type;
+
+ out_free:
+ free_token(token);
+ out_err:
+ *tok = NULL;
+ return EVENT_ERROR;
+}
+
+static enum event_type
process_paren(struct event_format *event, struct print_arg *arg, char **tok)
{
struct print_arg *item_arg;
@@ -2901,6 +2938,10 @@ process_function(struct event_format *event, struct print_arg *arg,
free_token(token);
return process_dynamic_array(event, arg, tok);
}
+ if (strcmp(token, "__get_dynamic_array_len") == 0) {
+ free_token(token);
+ return process_dynamic_array_len(event, arg, tok);
+ }

func = find_func_handler(event->pevent, token);
if (func) {
@@ -3581,6 +3622,17 @@ eval_num_arg(void *data, int size, struct event_format *event, struct print_arg
goto out_warning_op;
}
break;
+ case PRINT_DYNAMIC_ARRAY_LEN:
+ offset = pevent_read_number(pevent,
+ data + arg->dynarray.field->offset,
+ arg->dynarray.field->size);
+ /*
+ * The actual length of the dynamic array is stored
+ * in the top half of the field, and the offset
+ * is in the bottom half of the 32 bit field.
+ */
+ val = (unsigned long long)(offset >> 16);
+ break;
case PRINT_DYNAMIC_ARRAY:
/* Without [], we pass the address to the dynamic data */
offset = pevent_read_number(pevent,
diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h
index 063b197..94543aa 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -294,6 +294,7 @@ enum print_arg_type {
PRINT_OP,
PRINT_FUNC,
PRINT_BITMASK,
+ PRINT_DYNAMIC_ARRAY_LEN,
};

struct print_arg {
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index ace2484..d309341 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -253,6 +253,7 @@ static void define_event_symbols(struct event_format *event,
case PRINT_DYNAMIC_ARRAY:
case PRINT_FUNC:
case PRINT_BITMASK:
+ case PRINT_DYNAMIC_ARRAY_LEN:
/* we should warn... */
return;
}
--
1.8.5.2

2015-07-10 10:04:31

by He Kuang

[permalink] [raw]
Subject: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

There're scenarios that we need an eBPF program to record not only
kprobe point args, but also the PMU counters, time latencies or the
number of cache misses between two probe points and other information
when the probe point is entered.

This patch adds a new trace event to establish infrastruction for bpf to
output data to perf. Userspace perf tools can detect and use this event
as using the existing tracepoint events.

New bpf trace event entry in debugfs:

/sys/kernel/debug/tracing/events/bpf/bpf_output_data

Userspace perf tools detect the new tracepoint event as:

bpf:bpf_output_data [Tracepoint event]

Data in ring-buffer of perf events added to this event will be polled
out, sample types and other attributes can be adjusted to those events
directly without touching the original kprobe events.

The bpf helper function gives eBPF program ability to output data as
perf sample event. This helper simple call the new trace event and
userspace perf tools can record the BPF ftrace event to collect those
records.

Signed-off-by: He Kuang <[email protected]>
---
include/trace/events/bpf.h | 30 ++++++++++++++++++++++++++++++
include/uapi/linux/bpf.h | 7 +++++++
kernel/trace/bpf_trace.c | 23 +++++++++++++++++++++++
samples/bpf/bpf_helpers.h | 2 ++
4 files changed, 62 insertions(+)
create mode 100644 include/trace/events/bpf.h

diff --git a/include/trace/events/bpf.h b/include/trace/events/bpf.h
new file mode 100644
index 0000000..a659a91
--- /dev/null
+++ b/include/trace/events/bpf.h
@@ -0,0 +1,30 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM bpf
+
+#if !defined(_TRACE_BPF_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_BPF_H
+
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(bpf_output_data,
+
+ TP_PROTO(u64 *src, int len),
+
+ TP_ARGS(src, len),
+
+ TP_STRUCT__entry(
+ __dynamic_array(u64, buf, len)
+ ),
+
+ TP_fast_assign(
+ memcpy(__get_dynamic_array(buf), src, len * sizeof(u64));
+ ),
+
+ TP_printk("%s", __print_array(__get_dynamic_array(buf),
+ __get_dynamic_array_len(buf), 8))
+);
+
+#endif /* _TRACE_BPF_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 29ef6f9..5068ab1 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -249,6 +249,13 @@ enum bpf_func_id {
* Return: 0 on success
*/
BPF_FUNC_get_current_comm,
+
+ /**
+ * int bpf_output_trace_data(void *src, int size)
+ * Return: 0 on success
+ */
+ BPF_FUNC_output_trace_data,
+
__BPF_FUNC_MAX_ID,
};

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 88a041a..31fc31a 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -11,7 +11,10 @@
#include <linux/filter.h>
#include <linux/uaccess.h>
#include <linux/ctype.h>
+
#include "trace.h"
+#define CREATE_TRACE_POINTS
+#include <trace/events/bpf.h>

static DEFINE_PER_CPU(int, bpf_prog_active);

@@ -79,6 +82,24 @@ static const struct bpf_func_proto bpf_probe_read_proto = {
.arg3_type = ARG_ANYTHING,
};

+static u64 bpf_output_trace_data(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5)
+{
+ void *src = (void *) (long) r1;
+ int size = (int) r2;
+
+ trace_bpf_output_data(src, size / sizeof(u64));
+
+ return 0;
+}
+
+static const struct bpf_func_proto bpf_output_trace_data_proto = {
+ .func = bpf_output_trace_data,
+ .gpl_only = true,
+ .ret_type = RET_INTEGER,
+ .arg1_type = ARG_PTR_TO_STACK,
+ .arg2_type = ARG_CONST_STACK_SIZE,
+};
+
/*
* limited trace_printk()
* only %d %u %x %ld %lu %lx %lld %llu %llx %p conversion specifiers allowed
@@ -169,6 +190,8 @@ static const struct bpf_func_proto *kprobe_prog_func_proto(enum bpf_func_id func
return &bpf_map_delete_elem_proto;
case BPF_FUNC_probe_read:
return &bpf_probe_read_proto;
+ case BPF_FUNC_output_trace_data:
+ return &bpf_output_trace_data_proto;
case BPF_FUNC_ktime_get_ns:
return &bpf_ktime_get_ns_proto;
case BPF_FUNC_tail_call:
diff --git a/samples/bpf/bpf_helpers.h b/samples/bpf/bpf_helpers.h
index bdf1c16..0aeaebe 100644
--- a/samples/bpf/bpf_helpers.h
+++ b/samples/bpf/bpf_helpers.h
@@ -59,5 +59,7 @@ static int (*bpf_l3_csum_replace)(void *ctx, int off, int from, int to, int flag
(void *) BPF_FUNC_l3_csum_replace;
static int (*bpf_l4_csum_replace)(void *ctx, int off, int from, int to, int flags) =
(void *) BPF_FUNC_l4_csum_replace;
+static int (*bpf_output_trace_data)(void *src, int size) =
+ (void *) BPF_FUNC_output_trace_data;

#endif
--
1.8.5.2

2015-07-10 22:10:46

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

On 7/10/15 3:03 AM, He Kuang wrote:
> There're scenarios that we need an eBPF program to record not only
> kprobe point args, but also the PMU counters, time latencies or the
> number of cache misses between two probe points and other information
> when the probe point is entered.
>
> This patch adds a new trace event to establish infrastruction for bpf to
> output data to perf. Userspace perf tools can detect and use this event
> as using the existing tracepoint events.
>
> New bpf trace event entry in debugfs:
>
> /sys/kernel/debug/tracing/events/bpf/bpf_output_data
>
> Userspace perf tools detect the new tracepoint event as:
>
> bpf:bpf_output_data [Tracepoint event]

Nice! This approach looks cleanest so far.

> +TRACE_EVENT(bpf_output_data,
> +
> + TP_PROTO(u64 *src, int len),
> +
> + TP_ARGS(src, len),
> +
> + TP_STRUCT__entry(
> + __dynamic_array(u64, buf, len)
> + ),
> +
> + TP_fast_assign(
> + memcpy(__get_dynamic_array(buf), src, len * sizeof(u64));

may be make it 'u8' array? The extra multiply and...

> +static u64 bpf_output_trace_data(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5)
> +{
> + void *src = (void *) (long) r1;
> + int size = (int) r2;
> +
> + trace_bpf_output_data(src, size / sizeof(u64));

.. and this silent round down could be confusing to use.
With array of u8, the program can push any structured data into it
and let user space interpret it.

2015-07-13 04:37:10

by He Kuang

[permalink] [raw]
Subject: Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

hi, Alexei

On 2015/7/11 6:10, Alexei Starovoitov wrote:
> On 7/10/15 3:03 AM, He Kuang wrote:
>> There're scenarios that we need an eBPF program to record not only
>> kprobe point args, but also the PMU counters, time latencies or the
>> number of cache misses between two probe points and other information
>> when the probe point is entered.
>>
>> This patch adds a new trace event to establish infrastruction for bpf to
>> output data to perf. Userspace perf tools can detect and use this event
>> as using the existing tracepoint events.
>>
>> New bpf trace event entry in debugfs:
>>
>> /sys/kernel/debug/tracing/events/bpf/bpf_output_data
>>
>> Userspace perf tools detect the new tracepoint event as:
>>
>> bpf:bpf_output_data [Tracepoint event]
>
> Nice! This approach looks cleanest so far.
>
>> +TRACE_EVENT(bpf_output_data,
>> +
>> + TP_PROTO(u64 *src, int len),
>> +
>> + TP_ARGS(src, len),
>> +
>> + TP_STRUCT__entry(
>> + __dynamic_array(u64, buf, len)
>> + ),
>> +
>> + TP_fast_assign(
>> + memcpy(__get_dynamic_array(buf), src, len * sizeof(u64));
>
> may be make it 'u8' array? The extra multiply and...

OK

So the output of three u64 integers (e.g. 0x2060572485, 0x20667b0ff2,
0x623eb6d) will be this:

dd 994 [000] 139.158180: bpf:bpf_output_data: 85 24 57 60 20 00 00 00
f2 0f 7b 66 20 00 00 00 6d eb 23 06 00 00 00 00

And users are not restricted to u64 type elements. I'll change that.

>
>> +static u64 bpf_output_trace_data(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5)
>> +{
>> + void *src = (void *) (long) r1;
>> + int size = (int) r2;
>> +
>> + trace_bpf_output_data(src, size / sizeof(u64));
>
> .. and this silent round down could be confusing to use.
> With array of u8, the program can push any structured data into it
> and let user space interpret it.
>
>

2015-07-13 08:30:16

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

On Fri, Jul 10, 2015 at 10:03:07AM +0000, He Kuang wrote:

> include/trace/events/bpf.h | 30 ++++++++++++++++++++++++++++++
> include/uapi/linux/bpf.h | 7 +++++++
> kernel/trace/bpf_trace.c | 23 +++++++++++++++++++++++
> samples/bpf/bpf_helpers.h | 2 ++
> 4 files changed, 62 insertions(+)
> create mode 100644 include/trace/events/bpf.h

Note that the subject is misleading, nothing in this entire patch series
has anything to do with perf events.

2015-07-13 13:54:39

by Namhyung Kim

[permalink] [raw]
Subject: Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

Hi,

On Mon, Jul 13, 2015 at 12:36:27PM +0800, He Kuang wrote:
> hi, Alexei
>
> On 2015/7/11 6:10, Alexei Starovoitov wrote:
> >On 7/10/15 3:03 AM, He Kuang wrote:
> >>There're scenarios that we need an eBPF program to record not only
> >>kprobe point args, but also the PMU counters, time latencies or the
> >>number of cache misses between two probe points and other information
> >>when the probe point is entered.
> >>
> >>This patch adds a new trace event to establish infrastruction for bpf to
> >>output data to perf. Userspace perf tools can detect and use this event
> >>as using the existing tracepoint events.
> >>
> >>New bpf trace event entry in debugfs:
> >>
> >> /sys/kernel/debug/tracing/events/bpf/bpf_output_data
> >>
> >>Userspace perf tools detect the new tracepoint event as:
> >>
> >> bpf:bpf_output_data [Tracepoint event]
> >
> >Nice! This approach looks cleanest so far.
> >
> >>+TRACE_EVENT(bpf_output_data,
> >>+
> >>+ TP_PROTO(u64 *src, int len),
> >>+
> >>+ TP_ARGS(src, len),
> >>+
> >>+ TP_STRUCT__entry(
> >>+ __dynamic_array(u64, buf, len)
> >>+ ),
> >>+
> >>+ TP_fast_assign(
> >>+ memcpy(__get_dynamic_array(buf), src, len * sizeof(u64));
> >
> >may be make it 'u8' array? The extra multiply and...
>
> OK
>
> So the output of three u64 integers (e.g. 0x2060572485, 0x20667b0ff2,
> 0x623eb6d) will be this:
>
> dd 994 [000] 139.158180: bpf:bpf_output_data: 85 24 57 60 20 00 00 00
> f2 0f 7b 66 20 00 00 00 6d eb 23 06 00 00 00 00
>
> And users are not restricted to u64 type elements. I'll change that.

While this general event format works well, I think it might be hard
to know which output came from which program when more than one bpf
programs used.

I was thinking about providing custom event formats for each bpf
program (if needed). The event format definitions might be in a
specific directory or a bpf object itself. Then perf can read those
formats and print the output data according to the formats. Maybe we
need to add some dynamic event id to match format and data.

Thanks,
Namhyung

2015-07-13 14:03:23

by Wang Nan

[permalink] [raw]
Subject: Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event



?????ҵ? iPhone

> ?? 2015??7??13?գ?????9:52??Namhyung Kim <[email protected]> д????
>
> Hi,
>
>> On Mon, Jul 13, 2015 at 12:36:27PM +0800, He Kuang wrote:
>> hi, Alexei
>>
>>> On 2015/7/11 6:10, Alexei Starovoitov wrote:
>>>> On 7/10/15 3:03 AM, He Kuang wrote:
>>>> There're scenarios that we need an eBPF program to record not only
>>>> kprobe point args, but also the PMU counters, time latencies or the
>>>> number of cache misses between two probe points and other information
>>>> when the probe point is entered.
>>>>
>>>> This patch adds a new trace event to establish infrastruction for bpf to
>>>> output data to perf. Userspace perf tools can detect and use this event
>>>> as using the existing tracepoint events.
>>>>
>>>> New bpf trace event entry in debugfs:
>>>>
>>>> /sys/kernel/debug/tracing/events/bpf/bpf_output_data
>>>>
>>>> Userspace perf tools detect the new tracepoint event as:
>>>>
>>>> bpf:bpf_output_data [Tracepoint event]
>>>
>>> Nice! This approach looks cleanest so far.
>>>
>>>> +TRACE_EVENT(bpf_output_data,
>>>> +
>>>> + TP_PROTO(u64 *src, int len),
>>>> +
>>>> + TP_ARGS(src, len),
>>>> +
>>>> + TP_STRUCT__entry(
>>>> + __dynamic_array(u64, buf, len)
>>>> + ),
>>>> +
>>>> + TP_fast_assign(
>>>> + memcpy(__get_dynamic_array(buf), src, len * sizeof(u64));
>>>
>>> may be make it 'u8' array? The extra multiply and...
>>
>> OK
>>
>> So the output of three u64 integers (e.g. 0x2060572485, 0x20667b0ff2,
>> 0x623eb6d) will be this:
>>
>> dd 994 [000] 139.158180: bpf:bpf_output_data: 85 24 57 60 20 00 00 00
>> f2 0f 7b 66 20 00 00 00 6d eb 23 06 00 00 00 00
>>
>> And users are not restricted to u64 type elements. I'll change that.
>
> While this general event format works well, I think it might be hard
> to know which output came from which program when more than one bpf
> programs used.
>
> I was thinking about providing custom event formats for each bpf
> program (if needed). The event format definitions might be in a
> specific directory or a bpf object itself. Then perf can read those
> formats and print the output data according to the formats. Maybe we
> need to add some dynamic event id to match format and data.
>

I think we can do it in perf side. Let BPF programs themselves encode format information into the array and make perf read and decode them. In kernel side simply support raw data should be enough, so we can make kernel code as simple as possible.

Thanks.

> Thanks,
> Namhyung

2015-07-13 14:11:59

by Namhyung Kim

[permalink] [raw]
Subject: Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

On Mon, Jul 13, 2015 at 10:01:26PM +0800, pi3orama wrote:
>
>
> 发自我的 iPhone
>
> > 在 2015年7月13日,下午9:52,Namhyung Kim <[email protected]> 写道:
> >
> > Hi,
> >
> >> On Mon, Jul 13, 2015 at 12:36:27PM +0800, He Kuang wrote:
> >> hi, Alexei
> >>
> >>> On 2015/7/11 6:10, Alexei Starovoitov wrote:
> >>>> On 7/10/15 3:03 AM, He Kuang wrote:
> >>>> There're scenarios that we need an eBPF program to record not only
> >>>> kprobe point args, but also the PMU counters, time latencies or the
> >>>> number of cache misses between two probe points and other information
> >>>> when the probe point is entered.
> >>>>
> >>>> This patch adds a new trace event to establish infrastruction for bpf to
> >>>> output data to perf. Userspace perf tools can detect and use this event
> >>>> as using the existing tracepoint events.
> >>>>
> >>>> New bpf trace event entry in debugfs:
> >>>>
> >>>> /sys/kernel/debug/tracing/events/bpf/bpf_output_data
> >>>>
> >>>> Userspace perf tools detect the new tracepoint event as:
> >>>>
> >>>> bpf:bpf_output_data [Tracepoint event]
> >>>
> >>> Nice! This approach looks cleanest so far.
> >>>
> >>>> +TRACE_EVENT(bpf_output_data,
> >>>> +
> >>>> + TP_PROTO(u64 *src, int len),
> >>>> +
> >>>> + TP_ARGS(src, len),
> >>>> +
> >>>> + TP_STRUCT__entry(
> >>>> + __dynamic_array(u64, buf, len)
> >>>> + ),
> >>>> +
> >>>> + TP_fast_assign(
> >>>> + memcpy(__get_dynamic_array(buf), src, len * sizeof(u64));
> >>>
> >>> may be make it 'u8' array? The extra multiply and...
> >>
> >> OK
> >>
> >> So the output of three u64 integers (e.g. 0x2060572485, 0x20667b0ff2,
> >> 0x623eb6d) will be this:
> >>
> >> dd 994 [000] 139.158180: bpf:bpf_output_data: 85 24 57 60 20 00 00 00
> >> f2 0f 7b 66 20 00 00 00 6d eb 23 06 00 00 00 00
> >>
> >> And users are not restricted to u64 type elements. I'll change that.
> >
> > While this general event format works well, I think it might be hard
> > to know which output came from which program when more than one bpf
> > programs used.
> >
> > I was thinking about providing custom event formats for each bpf
> > program (if needed). The event format definitions might be in a
> > specific directory or a bpf object itself. Then perf can read those
> > formats and print the output data according to the formats. Maybe we
> > need to add some dynamic event id to match format and data.
> >
>
> I think we can do it in perf side. Let BPF programs themselves
> encode format information into the array and make perf read and
> decode them. In kernel side simply support raw data should be
> enough, so we can make kernel code as simple as possible.

Yes, of course, I also meant that doing those work all in perf side. :)

Thanks,
Namhyung

2015-07-13 14:32:46

by Wang Nan

[permalink] [raw]
Subject: Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event



?????ҵ? iPhone

> ?? 2015??7??13?գ?????10:09??Namhyung Kim <[email protected]> д????
>
>> On Mon, Jul 13, 2015 at 10:01:26PM +0800, pi3orama wrote:
>>
>>
>> ?????ҵ? iPhone
>>
>>> ?? 2015??7??13?գ?????9:52??Namhyung Kim <[email protected]> д????
>>>
>>> Hi,
>>>
>>>> On Mon, Jul 13, 2015 at 12:36:27PM +0800, He Kuang wrote:
>>>> hi, Alexei
>>>>
>>>>>> On 2015/7/11 6:10, Alexei Starovoitov wrote:
>>>>>> On 7/10/15 3:03 AM, He Kuang wrote:
>>>>>> There're scenarios that we need an eBPF program to record not only
>>>>>> kprobe point args, but also the PMU counters, time latencies or the
>>>>>> number of cache misses between two probe points and other information
>>>>>> when the probe point is entered.
>>>>>>
>>>>>> This patch adds a new trace event to establish infrastruction for bpf to
>>>>>> output data to perf. Userspace perf tools can detect and use this event
>>>>>> as using the existing tracepoint events.
>>>>>>
>>>>>> New bpf trace event entry in debugfs:
>>>>>>
>>>>>> /sys/kernel/debug/tracing/events/bpf/bpf_output_data
>>>>>>
>>>>>> Userspace perf tools detect the new tracepoint event as:
>>>>>>
>>>>>> bpf:bpf_output_data [Tracepoint event]
>>>>>
>>>>> Nice! This approach looks cleanest so far.
>>>>>
>>>>>> +TRACE_EVENT(bpf_output_data,
>>>>>> +
>>>>>> + TP_PROTO(u64 *src, int len),
>>>>>> +
>>>>>> + TP_ARGS(src, len),
>>>>>> +
>>>>>> + TP_STRUCT__entry(
>>>>>> + __dynamic_array(u64, buf, len)
>>>>>> + ),
>>>>>> +
>>>>>> + TP_fast_assign(
>>>>>> + memcpy(__get_dynamic_array(buf), src, len * sizeof(u64));
>>>>>
>>>>> may be make it 'u8' array? The extra multiply and...
>>>>
>>>> OK
>>>>
>>>> So the output of three u64 integers (e.g. 0x2060572485, 0x20667b0ff2,
>>>> 0x623eb6d) will be this:
>>>>
>>>> dd 994 [000] 139.158180: bpf:bpf_output_data: 85 24 57 60 20 00 00 00
>>>> f2 0f 7b 66 20 00 00 00 6d eb 23 06 00 00 00 00
>>>>
>>>> And users are not restricted to u64 type elements. I'll change that.
>>>
>>> While this general event format works well, I think it might be hard
>>> to know which output came from which program when more than one bpf
>>> programs used.
>>>
>>> I was thinking about providing custom event formats for each bpf
>>> program (if needed). The event format definitions might be in a
>>> specific directory or a bpf object itself. Then perf can read those
>>> formats and print the output data according to the formats. Maybe we
>>> need to add some dynamic event id to match format and data.
>>
>> I think we can do it in perf side. Let BPF programs themselves
>> encode format information into the array and make perf read and
>> decode them. In kernel side simply support raw data should be
>> enough, so we can make kernel code as simple as possible.
>
> Yes, of course, I also meant that doing those work all in perf side. :)
>

I have an idea on it:

struct x{
int a;
int b;
};
struct x __x;

SEC(...)
int func(void *ctx)
{
struct x myx;
...
myx.a = 1;
myx.b = 2;
OUTPUT(&myx, &__x);
...
}

In the above program, the '&' operator will emit a relocation, so libbpf will have a chance to know the exact type of the output data. It then can translate into a unique number. The OUTPUT macro should pass the number through the raw data. When decoding, by reading the first word in the raw data perf knows the format. According to it perf can then retrieve the structure format through dwarf information. We can use more macro to make the above code simpler.

We will start working on it after this patch get accepted.

Thank you.


> Thanks,
> Namhyung

2015-07-14 01:43:08

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

On 7/13/15 7:29 AM, pi3orama wrote:
>>>> I was thinking about providing custom event formats for each bpf
>>>> >>>program (if needed). The event format definitions might be in a
>>>> >>>specific directory or a bpf object itself. Then perf can read those
>>>> >>>formats and print the output data according to the formats. Maybe we
>>>> >>>need to add some dynamic event id to match format and data.
>>> >>
>>> >>I think we can do it in perf side. Let BPF programs themselves
>>> >>encode format information into the array and make perf read and
>>> >>decode them. In kernel side simply support raw data should be
>>> >>enough, so we can make kernel code as simple as possible.
>> >
>> >Yes, of course, I also meant that doing those work all in perf side. :)

sounds like a great idea. +1

> I have an idea on it:
>
> struct x{
> int a;
> int b;
> };
> struct x __x;
>
> SEC(...)
> int func(void *ctx)
> {
> struct x myx;
> ...
> myx.a = 1;
> myx.b = 2;
> OUTPUT(&myx, &__x);
> ...
> }
>
> In the above program, the '&' operator will emit a relocation,

relocation once emitted will be painful to remove from compiled code.
Why not to use dwarf info from the struct passed into
bpf_output_trace_data()? No extra macros would be needed from users.
I'm not sure llvm generates proper dwarf along with bpf code (I didn't
test that part. If there are any issues they should be fixable. If you
can prepapre a patch for llvm that would be even better :)

2015-07-14 11:55:01

by He Kuang

[permalink] [raw]
Subject: Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

Hi, Alexei

On 2015/7/14 9:43, Alexei Starovoitov wrote:
> On 7/13/15 7:29 AM, pi3orama wrote:
>>>>> I was thinking about providing custom event formats for each bpf
>>>>> >>>program (if needed). The event format definitions might be in a
>>>>> >>>specific directory or a bpf object itself. Then perf can read those
>>>>> >>>formats and print the output data according to the formats. Maybe we
>>>>> >>>need to add some dynamic event id to match format and data.
>>>> >>
>>>> >>I think we can do it in perf side. Let BPF programs themselves
>>>> >>encode format information into the array and make perf read and
>>>> >>decode them. In kernel side simply support raw data should be
>>>> >>enough, so we can make kernel code as simple as possible.
>>> >
>>> >Yes, of course, I also meant that doing those work all in perf side. :)
>
> sounds like a great idea. +1
>
>> I have an idea on it:
>>
>> struct x{
>> int a;
>> int b;
>> };
>> struct x __x;
>>
>> SEC(...)
>> int func(void *ctx)
>> {
>> struct x myx;
>> ...
>> myx.a = 1;
>> myx.b = 2;
>> OUTPUT(&myx, &__x);
>> ...
>> }
>>
>> In the above program, the '&' operator will emit a relocation,
>
> relocation once emitted will be painful to remove from compiled code.
> Why not to use dwarf info from the struct passed into
> bpf_output_trace_data()? No extra macros would be needed from users.

We are turning to obtain infomation from dwarf_info, but still we
should store the struct type in the bpf output raw data, otherwise
perf tools can not distinguish different bpf trace points if
there're multiple trace points in bpf program, because all the
bpf_output sample entries has the same id.

> I'm not sure llvm generates proper dwarf along with bpf code (I didn't
> test that part. If there are any issues they should be fixable. If you
> can prepapre a patch for llvm that would be even better :)
>

I found objdump can't get dwarf info from bpf object file:

$ objdump --dwarf=info bpf.o
bpf.o: file format elf64-little

$ readelf -a bpf.o |grep debug_info
<EMPTY>

'-g' and '-gdwarf=4' options are added to clang flags, and if we
compile object file for other platform such as x86_64, there's no problem.

$ objdump --dwarf=info x86_64_xx.o |wc -l
179

$ readelf -a x86_64_xx.o |grep debug_info
[10] .debug_info PROGBITS 0000000000000000 000002b9

I'm not very familar with llvm so can you give me some
suggestions?

Thank you.

2015-07-17 04:11:52

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

On 7/14/15 4:54 AM, He Kuang wrote:
>> I'm not sure llvm generates proper dwarf along with bpf code (I didn't
>> test that part. If there are any issues they should be fixable. If you
>> can prepapre a patch for llvm that would be even better :)
>>
>
> I found objdump can't get dwarf info from bpf object file:
>
> $ objdump --dwarf=info bpf.o
> bpf.o: file format elf64-little
>
> $ readelf -a bpf.o |grep debug_info
> <EMPTY>

yeah. looks like this part is not working.
Interesting that when I do: clang -O2 -target bpf a.c -g -S
there is some minimal debug info in the .s, but .o lacks
debuginfo completely. Digging further...

2015-07-17 04:15:23

by Wang Nan

[permalink] [raw]
Subject: Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event



On 2015/7/17 12:11, Alexei Starovoitov wrote:
> On 7/14/15 4:54 AM, He Kuang wrote:
>>> I'm not sure llvm generates proper dwarf along with bpf code (I didn't
>>> test that part. If there are any issues they should be fixable. If you
>>> can prepapre a patch for llvm that would be even better :)
>>>
>>
>> I found objdump can't get dwarf info from bpf object file:
>>
>> $ objdump --dwarf=info bpf.o
>> bpf.o: file format elf64-little
>>
>> $ readelf -a bpf.o |grep debug_info
>> <EMPTY>
>
> yeah. looks like this part is not working.
> Interesting that when I do: clang -O2 -target bpf a.c -g -S
> there is some minimal debug info in the .s, but .o lacks
> debuginfo completely. Digging further...

Glad to see you start look at it. We are not familiar with LLVM, but I was
told that LLVM has a clean structure and very easy to introduce new
features.
Could you please give us some hits on it so we can work together?

Thank you.

2015-07-17 04:27:57

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

On 7/16/15 9:14 PM, Wangnan (F) wrote:
>
>
> On 2015/7/17 12:11, Alexei Starovoitov wrote:
>> On 7/14/15 4:54 AM, He Kuang wrote:
>>>> I'm not sure llvm generates proper dwarf along with bpf code (I didn't
>>>> test that part. If there are any issues they should be fixable. If you
>>>> can prepapre a patch for llvm that would be even better :)
>>>>
>>>
>>> I found objdump can't get dwarf info from bpf object file:
>>>
>>> $ objdump --dwarf=info bpf.o
>>> bpf.o: file format elf64-little
>>>
>>> $ readelf -a bpf.o |grep debug_info
>>> <EMPTY>
>>
>> yeah. looks like this part is not working.
>> Interesting that when I do: clang -O2 -target bpf a.c -g -S
>> there is some minimal debug info in the .s, but .o lacks
>> debuginfo completely. Digging further...
>
> Glad to see you start look at it. We are not familiar with LLVM, but I was
> told that LLVM has a clean structure and very easy to introduce new
> features.
> Could you please give us some hits on it so we can work together?

sure. that would be awesome.
In general llmv is very well documented:
http://llvm.org/docs/

In this particular case start with:
clang -O2 -emit-llvm -g a.c -S -o a.ll
in a.ll you'll see llvm bitcode with corresponding debug tags.
note debug info in llvm ir in general is not compatible between
releases. So clang and llc need to match very closely.

Then use:
llc -march=bpf -print-after-all a.ll
you'll see something like:
BB#0: derived from LLVM BB %0
DBG_VALUE %R1, %noreg, !"a", <!16>; line no:6
DBG_VALUE %R2, %noreg, !"b", <!16>; line no:6
DBG_VALUE %R3, %noreg, !"c", <!16>; line no:6
%R1<def> = MOV_ri 0
%R2<def> = MOV_ri 3
JAL <ga:@bar>, %R0<imp-def,dead>, %R1<imp-def,dead>, %R2<imp-def,dead>,
%R3<imp-def,dead>, %R11<imp-use>, %R1<imp-use>, %R2<imp-use>, ...;
dbg:a.c:8:2
%R1<def> = MOV_ri 1; dbg:a.c:9:2

which means that debug info about line numbers and variable
names/types mapping to bpf registers was preserved all the way
till the last pass of the compiler.

It means that the problem is somewhere in 'machine code emitter'
in lib/Target/BPF/MCTargetDesc/*
Likely just some switch is saying to the rest of llvm infra that
this backend is not capable of emitting debug info.

btw, if you can implement 32-bit subregister support for the backend
it would be really awesome. Many programs will benefit and will
become faster.

2015-07-17 14:32:19

by Steven Rostedt

[permalink] [raw]
Subject: Re: [RFC PATCH v4 1/3] tracing/events: Fix wrong sample output by storing array length instead of size

On Fri, 10 Jul 2015 10:03:05 +0000
He Kuang <[email protected]> wrote:

> The output result of trace_foo_bar event in traceevent samples is
> wrong. This problem can be reproduced as following:
>
> (Build kernel with SAMPLE_TRACE_EVENTS=m)
>
> $ insmod trace-events-sample.ko
>
> $ echo 1 > /sys/kernel/debug/tracing/events/sample-trace/foo_bar/enable
>
> $ cat /sys/kernel/debug/tracing/trace
>
> event-sample-980 [000] .... 43.649559: foo_bar: foo hello 21 0x15
> BIT1|BIT3|0x10 {0x1,0x6f6f6e53,0xff007970,0xffffffff} Snoopy
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> The array length is not right, should be {0x1}.
> (ffffffff,ffffffff)
>
> event-sample-980 [000] .... 44.653827: foo_bar: foo hello 22 0x16
> BIT2|BIT3|0x10
> {0x1,0x2,0x646e6147,0x666c61,0xffffffff,0xffffffff,0x750aeffe,0x7}
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> The array length is not right, should be {0x1,0x2}.
> Gandalf (ffffffff,ffffffff)
>
> The event defined in samples/trace_events/trace-events-sample.h uses
> this helper function to output dynamic list:
>
> __print_array(__get_dynamic_array(list),
> __get_dynamic_array_len(list),
> sizeof(int))
>
> Currently, __get_dynamic_array_len() returns the total size of the
> array instead of the number of items by referencing the high 16 bits
> of entry->data_loc_##item. The element size for calculating the number
> of items can not be fetched by referencing fields from __entry, so
> macro __get_dynamic_array_len can not return the expected value.
>
> This patch stores array item number instead of the total size in
> entry->__data_loc_##item, and makes __get_dynamic_array_len get the
> right value directly. Because the function __get_bitmask() is affected
> by this change, __bitmask_size is assigned to the array len by
> multiplied bitmask type size.
>
> After this patch:
>
> event-sample-993 [000] .... 692.348562: foo_bar: foo hello 201
> 0xc9 BIT1|BIT4|0xc0 {0x1} Snoopy (ffffffff,ffffffff)
> ^^^^^
> Array length fixed.
>
> event-sample-993 [000] .... 693.349276: foo_bar: foo hello 202
> 0xca BIT2|BIT4|0xc0 {0x1,0x2} Gandalf (ffffffff,ffffffff)
> ^^^^^^^^^
> Array length fixed.
>
> Signed-off-by: He Kuang <[email protected]>
> ---
> include/trace/trace_events.h | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/include/trace/trace_events.h b/include/trace/trace_events.h
> index 43be3b0..5abe027 100644
> --- a/include/trace/trace_events.h
> +++ b/include/trace/trace_events.h
> @@ -257,7 +257,8 @@ TRACE_MAKE_SYSTEM_STR();
> ({ \
> void *__bitmask = __get_dynamic_array(field); \
> unsigned int __bitmask_size; \
> - __bitmask_size = __get_dynamic_array_len(field); \
> + __bitmask_size = (__get_dynamic_array_len(field) * \
> + sizeof(unsigned long)); \
> trace_print_bitmask_seq(p, __bitmask, __bitmask_size); \
> })
>
> @@ -453,7 +454,7 @@ trace_event_define_fields_##call(struct trace_event_call *event_call) \
> __item_length = (len) * sizeof(type); \
> __data_offsets->item = __data_size + \
> offsetof(typeof(*entry), __data); \
> - __data_offsets->item |= __item_length << 16; \
> + __data_offsets->item |= (len) << 16; \

This change affects all callers of dymanic_array, not just bitmasks.

> __data_size += __item_length;
>
> #undef __string

BTW, if I revert commit ac01ce1410fc2 "tracing: Make
ftrace_print_array_seq compute buf_len" it works again.

I'm going to look into this some more, and maybe the answer is to go
back and just pass in buffer length here. I can't see what was broken
before that change.

-- Steve

2015-07-17 17:25:47

by Sara Rostedt

[permalink] [raw]
Subject: Re: [RFC PATCH v4 1/3] tracing/events: Fix wrong sample output by storing array length instead of size

On Fri, 17 Jul 2015 10:32:15 -0400
Steven Rostedt <[email protected]> wrote:

> > @@ -453,7 +454,7 @@ trace_event_define_fields_##call(struct trace_event_call *event_call) \
> > __item_length = (len) * sizeof(type); \
> > __data_offsets->item = __data_size + \
> > offsetof(typeof(*entry), __data); \
> > - __data_offsets->item |= __item_length << 16; \
> > + __data_offsets->item |= (len) << 16; \
>
> This change affects all callers of dymanic_array, not just bitmasks.

This also affects what goes into the trace record and changes what
tools expect. That size is to be the size of the entire allocated
space, not the size of an individual element or the number of them.

>
> > __data_size += __item_length;
> >
> > #undef __string
>
> BTW, if I revert commit ac01ce1410fc2 "tracing: Make
> ftrace_print_array_seq compute buf_len" it works again.
>
> I'm going to look into this some more, and maybe the answer is to go
> back and just pass in buffer length here. I can't see what was broken
> before that change.

I'm thinking the best thing to do is to revert that patch and make the
second argument of __print_array() the size and not the count.

-- Steve

2015-07-17 18:13:19

by Steven Rostedt

[permalink] [raw]
Subject: Re: [RFC PATCH v4 1/3] tracing/events: Fix wrong sample output by storing array length instead of size

On Fri, 17 Jul 2015 10:32:15 -0400
Steven Rostedt <[email protected]> wrote:


> This change affects all callers of dymanic_array, not just bitmasks.
>
> > __data_size += __item_length;
> >
> > #undef __string
>
> BTW, if I revert commit ac01ce1410fc2 "tracing: Make
> ftrace_print_array_seq compute buf_len" it works again.
>
> I'm going to look into this some more, and maybe the answer is to go
> back and just pass in buffer length here. I can't see what was broken
> before that change.

OK, the print_array() code is already being used by the thermal events
and can't be changed. But we can't make the proposed change because
that changes the user interface.

What we can change is the sample code!

-- Steve

>From 95de1e9721a2f9d05831a53d228e181a33001c55 Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (Red Hat)" <[email protected]>
Date: Fri, 17 Jul 2015 14:03:26 -0400
Subject: [PATCH] tracing: Fix sample output of dynamic arrays

He Kuang noticed that the trace event samples for arrays was broken:

"The output result of trace_foo_bar event in traceevent samples is
wrong. This problem can be reproduced as following:

(Build kernel with SAMPLE_TRACE_EVENTS=m)

$ insmod trace-events-sample.ko

$ echo 1 > /sys/kernel/debug/tracing/events/sample-trace/foo_bar/enable

$ cat /sys/kernel/debug/tracing/trace

event-sample-980 [000] .... 43.649559: foo_bar: foo hello 21 0x15
BIT1|BIT3|0x10 {0x1,0x6f6f6e53,0xff007970,0xffffffff} Snoopy
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The array length is not right, should be {0x1}.
(ffffffff,ffffffff)

event-sample-980 [000] .... 44.653827: foo_bar: foo hello 22 0x16
BIT2|BIT3|0x10
{0x1,0x2,0x646e6147,0x666c61,0xffffffff,0xffffffff,0x750aeffe,0x7}
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The array length is not right, should be {0x1,0x2}.
Gandalf (ffffffff,ffffffff)"

This was caused by an update to have __print_array()'s second parameter
be the count of items in the array and not the size of the array.

As there is already users of __print_array(), it can not change. But
the sample code can and we can also improve on the documentation about
__print_array() and __get_dynamic_array_len().

Link: http://lkml.kernel.org/r/[email protected]

Fixes: ac01ce1410fc2 ("tracing: Make ftrace_print_array_seq compute buf_len")
Reported-by: He Kuang <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
---
samples/trace_events/trace-events-sample.h | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/samples/trace_events/trace-events-sample.h b/samples/trace_events/trace-events-sample.h
index 8965d1bb8811..125d6402f64f 100644
--- a/samples/trace_events/trace-events-sample.h
+++ b/samples/trace_events/trace-events-sample.h
@@ -168,7 +168,10 @@
*
* For __dynamic_array(int, foo, bar) use __get_dynamic_array(foo)
* Use __get_dynamic_array_len(foo) to get the length of the array
- * saved.
+ * saved. Note, __get_dynamic_array_len() returns the total allocated
+ * length of the dynamic array; __print_array() expects the second
+ * parameter to be the number of elements. To get that, the array length
+ * needs to be divided by the element size.
*
* For __string(foo, bar) use __get_str(foo)
*
@@ -288,7 +291,7 @@ TRACE_EVENT(foo_bar,
* This prints out the array that is defined by __array in a nice format.
*/
__print_array(__get_dynamic_array(list),
- __get_dynamic_array_len(list),
+ __get_dynamic_array_len(list) / sizeof(int),
sizeof(int)),
__get_str(str), __get_bitmask(cpus))
);
--
1.8.3.1

2015-07-23 11:55:18

by He Kuang

[permalink] [raw]
Subject: Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

Hi, Alexi

Thank you for your guidence, and by referencing your last mail
and other llvm backends, I found setting
BPFMCAsmInfo::SupportsDebugInformation = true in BPFMCAsmInfo.h
and fix some unhandeled switch can make llc output debug_info,
but important information is missing in the result:

bpf:
<1><2a>: Abbrev Number: 2 (DW_TAG_subprogram)
<2b> DW_AT_low_pc : 0x0
<33> DW_AT_high_pc : 0x60
<37> Unknown AT value: 3fe7: 1
<37> DW_AT_frame_base : 1 byte block: 5a (DW_OP_reg10 (r10))
<39> DW_AT_name : (indirect string, offset: 0x0): clang
version 3.7.0 (http://llvm.org/git/clang.git..
<3d> DW_AT_decl_file : 1
<3e> DW_AT_decl_line : 3
<3f> DW_AT_prototyped : 1
<3f> DW_AT_type : <0x65>
<43> DW_AT_external : 1
<43> Unknown AT value: 3fe1: 1
<2><43>: Abbrev Number: 3 (DW_TAG_formal_parameter)
<44> DW_AT_name : (indirect string, offset: 0x0): clang
version 3.7.0 (http://llvm.org/git/clang.git..
<48> DW_AT_decl_file : 1
<49> DW_AT_decl_line : 3
<4a> DW_AT_type : <0x65>

Compares to x86 platform result:

<1><26>: Abbrev Number: 2 (DW_TAG_subprogram)
<27> DW_AT_low_pc : 0x0
<2b> DW_AT_high_pc : 0x16
<2f> Unknown AT value: 3fe7: 1
<2f> DW_AT_frame_base : 1 byte block: 54 (DW_OP_reg4 (esp))
<31> DW_AT_name : (indirect string, offset: 0xcf): testprog
<35> DW_AT_decl_file : 1
<36> DW_AT_decl_line : 3
<37> DW_AT_prototyped : 1
<37> DW_AT_type : <0x65>
<3b> DW_AT_external : 1
<3b> Unknown AT value: 3fe1: 1
<2><3b>: Abbrev Number: 3 (DW_TAG_formal_parameter)
<3c> DW_AT_location : 2 byte block: 91 4 (DW_OP_fbreg: 4)
<3f> DW_AT_name : (indirect string, offset: 0xdc): myvar_a
<43> DW_AT_decl_file : 1
<44> DW_AT_decl_line : 3
<45> DW_AT_type : <0x65>

The bpf result lacks of DW_AT_location, and DW_AT_name gives no
infomation.

Then I used 'llc print-after*' command to check each pass and
wanted to find by which step the debug infomation is dropped,
things looks similar until the passes between 'verify' and
'expand-isel-pseudos':

x86:
$ llc -march=x86 --print-before-all -print-after-all
-stop-after=expand-isel-pseudos test.ll

# *** IR Dump Before Expand ISel Pseudo-instructions ***:
# Machine code for function testprog: SSA
Frame Objects:
fi#-2: size=4, align=4, fixed, at location [SP+8]
fi#-1: size=4, align=16, fixed, at location [SP+4]

BB#0: derived from LLVM BB %entry
DBG_VALUE <fi#-1>, 0, !"myvar_a", <!15>; line no:3
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
DBG_VALUE <fi#-2>, 0, !"myvar_b", <!15>; line no:3
%vreg0<def> = MOV32rm <fi#-2>, 1, %noreg, 0, %noreg; mem:LD4[FixedStack-2]
; GR32:%vreg0

bpf:
$ llc -march=bpf --print-before-all -print-after-all
-stop-after=expand-isel-pseudos test.ll

# *** IR Dump Before Expand ISel Pseudo-instructions ***:
# Machine code for function testprog: SSA
Function Live Ins: %R1 in %vreg0, %R2 in %vreg1

BB#0: derived from LLVM BB %entry
Live Ins: %R1 %R2
%vreg1<def> = COPY %R2; GPR:%vreg1
%vreg0<def> = COPY %R1; GPR:%vreg0
%vreg2<def> = LD_imm64 2147483648; GPR:%vreg2

I think maybe this missing 'DBG_VALUE' causes the problem, but
I'm stuck here and hope you can give more advice.

Thank you!

On 2015/7/17 12:27, Alexei Starovoitov wrote:
> clang -O2 -emit-llvm -g a.c -S -o a.ll

2015-07-23 19:36:27

by Alex Bennée

[permalink] [raw]
Subject: Re: [RFC PATCH v4 1/3] tracing/events: Fix wrong sample output by storing array length instead of size


Steven Rostedt <[email protected]> writes:

> On Fri, 17 Jul 2015 10:32:15 -0400
> Steven Rostedt <[email protected]> wrote:
>
>
>> This change affects all callers of dymanic_array, not just bitmasks.
>>
>> > __data_size += __item_length;
>> >
>> > #undef __string
>>
>> BTW, if I revert commit ac01ce1410fc2 "tracing: Make
>> ftrace_print_array_seq compute buf_len" it works again.
>>
>> I'm going to look into this some more, and maybe the answer is to go
>> back and just pass in buffer length here. I can't see what was broken
>> before that change.
>
> OK, the print_array() code is already being used by the thermal events
> and can't be changed. But we can't make the proposed change because
> that changes the user interface.
>
> What we can change is the sample code!
>
> -- Steve
>
> From 95de1e9721a2f9d05831a53d228e181a33001c55 Mon Sep 17 00:00:00 2001
> From: "Steven Rostedt (Red Hat)" <[email protected]>
> Date: Fri, 17 Jul 2015 14:03:26 -0400
> Subject: [PATCH] tracing: Fix sample output of dynamic arrays
>
> He Kuang noticed that the trace event samples for arrays was broken:
>
> "The output result of trace_foo_bar event in traceevent samples is
> wrong. This problem can be reproduced as following:
>
> (Build kernel with SAMPLE_TRACE_EVENTS=m)
>
> $ insmod trace-events-sample.ko
>
> $ echo 1 > /sys/kernel/debug/tracing/events/sample-trace/foo_bar/enable
>
> $ cat /sys/kernel/debug/tracing/trace
>
> event-sample-980 [000] .... 43.649559: foo_bar: foo hello 21 0x15
> BIT1|BIT3|0x10 {0x1,0x6f6f6e53,0xff007970,0xffffffff} Snoopy
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> The array length is not right, should be {0x1}.
> (ffffffff,ffffffff)
>
> event-sample-980 [000] .... 44.653827: foo_bar: foo hello 22 0x16
> BIT2|BIT3|0x10
> {0x1,0x2,0x646e6147,0x666c61,0xffffffff,0xffffffff,0x750aeffe,0x7}
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> The array length is not right, should be {0x1,0x2}.
> Gandalf (ffffffff,ffffffff)"
>
> This was caused by an update to have __print_array()'s second parameter
> be the count of items in the array and not the size of the array.
>
> As there is already users of __print_array(), it can not change. But
> the sample code can and we can also improve on the documentation about
> __print_array() and __get_dynamic_array_len().
>
> Link: http://lkml.kernel.org/r/[email protected]
>
> Fixes: ac01ce1410fc2 ("tracing: Make ftrace_print_array_seq compute buf_len")
> Reported-by: He Kuang <[email protected]>
> Signed-off-by: Steven Rostedt <[email protected]>
> ---
> samples/trace_events/trace-events-sample.h | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/samples/trace_events/trace-events-sample.h b/samples/trace_events/trace-events-sample.h
> index 8965d1bb8811..125d6402f64f 100644
> --- a/samples/trace_events/trace-events-sample.h
> +++ b/samples/trace_events/trace-events-sample.h
> @@ -168,7 +168,10 @@
> *
> * For __dynamic_array(int, foo, bar) use __get_dynamic_array(foo)
> * Use __get_dynamic_array_len(foo) to get the length of the array
> - * saved.
> + * saved. Note, __get_dynamic_array_len() returns the total allocated
> + * length of the dynamic array; __print_array() expects the second
> + * parameter to be the number of elements. To get that, the array length
> + * needs to be divided by the element size.
> *
> * For __string(foo, bar) use __get_str(foo)
> *
> @@ -288,7 +291,7 @@ TRACE_EVENT(foo_bar,
> * This prints out the array that is defined by __array in a nice format.
> */
> __print_array(__get_dynamic_array(list),
> - __get_dynamic_array_len(list),
> + __get_dynamic_array_len(list) / sizeof(int),
> sizeof(int)),
> __get_str(str), __get_bitmask(cpus))
> );

Reviewed-by: Alex Bennée <[email protected]>

--
Alex Bennée

2015-07-23 20:49:39

by Alexei Starovoitov

[permalink] [raw]
Subject: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

On 7/23/15 4:54 AM, He Kuang wrote:

trimmed cc-list, since it's not related to kernel.

> Thank you for your guidence, and by referencing your last mail
> and other llvm backends, I found setting
> BPFMCAsmInfo::SupportsDebugInformation = true in BPFMCAsmInfo.h

thanks! yes. it was missing.

> and fix some unhandeled switch can make llc output debug_info,

what do you mean ?

> but important information is missing in the result:

hmm. I see slightly different picture.
With 'clang -O2 -target bpf -g -S a.c'
I see all the right info inside .s file.
with '-c a.c' for some reasons it produces bogus offset:
Abbrev Offset: 0xffff0000
Pointer Size: 8
/usr/local/bin/objdump: Warning: Debug info is corrupted, abbrev offset
(ffff0000) is larger than abbrev section size (4b)

and objdump fails to parse .o
I'm using llvm trunk 3.8. Do you see this as well?

2015-07-24 03:20:15

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

On 7/23/15 1:49 PM, Alexei Starovoitov wrote:
> On 7/23/15 4:54 AM, He Kuang wrote:
>
> trimmed cc-list, since it's not related to kernel.
>
>> Thank you for your guidence, and by referencing your last mail
>> and other llvm backends, I found setting
>> BPFMCAsmInfo::SupportsDebugInformation = true in BPFMCAsmInfo.h
>
> thanks! yes. it was missing.
>
>> and fix some unhandeled switch can make llc output debug_info,
>
> what do you mean ?
>
>> but important information is missing in the result:
>
> hmm. I see slightly different picture.
> With 'clang -O2 -target bpf -g -S a.c'
> I see all the right info inside .s file.
> with '-c a.c' for some reasons it produces bogus offset:
> Abbrev Offset: 0xffff0000
> Pointer Size: 8
> /usr/local/bin/objdump: Warning: Debug info is corrupted, abbrev offset
> (ffff0000) is larger than abbrev section size (4b)
>
> and objdump fails to parse .o
> I'm using llvm trunk 3.8. Do you see this as well?

there were few issues related to relocations.
Fixed it up and pushed to llvm trunk r243087.
Please pull and give it a try.
AT_location should be correct, but AT_name still looks buggy.

2015-07-24 04:16:38

by He Kuang

[permalink] [raw]
Subject: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

Hi, Alexei

On 2015/7/24 11:20, Alexei Starovoitov wrote:
> On 7/23/15 1:49 PM, Alexei Starovoitov wrote:
>> On 7/23/15 4:54 AM, He Kuang wrote:
>>
>> trimmed cc-list, since it's not related to kernel.
>>
>>> Thank you for your guidence, and by referencing your last mail
>>> and other llvm backends, I found setting
>>> BPFMCAsmInfo::SupportsDebugInformation = true in BPFMCAsmInfo.h
>>
>> thanks! yes. it was missing.
>>
>>> and fix some unhandeled switch can make llc output debug_info,
>>
>> what do you mean ?
>>
>>> but important information is missing in the result:
>>
>> hmm. I see slightly different picture.
>> With 'clang -O2 -target bpf -g -S a.c'
>> I see all the right info inside .s file.
>> with '-c a.c' for some reasons it produces bogus offset:
>> Abbrev Offset: 0xffff0000
>> Pointer Size: 8
>> /usr/local/bin/objdump: Warning: Debug info is corrupted, abbrev offset
>> (ffff0000) is larger than abbrev section size (4b)
>>
>> and objdump fails to parse .o
>> I'm using llvm trunk 3.8. Do you see this as well?
>
> there were few issues related to relocations.
> Fixed it up and pushed to llvm trunk r243087.
> Please pull and give it a try.
> AT_location should be correct, but AT_name still looks buggy.

I've pulled the lastest version "[bpf] initial support for
debug_info" and tested it. This version can output debug_info but
still not generate correct AT_location, I tested as following:

$ cat > main.c <<EOF
int testprog(int myvar_a, int myvar_b)
{
int myvar_c;
myvar_c = myvar_a + myvar_b;
return myvar_c;
}
EOF

$ clang -g -O2 -target bpf -c main.c -o test.obj.bpf
$ clang -g -O2 -c main.c -o test.obj.x86

$ objdump --dwarf=info test.obj.x86 > test.obj.x86.dump
$ objdump --dwarf=info test.obj.bpf > test.obj.bpf.dump

Compare those two dump files:

test.obj.x86.dump:

<2><43>: Abbrev Number: 3 (DW_TAG_formal_parameter)
<44> DW_AT_location : 3 byte block: 55 93 4 (DW_OP_reg5 (rdi);
DW_OP_piece: 4)
<48> DW_AT_name : (indirect string, offset: 0xdc): myvar_a
<4c> DW_AT_decl_file : 1
<4d> DW_AT_decl_line : 1
<4e> DW_AT_type : <0x71>

test.obj.bpf.dump:

<2><43>: Abbrev Number: 3 (DW_TAG_formal_parameter)
<44> DW_AT_name : (indirect string, offset: 0x0): clang
version 3.8.0 (http://llvm.org/git/clang.git
3a7c733b80f156a547f3f1517e6fbce9c0a33026)
(http://llvm.org/git/llvm.git
90908cb34d73460d3 a83e2194a58d82c6d1f199)
<48> DW_AT_decl_file : 1
<49> DW_AT_decl_line : 1
<4a> DW_AT_type : <0x65>


No DW_AT_location info for formal parameters, but if we change
the function 'testprog' to 'main', DW_AT_location of formal
parameters appear but that of local variables are still missed,
don't know why..

$ cat > main.c <<EOF
#include <stdio.h>

int main(int argc, char *argv[])
{
int myvar_a, myvar_b;
int myvar_c;
myvar_c = myvar_a + myvar_b;
return myvar_c;
}

test.obj.bpf.dump:
<2><43>: Abbrev Number: 3 (DW_TAG_formal_parameter)
<44> DW_AT_location : 1 byte block: 51 (DW_OP_reg1 (r1))
<46> DW_AT_name : (indirect string, offset: 0x0): clang version 3.8.
..
<2><5d>: Abbrev Number: 4 (DW_TAG_variable)
<5e> DW_AT_name : (indirect string, offset: 0x0): clang version 3.8.

test.obj.x86.dump:

<2><43>: Abbrev Number: 3 (DW_TAG_formal_parameter)
<44> DW_AT_location : 3 byte block: 55 93 4 (DW_OP_reg5 (rdi);
DW_OP_piece: 4)
<48> DW_AT_name : (indirect string, offset: 0xd8): argc
..
<2><5f>: Abbrev Number: 4 (DW_TAG_variable)
<60> DW_AT_name : (indirect string, offset: 0xe7): myvar_a
..

Thank you.

2015-07-25 10:05:14

by He Kuang

[permalink] [raw]
Subject: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

Hi, Alexei

On 2015/7/24 12:16, He Kuang wrote:
> Hi, Alexei
>
> On 2015/7/24 11:20, Alexei Starovoitov wrote:
>> On 7/23/15 1:49 PM, Alexei Starovoitov wrote:
>>> On 7/23/15 4:54 AM, He Kuang wrote:
>>>
>>> trimmed cc-list, since it's not related to kernel.
>>>
>>>> Thank you for your guidence, and by referencing your last mail
>>>> and other llvm backends, I found setting
>>>> BPFMCAsmInfo::SupportsDebugInformation = true in BPFMCAsmInfo.h
>>>
>>> thanks! yes. it was missing.
>>>
>>>> and fix some unhandeled switch can make llc output debug_info,
>>>
>>> what do you mean ?
>>>
>>>> but important information is missing in the result:
>>>
>>> hmm. I see slightly different picture.
>>> With 'clang -O2 -target bpf -g -S a.c'
>>> I see all the right info inside .s file.
>>> with '-c a.c' for some reasons it produces bogus offset:
>>> Abbrev Offset: 0xffff0000
>>> Pointer Size: 8
>>> /usr/local/bin/objdump: Warning: Debug info is corrupted, abbrev offset
>>> (ffff0000) is larger than abbrev section size (4b)
>>>
>>> and objdump fails to parse .o
>>> I'm using llvm trunk 3.8. Do you see this as well?
>>
>> there were few issues related to relocations.
>> Fixed it up and pushed to llvm trunk r243087.
>> Please pull and give it a try.
>> AT_location should be correct, but AT_name still looks buggy.
>
> I've pulled the lastest version "[bpf] initial support for
> debug_info" and tested it. This version can output debug_info but
> still not generate correct AT_location, I tested as following:
>
> $ cat > main.c <<EOF
> int testprog(int myvar_a, int myvar_b)
> {
> int myvar_c;
> myvar_c = myvar_a + myvar_b;
> return myvar_c;
> }
> EOF
>
> $ clang -g -O2 -target bpf -c main.c -o test.obj.bpf
> $ clang -g -O2 -c main.c -o test.obj.x86
>
> $ objdump --dwarf=info test.obj.x86 > test.obj.x86.dump
> $ objdump --dwarf=info test.obj.bpf > test.obj.bpf.dump
>
> Compare those two dump files:
>
> test.obj.x86.dump:
>
> <2><43>: Abbrev Number: 3 (DW_TAG_formal_parameter)
> <44> DW_AT_location : 3 byte block: 55 93 4 (DW_OP_reg5 (rdi);
> DW_OP_piece: 4)
> <48> DW_AT_name : (indirect string, offset: 0xdc): myvar_a
> <4c> DW_AT_decl_file : 1
> <4d> DW_AT_decl_line : 1
> <4e> DW_AT_type : <0x71>
>
> test.obj.bpf.dump:
>
> <2><43>: Abbrev Number: 3 (DW_TAG_formal_parameter)
> <44> DW_AT_name : (indirect string, offset: 0x0): clang
> version 3.8.0 (http://llvm.org/git/clang.git
> 3a7c733b80f156a547f3f1517e6fbce9c0a33026)
> (http://llvm.org/git/llvm.git
> 90908cb34d73460d3 a83e2194a58d82c6d1f199)
> <48> DW_AT_decl_file : 1
> <49> DW_AT_decl_line : 1
> <4a> DW_AT_type : <0x65>
>
>
> No DW_AT_location info for formal parameters, but if we change
> the function 'testprog' to 'main', DW_AT_location of formal
> parameters appear but that of local variables are still missed,
> don't know why..
>
> $ cat > main.c <<EOF
> #include <stdio.h>
>
> int main(int argc, char *argv[])
> {
> int myvar_a, myvar_b;
> int myvar_c;
> myvar_c = myvar_a + myvar_b;
> return myvar_c;
> }
>
> test.obj.bpf.dump:
> <2><43>: Abbrev Number: 3 (DW_TAG_formal_parameter)
> <44> DW_AT_location : 1 byte block: 51 (DW_OP_reg1 (r1))
> <46> DW_AT_name : (indirect string, offset: 0x0): clang version 3.8.
> ..
> <2><5d>: Abbrev Number: 4 (DW_TAG_variable)
> <5e> DW_AT_name : (indirect string, offset: 0x0): clang version 3.8.
>
> test.obj.x86.dump:
>
> <2><43>: Abbrev Number: 3 (DW_TAG_formal_parameter)
> <44> DW_AT_location : 3 byte block: 55 93 4 (DW_OP_reg5 (rdi);
> DW_OP_piece: 4)
> <48> DW_AT_name : (indirect string, offset: 0xd8): argc
> ..
> <2><5f>: Abbrev Number: 4 (DW_TAG_variable)
> <60> DW_AT_name : (indirect string, offset: 0xe7): myvar_a
> ..
>
> Thank you.
>
>

Share some infomation on debuging bpf debuginfo problem.

An error "error: failed to compute relocation: Unknown" can be
found if we use llvm-dwarfdump tools to dump the object file,
debuging on that error, it seems there's no support for type
'BPF' in llvm/include/llvm/Support/MachO.h, and llvm-dwarfdump
fails to find the corresponding VisitElf method. Then I have a
rough test which forces RelocVisitor to use 'visitELF_386_32',
and got the correct DW_AT_name in the output:

0x00000043: DW_TAG_formal_parameter [3]
DW_AT_name [DW_FORM_strp] ( .debug_str[0x000000dc] = "myvar_a")
DW_AT_decl_file [DW_FORM_data1] ("testllvm/main.c")
DW_AT_decl_line [DW_FORM_data1] (3)
DW_AT_type [DW_FORM_ref4] (cu + 0x0065 => {0x00000065})

I noticed that for 64-bit elf format, the reloc sections have
'Addend' in the entry, but there's no 'Addend' info in bpf elf
file(64bit). I think there must be something wrong in the process
of .s -> .o, which related to 64bit/32bit. Anyway, we can parse out the
AT_name now, DW_AT_LOCATION still missed and need your help.

Thank you.


2015-07-28 02:18:48

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

On 7/25/15 3:04 AM, He Kuang wrote:
> I noticed that for 64-bit elf format, the reloc sections have
> 'Addend' in the entry, but there's no 'Addend' info in bpf elf
> file(64bit). I think there must be something wrong in the process
> of .s -> .o, which related to 64bit/32bit. Anyway, we can parse out the
> AT_name now, DW_AT_LOCATION still missed and need your help.

looks like objdump/llvm-dwarfdump can only read known EM,
but that that shouldn't be the problem for your dwarf reader right?
It should be able to recognize id-s of ELF::R_X86_64_* relo used right?
As far as AT_location for testprog.c it seems there is no info for
local variables because they were optimized away.
With -O0 I see AT_location being emitted.
Also line number info seems to be good in both cases.
But in our case, we don't need this anyway, no? we need to see
the types of structs mainly or you have some other use cases?
I think line number info would be great to correlate the error reported
by verifier into specific line in C.

2015-07-29 09:38:37

by He Kuang

[permalink] [raw]
Subject: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

Hi, Alexei

On 2015/7/28 10:18, Alexei Starovoitov wrote:
> On 7/25/15 3:04 AM, He Kuang wrote:
>> I noticed that for 64-bit elf format, the reloc sections have
>> 'Addend' in the entry, but there's no 'Addend' info in bpf elf
>> file(64bit). I think there must be something wrong in the process
>> of .s -> .o, which related to 64bit/32bit. Anyway, we can parse out the
>> AT_name now, DW_AT_LOCATION still missed and need your help.
>
> looks like objdump/llvm-dwarfdump can only read known EM,
> but that that shouldn't be the problem for your dwarf reader right?
> It should be able to recognize id-s of ELF::R_X86_64_* relo used right?
> As far as AT_location for testprog.c it seems there is no info for
> local variables because they were optimized away.
> With -O0 I see AT_location being emitted.
> Also line number info seems to be good in both cases.
> But in our case, we don't need this anyway, no? we need to see
> the types of structs mainly or you have some other use cases?
> I think line number info would be great to correlate the error reported
> by verifier into specific line in C.
>

Yes, without AT_location, we can lookup the user output data type
by line number, but there're some issues when we look deep.

There're two steps of work that should be done in user space,
first we embed data type into bpf output record, then we use this
type, or index or some other identifier to lookup the type from
dwarf info, so we got a few plans.

* Plan A. Use line number to identify the user data type

Predefined macros:

#define DEFINE_BPF_OUTPUT_DATA(type, var) \
const int BPF_OUTPUT_LINE__##var = __LINE__; type var;

#define BPF_OUTPUT_TRACE_DATA(data, size) \
__bpf_output_trace_data(BPF_OUTPUT_LINE__##data, &data, size)

User defined BPF code:

struct user_define_struct {
...
};

int testprog(int myvar_a, long myvar_b)
{
DEFINE_BPF_OUTPUT_DATA(struct user_define_struct, myvar_c);

BPF_OUTPUT_TRACE_DATA(myvar_c, sizeof(myvar_c));

...

We use macros to embed linenum implicitly, which leads an extra
restriction that user should not define multiple variables in the
same line and not split the macro over multiple lines, like this:

22 DEFINE_BPF_OUTPUT_DATA(struct xxtype, a); DEFINE_BPF_OUTPUT_DATA(struct xxtype, b);

Or

22 DEFINE_BPF_OUTPUT_DATA(struct user_define_struct,
23 myvar_c);

DW_AT_decl_line = 22, while __LINE__ = 23

So we should add verifier in the llvm BPF backend to warn on the
above codes.

* Plan B. Lookup variable type from dwarf AT_location info

We can make use of the output data variable's address, for bpf is
a minus offset to frame base. Then lookup matched offset from
location info(e.g. "DW_OP_fbreg: -32") to identify the variable
type.

For getting the frame base address, we can use builtin functions
like __builtin_frame_base() and __builtin_dwarf_cfa() which
returns the call frame base address. Currently those builtin
functions are not implemented in BPF lower operation yet, so we
tested our bpf program by using a variable tag on frame base, as
following:

struct user_define_struct {
...
};

typedef struct {} frame_base_tag;

int testprog(void)
{
frame_base_tag BPF_FRAME_BASE;
struct user_define_struct myvar_a;

__bpf_trace_output_data((void *)&myvar_a - (void *)&BPF_FRAME_BASE,
&myvar_a, sizeof(myvar_a));
...

The first argument of __bpf_trace_output_data() will be caculated
and it's easy to traverse the variable DIEs in dwarf info and
check each DW_AT_location attribute to find the corresponding
variable type.

The things let us worry about is the opimization may reuse the
stack space which can cause different variables share the same
address, by some rough tests that kind of optimization does not
appear.

* Comparison

Plan A needs less effort and easy to implement, but requires more
check to ensure user not use multiple definition in the same line
and not use macro cross lines.

The advantages of plan B is that we do not need introduce macros
showed in above example and all the things are done implicitly,
but the AT_location info is the prerequisite of this plan, I'm
not sure whether we can guarantee this info in dwarf or not.

Another way we can think of is adding new builtin functions to
indicate the compilier to generate codes return the dwarf type
index directly:

__bpf_trace_output_data(__builtin_dwarf_type(myvar_a), &myvar_a, size);

What's your opinion on those plans, and do you have more
suggestion?

Thank you.



2015-07-29 17:13:28

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

On 7/29/15 2:38 AM, He Kuang wrote:
> Hi, Alexei
>
> On 2015/7/28 10:18, Alexei Starovoitov wrote:
>> On 7/25/15 3:04 AM, He Kuang wrote:
>>> I noticed that for 64-bit elf format, the reloc sections have
>>> 'Addend' in the entry, but there's no 'Addend' info in bpf elf
>>> file(64bit). I think there must be something wrong in the process
>>> of .s -> .o, which related to 64bit/32bit. Anyway, we can parse out the
>>> AT_name now, DW_AT_LOCATION still missed and need your help.
>>
>> looks like objdump/llvm-dwarfdump can only read known EM,
>> but that that shouldn't be the problem for your dwarf reader right?
>> It should be able to recognize id-s of ELF::R_X86_64_* relo used right?
>> As far as AT_location for testprog.c it seems there is no info for
>> local variables because they were optimized away.
>> With -O0 I see AT_location being emitted.
>> Also line number info seems to be good in both cases.
>> But in our case, we don't need this anyway, no? we need to see
>> the types of structs mainly or you have some other use cases?
>> I think line number info would be great to correlate the error reported
>> by verifier into specific line in C.
>>
>
> Yes, without AT_location, we can lookup the user output data type
> by line number, but there're some issues when we look deep.
>
> There're two steps of work that should be done in user space,
> first we embed data type into bpf output record, then we use this
> type, or index or some other identifier to lookup the type from
> dwarf info, so we got a few plans.
>
> * Plan A. Use line number to identify the user data type
>
> Predefined macros:
>
> #define DEFINE_BPF_OUTPUT_DATA(type,
> var) \
> const int BPF_OUTPUT_LINE__##var = __LINE__; type var;
>
> #define BPF_OUTPUT_TRACE_DATA(data, size) \
> __bpf_output_trace_data(BPF_OUTPUT_LINE__##data, &data, size)
>
> User defined BPF code:
>
> struct user_define_struct {
> ...
> };
>
> int testprog(int myvar_a, long myvar_b)
> {
> DEFINE_BPF_OUTPUT_DATA(struct user_define_struct, myvar_c);
>
> BPF_OUTPUT_TRACE_DATA(myvar_c, sizeof(myvar_c));
>
> ...
>
> We use macros to embed linenum implicitly, which leads an extra
> restriction that user should not define multiple variables in the
> same line and not split the macro over multiple lines, like this:
>
> 22 DEFINE_BPF_OUTPUT_DATA(struct xxtype, a);
> DEFINE_BPF_OUTPUT_DATA(struct xxtype, b);
>
> Or
>
> 22 DEFINE_BPF_OUTPUT_DATA(struct user_define_struct,
> 23 myvar_c);
>
> DW_AT_decl_line = 22, while __LINE__ = 23
>
> So we should add verifier in the llvm BPF backend to warn on the
> above codes.
>
> * Plan B. Lookup variable type from dwarf AT_location info
>
> We can make use of the output data variable's address, for bpf is
> a minus offset to frame base. Then lookup matched offset from
> location info(e.g. "DW_OP_fbreg: -32") to identify the variable
> type.
>
> For getting the frame base address, we can use builtin functions
> like __builtin_frame_base() and __builtin_dwarf_cfa() which
> returns the call frame base address. Currently those builtin
> functions are not implemented in BPF lower operation yet, so we
> tested our bpf program by using a variable tag on frame base, as
> following:
>
> struct user_define_struct {
> ...
> };
>
> typedef struct {} frame_base_tag;
>
> int testprog(void)
> {
> frame_base_tag BPF_FRAME_BASE;
> struct user_define_struct myvar_a;
>
> __bpf_trace_output_data((void *)&myvar_a - (void *)&BPF_FRAME_BASE,
> &myvar_a, sizeof(myvar_a));
> ...
>
> The first argument of __bpf_trace_output_data() will be caculated
> and it's easy to traverse the variable DIEs in dwarf info and
> check each DW_AT_location attribute to find the corresponding
> variable type.
>
> The things let us worry about is the opimization may reuse the
> stack space which can cause different variables share the same
> address, by some rough tests that kind of optimization does not
> appear.
>
> * Comparison
>
> Plan A needs less effort and easy to implement, but requires more
> check to ensure user not use multiple definition in the same line
> and not use macro cross lines.
>
> The advantages of plan B is that we do not need introduce macros
> showed in above example and all the things are done implicitly,
> but the AT_location info is the prerequisite of this plan, I'm
> not sure whether we can guarantee this info in dwarf or not.
>
> Another way we can think of is adding new builtin functions to
> indicate the compilier to generate codes return the dwarf type
> index directly:
>
> __bpf_trace_output_data(__builtin_dwarf_type(myvar_a), &myvar_a, size);
>

probably both A and B won't really work when programs get bigger
and optimizations will start moving lines around.
the builtin_dwarf_type idea is actually quite interesting.
Potentially that builtin can stringify type name and later we can
search it in dwarf. Please take a look how to add such builtin.
There are few similar builtins that deal with exception handling
and need type info. May be they can be reused. Like:
int_eh_typeid_for and int_eh_dwarf_cfa

2015-07-29 20:00:20

by Wang Nan

[permalink] [raw]
Subject: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event



?????ҵ? iPhone

> ?? 2015??7??30?գ?????1:13??Alexei Starovoitov <[email protected]> д????
>
>> On 7/29/15 2:38 AM, He Kuang wrote:
>> Hi, Alexei
>>
>>> On 2015/7/28 10:18, Alexei Starovoitov wrote:
>>>> On 7/25/15 3:04 AM, He Kuang wrote:
>>>> I noticed that for 64-bit elf format, the reloc sections have
>>>> 'Addend' in the entry, but there's no 'Addend' info in bpf elf
>>>> file(64bit). I think there must be something wrong in the process
>>>> of .s -> .o, which related to 64bit/32bit. Anyway, we can parse out the
>>>> AT_name now, DW_AT_LOCATION still missed and need your help.
>>>
>>> looks like objdump/llvm-dwarfdump can only read known EM,
>>> but that that shouldn't be the problem for your dwarf reader right?
>>> It should be able to recognize id-s of ELF::R_X86_64_* relo used right?
>>> As far as AT_location for testprog.c it seems there is no info for
>>> local variables because they were optimized away.
>>> With -O0 I see AT_location being emitted.
>>> Also line number info seems to be good in both cases.
>>> But in our case, we don't need this anyway, no? we need to see
>>> the types of structs mainly or you have some other use cases?
>>> I think line number info would be great to correlate the error reported
>>> by verifier into specific line in C.
>>
>> Yes, without AT_location, we can lookup the user output data type
>> by line number, but there're some issues when we look deep.
>>
>> There're two steps of work that should be done in user space,
>> first we embed data type into bpf output record, then we use this
>> type, or index or some other identifier to lookup the type from
>> dwarf info, so we got a few plans.
>>
>> * Plan A. Use line number to identify the user data type
>>
>> Predefined macros:
>>
>> #define DEFINE_BPF_OUTPUT_DATA(type,
>> var) \
>> const int BPF_OUTPUT_LINE__##var = __LINE__; type var;
>>
>> #define BPF_OUTPUT_TRACE_DATA(data, size) \
>> __bpf_output_trace_data(BPF_OUTPUT_LINE__##data, &data, size)
>>
>> User defined BPF code:
>>
>> struct user_define_struct {
>> ...
>> };
>>
>> int testprog(int myvar_a, long myvar_b)
>> {
>> DEFINE_BPF_OUTPUT_DATA(struct user_define_struct, myvar_c);
>>
>> BPF_OUTPUT_TRACE_DATA(myvar_c, sizeof(myvar_c));
>>
>> ...
>>
>> We use macros to embed linenum implicitly, which leads an extra
>> restriction that user should not define multiple variables in the
>> same line and not split the macro over multiple lines, like this:
>>
>> 22 DEFINE_BPF_OUTPUT_DATA(struct xxtype, a);
>> DEFINE_BPF_OUTPUT_DATA(struct xxtype, b);
>>
>> Or
>>
>> 22 DEFINE_BPF_OUTPUT_DATA(struct user_define_struct,
>> 23 myvar_c);
>>
>> DW_AT_decl_line = 22, while __LINE__ = 23
>>
>> So we should add verifier in the llvm BPF backend to warn on the
>> above codes.
>>
>> * Plan B. Lookup variable type from dwarf AT_location info
>>
>> We can make use of the output data variable's address, for bpf is
>> a minus offset to frame base. Then lookup matched offset from
>> location info(e.g. "DW_OP_fbreg: -32") to identify the variable
>> type.
>>
>> For getting the frame base address, we can use builtin functions
>> like __builtin_frame_base() and __builtin_dwarf_cfa() which
>> returns the call frame base address. Currently those builtin
>> functions are not implemented in BPF lower operation yet, so we
>> tested our bpf program by using a variable tag on frame base, as
>> following:
>>
>> struct user_define_struct {
>> ...
>> };
>>
>> typedef struct {} frame_base_tag;
>>
>> int testprog(void)
>> {
>> frame_base_tag BPF_FRAME_BASE;
>> struct user_define_struct myvar_a;
>>
>> __bpf_trace_output_data((void *)&myvar_a - (void *)&BPF_FRAME_BASE,
>> &myvar_a, sizeof(myvar_a));
>> ...
>>
>> The first argument of __bpf_trace_output_data() will be caculated
>> and it's easy to traverse the variable DIEs in dwarf info and
>> check each DW_AT_location attribute to find the corresponding
>> variable type.
>>
>> The things let us worry about is the opimization may reuse the
>> stack space which can cause different variables share the same
>> address, by some rough tests that kind of optimization does not
>> appear.
>>
>> * Comparison
>>
>> Plan A needs less effort and easy to implement, but requires more
>> check to ensure user not use multiple definition in the same line
>> and not use macro cross lines.
>>
>> The advantages of plan B is that we do not need introduce macros
>> showed in above example and all the things are done implicitly,
>> but the AT_location info is the prerequisite of this plan, I'm
>> not sure whether we can guarantee this info in dwarf or not.
>>
>> Another way we can think of is adding new builtin functions to
>> indicate the compilier to generate codes return the dwarf type
>> index directly:
>>
>> __bpf_trace_output_data(__builtin_dwarf_type(myvar_a), &myvar_a, size);
>
> probably both A and B won't really work when programs get bigger
> and optimizations will start moving lines around.
> the builtin_dwarf_type idea is actually quite interesting.
> Potentially that builtin can stringify type name and later we can
> search it in dwarf. Please take a look how to add such builtin.
> There are few similar builtins that deal with exception handling
> and need type info. May be they can be reused. Like:
> int_eh_typeid_for and int_eh_dwarf_cfa
>

Hi Alexei,

I was wondering if you could give us a hint on adding BPF specific builtins?

Doesn't like other machines, currently there's no Builtins.def for BPF to hold
builtins for that specific target. If we start creating such builtins, we can bring
more there. One builtins I want to see should be __builtin_bpf_strcmp(char*, char*),
with that we can filter events base on name of a task. What I really need is filtering
events based on comm of the main thread, which should be useful on Android
since in Android name of working threads are always something like "Binder_?",
only the main threads names are meaningful. But let's work on strcmp first.

Thank you.

2015-07-29 22:20:23

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

On 7/29/15 1:00 PM, pi3orama wrote:
> I was wondering if you could give us a hint on adding BPF specific builtins?
>
> Doesn't like other machines, currently there's no Builtins.def for BPF to hold
> builtins for that specific target. If we start creating such builtins, we can bring
> more there. One builtins I want to see should be __builtin_bpf_strcmp(char*, char*),
> with that we can filter events base on name of a task. What I really need is filtering
> events based on comm of the main thread, which should be useful on Android
> since in Android name of working threads are always something like "Binder_?",
> only the main threads names are meaningful. But let's work on strcmp first.

Currently there are 4 of them in IntrinsicsBPF.td
but I don't see how strcmp can work with current no-loops restriction.
As an alternative to filter binder threads you can use
memcmp(ptr, "Binder_", 6) since it will get unrolled fully.

2015-07-31 10:19:24

by Wang Nan

[permalink] [raw]
Subject: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event



On 2015/7/30 1:13, Alexei Starovoitov wrote:

[SNIP]
> probably both A and B won't really work when programs get bigger
> and optimizations will start moving lines around.
> the builtin_dwarf_type idea is actually quite interesting.
> Potentially that builtin can stringify type name and later we can
> search it in dwarf. Please take a look how to add such builtin.
> There are few similar builtins that deal with exception handling
> and need type info. May be they can be reused. Like:
> int_eh_typeid_for and int_eh_dwarf_cfa
>

Hi Alexei,

I have tested int_eh_dwarf_cfa and int_eh_typeid_for.

By implementing ISD::FRAMEADDR support in LLVM BPF backend users are
allowed to
fetch frame address R11. __builtin_frame_addr(0) and __builtin_dwarf_cfa()
will be enabled.

By emitting llvm.eh_typeid_for in clang we can utilize it go generate an
unique
ID from a pointer of user defined type. However, we can't use pointer of
local
variables.

I attach a sample program and the resuling asm output at the bottom of
this mail.

Looks like llvm.eh_typeid_for meets our goal further. However, currently
it is
still ugly:

1. llvm.eh_typeid_for can be used on global variables only. So for each
output
structure we have to define a global varable.

2. We still need to find a way to connect the fetchd typeid with DWARF info.
Inserting that ID into DWARF may workable?

However with the newest llvm + clang the DWARF info is still incorrect:

$ objdump --dwarf=info ./out.o
...
<1><3f>: Abbrev Number: 3 (DW_TAG_structure_type)
<40> DW_AT_name : (indirect string, offset: 0x0): clang
version 3.8.0 (http://llvm.org/git/clang.git
f0fcd3432cbed83500df70c18f275d8affb89e5e) (http://llvm.org/git/llvm.git
c8ccd78d31d4949fa1c14e954ccb06253e18cf37)
<44> DW_AT_byte_size : 8
<45> DW_AT_decl_file : 1
<46> DW_AT_decl_line : 4
<2><47>: Abbrev Number: 4 (DW_TAG_member)
<48> DW_AT_name : (indirect string, offset: 0x0): clang
version 3.8.0 (http://llvm.org/git/clang.git
f0fcd3432cbed83500df70c18f275d8affb89e5e) (http://llvm.org/git/llvm.git
c8ccd78d31d4949fa1c14e954ccb06253e18cf37)
<4c> DW_AT_type : <0x60>
<50> DW_AT_decl_file : 1
<51> DW_AT_decl_line : 5
<52> DW_AT_data_member_location: 0
<2><53>: Abbrev Number: 4 (DW_TAG_member)
<54> DW_AT_name : (indirect string, offset: 0x0): clang
version 3.8.0 (http://llvm.org/git/clang.git
f0fcd3432cbed83500df70c18f275d8affb89e5e) (http://llvm.org/git/llvm.git
c8ccd78d31d4949fa1c14e954ccb06253e18cf37)
<58> DW_AT_type : <0x60>
<5c> DW_AT_decl_file : 1
<5d> DW_AT_decl_line : 6
<5e> DW_AT_data_member_location: 4
...

The DW_AT_name is missing.

I'll post 2 LLVM patches by replying this mail. Please have a look and
help me
send them to LLVM if you think my code is correct.

Following is the sample code and resuling ASM:

========================================================

static int (*test_func)(unsigned long) = (void *) 0x1234;

struct my_str {
int x;
int y;
};
struct my_str __str_my_str;

struct my_str2 {
int x;
int y;
int z;
};
struct my_str2 __str_my_str2;

int func(int *ctx)
{
struct my_str var_a;
struct my_str2 var_b;
test_func((void*)&var_a - __builtin_dwarf_cfa());
test_func(__builtin_bpf_typeid(&__str_my_str));
test_func(__builtin_bpf_typeid(&__str_my_str2));
return 0;
}

==================== the resuling asm code =============

.text
.globl func
.align 8
func: # @func
# BB#0: # %entry
mov r1, r10
addi r1, -8
sub r1, r11
call 4660
mov r1, 1
call 4660
mov r1, 2
call 4660
mov r0, 0
ret

.comm __str_my_str,8,4 # @__str_my_str
.comm __str_my_str2,12,4 # @__str_my_str2

2015-07-31 10:20:33

by Wang Nan

[permalink] [raw]
Subject: [LLVM PATCH] BPF: add FRAMEADDR support

After this patch now BPF backend support __builtin_frame_address()
and __builtin_dwarf_cfa().

Signed-off-by: Wang Nan <[email protected]>
---
lib/Target/BPF/BPFISelLowering.cpp | 20 ++++++++++++++++++++
lib/Target/BPF/BPFISelLowering.h | 1 +
2 files changed, 21 insertions(+)

diff --git a/lib/Target/BPF/BPFISelLowering.cpp b/lib/Target/BPF/BPFISelLowering.cpp
index 58498a1..f1934a2 100644
--- a/lib/Target/BPF/BPFISelLowering.cpp
+++ b/lib/Target/BPF/BPFISelLowering.cpp
@@ -169,6 +169,9 @@ BPFTargetLowering::BPFTargetLowering(const TargetMachine &TM,
MaxStoresPerMemset = MaxStoresPerMemsetOptSize = 128;
MaxStoresPerMemcpy = MaxStoresPerMemcpyOptSize = 128;
MaxStoresPerMemmove = MaxStoresPerMemmoveOptSize = 128;
+
+ // support __builtin_frame_address(0)
+ setOperationAction(ISD::FRAMEADDR, MVT::i32, Custom);
}

SDValue BPFTargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) const {
@@ -179,6 +182,8 @@ SDValue BPFTargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) const {
return LowerGlobalAddress(Op, DAG);
case ISD::SELECT_CC:
return LowerSELECT_CC(Op, DAG);
+ case ISD::FRAMEADDR:
+ return LowerFRAMEADDR(Op, DAG);
default:
llvm_unreachable("unimplemented operand");
}
@@ -509,6 +514,21 @@ SDValue BPFTargetLowering::LowerSELECT_CC(SDValue Op, SelectionDAG &DAG) const {
return DAG.getNode(BPFISD::SELECT_CC, DL, VTs, Ops);
}

+SDValue BPFTargetLowering::LowerFRAMEADDR(SDValue Op, SelectionDAG &DAG) const {
+ EVT VT = Op.getValueType();
+ unsigned FrameReg = BPF::R11;
+ unsigned Depth = cast<ConstantSDNode>(Op.getOperand(0))->getZExtValue();
+
+ if (Depth != 0) {
+ SDLoc DL(Op);
+ MachineFunction &MF = DAG.getMachineFunction();
+ DiagnosticInfoUnsupported Err(DL, *MF.getFunction(),
+ "only frame 0 address can be fetched", SDValue());
+ DAG.getContext()->diagnose(Err);
+ }
+ return DAG.getRegister(FrameReg, VT);
+}
+
const char *BPFTargetLowering::getTargetNodeName(unsigned Opcode) const {
switch ((BPFISD::NodeType)Opcode) {
case BPFISD::FIRST_NUMBER:
diff --git a/lib/Target/BPF/BPFISelLowering.h b/lib/Target/BPF/BPFISelLowering.h
index ec71dca..e4bf73d 100644
--- a/lib/Target/BPF/BPFISelLowering.h
+++ b/lib/Target/BPF/BPFISelLowering.h
@@ -50,6 +50,7 @@ private:
SDValue LowerBR_CC(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerSELECT_CC(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerGlobalAddress(SDValue Op, SelectionDAG &DAG) const;
+ SDValue LowerFRAMEADDR(SDValue Op, SelectionDAG &DAG) const;

// Lower the result values of a call, copying them out of physregs into vregs
SDValue LowerCallResult(SDValue Chain, SDValue InFlag,
--
1.8.3.4

2015-07-31 10:22:12

by Wang Nan

[permalink] [raw]
Subject: [LLVM CLANG PATCH] BPF: add __builtin_bpf_typeid()

This patch introduces a new builtin function __builtin_bpf_typeid()
for BPF. This function emit a 'llvm.eh_typeid_for', and will be
converted to typeid of a global variable.

Signed-off-by: Wang Nan <[email protected]>
---
include/clang/Basic/BuiltinsBPF.def | 16 ++++++++++++++++
include/clang/Basic/TargetBuiltins.h | 9 +++++++++
lib/Basic/Targets.cpp | 14 +++++++++++++-
lib/CodeGen/CGBuiltin.cpp | 16 ++++++++++++++++
lib/CodeGen/CodeGenFunction.h | 1 +
5 files changed, 55 insertions(+), 1 deletion(-)
create mode 100644 include/clang/Basic/BuiltinsBPF.def

diff --git a/include/clang/Basic/BuiltinsBPF.def b/include/clang/Basic/BuiltinsBPF.def
new file mode 100644
index 0000000..9e53ee0
--- /dev/null
+++ b/include/clang/Basic/BuiltinsBPF.def
@@ -0,0 +1,16 @@
+//===--- BuiltinsBPF.def - BPF Builtin function database ----*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// This file defines the BPF-specific builtin function database. Users of
+// this file must define the BUILTIN macro to make use of this information.
+//
+//===----------------------------------------------------------------------===//
+
+BUILTIN(__builtin_bpf_typeid, "Wiv*", "nc")
+
diff --git a/include/clang/Basic/TargetBuiltins.h b/include/clang/Basic/TargetBuiltins.h
index b4740c5..550f921 100644
--- a/include/clang/Basic/TargetBuiltins.h
+++ b/include/clang/Basic/TargetBuiltins.h
@@ -93,6 +93,15 @@ namespace clang {
};
}

+ namespace BPF {
+ enum {
+ LastTIBuiltin = clang::Builtin::FirstTSBuiltin-1,
+#define BUILTIN(ID, TYPE, ATTRS) BI##ID,
+#include "clang/Basic/BuiltinsBPF.def"
+ LastTSBuiltin
+ };
+ }
+
/// \brief Flags to identify the types for overloaded Neon builtins.
///
/// These must be kept in sync with the flags in utils/TableGen/NeonEmitter.h.
diff --git a/lib/Basic/Targets.cpp b/lib/Basic/Targets.cpp
index f229c99..aa7fffb 100644
--- a/lib/Basic/Targets.cpp
+++ b/lib/Basic/Targets.cpp
@@ -6111,6 +6111,7 @@ validateAsmConstraint(const char *&Name,
};

class BPFTargetInfo : public TargetInfo {
+ static const Builtin::Info BuiltinInfo[];
public:
BPFTargetInfo(const llvm::Triple &Triple) : TargetInfo(Triple) {
LongWidth = LongAlign = PointerWidth = PointerAlign = 64;
@@ -6141,7 +6142,10 @@ public:
}

void getTargetBuiltins(const Builtin::Info *&Records,
- unsigned &NumRecords) const override {}
+ unsigned &NumRecords) const override {
+ Records = BuiltinInfo;
+ NumRecords = clang::BPF::LastTSBuiltin - Builtin::FirstTSBuiltin;
+ }
const char *getClobbers() const override {
return "";
}
@@ -6164,6 +6168,14 @@ public:
}
};

+const Builtin::Info BPFTargetInfo::BuiltinInfo[] = {
+#define BUILTIN(ID, TYPE, ATTRS) { #ID, TYPE, ATTRS, 0, ALL_LANGUAGES },
+#define LIBBUILTIN(ID, TYPE, ATTRS, HEADER) { #ID, TYPE, ATTRS, HEADER,\
+ ALL_LANGUAGES },
+#include "clang/Basic/BuiltinsBPF.def"
+};
+
+
class MipsTargetInfoBase : public TargetInfo {
virtual void setDescriptionString() = 0;

diff --git a/lib/CodeGen/CGBuiltin.cpp b/lib/CodeGen/CGBuiltin.cpp
index 1555d29..a46f638 100644
--- a/lib/CodeGen/CGBuiltin.cpp
+++ b/lib/CodeGen/CGBuiltin.cpp
@@ -1882,6 +1882,9 @@ Value *CodeGenFunction::EmitTargetBuiltinExpr(unsigned BuiltinID,
case llvm::Triple::nvptx:
case llvm::Triple::nvptx64:
return EmitNVPTXBuiltinExpr(BuiltinID, E);
+ case llvm::Triple::bpfel:
+ case llvm::Triple::bpfeb:
+ return EmitBPFBuiltinExpr(BuiltinID, E);
default:
return nullptr;
}
@@ -7056,3 +7059,16 @@ Value *CodeGenFunction::EmitNVPTXBuiltinExpr(unsigned BuiltinID,
return nullptr;
}
}
+
+Value *CodeGenFunction::EmitBPFBuiltinExpr(unsigned BuiltinID,
+ const CallExpr *E) {
+ switch (BuiltinID) {
+ default:
+ return nullptr;
+ case BPF::BI__builtin_bpf_typeid:
+ assert(E->getArg(0)->getType()->isPointerType());
+ llvm::Value *llvm_eh_typeid_for =
+ CGM.getIntrinsic(llvm::Intrinsic::eh_typeid_for);
+ return Builder.CreateCall(llvm_eh_typeid_for, EmitScalarExpr(E->getArg(0)));
+ }
+}
diff --git a/lib/CodeGen/CodeGenFunction.h b/lib/CodeGen/CodeGenFunction.h
index b7d6bbd..863420d 100644
--- a/lib/CodeGen/CodeGenFunction.h
+++ b/lib/CodeGen/CodeGenFunction.h
@@ -2622,6 +2622,7 @@ public:
llvm::Value *EmitAMDGPUBuiltinExpr(unsigned BuiltinID, const CallExpr *E);
llvm::Value *EmitSystemZBuiltinExpr(unsigned BuiltinID, const CallExpr *E);
llvm::Value *EmitNVPTXBuiltinExpr(unsigned BuiltinID, const CallExpr *E);
+ llvm::Value *EmitBPFBuiltinExpr(unsigned BuiltinID, const CallExpr *E);

llvm::Value *EmitObjCProtocolExpr(const ObjCProtocolExpr *E);
llvm::Value *EmitObjCStringLiteral(const ObjCStringLiteral *E);
--
1.8.3.4

2015-07-31 10:48:56

by Wang Nan

[permalink] [raw]
Subject: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

Just for your information:

The buggy DW_AT_name seems to be a problem in assembler.

I built the C file using -S:

$ clang -target bpf -g -c -O2 test.c

Then in resuming test.s, replaced all BPF op by "nop";

Also adjust .file directive;

Then assemble the .s file using as:

$ as -c test.s -o test.o

Then check dwarf info using objdump:

$ objdump --dwarf=info test.o

DW_AT_name looks all correct.

Thank you.


?????ҵ? iPhone

> ?? 2015??7??31?գ?????6:18??Wangnan (F) <[email protected]> д????
>
>
>
> On 2015/7/30 1:13, Alexei Starovoitov wrote:
>
> [SNIP]
>> probably both A and B won't really work when programs get bigger
>> and optimizations will start moving lines around.
>> the builtin_dwarf_type idea is actually quite interesting.
>> Potentially that builtin can stringify type name and later we can
>> search it in dwarf. Please take a look how to add such builtin.
>> There are few similar builtins that deal with exception handling
>> and need type info. May be they can be reused. Like:
>> int_eh_typeid_for and int_eh_dwarf_cfa
>
> Hi Alexei,
>
> I have tested int_eh_dwarf_cfa and int_eh_typeid_for.
>
> By implementing ISD::FRAMEADDR support in LLVM BPF backend users are allowed to
> fetch frame address R11. __builtin_frame_addr(0) and __builtin_dwarf_cfa()
> will be enabled.
>
> By emitting llvm.eh_typeid_for in clang we can utilize it go generate an unique
> ID from a pointer of user defined type. However, we can't use pointer of local
> variables.
>
> I attach a sample program and the resuling asm output at the bottom of this mail.
>
> Looks like llvm.eh_typeid_for meets our goal further. However, currently it is
> still ugly:
>
> 1. llvm.eh_typeid_for can be used on global variables only. So for each output
> structure we have to define a global varable.
>
> 2. We still need to find a way to connect the fetchd typeid with DWARF info.
> Inserting that ID into DWARF may workable?
>
> However with the newest llvm + clang the DWARF info is still incorrect:
>
> $ objdump --dwarf=info ./out.o
> ...
> <1><3f>: Abbrev Number: 3 (DW_TAG_structure_type)
> <40> DW_AT_name : (indirect string, offset: 0x0): clang version 3.8.0 (http://llvm.org/git/clang.git f0fcd3432cbed83500df70c18f275d8affb89e5e) (http://llvm.org/git/llvm.git c8ccd78d31d4949fa1c14e954ccb06253e18cf37)
> <44> DW_AT_byte_size : 8
> <45> DW_AT_decl_file : 1
> <46> DW_AT_decl_line : 4
> <2><47>: Abbrev Number: 4 (DW_TAG_member)
> <48> DW_AT_name : (indirect string, offset: 0x0): clang version 3.8.0 (http://llvm.org/git/clang.git f0fcd3432cbed83500df70c18f275d8affb89e5e) (http://llvm.org/git/llvm.git c8ccd78d31d4949fa1c14e954ccb06253e18cf37)
> <4c> DW_AT_type : <0x60>
> <50> DW_AT_decl_file : 1
> <51> DW_AT_decl_line : 5
> <52> DW_AT_data_member_location: 0
> <2><53>: Abbrev Number: 4 (DW_TAG_member)
> <54> DW_AT_name : (indirect string, offset: 0x0): clang version 3.8.0 (http://llvm.org/git/clang.git f0fcd3432cbed83500df70c18f275d8affb89e5e) (http://llvm.org/git/llvm.git c8ccd78d31d4949fa1c14e954ccb06253e18cf37)
> <58> DW_AT_type : <0x60>
> <5c> DW_AT_decl_file : 1
> <5d> DW_AT_decl_line : 6
> <5e> DW_AT_data_member_location: 4
> ...
>
> The DW_AT_name is missing.
>
> I'll post 2 LLVM patches by replying this mail. Please have a look and help me
> send them to LLVM if you think my code is correct.
>
> Following is the sample code and resuling ASM:
>
> ========================================================
>
> static int (*test_func)(unsigned long) = (void *) 0x1234;
>
> struct my_str {
> int x;
> int y;
> };
> struct my_str __str_my_str;
>
> struct my_str2 {
> int x;
> int y;
> int z;
> };
> struct my_str2 __str_my_str2;
>
> int func(int *ctx)
> {
> struct my_str var_a;
> struct my_str2 var_b;
> test_func((void*)&var_a - __builtin_dwarf_cfa());
> test_func(__builtin_bpf_typeid(&__str_my_str));
> test_func(__builtin_bpf_typeid(&__str_my_str2));
> return 0;
> }
>
> ==================== the resuling asm code =============
>
> .text
> .globl func
> .align 8
> func: # @func
> # BB#0: # %entry
> mov r1, r10
> addi r1, -8
> sub r1, r11
> call 4660
> mov r1, 1
> call 4660
> mov r1, 2
> call 4660
> mov r0, 0
> ret
>
> .comm __str_my_str,8,4 # @__str_my_str
> .comm __str_my_str2,12,4 # @__str_my_str2
>
>

2015-08-03 19:44:37

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

On 7/31/15 3:18 AM, Wangnan (F) wrote:
>
> However with the newest llvm + clang the DWARF info is still incorrect:
>
> $ objdump --dwarf=info ./out.o
> ...
> <1><3f>: Abbrev Number: 3 (DW_TAG_structure_type)
> <40> DW_AT_name : (indirect string, offset: 0x0): clang
> version 3.8.0 (http://llvm.org/git/clang.git
> f0fcd3432cbed83500df70c18f275d8affb89e5e) (http://llvm.org/git/llvm.git
> c8ccd78d31d4949fa1c14e954ccb06253e18cf37)
> <44> DW_AT_byte_size : 8
> <45> DW_AT_decl_file : 1
> <46> DW_AT_decl_line : 4
> <2><47>: Abbrev Number: 4 (DW_TAG_member)
> <48> DW_AT_name : (indirect string, offset: 0x0): clang
> version 3.8.0 (http://llvm.org/git/clang.git
> f0fcd3432cbed83500df70c18f275d8affb89e5e) (http://llvm.org/git/llvm.git
> c8ccd78d31d4949fa1c14e954ccb06253e18cf37)
> <4c> DW_AT_type : <0x60>
> <50> DW_AT_decl_file : 1
> <51> DW_AT_decl_line : 5
> <52> DW_AT_data_member_location: 0
> <2><53>: Abbrev Number: 4 (DW_TAG_member)
> <54> DW_AT_name : (indirect string, offset: 0x0): clang
> version 3.8.0 (http://llvm.org/git/clang.git
> f0fcd3432cbed83500df70c18f275d8affb89e5e) (http://llvm.org/git/llvm.git
> c8ccd78d31d4949fa1c14e954ccb06253e18cf37)
> <58> DW_AT_type : <0x60>
> <5c> DW_AT_decl_file : 1
> <5d> DW_AT_decl_line : 6
> <5e> DW_AT_data_member_location: 4
> ...
>
> The DW_AT_name is missing.

didn't have time to look at it.
from your llvm patches looks like you've got quite experienced
with it already :)

> I'll post 2 LLVM patches by replying this mail. Please have a look and
> help me
> send them to LLVM if you think my code is correct.

patch 1:
I don't quite understand the purpose of builtin_dwarf_cfa
returning R11. It's a special register seen inside llvm codegen
only. It doesn't have kernel meaning.

patch 2:
do we really need to hack clang?
Can you just define a function that aliases to intrinsic,
like we do for ld_abs/ld_ind ?
void bpf_store_half(void *skb, u64 off, u64 val) asm("llvm.bpf.store.half");
then no extra patches necessary.

> struct my_str {
> int x;
> int y;
> };
> struct my_str __str_my_str;
>
> struct my_str2 {
> int x;
> int y;
> int z;
> };
> struct my_str2 __str_my_str2;
>
> test_func(__builtin_bpf_typeid(&__str_my_str));
> test_func(__builtin_bpf_typeid(&__str_my_str2));
> mov r1, 1
> call 4660
> mov r1, 2
> call 4660

this part looks good. I think it's usable.

> 1. llvm.eh_typeid_for can be used on global variables only. So for each
> output
> structure we have to define a global varable.

why? I think it should work with local pointers as well.

> 2. We still need to find a way to connect the fetchd typeid with DWARF
> info.
> Inserting that ID into DWARF may workable?

hmm, that id should be the same id we're seeing in dwarf, right?
I think it's used in exception handling which is reusing some of
the dwarf stuff for this, so the must be a way to connect that id
to actual type info. Though last time I looked at EH was
during g++ hacking days. No idea how llvm does it exactly, but
I'm assuming the logic for rtti should be similar.

btw, for any deep llvm stuff you may need to move the thread to llvmdev.
May be folks there will have other ideas.

2015-08-04 09:05:47

by Wang Nan

[permalink] [raw]
Subject: Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

For people who in llvmdev:

This mail is belong to a thread in linux kernel mailing list, the first
message
can be retrived from:

http://lkml.kernel.org/r/[email protected]

Our goal is to fild a way to make BPF program get an unique ID for each type
so it can pass the ID to other part of kernel, then we can retrive the
type and
decode the structure using DWARF information. Currently we have two problem
needs to solve:

1. Dwarf information generated by BPF backend lost all DW_AT_name field;

2. How to get typeid from local variable? I tried llvm.eh_typeid_for
but it support global variable only.

Following is my response to Alexei.

On 2015/8/4 3:44, Alexei Starovoitov wrote:
> On 7/31/15 3:18 AM, Wangnan (F) wrote:
>

[SNIP]

> didn't have time to look at it.
> from your llvm patches looks like you've got quite experienced
> with it already :)
>
>> I'll post 2 LLVM patches by replying this mail. Please have a look and
>> help me
>> send them to LLVM if you think my code is correct.
>
> patch 1:
> I don't quite understand the purpose of builtin_dwarf_cfa
> returning R11. It's a special register seen inside llvm codegen
> only. It doesn't have kernel meaning.
>

Kernel side verifier allows us to do arithmetic computation using two
local variable
address or local variable address and R11. Therefore, we can compute the
location
of a local variable using:

mark = &my_var_a - __builtin_frame_address(0);

If the stack allocation is fixed (if the location is never reused), the
above 'mark'
can be uniquely identify a local variable. That's why I'm interesting in
it. However
I'm not sure whether the prerequestion is hold.

> patch 2:
> do we really need to hack clang?
> Can you just define a function that aliases to intrinsic,
> like we do for ld_abs/ld_ind ?
> void bpf_store_half(void *skb, u64 off, u64 val)
> asm("llvm.bpf.store.half");
> then no extra patches necessary.
>
>> struct my_str {
>> int x;
>> int y;
>> };
>> struct my_str __str_my_str;
>>
>> struct my_str2 {
>> int x;
>> int y;
>> int z;
>> };
>> struct my_str2 __str_my_str2;
>>
>> test_func(__builtin_bpf_typeid(&__str_my_str));
>> test_func(__builtin_bpf_typeid(&__str_my_str2));
>> mov r1, 1
>> call 4660
>> mov r1, 2
>> call 4660
>
> this part looks good. I think it's usable.
>
> > 1. llvm.eh_typeid_for can be used on global variables only. So for each
> > output
> > structure we have to define a global varable.
>
> why? I think it should work with local pointers as well.
>

It is defined by LLVM, in lib/CodeGen/Analysis.cpp:

/// ExtractTypeInfo - Returns the type info, possibly bitcast, encoded in V.
GlobalValue *llvm::ExtractTypeInfo(Value *V) {
...
assert((GV || isa<ConstantPointerNull>(V)) &&
"TypeInfo must be a global variable or NULL"); <-- we can
use only constant pointers
return GV;
}

So from llvm::Intrinsic::eh_typeid_for we can get type of global
variables only.

We may need a new intrinsic for that.


> > 2. We still need to find a way to connect the fetchd typeid with DWARF
> > info.
> > Inserting that ID into DWARF may workable?
>
> hmm, that id should be the same id we're seeing in dwarf, right?

There's no 'typeid' field in dwarf originally. I'm still looking for a way
to inject this ID into dwarf infromation.

> I think it's used in exception handling which is reusing some of
> the dwarf stuff for this, so the must be a way to connect that id
> to actual type info. Though last time I looked at EH was
> during g++ hacking days. No idea how llvm does it exactly, but
> I'm assuming the logic for rtti should be similar.
>

I'm not sure whether RTTI use dwarf to deduce type information. I think not,
because dwarf infos can be stripped out.

> btw, for any deep llvm stuff you may need to move the thread to llvmdev.
> May be folks there will have other ideas.
>

2015-08-05 01:59:28

by Wang Nan

[permalink] [raw]
Subject: Re: Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event



On 2015/8/4 17:01, Wangnan (F) wrote:
> For people who in llvmdev:
>
> This mail is belong to a thread in linux kernel mailing list, the
> first message
> can be retrived from:
>
> http://lkml.kernel.org/r/[email protected]
>
> Our goal is to fild a way to make BPF program get an unique ID for
> each type
> so it can pass the ID to other part of kernel, then we can retrive the
> type and
> decode the structure using DWARF information. Currently we have two
> problem
> needs to solve:
>
> 1. Dwarf information generated by BPF backend lost all DW_AT_name field;
>
> 2. How to get typeid from local variable? I tried llvm.eh_typeid_for
> but it support global variable only.
>
> Following is my response to Alexei.
>
> On 2015/8/4 3:44, Alexei Starovoitov wrote:
>> On 7/31/15 3:18 AM, Wangnan (F) wrote:
>>
>
> [SNIP]
>
>> didn't have time to look at it.
>> from your llvm patches looks like you've got quite experienced
>> with it already :)
>>
>>> I'll post 2 LLVM patches by replying this mail. Please have a look and
>>> help me
>>> send them to LLVM if you think my code is correct.
>>
>> patch 1:
>> I don't quite understand the purpose of builtin_dwarf_cfa
>> returning R11. It's a special register seen inside llvm codegen
>> only. It doesn't have kernel meaning.
>>
>
> Kernel side verifier allows us to do arithmetic computation using two
> local variable
> address or local variable address and R11. Therefore, we can compute
> the location
> of a local variable using:
>
> mark = &my_var_a - __builtin_frame_address(0);
>
> If the stack allocation is fixed (if the location is never reused),
> the above 'mark'
> can be uniquely identify a local variable. That's why I'm interesting
> in it. However
> I'm not sure whether the prerequestion is hold.
>
>> patch 2:
>> do we really need to hack clang?
>> Can you just define a function that aliases to intrinsic,
>> like we do for ld_abs/ld_ind ?
>> void bpf_store_half(void *skb, u64 off, u64 val)
>> asm("llvm.bpf.store.half");
>> then no extra patches necessary.
>>
>>> struct my_str {
>>> int x;
>>> int y;
>>> };
>>> struct my_str __str_my_str;
>>>
>>> struct my_str2 {
>>> int x;
>>> int y;
>>> int z;
>>> };
>>> struct my_str2 __str_my_str2;
>>>
>>> test_func(__builtin_bpf_typeid(&__str_my_str));
>>> test_func(__builtin_bpf_typeid(&__str_my_str2));
>>> mov r1, 1
>>> call 4660
>>> mov r1, 2
>>> call 4660
>>
>> this part looks good. I think it's usable.
>>
>> > 1. llvm.eh_typeid_for can be used on global variables only. So for
>> each
>> > output
>> > structure we have to define a global varable.
>>
>> why? I think it should work with local pointers as well.
>>
>
> It is defined by LLVM, in lib/CodeGen/Analysis.cpp:
>
> /// ExtractTypeInfo - Returns the type info, possibly bitcast, encoded
> in V.
> GlobalValue *llvm::ExtractTypeInfo(Value *V) {
> ...
> assert((GV || isa<ConstantPointerNull>(V)) &&
> "TypeInfo must be a global variable or NULL"); <-- we can
> use only constant pointers
> return GV;
> }
>
> So from llvm::Intrinsic::eh_typeid_for we can get type of global
> variables only.
>
> We may need a new intrinsic for that.
>
>
>> > 2. We still need to find a way to connect the fetchd typeid with DWARF
>> > info.
>> > Inserting that ID into DWARF may workable?
>>
>> hmm, that id should be the same id we're seeing in dwarf, right?
>
> There's no 'typeid' field in dwarf originally. I'm still looking for a
> way
> to inject this ID into dwarf infromation.
>
>> I think it's used in exception handling which is reusing some of
>> the dwarf stuff for this, so the must be a way to connect that id
>> to actual type info. Though last time I looked at EH was
>> during g++ hacking days. No idea how llvm does it exactly, but
>> I'm assuming the logic for rtti should be similar.
>>
>
> I'm not sure whether RTTI use dwarf to deduce type information. I
> think not,
> because dwarf infos can be stripped out.
>

Hi Alexei,

Just found that llvm::Intrinsic::eh_typeid_for is function specific. ID
of same type in
different functions may be different. Here is an example:

static int (*bpf_output_event)(unsigned long, void *buf, int size) =
(void *) 0x1234;

struct my_str {
int x;
int y;
};
struct my_str __str_my_str;

struct my_str2 {
int x;
int y;
int z;
};
struct my_str2 __str_my_str2;

int func(int *ctx)
{
struct my_str var_a;
struct my_str2 var_b;
bpf_output_event(__builtin_bpf_typeid(&__str_my_str), &var_a,
sizeof(var_a));
bpf_output_event(__builtin_bpf_typeid(&__str_my_str2), &var_b,
sizeof(var_b));
return 0;
}

int func2(int *ctx)
{
struct my_str var_a;
struct my_str2 var_b;

/* change order here */
bpf_output_event(__builtin_bpf_typeid(&__str_my_str2), &var_b,
sizeof(var_b));
bpf_output_event(__builtin_bpf_typeid(&__str_my_str), &var_a,
sizeof(var_a))
return 0;
}

This program uses __builtin_bpf_typeid(llvm::Intrinsic::eh_typeid_for)
in func and func2
for same two types but in different order. We expect same type get same ID.

Compiled with:

$ clang -target bpf -S -O2 -c ./test_bpf_typeid.c

The result is:

.text
.globl func
.align 8
func: # @func
# BB#0: # %entry
mov r2, r10
addi r2, -8
mov r1, 1
mov r3, 8
call 4660
mov r2, r10
addi r2, -24
mov r1, 2
mov r3, 12
call 4660
mov r0, 0
ret

.globl func2
.align 8
func2: # @func2
# BB#0: # %entry
mov r2, r10
addi r2, -24
mov r1, 1 <--- we want 2 here.
mov r3, 12
call 4660
mov r2, r10
addi r2, -8
mov r1, 2 <--- we want 1 here.
mov r3, 8
call 4660
mov r0, 0
ret

.comm __str_my_str,8,4 # @__str_my_str
.comm __str_my_str2,12,4 # @__str_my_str2


Conclusion: llvm::Intrinsic::eh_typeid_for is not on the right direction...

Thank you.

2015-08-05 02:05:40

by Wang Nan

[permalink] [raw]
Subject: Re: Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

Send again since llvmdev is moved to [email protected]

On 2015/8/5 9:58, Wangnan (F) wrote:
>
>
> On 2015/8/4 17:01, Wangnan (F) wrote:
>> For people who in llvmdev:
>>
>> This mail is belong to a thread in linux kernel mailing list, the
>> first message
>> can be retrived from:
>>
>> http://lkml.kernel.org/r/[email protected]
>>
>> Our goal is to fild a way to make BPF program get an unique ID for
>> each type
>> so it can pass the ID to other part of kernel, then we can retrive
>> the type and
>> decode the structure using DWARF information. Currently we have two
>> problem
>> needs to solve:
>>
>> 1. Dwarf information generated by BPF backend lost all DW_AT_name field;
>>
>> 2. How to get typeid from local variable? I tried llvm.eh_typeid_for
>> but it support global variable only.
>>
>> Following is my response to Alexei.
>>
>> On 2015/8/4 3:44, Alexei Starovoitov wrote:
>>> On 7/31/15 3:18 AM, Wangnan (F) wrote:
>>>
>>
>> [SNIP]
>>
>>> didn't have time to look at it.
>>> from your llvm patches looks like you've got quite experienced
>>> with it already :)
>>>
>>>> I'll post 2 LLVM patches by replying this mail. Please have a look and
>>>> help me
>>>> send them to LLVM if you think my code is correct.
>>>
>>> patch 1:
>>> I don't quite understand the purpose of builtin_dwarf_cfa
>>> returning R11. It's a special register seen inside llvm codegen
>>> only. It doesn't have kernel meaning.
>>>
>>
>> Kernel side verifier allows us to do arithmetic computation using two
>> local variable
>> address or local variable address and R11. Therefore, we can compute
>> the location
>> of a local variable using:
>>
>> mark = &my_var_a - __builtin_frame_address(0);
>>
>> If the stack allocation is fixed (if the location is never reused),
>> the above 'mark'
>> can be uniquely identify a local variable. That's why I'm interesting
>> in it. However
>> I'm not sure whether the prerequestion is hold.
>>
>>> patch 2:
>>> do we really need to hack clang?
>>> Can you just define a function that aliases to intrinsic,
>>> like we do for ld_abs/ld_ind ?
>>> void bpf_store_half(void *skb, u64 off, u64 val)
>>> asm("llvm.bpf.store.half");
>>> then no extra patches necessary.
>>>
>>>> struct my_str {
>>>> int x;
>>>> int y;
>>>> };
>>>> struct my_str __str_my_str;
>>>>
>>>> struct my_str2 {
>>>> int x;
>>>> int y;
>>>> int z;
>>>> };
>>>> struct my_str2 __str_my_str2;
>>>>
>>>> test_func(__builtin_bpf_typeid(&__str_my_str));
>>>> test_func(__builtin_bpf_typeid(&__str_my_str2));
>>>> mov r1, 1
>>>> call 4660
>>>> mov r1, 2
>>>> call 4660
>>>
>>> this part looks good. I think it's usable.
>>>
>>> > 1. llvm.eh_typeid_for can be used on global variables only. So for
>>> each
>>> > output
>>> > structure we have to define a global varable.
>>>
>>> why? I think it should work with local pointers as well.
>>>
>>
>> It is defined by LLVM, in lib/CodeGen/Analysis.cpp:
>>
>> /// ExtractTypeInfo - Returns the type info, possibly bitcast,
>> encoded in V.
>> GlobalValue *llvm::ExtractTypeInfo(Value *V) {
>> ...
>> assert((GV || isa<ConstantPointerNull>(V)) &&
>> "TypeInfo must be a global variable or NULL"); <-- we can
>> use only constant pointers
>> return GV;
>> }
>>
>> So from llvm::Intrinsic::eh_typeid_for we can get type of global
>> variables only.
>>
>> We may need a new intrinsic for that.
>>
>>
>>> > 2. We still need to find a way to connect the fetchd typeid with
>>> DWARF
>>> > info.
>>> > Inserting that ID into DWARF may workable?
>>>
>>> hmm, that id should be the same id we're seeing in dwarf, right?
>>
>> There's no 'typeid' field in dwarf originally. I'm still looking for
>> a way
>> to inject this ID into dwarf infromation.
>>
>>> I think it's used in exception handling which is reusing some of
>>> the dwarf stuff for this, so the must be a way to connect that id
>>> to actual type info. Though last time I looked at EH was
>>> during g++ hacking days. No idea how llvm does it exactly, but
>>> I'm assuming the logic for rtti should be similar.
>>>
>>
>> I'm not sure whether RTTI use dwarf to deduce type information. I
>> think not,
>> because dwarf infos can be stripped out.
>>
>
> Hi Alexei,
>
> Just found that llvm::Intrinsic::eh_typeid_for is function specific.
> ID of same type in
> different functions may be different. Here is an example:
>
> static int (*bpf_output_event)(unsigned long, void *buf, int size) =
> (void *) 0x1234;
>
> struct my_str {
> int x;
> int y;
> };
> struct my_str __str_my_str;
>
> struct my_str2 {
> int x;
> int y;
> int z;
> };
> struct my_str2 __str_my_str2;
>
> int func(int *ctx)
> {
> struct my_str var_a;
> struct my_str2 var_b;
> bpf_output_event(__builtin_bpf_typeid(&__str_my_str), &var_a,
> sizeof(var_a));
> bpf_output_event(__builtin_bpf_typeid(&__str_my_str2), &var_b,
> sizeof(var_b));
> return 0;
> }
>
> int func2(int *ctx)
> {
> struct my_str var_a;
> struct my_str2 var_b;
>
> /* change order here */
> bpf_output_event(__builtin_bpf_typeid(&__str_my_str2), &var_b,
> sizeof(var_b));
> bpf_output_event(__builtin_bpf_typeid(&__str_my_str), &var_a,
> sizeof(var_a))
> return 0;
> }
>
> This program uses __builtin_bpf_typeid(llvm::Intrinsic::eh_typeid_for)
> in func and func2
> for same two types but in different order. We expect same type get
> same ID.
>
> Compiled with:
>
> $ clang -target bpf -S -O2 -c ./test_bpf_typeid.c
>
> The result is:
>
> .text
> .globl func
> .align 8
> func: # @func
> # BB#0: # %entry
> mov r2, r10
> addi r2, -8
> mov r1, 1
> mov r3, 8
> call 4660
> mov r2, r10
> addi r2, -24
> mov r1, 2
> mov r3, 12
> call 4660
> mov r0, 0
> ret
>
> .globl func2
> .align 8
> func2: # @func2
> # BB#0: # %entry
> mov r2, r10
> addi r2, -24
> mov r1, 1 <--- we want 2 here.
> mov r3, 12
> call 4660
> mov r2, r10
> addi r2, -8
> mov r1, 2 <--- we want 1 here.
> mov r3, 8
> call 4660
> mov r0, 0
> ret
>
> .comm __str_my_str,8,4 # @__str_my_str
> .comm __str_my_str2,12,4 # @__str_my_str2
>
>
> Conclusion: llvm::Intrinsic::eh_typeid_for is not on the right
> direction...
>
> Thank you.

2015-08-05 06:51:19

by Wang Nan

[permalink] [raw]
Subject: Re: [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event



On 2015/8/5 10:05, Wangnan (F) wrote:
> Send again since llvmdev is moved to [email protected]
>
> On 2015/8/5 9:58, Wangnan (F) wrote:
>>
>>>
>>> On 2015/8/4 3:44, Alexei Starovoitov wrote:
>>>> On 7/31/15 3:18 AM, Wangnan (F) wrote:
>>>>
>>>
>>> [SNIP]
>>>
>>>> didn't have time to look at it.
>>>> from your llvm patches looks like you've got quite experienced
>>>> with it already :)
>>>>
>>>>> I'll post 2 LLVM patches by replying this mail. Please have a look
>>>>> and
>>>>> help me
>>>>> send them to LLVM if you think my code is correct.
>>>>
>>>> patch 1:
>>>> I don't quite understand the purpose of builtin_dwarf_cfa
>>>> returning R11. It's a special register seen inside llvm codegen
>>>> only. It doesn't have kernel meaning.
>>>>
>>>
>>> Kernel side verifier allows us to do arithmetic computation using
>>> two local variable
>>> address or local variable address and R11. Therefore, we can compute
>>> the location
>>> of a local variable using:
>>>
>>> mark = &my_var_a - __builtin_frame_address(0);
>>>
>>> If the stack allocation is fixed (if the location is never reused),
>>> the above 'mark'
>>> can be uniquely identify a local variable. That's why I'm
>>> interesting in it. However
>>> I'm not sure whether the prerequestion is hold.
>>>
>>>> patch 2:
>>>> do we really need to hack clang?
>>>> Can you just define a function that aliases to intrinsic,
>>>> like we do for ld_abs/ld_ind ?
>>>> void bpf_store_half(void *skb, u64 off, u64 val)
>>>> asm("llvm.bpf.store.half");
>>>> then no extra patches necessary.
>>>>

And for this:

I tried this test function:

void bpf_store_half(void *skb, int off, int val) asm("llvm.bpf.store.half");
int func()
{
bpf_store_half(0, 0, 0);
return 0;
}

Compiled with:

$ clang -g -target bpf -O2 -S -c test.c

And get this:

.text
.globl func
.align 8
func: # @func
# BB#0: # %entry
mov r1, 0
mov r2, 0
mov r3, 0
call llvm.bpf.store.half
mov r0, 0
ret

Without -S, it generate a function relocation:

$ objdump -r ./test.o

./test.o: file format elf64-little

RELOCATION RECORDS FOR [.text]:
OFFSET TYPE VALUE
0000000000000018 UNKNOWN llvm.bpf.store.half


It doesn't work as you suggestion. I think we still need to do something
in clang frontend, or it can only be used in '.ll'.

Thank you.

2015-08-05 07:11:24

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

On 8/4/15 11:51 PM, Wangnan (F) wrote:
> void bpf_store_half(void *skb, int off, int val)
> asm("llvm.bpf.store.half");
> int func()
> {
> bpf_store_half(0, 0, 0);
> return 0;
> }
>
> Compiled with:
>
> $ clang -g -target bpf -O2 -S -c test.c
>
> And get this:
>
> .text
> .globl func
> .align 8
> func: # @func
> # BB#0: # %entry
> mov r1, 0
> mov r2, 0
> mov r3, 0
> call llvm.bpf.store.half

it didn't work because number and types of args were incompatible.
Every samples/bpf/sockex[0-9]_kern.c is using llvm.bpf.load.* intrinsics.

the typeid changing ids with order is surprising.
I think the assertion in ExtractTypeInfo() is not hard.
Just there were no such use cases. May be we can do something
similar to what LowerIntrinsicCall() does and lower it differently
in the backend.

2015-08-05 08:35:10

by Wang Nan

[permalink] [raw]
Subject: Re: [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event



On 2015/8/5 15:11, Alexei Starovoitov wrote:
> On 8/4/15 11:51 PM, Wangnan (F) wrote:
>> void bpf_store_half(void *skb, int off, int val)
>> asm("llvm.bpf.store.half");
>> int func()
>> {
>> bpf_store_half(0, 0, 0);
>> return 0;
>> }
>>
>> Compiled with:
>>
>> $ clang -g -target bpf -O2 -S -c test.c
>>
>> And get this:
>>
>> .text
>> .globl func
>> .align 8
>> func: # @func
>> # BB#0: # %entry
>> mov r1, 0
>> mov r2, 0
>> mov r3, 0
>> call llvm.bpf.store.half
>
> it didn't work because number and types of args were incompatible.
> Every samples/bpf/sockex[0-9]_kern.c is using llvm.bpf.load.* intrinsics.
>

Make it work. Thank you.

It doesn't work for me at first since in my llvm there's only
llvm.bpf.load.*.

I think llvm.bpf.store.* belone to some patches you haven't posted yet?

> the typeid changing ids with order is surprising.
> I think the assertion in ExtractTypeInfo() is not hard.
> Just there were no such use cases. May be we can do something
> similar to what LowerIntrinsicCall() does and lower it differently
> in the backend.
>
But in backend can we still get type information? I thought type is
meaningful in frontend only, and backend behaviors is unable to affect
DWARF generation, right?

Thank you.

2015-08-05 08:59:48

by He Kuang

[permalink] [raw]
Subject: Re: [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

Hi, Alexei

On 2015/7/30 1:13, Alexei Starovoitov wrote:
> On 7/29/15 2:38 AM, He Kuang wrote:
>> Hi, Alexei
>>
>> On 2015/7/28 10:18, Alexei Starovoitov wrote:
>>> On 7/25/15 3:04 AM, He Kuang wrote:
>>>> I noticed that for 64-bit elf format, the reloc sections have
>>>> 'Addend' in the entry, but there's no 'Addend' info in bpf elf
>>>> file(64bit). I think there must be something wrong in the process
>>>> of .s -> .o, which related to 64bit/32bit. Anyway, we can parse out the
>>>> AT_name now, DW_AT_LOCATION still missed and need your help.
>>>

Another thing about DW_AT_name, we've already found that the name
string is stored indirectly and needs relocation which is
architecture specific, while the e_machine info in bpf obj file
is "unknown", both objdump and libdw cannot parse DW_AT_name
correctly.

Should we just use a known architeture for bpf object file
instead of "unknown"? If so, we can use the existing relocation
codes in libdw and get DIE name by simply invoking
dwarf_diename(). The drawback of this method is that, e.g. we
use "x86-64" instead, is hard to distinguish bpf obj file with
x86-64 elf file. Do you think this is ok?

Otherwise, for not touching libdw, we should reimplement the
relocation codes already in libdw for bpf elf file with "unknown"
machine info specially in perf. I wonder whether it is worth doing
this and what's your opinion?

Thank you.

>> index directly:
>>
>> __bpf_trace_output_data(__builtin_dwarf_type(myvar_a), &myvar_a, size);
>>
>
> probably both A and B won't really work when programs get bigger
> and optimizations will start moving lines around.
> the builtin_dwarf_type idea is actually quite interesting.
> Potentially that builtin can stringify type name and later we can
> search it in dwarf. Please take a look how to add such builtin.
> There are few similar builtins that deal with exception handling
> and need type info. May be they can be reused. Like:
> int_eh_typeid_for and int_eh_dwarf_cfa
>
>

2015-08-06 03:22:27

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [llvm-dev] [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

On Wed, Aug 05, 2015 at 04:28:13PM +0800, Wangnan (F) wrote:
>
> It doesn't work for me at first since in my llvm there's only
> llvm.bpf.load.*.
>
> I think llvm.bpf.store.* belone to some patches you haven't posted yet?

nope. only loads have special instructions ld_abs/ld_ind
which are represented by these intrinsics.
stores, so far, are done via single bpf_store_bytes() helper function.

> >the typeid changing ids with order is surprising.
> >I think the assertion in ExtractTypeInfo() is not hard.
> >Just there were no such use cases. May be we can do something
> >similar to what LowerIntrinsicCall() does and lower it differently
> >in the backend.
> >
> But in backend can we still get type information? I thought type is
> meaningful in frontend only, and backend behaviors is unable to affect
> DWARF generation, right?

why do we need to affect type generation? we just need to know dwarf
type id in the backend, so we can emit it as a constant.
I still think lowering eh_typeid_for differently may work.
Like instead of doing
GV = ExtractTypeInfo(I.getArgOperand(0)) followed by
getMachineFunction().getMMI().getTypeIDFor(GV)
we can get dwarf type id from I.getArgOperand(0) if it's
any pointer to struct type.
I'm not familiar with dwarf handling part of llvm, but feels possible.

2015-08-06 03:41:31

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [llvm-dev] [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

On Wed, Aug 05, 2015 at 04:59:01PM +0800, He Kuang wrote:
> Hi, Alexei
>
> On 2015/7/30 1:13, Alexei Starovoitov wrote:
> >On 7/29/15 2:38 AM, He Kuang wrote:
> >>Hi, Alexei
> >>
> >>On 2015/7/28 10:18, Alexei Starovoitov wrote:
> >>>On 7/25/15 3:04 AM, He Kuang wrote:
> >>>>I noticed that for 64-bit elf format, the reloc sections have
> >>>>'Addend' in the entry, but there's no 'Addend' info in bpf elf
> >>>>file(64bit). I think there must be something wrong in the process
> >>>>of .s -> .o, which related to 64bit/32bit. Anyway, we can parse out the
> >>>>AT_name now, DW_AT_LOCATION still missed and need your help.
> >>>
>
> Another thing about DW_AT_name, we've already found that the name
> string is stored indirectly and needs relocation which is
> architecture specific, while the e_machine info in bpf obj file
> is "unknown", both objdump and libdw cannot parse DW_AT_name
> correctly.
>
> Should we just use a known architeture for bpf object file
> instead of "unknown"? If so, we can use the existing relocation
> codes in libdw and get DIE name by simply invoking
> dwarf_diename(). The drawback of this method is that, e.g. we
> use "x86-64" instead, is hard to distinguish bpf obj file with
> x86-64 elf file. Do you think this is ok?

The only clean way would be to register bpf as an architecture
with elf standards committee. I have no idea who is doing that and
how much such new e_machine registration may cost.
So far using EM_NONE is a hack to avoid bureaucracy.
Are dwarf relocation processor specific?
Then simple hack to elfutils/libdw to treat EM_NONE as X64
should do the trick, right?
If that indeed works, we can tweak bpf backend to use EM_X86_64,
but then the danger that such .o file will be wrongly
recognized by elf utils. imo it's safer to keep it as EM_NONE
until real number is assigned, but even after it's assigned it
will take time to propagate that value. So for now I would try
to find a solution keeping EM_NONE hack.

2015-08-06 04:31:59

by Wang Nan

[permalink] [raw]
Subject: Re: [llvm-dev] [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event



On 2015/8/6 11:41, Alexei Starovoitov wrote:
> On Wed, Aug 05, 2015 at 04:59:01PM +0800, He Kuang wrote:
>> Hi, Alexei
>>
>> On 2015/7/30 1:13, Alexei Starovoitov wrote:
>>> On 7/29/15 2:38 AM, He Kuang wrote:
>>>> Hi, Alexei
>>>>
>>>> On 2015/7/28 10:18, Alexei Starovoitov wrote:
>>>>> On 7/25/15 3:04 AM, He Kuang wrote:
>>>>>> I noticed that for 64-bit elf format, the reloc sections have
>>>>>> 'Addend' in the entry, but there's no 'Addend' info in bpf elf
>>>>>> file(64bit). I think there must be something wrong in the process
>>>>>> of .s -> .o, which related to 64bit/32bit. Anyway, we can parse out the
>>>>>> AT_name now, DW_AT_LOCATION still missed and need your help.
>> Another thing about DW_AT_name, we've already found that the name
>> string is stored indirectly and needs relocation which is
>> architecture specific, while the e_machine info in bpf obj file
>> is "unknown", both objdump and libdw cannot parse DW_AT_name
>> correctly.
>>
>> Should we just use a known architeture for bpf object file
>> instead of "unknown"? If so, we can use the existing relocation
>> codes in libdw and get DIE name by simply invoking
>> dwarf_diename(). The drawback of this method is that, e.g. we
>> use "x86-64" instead, is hard to distinguish bpf obj file with
>> x86-64 elf file. Do you think this is ok?
> The only clean way would be to register bpf as an architecture
> with elf standards committee. I have no idea who is doing that and
> how much such new e_machine registration may cost.
> So far using EM_NONE is a hack to avoid bureaucracy.
> Are dwarf relocation processor specific?
> Then simple hack to elfutils/libdw to treat EM_NONE as X64
> should do the trick, right?
> If that indeed works, we can tweak bpf backend to use EM_X86_64,
> but then the danger that such .o file will be wrongly
> recognized by elf utils. imo it's safer to keep it as EM_NONE
> until real number is assigned, but even after it's assigned it
> will take time to propagate that value. So for now I would try
> to find a solution keeping EM_NONE hack.
>

What about hacking ELF binary in memory?

1. load the object into memory;
2. twist the machine code to EM_X86_64;
3. load it using elf_begin;
4. return the twested elf memory image using libdwfl's find_elf callback.

Then libdw will recognise BPF's object file as a X86_64 object file. If
required,
relocation sections can also be twisted in this way. Should not very
hard since
we can only consider one relocation type.

Then let's start thinking how to introduce EM_BPF. We can rely on the
hacking
until EM_BPF symbol reaches elfutils in perf.

What do you think?

Thank you.

2015-08-06 04:36:01

by Wang Nan

[permalink] [raw]
Subject: Re: [llvm-dev] [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event



On 2015/8/6 11:22, Alexei Starovoitov wrote:
> On Wed, Aug 05, 2015 at 04:28:13PM +0800, Wangnan (F) wrote:
>> It doesn't work for me at first since in my llvm there's only
>> llvm.bpf.load.*.
>>
>> I think llvm.bpf.store.* belone to some patches you haven't posted yet?
> nope. only loads have special instructions ld_abs/ld_ind
> which are represented by these intrinsics.
> stores, so far, are done via single bpf_store_bytes() helper function.
>
>>> the typeid changing ids with order is surprising.
>>> I think the assertion in ExtractTypeInfo() is not hard.
>>> Just there were no such use cases. May be we can do something
>>> similar to what LowerIntrinsicCall() does and lower it differently
>>> in the backend.
>>>
>> But in backend can we still get type information? I thought type is
>> meaningful in frontend only, and backend behaviors is unable to affect
>> DWARF generation, right?
> why do we need to affect type generation? we just need to know dwarf
> type id in the backend, so we can emit it as a constant.
> I still think lowering eh_typeid_for differently may work.
> Like instead of doing
> GV = ExtractTypeInfo(I.getArgOperand(0)) followed by
> getMachineFunction().getMMI().getTypeIDFor(GV)
> we can get dwarf type id from I.getArgOperand(0) if it's
> any pointer to struct type.

I have a bad news to tell:

#include <stdio.h>
struct my_str {
int x;
int y;
} __gv_my_str;
struct my_str __gv_my_str_;

struct my_str2 {
int x;
int y;
} __gv_my_str2;

int typeid(void *p) asm("llvm.eh.typeid.for");

int main()
{
printf("%d\n", typeid(&__gv_my_str));
printf("%d\n", typeid(&__gv_my_str_));
printf("%d\n", typeid(&__gv_my_str2));
return 0;
}

Compiled with clang into x86 executable, then:

$ ./a.out
3
2
1

See? I have two types but reported 3 IDs.

And here is the implementation of getTypeIDFor, in
lib/CodeGen/MachineModuleInfo.cpp:

unsigned MachineModuleInfo::getTypeIDFor(const GlobalValue *TI) {
for (unsigned i = 0, N = TypeInfos.size(); i != N; ++i)
if (TypeInfos[i] == TI) return i + 1;

TypeInfos.push_back(TI);
return TypeInfos.size();
}

It only checks value in a stupid way.

Now the dwarf side becomes clear (see my other response), but the
frontend may require
totally reconsidering.

Do you know someone in LLVM-dev who can help us?

Thank you.

> I'm not familiar with dwarf handling part of llvm, but feels possible.
>

2015-08-06 06:50:08

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [llvm-dev] [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

On Thu, Aug 06, 2015 at 12:31:26PM +0800, Wangnan (F) wrote:
>
>
> What about hacking ELF binary in memory?
>
> 1. load the object into memory;
> 2. twist the machine code to EM_X86_64;
> 3. load it using elf_begin;
> 4. return the twested elf memory image using libdwfl's find_elf callback.
>
> Then libdw will recognise BPF's object file as a X86_64 object file. If
> required,
> relocation sections can also be twisted in this way. Should not very hard
> since
> we can only consider one relocation type.
>
> Then let's start thinking how to introduce EM_BPF. We can rely on the
> hacking
> until EM_BPF symbol reaches elfutils in perf.
>
> What do you think?

sounds crazy, but may work. let's try it :)

2015-08-06 06:55:53

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [llvm-dev] [LLVMdev] Cc llvmdev: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

On Thu, Aug 06, 2015 at 12:35:30PM +0800, Wangnan (F) wrote:
>
>
> On 2015/8/6 11:22, Alexei Starovoitov wrote:
> >On Wed, Aug 05, 2015 at 04:28:13PM +0800, Wangnan (F) wrote:
> >>It doesn't work for me at first since in my llvm there's only
> >>llvm.bpf.load.*.
> >>
> >>I think llvm.bpf.store.* belone to some patches you haven't posted yet?
> >nope. only loads have special instructions ld_abs/ld_ind
> >which are represented by these intrinsics.
> >stores, so far, are done via single bpf_store_bytes() helper function.
> >
> >>>the typeid changing ids with order is surprising.
> >>>I think the assertion in ExtractTypeInfo() is not hard.
> >>>Just there were no such use cases. May be we can do something
> >>>similar to what LowerIntrinsicCall() does and lower it differently
> >>>in the backend.
> >>>
> >>But in backend can we still get type information? I thought type is
> >>meaningful in frontend only, and backend behaviors is unable to affect
> >>DWARF generation, right?
> >why do we need to affect type generation? we just need to know dwarf
> >type id in the backend, so we can emit it as a constant.
> >I still think lowering eh_typeid_for differently may work.
> >Like instead of doing
> >GV = ExtractTypeInfo(I.getArgOperand(0)) followed by
> >getMachineFunction().getMMI().getTypeIDFor(GV)
> >we can get dwarf type id from I.getArgOperand(0) if it's
> >any pointer to struct type.
>
> I have a bad news to tell:
>
> #include <stdio.h>
> struct my_str {
> int x;
> int y;
> } __gv_my_str;
> struct my_str __gv_my_str_;
>
> struct my_str2 {
> int x;
> int y;
> } __gv_my_str2;
>
> int typeid(void *p) asm("llvm.eh.typeid.for");
>
> int main()
> {
> printf("%d\n", typeid(&__gv_my_str));
> printf("%d\n", typeid(&__gv_my_str_));
> printf("%d\n", typeid(&__gv_my_str2));
> return 0;
> }
>
> Compiled with clang into x86 executable, then:
>
> $ ./a.out
> 3
> 2
> 1
>
> See? I have two types but reported 3 IDs.

that's expected. We don't have to use default lowering
of typeid_for with getTypeIDFor. bpf backend specific
lowering can be different, though in this case it's odd
that id for __gv_my_str and __gv_my_str_ are different.
__gv_my_str and __gv_my_str2 should be different.

2015-08-12 02:35:17

by Wang Nan

[permalink] [raw]
Subject: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event



On 2015/8/4 3:44, Alexei Starovoitov wrote:

[SNIP]
>> I'll post 2 LLVM patches by replying this mail. Please have a look and
>> help me
>> send them to LLVM if you think my code is correct.
>
>
[SNIP]
> patch 2:
> do we really need to hack clang?
> Can you just define a function that aliases to intrinsic,
> like we do for ld_abs/ld_ind ?
> void bpf_store_half(void *skb, u64 off, u64 val)
> asm("llvm.bpf.store.half");
> then no extra patches necessary.

Hi Alexei,

By two weeks researching, I have to give you a sad answer that:

target specific intrinsic is not work.

I tried target specific intrinsic. However, LLVM isolates backend and
frontend, and there's no way to pass language level type information
to backend code.

Think about a program like this:

struct strA { int a; }
struct strB { int b; }
int func() {
struct strA a;
struct strB b;

a.a = 1;
b.b = 2;
bpf_output(gettype(a), &a);
bpf_output(gettype(b), &b);
return 0;
}

BPF backend can't (and needn't) tell the difference between local
variables a and b in theory. In LLVM implementation, it filters type
information out using ComputeValueVTs(). Please have a look at
SelectionDAGBuilder::visitIntrinsicCall in
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp and
SelectionDAGBuilder::visitTargetIntrinsic in the same file. in
visitTargetIntrinsic, ComputeValueVTs acts as a barrier which strips
type information out from CallInst ("I"), and leave SDValue and SDVTList
("Ops" and "VTs") to target code. SDValue and SDVTList are wrappers of
EVT and MVT, all information we concern won't be passed here.

I think now we have 2 choices:

1. Hacking into clang, implement target specific builtin function. Now I
have worked out a ugly but workable patch which setup a builtin
function:
__builtin_bpf_typeid(), which accepts local or global variable then
returns different constant for different types.

2. Implementing an LLVM intrinsic call (llvm.typeid), make it be
processed in
visitIntrinsicCall(). I think we can get something useful if it is
processed
with that function.

The next thing should be generating debug information to map type and
constants which issued by __builtin_bpf_typeid() or llvm.typeid. Now we
have a crazy idea that, if we limit the name of the structure to 8 bytes,
we can insert the name into a u64, then there would be no need to consider
type information in DWARF. For example, in the above sample code, gettype(a)
will issue 0x0000000041727473 because its type is "strA". What do you think?

Thank you.

2015-08-12 04:57:12

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [llvm-dev] llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

On Wed, Aug 12, 2015 at 10:34:43AM +0800, Wangnan (F) via llvm-dev wrote:
>
> Think about a program like this:
>
> struct strA { int a; }
> struct strB { int b; }
> int func() {
> struct strA a;
> struct strB b;
>
> a.a = 1;
> b.b = 2;
> bpf_output(gettype(a), &a);
> bpf_output(gettype(b), &b);
> return 0;
> }
>
> BPF backend can't (and needn't) tell the difference between local
> variables a and b in theory. In LLVM implementation, it filters type
> information out using ComputeValueVTs(). Please have a look at
> SelectionDAGBuilder::visitIntrinsicCall in
> lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp and
> SelectionDAGBuilder::visitTargetIntrinsic in the same file. in
> visitTargetIntrinsic, ComputeValueVTs acts as a barrier which strips
> type information out from CallInst ("I"), and leave SDValue and SDVTList
> ("Ops" and "VTs") to target code. SDValue and SDVTList are wrappers of
> EVT and MVT, all information we concern won't be passed here.
>
> I think now we have 2 choices:
>
> 1. Hacking into clang, implement target specific builtin function. Now I
> have worked out a ugly but workable patch which setup a builtin function:
> __builtin_bpf_typeid(), which accepts local or global variable then
> returns different constant for different types.
>
> 2. Implementing an LLVM intrinsic call (llvm.typeid), make it be processed
> in
> visitIntrinsicCall(). I think we can get something useful if it is
> processed
> with that function.

Yeah. You're right about pure target intrinsics.
I think llvm.typeid might work. imo it's cleaner than
doing it at clang level.

> The next thing should be generating debug information to map type and
> constants which issued by __builtin_bpf_typeid() or llvm.typeid. Now we
> have a crazy idea that, if we limit the name of the structure to 8 bytes,
> we can insert the name into a u64, then there would be no need to consider
> type information in DWARF. For example, in the above sample code, gettype(a)
> will issue 0x0000000041727473 because its type is "strA". What do you think?

that's way too hacky.
I was thinking when compiling we can keep llvm ir along with .o
instead of dwarf and extract type info from there.
dwarf has names and other things that we don't need. We only
care about actual field layout of the structs.
But it probably won't be easy to parse llvm ir on perf side
instead of dwarf.

btw, if you haven't looked at iovisor/bcc, there we're solving
similar problem differently. There we use clang rewriter, so all
structs fields are visible at this level, then we use bpf backend
in JIT mode and push bpf instructions into the kernel on the fly
completely skipping ELF and .o
For example in:
https://github.com/iovisor/bcc/blob/master/examples/distributed_bridge/tunnel.c
when you see
struct ethernet_t {
unsigned long long dst:48;
unsigned long long src:48;
unsigned int type:16;
} BPF_PACKET_HEADER;
struct ethernet_t *ethernet = cursor_advance(cursor, sizeof(*ethernet));
... ethernet->src ...
is recognized by clang rewriter and ->src is converted to a different
C code that is sent again into clang.
So there is no need to use dwarf or patch clang/llvm. clang rewriter
has all the info.
I'm not sure you can live with clang/llvm on the host where you
want to run the tracing bits, but if you can that's an easier option.

2015-08-12 05:40:21

by Wang Nan

[permalink] [raw]
Subject: Re: [llvm-dev] llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event



On 2015/8/12 12:57, Alexei Starovoitov wrote:
> On Wed, Aug 12, 2015 at 10:34:43AM +0800, Wangnan (F) via llvm-dev wrote:
>> Think about a program like this:
>>
>> struct strA { int a; }
>> struct strB { int b; }
>> int func() {
>> struct strA a;
>> struct strB b;
>>
>> a.a = 1;
>> b.b = 2;
>> bpf_output(gettype(a), &a);
>> bpf_output(gettype(b), &b);
>> return 0;
>> }
>>
>> BPF backend can't (and needn't) tell the difference between local
>> variables a and b in theory. In LLVM implementation, it filters type
>> information out using ComputeValueVTs(). Please have a look at
>> SelectionDAGBuilder::visitIntrinsicCall in
>> lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp and
>> SelectionDAGBuilder::visitTargetIntrinsic in the same file. in
>> visitTargetIntrinsic, ComputeValueVTs acts as a barrier which strips
>> type information out from CallInst ("I"), and leave SDValue and SDVTList
>> ("Ops" and "VTs") to target code. SDValue and SDVTList are wrappers of
>> EVT and MVT, all information we concern won't be passed here.
>>
>> I think now we have 2 choices:
>>
>> 1. Hacking into clang, implement target specific builtin function. Now I
>> have worked out a ugly but workable patch which setup a builtin function:
>> __builtin_bpf_typeid(), which accepts local or global variable then
>> returns different constant for different types.
>>
>> 2. Implementing an LLVM intrinsic call (llvm.typeid), make it be processed
>> in
>> visitIntrinsicCall(). I think we can get something useful if it is
>> processed
>> with that function.
> Yeah. You're right about pure target intrinsics.
> I think llvm.typeid might work. imo it's cleaner than
> doing it at clang level.
>
>> The next thing should be generating debug information to map type and
>> constants which issued by __builtin_bpf_typeid() or llvm.typeid. Now we
>> have a crazy idea that, if we limit the name of the structure to 8 bytes,
>> we can insert the name into a u64, then there would be no need to consider
>> type information in DWARF. For example, in the above sample code, gettype(a)
>> will issue 0x0000000041727473 because its type is "strA". What do you think?
> that's way too hacky.
> I was thinking when compiling we can keep llvm ir along with .o
> instead of dwarf and extract type info from there.
> dwarf has names and other things that we don't need. We only
> care about actual field layout of the structs.
> But it probably won't be easy to parse llvm ir on perf side
> instead of dwarf.

Shipping both llvm IR and .o to perf makes it harder to use. I'm
not sure whether it is a good idea. If we are unable to encode the
structure using a u64, let's still dig into dwarf.

We have another idea that we can utilize dwarf's existing feature.
For example, when __buildin_bpf_typeid() get called, define an enumerate
type in dwarf info, so you'll find:

<1><2a>: Abbrev Number: 2 (DW_TAG_enumeration_type)
<2b> DW_AT_name : (indirect string, offset: 0xec): TYPEINFO
<2f> DW_AT_byte_size : 4
<30> DW_AT_decl_file : 1
<31> DW_AT_decl_line : 3
<2><32>: Abbrev Number: 3 (DW_TAG_enumerator)
<33> DW_AT_name : (indirect string, offset: 0xcc):
__typeinfo_strA
<37> DW_AT_const_value : 2
<2><38>: Abbrev Number: 3 (DW_TAG_enumerator)
<39> DW_AT_name : (indirect string, offset: 0xdc):
__typeinfo_strB
<3d> DW_AT_const_value : 3

or this:

<3><54>: Abbrev Number: 4 (DW_TAG_variable)
<55> DW_AT_const_value : 2
<66> DW_AT_name : (indirect string, offset: 0x1e):
__typeinfo_strA
<6a> DW_AT_decl_file : 1
<6b> DW_AT_decl_line : 29
<6c> DW_AT_type : <0x72>

then from DW_AT_name and DW_AT_const_value we can do the mapping.
Drawback is that
all __typeinfo_ prefixed names become reserved.


> btw, if you haven't looked at iovisor/bcc, there we're solving
> similar problem differently. There we use clang rewriter, so all
> structs fields are visible at this level, then we use bpf backend
> in JIT mode and push bpf instructions into the kernel on the fly
> completely skipping ELF and .o
> For example in:
> https://github.com/iovisor/bcc/blob/master/examples/distributed_bridge/tunnel.c
> when you see
> struct ethernet_t {
> unsigned long long dst:48;
> unsigned long long src:48;
> unsigned int type:16;
> } BPF_PACKET_HEADER;
> struct ethernet_t *ethernet = cursor_advance(cursor, sizeof(*ethernet));
> ... ethernet->src ...
> is recognized by clang rewriter and ->src is converted to a different
> C code that is sent again into clang.
> So there is no need to use dwarf or patch clang/llvm. clang rewriter
> has all the info.

Could you please give us further information about your clang rewriter?
I guess you need a new .so when injecting those code into kernel?

> I'm not sure you can live with clang/llvm on the host where you
> want to run the tracing bits, but if you can that's an easier option.
>

I'm not sure. Our target platform should be embedded devices like
smartphone.
Bringing full clang/llvm environment there is not acceptable.

Thank you.

2015-08-12 13:15:27

by Brenden Blanco

[permalink] [raw]
Subject: Re: [llvm-dev] llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

Hi Wangnan, I've been authoring the BCC development, so I'll answer
those specific questions.
>
>
> Could you please give us further information about your clang rewriter?
> I guess you need a new .so when injecting those code into kernel?

The rewriter runs all of its passes in a single process, creating no
files on disk and having no external dependencies in terms of
toolchain.
1. Entry point: bpf_module_create() - C API call to create module, can
take filename or directly a c string with the full contents of the
program
2. Convert contents into a clang memory buffer
3. Set up a clang driver::CompilerInvocation in the style of the clang
interpreter example
4. Run a rewriter pass over the memory buffer file, annotating and/or
doing BPF specific magic on the input source
a. Open BPF maps with a call to bpf_create_map directly
b. Convert references to map operations with the specific FD of the new map
c. Convert arguments to bpf_probe_read calls as needed
d. Collect the externed function names to avoid section() hack in the language
5. Re-run the CompilerInvocation on the modified sources
6. JIT the llvm::Module to bpf arch
7. Load the resulting in-memory ".o" to bpf_prog_load, keeping the FD
alive in the compiler process
8. Attach the FD as necessary to perf events, socket, tc, etc.
9. goto 1

The above steps are captured in the BCC github repo in src/cc, with
the clang specific bits inside of the frontends/clang subdirectory.

> I'm not sure. Our target platform should be embedded devices like
> smartphone.
> Bringing full clang/llvm environment there is not acceptable.

The artifact from the build process of BCC is a shared library, which
has the clang/llvm .a embedded within them. It is not yet a single
binary, but not unfeasible to make it so. The clang toolchain itself
does not need to exist on the target. I have not attempted to
cross-compile BCC to any architecture, currently x86_64 only.

If you have more BCC specific questions not involving clang/llvm,
perhaps you can ping Alexei/myself off of the llvm-dev list, in case
this discussion is not relevant to them.

2015-08-13 06:24:39

by Wang Nan

[permalink] [raw]
Subject: Re: [llvm-dev] llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

Thank you for your reply.

Add He Kuang to CC list.

On 2015/8/12 21:15, Brenden Blanco wrote:
> Hi Wangnan, I've been authoring the BCC development, so I'll answer
> those specific questions.
>>
>> Could you please give us further information about your clang rewriter?
>> I guess you need a new .so when injecting those code into kernel?
> The rewriter runs all of its passes in a single process, creating no
> files on disk and having no external dependencies in terms of
> toolchain.
> 1. Entry point: bpf_module_create() - C API call to create module, can
> take filename or directly a c string with the full contents of the
> program
> 2. Convert contents into a clang memory buffer
> 3. Set up a clang driver::CompilerInvocation in the style of the clang
> interpreter example
> 4. Run a rewriter pass over the memory buffer file, annotating and/or
> doing BPF specific magic on the input source
> a. Open BPF maps with a call to bpf_create_map directly
> b. Convert references to map operations with the specific FD of the new map
> c. Convert arguments to bpf_probe_read calls as needed
> d. Collect the externed function names to avoid section() hack in the language
> 5. Re-run the CompilerInvocation on the modified sources
> 6. JIT the llvm::Module to bpf arch
> 7. Load the resulting in-memory ".o" to bpf_prog_load, keeping the FD
> alive in the compiler process
> 8. Attach the FD as necessary to perf events, socket, tc, etc.
> 9. goto 1
>
> The above steps are captured in the BCC github repo in src/cc, with
> the clang specific bits inside of the frontends/clang subdirectory.
>
>> I'm not sure. Our target platform should be embedded devices like
>> smartphone.
>> Bringing full clang/llvm environment there is not acceptable.
> The artifact from the build process of BCC is a shared library, which
> has the clang/llvm .a embedded within them. It is not yet a single
> binary, but not unfeasible to make it so. The clang toolchain itself
> does not need to exist on the target. I have not attempted to
> cross-compile BCC to any architecture, currently x86_64 only.
>
> If you have more BCC specific questions not involving clang/llvm,
> perhaps you can ping Alexei/myself off of the llvm-dev list, in case
> this discussion is not relevant to them.