On Mon, 9 Dec 2019 17:14:34 -0800, Andrii Nakryiko wrote:
> struct <object-name> {
> /* used by libbpf's skeleton API */
> struct bpf_object_skeleton *skeleton;
> /* bpf_object for libbpf APIs */
> struct bpf_object *obj;
> struct {
> /* for every defined map in BPF object: */
> struct bpf_map *<map-name>;
> } maps;
> struct {
> /* for every program in BPF object: */
> struct bpf_program *<program-name>;
> } progs;
> struct {
> /* for every program in BPF object: */
> struct bpf_link *<program-name>;
> } links;
> /* for every present global data section: */
> struct <object-name>__<one of bss, data, or rodata> {
> /* memory layout of corresponding data section,
> * with every defined variable represented as a struct field
> * with exactly the same type, but without const/volatile
> * modifiers, e.g.:
> */
> int *my_var_1;
> ...
> } *<one of bss, data, or rodata>;
> };
I think I understand how this is useful, but perhaps the problem here
is that we're using C for everything, and simple programs for which
loading the ELF is majority of the code would be better of being
written in a dynamic language like python? Would it perhaps be a
better idea to work on some high-level language bindings than spend
time writing code gens and working around limitations of C?
> This provides great usability improvements:
> - no need to look up maps and programs by name, instead just
> my_obj->maps.my_map or my_obj->progs.my_prog would give necessary
> bpf_map/bpf_program pointers, which user can pass to existing libbpf APIs;
> - pre-defined places for bpf_links, which will be automatically populated for
> program types that libbpf knows how to attach automatically (currently
> tracepoints, kprobe/kretprobe, raw tracepoint and tracing programs). On
> tearing down skeleton, all active bpf_links will be destroyed (meaning BPF
> programs will be detached, if they are attached). For cases in which libbpf
> doesn't know how to auto-attach BPF program, user can manually create link
> after loading skeleton and they will be auto-detached on skeleton
> destruction:
>
> my_obj->links.my_fancy_prog = bpf_program__attach_cgroup_whatever(
> my_obj->progs.my_fancy_prog, <whatever extra param);
>
> - it's extremely easy and convenient to work with global data from userspace
> now. Both for read-only and read/write variables, it's possible to
> pre-initialize them before skeleton is loaded:
>
> skel = my_obj__open(raw_embed_data);
> my_obj->rodata->my_var = 123;
> my_obj__load(skel); /* 123 will be initialization value for my_var */
>
> After load, if kernel supports mmap() for BPF arrays, user can still read
> (and write for .bss and .data) variables values, but at that point it will
> be directly mmap()-ed to BPF array, backing global variables. This allows to
> seamlessly exchange data with BPF side. From userspace program's POV, all
> the pointers and memory contents stay the same, but mapped kernel memory
> changes to point to created map.
> If kernel doesn't yet support mmap() for BPF arrays, it's still possible to
> use those data section structs to pre-initialize .bss, .data, and .rodata,
> but after load their pointers will be reset to NULL, allowing user code to
> gracefully handle this condition, if necessary.
>
> Given a big surface area, skeleton is kept as an experimental non-public
> API for now, until more feedback and real-world experience is collected.
That makes no sense to me. bpftool has the same backward compat
requirements as libbpf. You're just pushing the requirements from
one component to the other. Feedback and real-world use cases have
to be exercised before code is merged to any project with backward
compatibility requirements :(
Also please run checkpatch on your patches, and fix reverse xmas tree.
This is bpftool, not libbpf. Creating a separate tool for this codegen
stuff is also an option IMHO.
On Mon, Dec 9, 2019 at 5:57 PM Jakub Kicinski
<[email protected]> wrote:
>
> On Mon, 9 Dec 2019 17:14:34 -0800, Andrii Nakryiko wrote:
> > struct <object-name> {
> > /* used by libbpf's skeleton API */
> > struct bpf_object_skeleton *skeleton;
> > /* bpf_object for libbpf APIs */
> > struct bpf_object *obj;
> > struct {
> > /* for every defined map in BPF object: */
> > struct bpf_map *<map-name>;
> > } maps;
> > struct {
> > /* for every program in BPF object: */
> > struct bpf_program *<program-name>;
> > } progs;
> > struct {
> > /* for every program in BPF object: */
> > struct bpf_link *<program-name>;
> > } links;
> > /* for every present global data section: */
> > struct <object-name>__<one of bss, data, or rodata> {
> > /* memory layout of corresponding data section,
> > * with every defined variable represented as a struct field
> > * with exactly the same type, but without const/volatile
> > * modifiers, e.g.:
> > */
> > int *my_var_1;
> > ...
> > } *<one of bss, data, or rodata>;
> > };
>
> I think I understand how this is useful, but perhaps the problem here
> is that we're using C for everything, and simple programs for which
> loading the ELF is majority of the code would be better of being
> written in a dynamic language like python? Would it perhaps be a
> better idea to work on some high-level language bindings than spend
> time writing code gens and working around limitations of C?
None of this work prevents Python bindings and other improvements, is
it? Patches, as always, are greatly appreciated ;)
This skeleton stuff is not just to save code, but in general to
simplify and streamline working with BPF program from userspace side.
Fortunately or not, but there are a lot of real-world applications
written in C and C++ that could benefit from this, so this is still
immensely useful. selftests/bpf themselves benefit a lot from this
work, see few of the last patches in this series.
>
> > This provides great usability improvements:
> > - no need to look up maps and programs by name, instead just
> > my_obj->maps.my_map or my_obj->progs.my_prog would give necessary
> > bpf_map/bpf_program pointers, which user can pass to existing libbpf APIs;
> > - pre-defined places for bpf_links, which will be automatically populated for
> > program types that libbpf knows how to attach automatically (currently
> > tracepoints, kprobe/kretprobe, raw tracepoint and tracing programs). On
> > tearing down skeleton, all active bpf_links will be destroyed (meaning BPF
> > programs will be detached, if they are attached). For cases in which libbpf
> > doesn't know how to auto-attach BPF program, user can manually create link
> > after loading skeleton and they will be auto-detached on skeleton
> > destruction:
> >
> > my_obj->links.my_fancy_prog = bpf_program__attach_cgroup_whatever(
> > my_obj->progs.my_fancy_prog, <whatever extra param);
> >
> > - it's extremely easy and convenient to work with global data from userspace
> > now. Both for read-only and read/write variables, it's possible to
> > pre-initialize them before skeleton is loaded:
> >
> > skel = my_obj__open(raw_embed_data);
> > my_obj->rodata->my_var = 123;
> > my_obj__load(skel); /* 123 will be initialization value for my_var */
> >
> > After load, if kernel supports mmap() for BPF arrays, user can still read
> > (and write for .bss and .data) variables values, but at that point it will
> > be directly mmap()-ed to BPF array, backing global variables. This allows to
> > seamlessly exchange data with BPF side. From userspace program's POV, all
> > the pointers and memory contents stay the same, but mapped kernel memory
> > changes to point to created map.
> > If kernel doesn't yet support mmap() for BPF arrays, it's still possible to
> > use those data section structs to pre-initialize .bss, .data, and .rodata,
> > but after load their pointers will be reset to NULL, allowing user code to
> > gracefully handle this condition, if necessary.
> >
> > Given a big surface area, skeleton is kept as an experimental non-public
> > API for now, until more feedback and real-world experience is collected.
>
> That makes no sense to me. bpftool has the same backward compat
> requirements as libbpf. You're just pushing the requirements from
> one component to the other. Feedback and real-world use cases have
> to be exercised before code is merged to any project with backward
> compatibility requirements :(
To get this feedback we need to have this functionality adopted. To
have it adopted, we need it available in tool users already know,
have, and use. If you feel that "experimental" disclaimer is not
enough, I guess we can add extra flag to bpftool itself to enable
experimental functionality, something like:
bpftool --experimental gen skeleton <bla>
>
> Also please run checkpatch on your patches, and fix reverse xmas tree.
> This is bpftool, not libbpf. Creating a separate tool for this codegen
> stuff is also an option IMHO.
Sure, will fix few small things checkpatch detected. Will reverse
christmas-ize all the variables, of course :)
As for separate tool just for this, you are not serious, right? If
bpftool is not right tool for this, I don't know which one is.
On Tue, 10 Dec 2019 09:11:31 -0800, Andrii Nakryiko wrote:
> On Mon, Dec 9, 2019 at 5:57 PM Jakub Kicinski wrote:
> > On Mon, 9 Dec 2019 17:14:34 -0800, Andrii Nakryiko wrote:
> > > struct <object-name> {
> > > /* used by libbpf's skeleton API */
> > > struct bpf_object_skeleton *skeleton;
> > > /* bpf_object for libbpf APIs */
> > > struct bpf_object *obj;
> > > struct {
> > > /* for every defined map in BPF object: */
> > > struct bpf_map *<map-name>;
> > > } maps;
> > > struct {
> > > /* for every program in BPF object: */
> > > struct bpf_program *<program-name>;
> > > } progs;
> > > struct {
> > > /* for every program in BPF object: */
> > > struct bpf_link *<program-name>;
> > > } links;
> > > /* for every present global data section: */
> > > struct <object-name>__<one of bss, data, or rodata> {
> > > /* memory layout of corresponding data section,
> > > * with every defined variable represented as a struct field
> > > * with exactly the same type, but without const/volatile
> > > * modifiers, e.g.:
> > > */
> > > int *my_var_1;
> > > ...
> > > } *<one of bss, data, or rodata>;
> > > };
> >
> > I think I understand how this is useful, but perhaps the problem here
> > is that we're using C for everything, and simple programs for which
> > loading the ELF is majority of the code would be better of being
> > written in a dynamic language like python? Would it perhaps be a
> > better idea to work on some high-level language bindings than spend
> > time writing code gens and working around limitations of C?
>
> None of this work prevents Python bindings and other improvements, is
> it? Patches, as always, are greatly appreciated ;)
This "do it yourself" shit is not really funny :/
I'll stop providing feedback on BPF patches if you guy keep saying
that :/ Maybe that's what you want.
> This skeleton stuff is not just to save code, but in general to
> simplify and streamline working with BPF program from userspace side.
> Fortunately or not, but there are a lot of real-world applications
> written in C and C++ that could benefit from this, so this is still
> immensely useful. selftests/bpf themselves benefit a lot from this
> work, see few of the last patches in this series.
Maybe those applications are written in C and C++ _because_ there
are no bindings for high level languages. I just wish BPF programming
was less weird and adding some funky codegen is not getting us closer
to that goal.
In my experience code gen is nothing more than a hack to work around
bad APIs, but experiences differ so that's not a solid argument.
> > > This provides great usability improvements:
> > > - no need to look up maps and programs by name, instead just
> > > my_obj->maps.my_map or my_obj->progs.my_prog would give necessary
> > > bpf_map/bpf_program pointers, which user can pass to existing libbpf APIs;
> > > - pre-defined places for bpf_links, which will be automatically populated for
> > > program types that libbpf knows how to attach automatically (currently
> > > tracepoints, kprobe/kretprobe, raw tracepoint and tracing programs). On
> > > tearing down skeleton, all active bpf_links will be destroyed (meaning BPF
> > > programs will be detached, if they are attached). For cases in which libbpf
> > > doesn't know how to auto-attach BPF program, user can manually create link
> > > after loading skeleton and they will be auto-detached on skeleton
> > > destruction:
> > >
> > > my_obj->links.my_fancy_prog = bpf_program__attach_cgroup_whatever(
> > > my_obj->progs.my_fancy_prog, <whatever extra param);
> > >
> > > - it's extremely easy and convenient to work with global data from userspace
> > > now. Both for read-only and read/write variables, it's possible to
> > > pre-initialize them before skeleton is loaded:
> > >
> > > skel = my_obj__open(raw_embed_data);
> > > my_obj->rodata->my_var = 123;
> > > my_obj__load(skel); /* 123 will be initialization value for my_var */
> > >
> > > After load, if kernel supports mmap() for BPF arrays, user can still read
> > > (and write for .bss and .data) variables values, but at that point it will
> > > be directly mmap()-ed to BPF array, backing global variables. This allows to
> > > seamlessly exchange data with BPF side. From userspace program's POV, all
> > > the pointers and memory contents stay the same, but mapped kernel memory
> > > changes to point to created map.
> > > If kernel doesn't yet support mmap() for BPF arrays, it's still possible to
> > > use those data section structs to pre-initialize .bss, .data, and .rodata,
> > > but after load their pointers will be reset to NULL, allowing user code to
> > > gracefully handle this condition, if necessary.
> > >
> > > Given a big surface area, skeleton is kept as an experimental non-public
> > > API for now, until more feedback and real-world experience is collected.
> >
> > That makes no sense to me. bpftool has the same backward compat
> > requirements as libbpf. You're just pushing the requirements from
> > one component to the other. Feedback and real-world use cases have
> > to be exercised before code is merged to any project with backward
> > compatibility requirements :(
>
> To get this feedback we need to have this functionality adopted. To
> have it adopted, we need it available in tool users already know,
> have, and use.
Well you claim you have users for it, just talk to them now. I don't
understand how this is not obvious. It's like saying "we can't test
this unless it's in the tree"..!?
> If you feel that "experimental" disclaimer is not enough, I guess we
> can add extra flag to bpftool itself to enable experimental
> functionality, something like:
>
> bpftool --experimental gen skeleton <bla>
Yeah, world doesn't really work like that. Users start depending on
a feature, it will break people's scripts/Makefiles if it disappears.
This codegen thing is made to be hard coded in Makefiles.. how do you
expect people not to immediately become dependent on it.
> > Also please run checkpatch on your patches, and fix reverse xmas tree.
> > This is bpftool, not libbpf. Creating a separate tool for this codegen
> > stuff is also an option IMHO.
>
> Sure, will fix few small things checkpatch detected.
Running checkpatch should be part of your upstreaming routine, you're
wasting people's time. So stop with the amused tone.
> Will reverse christmas-ize all the variables, of course :)
>
> As for separate tool just for this, you are not serious, right? If
> bpftool is not right tool for this, I don't know which one is.
I am serious. There absolutely nothing this tool needs from BPF, no
JSON needed, no bpffs etc. It can be a separate tool like
libbpf-skel-gen or libbpf-c-skel or something, distributed with libbpf.
That way you can actually soften the backward compat. In case people
become dependent on it they can carry that little tool on their own.
On Tue, Dec 10, 2019 at 10:05 AM Jakub Kicinski
<[email protected]> wrote:
>
> On Tue, 10 Dec 2019 09:11:31 -0800, Andrii Nakryiko wrote:
> > On Mon, Dec 9, 2019 at 5:57 PM Jakub Kicinski wrote:
> > > On Mon, 9 Dec 2019 17:14:34 -0800, Andrii Nakryiko wrote:
> > > > struct <object-name> {
> > > > /* used by libbpf's skeleton API */
> > > > struct bpf_object_skeleton *skeleton;
> > > > /* bpf_object for libbpf APIs */
> > > > struct bpf_object *obj;
> > > > struct {
> > > > /* for every defined map in BPF object: */
> > > > struct bpf_map *<map-name>;
> > > > } maps;
> > > > struct {
> > > > /* for every program in BPF object: */
> > > > struct bpf_program *<program-name>;
> > > > } progs;
> > > > struct {
> > > > /* for every program in BPF object: */
> > > > struct bpf_link *<program-name>;
> > > > } links;
> > > > /* for every present global data section: */
> > > > struct <object-name>__<one of bss, data, or rodata> {
> > > > /* memory layout of corresponding data section,
> > > > * with every defined variable represented as a struct field
> > > > * with exactly the same type, but without const/volatile
> > > > * modifiers, e.g.:
> > > > */
> > > > int *my_var_1;
> > > > ...
> > > > } *<one of bss, data, or rodata>;
> > > > };
> > >
> > > I think I understand how this is useful, but perhaps the problem here
> > > is that we're using C for everything, and simple programs for which
> > > loading the ELF is majority of the code would be better of being
> > > written in a dynamic language like python? Would it perhaps be a
> > > better idea to work on some high-level language bindings than spend
> > > time writing code gens and working around limitations of C?
> >
> > None of this work prevents Python bindings and other improvements, is
> > it? Patches, as always, are greatly appreciated ;)
>
> This "do it yourself" shit is not really funny :/
>
Everyone has different priorities and limited time/resources. So
deciding for someone else where he/she needs to spend time is what's
not funny. As long as this work doesn't prevent any Python
improvements you'd hope (apparently) someone else to do, I don't see a
problem with addressing real needs for C/C++ applications. What am I
missing?
> I'll stop providing feedback on BPF patches if you guy keep saying
> that :/ Maybe that's what you want.
We do value feedback, of course, as long as it's constructive. Thanks.
>
> > This skeleton stuff is not just to save code, but in general to
> > simplify and streamline working with BPF program from userspace side.
> > Fortunately or not, but there are a lot of real-world applications
> > written in C and C++ that could benefit from this, so this is still
> > immensely useful. selftests/bpf themselves benefit a lot from this
> > work, see few of the last patches in this series.
>
> Maybe those applications are written in C and C++ _because_ there
> are no bindings for high level languages. I just wish BPF programming
> was less weird and adding some funky codegen is not getting us closer
> to that goal.
>
> In my experience code gen is nothing more than a hack to work around
> bad APIs, but experiences differ so that's not a solid argument.
Agreed about the last point. Look at all sorts of RPC frameworks
(e.g., grpc, Thrift) and tell people relying on them how codegen is a
bad idea.
>
> > > > This provides great usability improvements:
> > > > - no need to look up maps and programs by name, instead just
> > > > my_obj->maps.my_map or my_obj->progs.my_prog would give necessary
> > > > bpf_map/bpf_program pointers, which user can pass to existing libbpf APIs;
> > > > - pre-defined places for bpf_links, which will be automatically populated for
> > > > program types that libbpf knows how to attach automatically (currently
> > > > tracepoints, kprobe/kretprobe, raw tracepoint and tracing programs). On
> > > > tearing down skeleton, all active bpf_links will be destroyed (meaning BPF
> > > > programs will be detached, if they are attached). For cases in which libbpf
> > > > doesn't know how to auto-attach BPF program, user can manually create link
> > > > after loading skeleton and they will be auto-detached on skeleton
> > > > destruction:
> > > >
> > > > my_obj->links.my_fancy_prog = bpf_program__attach_cgroup_whatever(
> > > > my_obj->progs.my_fancy_prog, <whatever extra param);
> > > >
> > > > - it's extremely easy and convenient to work with global data from userspace
> > > > now. Both for read-only and read/write variables, it's possible to
> > > > pre-initialize them before skeleton is loaded:
> > > >
> > > > skel = my_obj__open(raw_embed_data);
> > > > my_obj->rodata->my_var = 123;
> > > > my_obj__load(skel); /* 123 will be initialization value for my_var */
> > > >
> > > > After load, if kernel supports mmap() for BPF arrays, user can still read
> > > > (and write for .bss and .data) variables values, but at that point it will
> > > > be directly mmap()-ed to BPF array, backing global variables. This allows to
> > > > seamlessly exchange data with BPF side. From userspace program's POV, all
> > > > the pointers and memory contents stay the same, but mapped kernel memory
> > > > changes to point to created map.
> > > > If kernel doesn't yet support mmap() for BPF arrays, it's still possible to
> > > > use those data section structs to pre-initialize .bss, .data, and .rodata,
> > > > but after load their pointers will be reset to NULL, allowing user code to
> > > > gracefully handle this condition, if necessary.
> > > >
> > > > Given a big surface area, skeleton is kept as an experimental non-public
> > > > API for now, until more feedback and real-world experience is collected.
> > >
> > > That makes no sense to me. bpftool has the same backward compat
> > > requirements as libbpf. You're just pushing the requirements from
> > > one component to the other. Feedback and real-world use cases have
> > > to be exercised before code is merged to any project with backward
> > > compatibility requirements :(
> >
> > To get this feedback we need to have this functionality adopted. To
> > have it adopted, we need it available in tool users already know,
> > have, and use.
>
> Well you claim you have users for it, just talk to them now. I don't
> understand how this is not obvious. It's like saying "we can't test
> this unless it's in the tree"..!?
It is useful right now as is, I never claimed I'm not sure if this
stuff is going to be used/useful. What I'm saying is that as we get
some more applications converted to this and used actively over
prolonged period of time, we might identify a bunch of tweaks we might
want to do. Talking with users without having a code for them to use
is not going to provide more insights beyond what we already
collected.
>
> > If you feel that "experimental" disclaimer is not enough, I guess we
> > can add extra flag to bpftool itself to enable experimental
> > functionality, something like:
> >
> > bpftool --experimental gen skeleton <bla>
>
> Yeah, world doesn't really work like that. Users start depending on
> a feature, it will break people's scripts/Makefiles if it disappears.
> This codegen thing is made to be hard coded in Makefiles.. how do you
> expect people not to immediately become dependent on it.
It's about managing expectations, isn't it? That's why I'm putting an
"experimental" disclaimer (or even extra --experimental flag as
described above) on this. If users is afraid of having a minor
breakage due to the need to add extra argument in source code and
recompile (if necessary), they shouldn't opt in.
>
> > > Also please run checkpatch on your patches, and fix reverse xmas tree.
> > > This is bpftool, not libbpf. Creating a separate tool for this codegen
> > > stuff is also an option IMHO.
> >
> > Sure, will fix few small things checkpatch detected.
>
> Running checkpatch should be part of your upstreaming routine, you're
> wasting people's time. So stop with the amused tone.
>
> > Will reverse christmas-ize all the variables, of course :)
> >
> > As for separate tool just for this, you are not serious, right? If
> > bpftool is not right tool for this, I don't know which one is.
>
> I am serious. There absolutely nothing this tool needs from BPF, no
> JSON needed, no bpffs etc. It can be a separate tool like
> libbpf-skel-gen or libbpf-c-skel or something, distributed with libbpf.
> That way you can actually soften the backward compat. In case people
> become dependent on it they can carry that little tool on their own.
We are trying to make users lives easier by having major distributions
distribute bpftool and libbpf properly. Adding extra binaries to
distribute around doesn't seem to be easing any of users pains.
On 12/10, Jakub Kicinski wrote:
> On Tue, 10 Dec 2019 09:11:31 -0800, Andrii Nakryiko wrote:
> > On Mon, Dec 9, 2019 at 5:57 PM Jakub Kicinski wrote:
> > > On Mon, 9 Dec 2019 17:14:34 -0800, Andrii Nakryiko wrote:
> > > > struct <object-name> {
> > > > /* used by libbpf's skeleton API */
> > > > struct bpf_object_skeleton *skeleton;
> > > > /* bpf_object for libbpf APIs */
> > > > struct bpf_object *obj;
> > > > struct {
> > > > /* for every defined map in BPF object: */
> > > > struct bpf_map *<map-name>;
> > > > } maps;
> > > > struct {
> > > > /* for every program in BPF object: */
> > > > struct bpf_program *<program-name>;
> > > > } progs;
> > > > struct {
> > > > /* for every program in BPF object: */
> > > > struct bpf_link *<program-name>;
> > > > } links;
> > > > /* for every present global data section: */
> > > > struct <object-name>__<one of bss, data, or rodata> {
> > > > /* memory layout of corresponding data section,
> > > > * with every defined variable represented as a struct field
> > > > * with exactly the same type, but without const/volatile
> > > > * modifiers, e.g.:
> > > > */
> > > > int *my_var_1;
> > > > ...
> > > > } *<one of bss, data, or rodata>;
> > > > };
> > >
> > > I think I understand how this is useful, but perhaps the problem here
> > > is that we're using C for everything, and simple programs for which
> > > loading the ELF is majority of the code would be better of being
> > > written in a dynamic language like python? Would it perhaps be a
> > > better idea to work on some high-level language bindings than spend
> > > time writing code gens and working around limitations of C?
> >
> > None of this work prevents Python bindings and other improvements, is
> > it? Patches, as always, are greatly appreciated ;)
>
> This "do it yourself" shit is not really funny :/
>
> I'll stop providing feedback on BPF patches if you guy keep saying
> that :/ Maybe that's what you want.
>
> > This skeleton stuff is not just to save code, but in general to
> > simplify and streamline working with BPF program from userspace side.
> > Fortunately or not, but there are a lot of real-world applications
> > written in C and C++ that could benefit from this, so this is still
> > immensely useful. selftests/bpf themselves benefit a lot from this
> > work, see few of the last patches in this series.
>
> Maybe those applications are written in C and C++ _because_ there
> are no bindings for high level languages. I just wish BPF programming
> was less weird and adding some funky codegen is not getting us closer
> to that goal.
>
> In my experience code gen is nothing more than a hack to work around
> bad APIs, but experiences differ so that's not a solid argument.
*nod*
We have a nice set of C++ wrappers around libbpf internally, so we can do
something like BpfMap<key type, value type> and get a much better interface
with type checking. Maybe we should focus on higher level languages instead?
We are open to open-sourcing our C++ bits if you want to collaborate.
(I assume most of the stuff you have at fb is also non-c and one of
c++/python/php/rust/go/whatver).
On Tue, Dec 10, 2019 at 1:44 PM Stanislav Fomichev <[email protected]> wrote:
>
> On 12/10, Jakub Kicinski wrote:
> > On Tue, 10 Dec 2019 09:11:31 -0800, Andrii Nakryiko wrote:
> > > On Mon, Dec 9, 2019 at 5:57 PM Jakub Kicinski wrote:
> > > > On Mon, 9 Dec 2019 17:14:34 -0800, Andrii Nakryiko wrote:
> > > > > struct <object-name> {
> > > > > /* used by libbpf's skeleton API */
> > > > > struct bpf_object_skeleton *skeleton;
> > > > > /* bpf_object for libbpf APIs */
> > > > > struct bpf_object *obj;
> > > > > struct {
> > > > > /* for every defined map in BPF object: */
> > > > > struct bpf_map *<map-name>;
> > > > > } maps;
> > > > > struct {
> > > > > /* for every program in BPF object: */
> > > > > struct bpf_program *<program-name>;
> > > > > } progs;
> > > > > struct {
> > > > > /* for every program in BPF object: */
> > > > > struct bpf_link *<program-name>;
> > > > > } links;
> > > > > /* for every present global data section: */
> > > > > struct <object-name>__<one of bss, data, or rodata> {
> > > > > /* memory layout of corresponding data section,
> > > > > * with every defined variable represented as a struct field
> > > > > * with exactly the same type, but without const/volatile
> > > > > * modifiers, e.g.:
> > > > > */
> > > > > int *my_var_1;
> > > > > ...
> > > > > } *<one of bss, data, or rodata>;
> > > > > };
> > > >
> > > > I think I understand how this is useful, but perhaps the problem here
> > > > is that we're using C for everything, and simple programs for which
> > > > loading the ELF is majority of the code would be better of being
> > > > written in a dynamic language like python? Would it perhaps be a
> > > > better idea to work on some high-level language bindings than spend
> > > > time writing code gens and working around limitations of C?
> > >
> > > None of this work prevents Python bindings and other improvements, is
> > > it? Patches, as always, are greatly appreciated ;)
> >
> > This "do it yourself" shit is not really funny :/
> >
> > I'll stop providing feedback on BPF patches if you guy keep saying
> > that :/ Maybe that's what you want.
> >
> > > This skeleton stuff is not just to save code, but in general to
> > > simplify and streamline working with BPF program from userspace side.
> > > Fortunately or not, but there are a lot of real-world applications
> > > written in C and C++ that could benefit from this, so this is still
> > > immensely useful. selftests/bpf themselves benefit a lot from this
> > > work, see few of the last patches in this series.
> >
> > Maybe those applications are written in C and C++ _because_ there
> > are no bindings for high level languages. I just wish BPF programming
> > was less weird and adding some funky codegen is not getting us closer
> > to that goal.
> >
> > In my experience code gen is nothing more than a hack to work around
> > bad APIs, but experiences differ so that's not a solid argument.
> *nod*
>
> We have a nice set of C++ wrappers around libbpf internally, so we can do
> something like BpfMap<key type, value type> and get a much better interface
> with type checking. Maybe we should focus on higher level languages instead?
> We are open to open-sourcing our C++ bits if you want to collaborate.
Python/C++ bindings and API wrappers are an orthogonal concerns here.
I personally think it would be great to have both Python and C++
specific API that uses libbpf under the cover. The only debatable
thing is the logistics: where the source code lives, how it's kept in
sync with libbpf, how we avoid crippling libbpf itself because
something is hard or inconvenient to adapt w/ Python, etc.
The problem I'm trying to solve here is not really C-specific. I don't
think you can solve it without code generation for C++. How do you
"generate" BPF program-specific layout of .data, .bss, .rodata, etc
data sections in such a way, where it's type safe (to the degree that
language allows that, of course) and is not "stringly-based" API? This
skeleton stuff provides a natural, convenient and type-safe way to
work with global data from userspace pretty much at the same level of
performance and convenience, as from BPF side. How can you achieve
that w/ C++ without code generation? As for Python, sure you can do
dynamic lookups based on just the name of property/method, but amount
of overheads is not acceptable for all applications (and Python itself
is not acceptable for those applications). In addition to that, C is
the best way for other less popular languages (e.g., Rust) to leverage
libbpf without investing lots of effort in re-implementing libbpf in
Rust.
So while having nice high-level language-specific APIs is good, it's not enough.
>
> (I assume most of the stuff you have at fb is also non-c and one of
> c++/python/php/rust/go/whatver).
Yes, C++ using libbpf directly or through very thin wrappers. For
BCC-based stuff, obviously, we rely on C++ parts of BCC. This struct
I'm generating is extremely useful for C++ as well, as it gives very
natural way to access *and initialize* global variables.
On 12/10, Andrii Nakryiko wrote:
> On Tue, Dec 10, 2019 at 1:44 PM Stanislav Fomichev <[email protected]> wrote:
> >
> > On 12/10, Jakub Kicinski wrote:
> > > On Tue, 10 Dec 2019 09:11:31 -0800, Andrii Nakryiko wrote:
> > > > On Mon, Dec 9, 2019 at 5:57 PM Jakub Kicinski wrote:
> > > > > On Mon, 9 Dec 2019 17:14:34 -0800, Andrii Nakryiko wrote:
> > > > > > struct <object-name> {
> > > > > > /* used by libbpf's skeleton API */
> > > > > > struct bpf_object_skeleton *skeleton;
> > > > > > /* bpf_object for libbpf APIs */
> > > > > > struct bpf_object *obj;
> > > > > > struct {
> > > > > > /* for every defined map in BPF object: */
> > > > > > struct bpf_map *<map-name>;
> > > > > > } maps;
> > > > > > struct {
> > > > > > /* for every program in BPF object: */
> > > > > > struct bpf_program *<program-name>;
> > > > > > } progs;
> > > > > > struct {
> > > > > > /* for every program in BPF object: */
> > > > > > struct bpf_link *<program-name>;
> > > > > > } links;
> > > > > > /* for every present global data section: */
> > > > > > struct <object-name>__<one of bss, data, or rodata> {
> > > > > > /* memory layout of corresponding data section,
> > > > > > * with every defined variable represented as a struct field
> > > > > > * with exactly the same type, but without const/volatile
> > > > > > * modifiers, e.g.:
> > > > > > */
> > > > > > int *my_var_1;
> > > > > > ...
> > > > > > } *<one of bss, data, or rodata>;
> > > > > > };
> > > > >
> > > > > I think I understand how this is useful, but perhaps the problem here
> > > > > is that we're using C for everything, and simple programs for which
> > > > > loading the ELF is majority of the code would be better of being
> > > > > written in a dynamic language like python? Would it perhaps be a
> > > > > better idea to work on some high-level language bindings than spend
> > > > > time writing code gens and working around limitations of C?
> > > >
> > > > None of this work prevents Python bindings and other improvements, is
> > > > it? Patches, as always, are greatly appreciated ;)
> > >
> > > This "do it yourself" shit is not really funny :/
> > >
> > > I'll stop providing feedback on BPF patches if you guy keep saying
> > > that :/ Maybe that's what you want.
> > >
> > > > This skeleton stuff is not just to save code, but in general to
> > > > simplify and streamline working with BPF program from userspace side.
> > > > Fortunately or not, but there are a lot of real-world applications
> > > > written in C and C++ that could benefit from this, so this is still
> > > > immensely useful. selftests/bpf themselves benefit a lot from this
> > > > work, see few of the last patches in this series.
> > >
> > > Maybe those applications are written in C and C++ _because_ there
> > > are no bindings for high level languages. I just wish BPF programming
> > > was less weird and adding some funky codegen is not getting us closer
> > > to that goal.
> > >
> > > In my experience code gen is nothing more than a hack to work around
> > > bad APIs, but experiences differ so that's not a solid argument.
> > *nod*
> >
> > We have a nice set of C++ wrappers around libbpf internally, so we can do
> > something like BpfMap<key type, value type> and get a much better interface
> > with type checking. Maybe we should focus on higher level languages instead?
> > We are open to open-sourcing our C++ bits if you want to collaborate.
>
> Python/C++ bindings and API wrappers are an orthogonal concerns here.
> I personally think it would be great to have both Python and C++
> specific API that uses libbpf under the cover. The only debatable
> thing is the logistics: where the source code lives, how it's kept in
> sync with libbpf, how we avoid crippling libbpf itself because
> something is hard or inconvenient to adapt w/ Python, etc.
[..]
> The problem I'm trying to solve here is not really C-specific. I don't
> think you can solve it without code generation for C++. How do you
> "generate" BPF program-specific layout of .data, .bss, .rodata, etc
> data sections in such a way, where it's type safe (to the degree that
> language allows that, of course) and is not "stringly-based" API? This
> skeleton stuff provides a natural, convenient and type-safe way to
> work with global data from userspace pretty much at the same level of
> performance and convenience, as from BPF side. How can you achieve
> that w/ C++ without code generation? As for Python, sure you can do
> dynamic lookups based on just the name of property/method, but amount
> of overheads is not acceptable for all applications (and Python itself
> is not acceptable for those applications). In addition to that, C is
> the best way for other less popular languages (e.g., Rust) to leverage
> libbpf without investing lots of effort in re-implementing libbpf in
> Rust.
I'd say that a libbpf API similar to dlopen/dlsym is a more
straightforward thing to do. Have a way to "open" a section and
a way to find a symbol in it. Yes, it's a string-based API,
but there is nothing wrong with it. IMO, this is easier to
use/understand and I suppose Python/C++ wrappers are trivial.
As for type-safety: it's C, forget about it :-)
On Tue, Dec 10, 2019 at 2:59 PM Stanislav Fomichev <[email protected]> wrote:
>
> On 12/10, Andrii Nakryiko wrote:
> > On Tue, Dec 10, 2019 at 1:44 PM Stanislav Fomichev <[email protected]> wrote:
> > >
> > > On 12/10, Jakub Kicinski wrote:
> > > > On Tue, 10 Dec 2019 09:11:31 -0800, Andrii Nakryiko wrote:
> > > > > On Mon, Dec 9, 2019 at 5:57 PM Jakub Kicinski wrote:
> > > > > > On Mon, 9 Dec 2019 17:14:34 -0800, Andrii Nakryiko wrote:
> > > > > > > struct <object-name> {
> > > > > > > /* used by libbpf's skeleton API */
> > > > > > > struct bpf_object_skeleton *skeleton;
> > > > > > > /* bpf_object for libbpf APIs */
> > > > > > > struct bpf_object *obj;
> > > > > > > struct {
> > > > > > > /* for every defined map in BPF object: */
> > > > > > > struct bpf_map *<map-name>;
> > > > > > > } maps;
> > > > > > > struct {
> > > > > > > /* for every program in BPF object: */
> > > > > > > struct bpf_program *<program-name>;
> > > > > > > } progs;
> > > > > > > struct {
> > > > > > > /* for every program in BPF object: */
> > > > > > > struct bpf_link *<program-name>;
> > > > > > > } links;
> > > > > > > /* for every present global data section: */
> > > > > > > struct <object-name>__<one of bss, data, or rodata> {
> > > > > > > /* memory layout of corresponding data section,
> > > > > > > * with every defined variable represented as a struct field
> > > > > > > * with exactly the same type, but without const/volatile
> > > > > > > * modifiers, e.g.:
> > > > > > > */
> > > > > > > int *my_var_1;
> > > > > > > ...
> > > > > > > } *<one of bss, data, or rodata>;
> > > > > > > };
> > > > > >
> > > > > > I think I understand how this is useful, but perhaps the problem here
> > > > > > is that we're using C for everything, and simple programs for which
> > > > > > loading the ELF is majority of the code would be better of being
> > > > > > written in a dynamic language like python? Would it perhaps be a
> > > > > > better idea to work on some high-level language bindings than spend
> > > > > > time writing code gens and working around limitations of C?
> > > > >
> > > > > None of this work prevents Python bindings and other improvements, is
> > > > > it? Patches, as always, are greatly appreciated ;)
> > > >
> > > > This "do it yourself" shit is not really funny :/
> > > >
> > > > I'll stop providing feedback on BPF patches if you guy keep saying
> > > > that :/ Maybe that's what you want.
> > > >
> > > > > This skeleton stuff is not just to save code, but in general to
> > > > > simplify and streamline working with BPF program from userspace side.
> > > > > Fortunately or not, but there are a lot of real-world applications
> > > > > written in C and C++ that could benefit from this, so this is still
> > > > > immensely useful. selftests/bpf themselves benefit a lot from this
> > > > > work, see few of the last patches in this series.
> > > >
> > > > Maybe those applications are written in C and C++ _because_ there
> > > > are no bindings for high level languages. I just wish BPF programming
> > > > was less weird and adding some funky codegen is not getting us closer
> > > > to that goal.
> > > >
> > > > In my experience code gen is nothing more than a hack to work around
> > > > bad APIs, but experiences differ so that's not a solid argument.
> > > *nod*
> > >
> > > We have a nice set of C++ wrappers around libbpf internally, so we can do
> > > something like BpfMap<key type, value type> and get a much better interface
> > > with type checking. Maybe we should focus on higher level languages instead?
> > > We are open to open-sourcing our C++ bits if you want to collaborate.
> >
> > Python/C++ bindings and API wrappers are an orthogonal concerns here.
> > I personally think it would be great to have both Python and C++
> > specific API that uses libbpf under the cover. The only debatable
> > thing is the logistics: where the source code lives, how it's kept in
> > sync with libbpf, how we avoid crippling libbpf itself because
> > something is hard or inconvenient to adapt w/ Python, etc.
>
> [..]
> > The problem I'm trying to solve here is not really C-specific. I don't
> > think you can solve it without code generation for C++. How do you
> > "generate" BPF program-specific layout of .data, .bss, .rodata, etc
> > data sections in such a way, where it's type safe (to the degree that
> > language allows that, of course) and is not "stringly-based" API? This
> > skeleton stuff provides a natural, convenient and type-safe way to
> > work with global data from userspace pretty much at the same level of
> > performance and convenience, as from BPF side. How can you achieve
> > that w/ C++ without code generation? As for Python, sure you can do
> > dynamic lookups based on just the name of property/method, but amount
> > of overheads is not acceptable for all applications (and Python itself
> > is not acceptable for those applications). In addition to that, C is
> > the best way for other less popular languages (e.g., Rust) to leverage
> > libbpf without investing lots of effort in re-implementing libbpf in
> > Rust.
> I'd say that a libbpf API similar to dlopen/dlsym is a more
> straightforward thing to do. Have a way to "open" a section and
> a way to find a symbol in it. Yes, it's a string-based API,
> but there is nothing wrong with it. IMO, this is easier to
> use/understand and I suppose Python/C++ wrappers are trivial.
Without digging through libbpf source code (or actually, look at code,
but don't run any test program), what's the name of the map
corresponding to .bss section, if object file is
some_bpf_object_file.o? If you got it right (congrats, btw, it took me
multiple attempts to memorize the pattern), how much time did you
spend looking it up? Now compare it to `skel->maps.bss`. Further, if
you use anonymous structs for your global vars, good luck maintaining
two copies of that: one for BPF side and one for userspace.
I never said there is anything wrong with current straightforward
libbpf API, but I also never said it's the easiest and most
user-friendly way to work with BPF either. So we'll have both
code-generated interface and existing API. Furthermore, they are
interoperable (you can pass skel->maps.whatever to any of the existing
libbpf APIs, same for progs, links, obj itself). But there isn't much
that can beat performance and usability of code-generated .data, .bss,
.rodata (and now .extern) layout.
>
> As for type-safety: it's C, forget about it :-)
C is weakly, but still typed language. There are types and they are
helpful. Yes, you can disregard them and re-interpret values as
anything, but that's beside the point.
On 12/10, Andrii Nakryiko wrote:
> On Tue, Dec 10, 2019 at 2:59 PM Stanislav Fomichev <[email protected]> wrote:
> >
> > On 12/10, Andrii Nakryiko wrote:
> > > On Tue, Dec 10, 2019 at 1:44 PM Stanislav Fomichev <[email protected]> wrote:
> > > >
> > > > On 12/10, Jakub Kicinski wrote:
> > > > > On Tue, 10 Dec 2019 09:11:31 -0800, Andrii Nakryiko wrote:
> > > > > > On Mon, Dec 9, 2019 at 5:57 PM Jakub Kicinski wrote:
> > > > > > > On Mon, 9 Dec 2019 17:14:34 -0800, Andrii Nakryiko wrote:
> > > > > > > > struct <object-name> {
> > > > > > > > /* used by libbpf's skeleton API */
> > > > > > > > struct bpf_object_skeleton *skeleton;
> > > > > > > > /* bpf_object for libbpf APIs */
> > > > > > > > struct bpf_object *obj;
> > > > > > > > struct {
> > > > > > > > /* for every defined map in BPF object: */
> > > > > > > > struct bpf_map *<map-name>;
> > > > > > > > } maps;
> > > > > > > > struct {
> > > > > > > > /* for every program in BPF object: */
> > > > > > > > struct bpf_program *<program-name>;
> > > > > > > > } progs;
> > > > > > > > struct {
> > > > > > > > /* for every program in BPF object: */
> > > > > > > > struct bpf_link *<program-name>;
> > > > > > > > } links;
> > > > > > > > /* for every present global data section: */
> > > > > > > > struct <object-name>__<one of bss, data, or rodata> {
> > > > > > > > /* memory layout of corresponding data section,
> > > > > > > > * with every defined variable represented as a struct field
> > > > > > > > * with exactly the same type, but without const/volatile
> > > > > > > > * modifiers, e.g.:
> > > > > > > > */
> > > > > > > > int *my_var_1;
> > > > > > > > ...
> > > > > > > > } *<one of bss, data, or rodata>;
> > > > > > > > };
> > > > > > >
> > > > > > > I think I understand how this is useful, but perhaps the problem here
> > > > > > > is that we're using C for everything, and simple programs for which
> > > > > > > loading the ELF is majority of the code would be better of being
> > > > > > > written in a dynamic language like python? Would it perhaps be a
> > > > > > > better idea to work on some high-level language bindings than spend
> > > > > > > time writing code gens and working around limitations of C?
> > > > > >
> > > > > > None of this work prevents Python bindings and other improvements, is
> > > > > > it? Patches, as always, are greatly appreciated ;)
> > > > >
> > > > > This "do it yourself" shit is not really funny :/
> > > > >
> > > > > I'll stop providing feedback on BPF patches if you guy keep saying
> > > > > that :/ Maybe that's what you want.
> > > > >
> > > > > > This skeleton stuff is not just to save code, but in general to
> > > > > > simplify and streamline working with BPF program from userspace side.
> > > > > > Fortunately or not, but there are a lot of real-world applications
> > > > > > written in C and C++ that could benefit from this, so this is still
> > > > > > immensely useful. selftests/bpf themselves benefit a lot from this
> > > > > > work, see few of the last patches in this series.
> > > > >
> > > > > Maybe those applications are written in C and C++ _because_ there
> > > > > are no bindings for high level languages. I just wish BPF programming
> > > > > was less weird and adding some funky codegen is not getting us closer
> > > > > to that goal.
> > > > >
> > > > > In my experience code gen is nothing more than a hack to work around
> > > > > bad APIs, but experiences differ so that's not a solid argument.
> > > > *nod*
> > > >
> > > > We have a nice set of C++ wrappers around libbpf internally, so we can do
> > > > something like BpfMap<key type, value type> and get a much better interface
> > > > with type checking. Maybe we should focus on higher level languages instead?
> > > > We are open to open-sourcing our C++ bits if you want to collaborate.
> > >
> > > Python/C++ bindings and API wrappers are an orthogonal concerns here.
> > > I personally think it would be great to have both Python and C++
> > > specific API that uses libbpf under the cover. The only debatable
> > > thing is the logistics: where the source code lives, how it's kept in
> > > sync with libbpf, how we avoid crippling libbpf itself because
> > > something is hard or inconvenient to adapt w/ Python, etc.
> >
> > [..]
> > > The problem I'm trying to solve here is not really C-specific. I don't
> > > think you can solve it without code generation for C++. How do you
> > > "generate" BPF program-specific layout of .data, .bss, .rodata, etc
> > > data sections in such a way, where it's type safe (to the degree that
> > > language allows that, of course) and is not "stringly-based" API? This
> > > skeleton stuff provides a natural, convenient and type-safe way to
> > > work with global data from userspace pretty much at the same level of
> > > performance and convenience, as from BPF side. How can you achieve
> > > that w/ C++ without code generation? As for Python, sure you can do
> > > dynamic lookups based on just the name of property/method, but amount
> > > of overheads is not acceptable for all applications (and Python itself
> > > is not acceptable for those applications). In addition to that, C is
> > > the best way for other less popular languages (e.g., Rust) to leverage
> > > libbpf without investing lots of effort in re-implementing libbpf in
> > > Rust.
> > I'd say that a libbpf API similar to dlopen/dlsym is a more
> > straightforward thing to do. Have a way to "open" a section and
> > a way to find a symbol in it. Yes, it's a string-based API,
> > but there is nothing wrong with it. IMO, this is easier to
> > use/understand and I suppose Python/C++ wrappers are trivial.
>
> Without digging through libbpf source code (or actually, look at code,
> but don't run any test program), what's the name of the map
> corresponding to .bss section, if object file is
> some_bpf_object_file.o? If you got it right (congrats, btw, it took me
> multiple attempts to memorize the pattern), how much time did you
> spend looking it up? Now compare it to `skel->maps.bss`. Further, if
> you use anonymous structs for your global vars, good luck maintaining
> two copies of that: one for BPF side and one for userspace.
As your average author of BPF programs I don't really care
which section my symbol ends up into. Just give me an api
to mmap all "global" sections (or a call per section which does all the
naming magic inside) and lookup symbol by name; I can cast it to a proper
type and set it.
RE anonymous structs: maybe don't use them if you want to share the data
between bpf and userspace?
> I never said there is anything wrong with current straightforward
> libbpf API, but I also never said it's the easiest and most
> user-friendly way to work with BPF either. So we'll have both
> code-generated interface and existing API. Furthermore, they are
> interoperable (you can pass skel->maps.whatever to any of the existing
> libbpf APIs, same for progs, links, obj itself). But there isn't much
> that can beat performance and usability of code-generated .data, .bss,
> .rodata (and now .extern) layout.
I haven't looked closely enough, but is there a libbpf api to get
an offset of a variable? Suppose I have the following in bpf.c:
int a;
int b;
Can I get an offset of 'b' in the .bss without manually parsing BTF?
TBH, I don't buy the performance argument for these global maps.
When you did the mmap patchset for the array, you said it yourself
that it's about convenience and not performance.
> > As for type-safety: it's C, forget about it :-)
>
> C is weakly, but still typed language. There are types and they are
> helpful. Yes, you can disregard them and re-interpret values as
> anything, but that's beside the point.
My point was that there is a certain mental model when working
with this type of external symbols which feels "natural" for C
(dlopen/dlsym).
But I agree with you, that as long as code-gen is optional
and there is an alternative api in libbpf, we should be good.
On Wed, Dec 11, 2019 at 9:24 AM Stanislav Fomichev <[email protected]> wrote:
>
> On 12/10, Andrii Nakryiko wrote:
> > On Tue, Dec 10, 2019 at 2:59 PM Stanislav Fomichev <[email protected]> wrote:
> > >
> > > On 12/10, Andrii Nakryiko wrote:
> > > > On Tue, Dec 10, 2019 at 1:44 PM Stanislav Fomichev <[email protected]> wrote:
> > > > >
> > > > > On 12/10, Jakub Kicinski wrote:
> > > > > > On Tue, 10 Dec 2019 09:11:31 -0800, Andrii Nakryiko wrote:
> > > > > > > On Mon, Dec 9, 2019 at 5:57 PM Jakub Kicinski wrote:
> > > > > > > > On Mon, 9 Dec 2019 17:14:34 -0800, Andrii Nakryiko wrote:
> > > > > > > > > struct <object-name> {
> > > > > > > > > /* used by libbpf's skeleton API */
> > > > > > > > > struct bpf_object_skeleton *skeleton;
> > > > > > > > > /* bpf_object for libbpf APIs */
> > > > > > > > > struct bpf_object *obj;
> > > > > > > > > struct {
> > > > > > > > > /* for every defined map in BPF object: */
> > > > > > > > > struct bpf_map *<map-name>;
> > > > > > > > > } maps;
> > > > > > > > > struct {
> > > > > > > > > /* for every program in BPF object: */
> > > > > > > > > struct bpf_program *<program-name>;
> > > > > > > > > } progs;
> > > > > > > > > struct {
> > > > > > > > > /* for every program in BPF object: */
> > > > > > > > > struct bpf_link *<program-name>;
> > > > > > > > > } links;
> > > > > > > > > /* for every present global data section: */
> > > > > > > > > struct <object-name>__<one of bss, data, or rodata> {
> > > > > > > > > /* memory layout of corresponding data section,
> > > > > > > > > * with every defined variable represented as a struct field
> > > > > > > > > * with exactly the same type, but without const/volatile
> > > > > > > > > * modifiers, e.g.:
> > > > > > > > > */
> > > > > > > > > int *my_var_1;
> > > > > > > > > ...
> > > > > > > > > } *<one of bss, data, or rodata>;
> > > > > > > > > };
> > > > > > > >
> > > > > > > > I think I understand how this is useful, but perhaps the problem here
> > > > > > > > is that we're using C for everything, and simple programs for which
> > > > > > > > loading the ELF is majority of the code would be better of being
> > > > > > > > written in a dynamic language like python? Would it perhaps be a
> > > > > > > > better idea to work on some high-level language bindings than spend
> > > > > > > > time writing code gens and working around limitations of C?
> > > > > > >
> > > > > > > None of this work prevents Python bindings and other improvements, is
> > > > > > > it? Patches, as always, are greatly appreciated ;)
> > > > > >
> > > > > > This "do it yourself" shit is not really funny :/
> > > > > >
> > > > > > I'll stop providing feedback on BPF patches if you guy keep saying
> > > > > > that :/ Maybe that's what you want.
> > > > > >
> > > > > > > This skeleton stuff is not just to save code, but in general to
> > > > > > > simplify and streamline working with BPF program from userspace side.
> > > > > > > Fortunately or not, but there are a lot of real-world applications
> > > > > > > written in C and C++ that could benefit from this, so this is still
> > > > > > > immensely useful. selftests/bpf themselves benefit a lot from this
> > > > > > > work, see few of the last patches in this series.
> > > > > >
> > > > > > Maybe those applications are written in C and C++ _because_ there
> > > > > > are no bindings for high level languages. I just wish BPF programming
> > > > > > was less weird and adding some funky codegen is not getting us closer
> > > > > > to that goal.
> > > > > >
> > > > > > In my experience code gen is nothing more than a hack to work around
> > > > > > bad APIs, but experiences differ so that's not a solid argument.
> > > > > *nod*
> > > > >
> > > > > We have a nice set of C++ wrappers around libbpf internally, so we can do
> > > > > something like BpfMap<key type, value type> and get a much better interface
> > > > > with type checking. Maybe we should focus on higher level languages instead?
> > > > > We are open to open-sourcing our C++ bits if you want to collaborate.
> > > >
> > > > Python/C++ bindings and API wrappers are an orthogonal concerns here.
> > > > I personally think it would be great to have both Python and C++
> > > > specific API that uses libbpf under the cover. The only debatable
> > > > thing is the logistics: where the source code lives, how it's kept in
> > > > sync with libbpf, how we avoid crippling libbpf itself because
> > > > something is hard or inconvenient to adapt w/ Python, etc.
> > >
> > > [..]
> > > > The problem I'm trying to solve here is not really C-specific. I don't
> > > > think you can solve it without code generation for C++. How do you
> > > > "generate" BPF program-specific layout of .data, .bss, .rodata, etc
> > > > data sections in such a way, where it's type safe (to the degree that
> > > > language allows that, of course) and is not "stringly-based" API? This
> > > > skeleton stuff provides a natural, convenient and type-safe way to
> > > > work with global data from userspace pretty much at the same level of
> > > > performance and convenience, as from BPF side. How can you achieve
> > > > that w/ C++ without code generation? As for Python, sure you can do
> > > > dynamic lookups based on just the name of property/method, but amount
> > > > of overheads is not acceptable for all applications (and Python itself
> > > > is not acceptable for those applications). In addition to that, C is
> > > > the best way for other less popular languages (e.g., Rust) to leverage
> > > > libbpf without investing lots of effort in re-implementing libbpf in
> > > > Rust.
> > > I'd say that a libbpf API similar to dlopen/dlsym is a more
> > > straightforward thing to do. Have a way to "open" a section and
> > > a way to find a symbol in it. Yes, it's a string-based API,
> > > but there is nothing wrong with it. IMO, this is easier to
> > > use/understand and I suppose Python/C++ wrappers are trivial.
> >
> > Without digging through libbpf source code (or actually, look at code,
> > but don't run any test program), what's the name of the map
> > corresponding to .bss section, if object file is
> > some_bpf_object_file.o? If you got it right (congrats, btw, it took me
> > multiple attempts to memorize the pattern), how much time did you
> > spend looking it up? Now compare it to `skel->maps.bss`. Further, if
> > you use anonymous structs for your global vars, good luck maintaining
> > two copies of that: one for BPF side and one for userspace.
> As your average author of BPF programs I don't really care
> which section my symbol ends up into. Just give me an api
> to mmap all "global" sections (or a call per section which does all the
> naming magic inside) and lookup symbol by name; I can cast it to a proper
> type and set it.
I'd like to not have to know about bss/rodata/data as well, but that's
how things are done for global variables. In skeleton we can try to
make an illusion like they are part of one big datasection/struct, but
that seems like a bit too much magic at this point. But then again,
one of the reasons I want this as an experimental feature, so that we
can actually judge from real experience how inconvenient some things
are, and not just based on "I think it would be ...".
re: "Just give me ...". Following the spirit of "C is hard" from your
previous arguments, you already have that API: mmap() syscall. C
programmers have to be able to figure out the rest ;) But on the
serious note, this auto-generated code in skeleton actually addresses
all concerns (and more) that you mentioned: mmaping, knowing offsets,
knowing names and types, etc. And it doesn't preclude adding more
"conventional" additional APIs to do everything more dynamically,
based on string names.
>
> RE anonymous structs: maybe don't use them if you want to share the data
> between bpf and userspace?
Alright.
>
> > I never said there is anything wrong with current straightforward
> > libbpf API, but I also never said it's the easiest and most
> > user-friendly way to work with BPF either. So we'll have both
> > code-generated interface and existing API. Furthermore, they are
> > interoperable (you can pass skel->maps.whatever to any of the existing
> > libbpf APIs, same for progs, links, obj itself). But there isn't much
> > that can beat performance and usability of code-generated .data, .bss,
> > .rodata (and now .extern) layout.
> I haven't looked closely enough, but is there a libbpf api to get
> an offset of a variable? Suppose I have the following in bpf.c:
>
> int a;
> int b;
>
> Can I get an offset of 'b' in the .bss without manually parsing BTF?
No there isn't right now. There isn't even an API to know that there
is such a variable called "b". Except for this skeleton, of course.
>
> TBH, I don't buy the performance argument for these global maps.
> When you did the mmap patchset for the array, you said it yourself
> that it's about convenience and not performance.
Yes, it's first and foremost about convenience, addressing exactly the
problems you mentioned above. But performance is critical for some use
cases, and nothing can beat memory-mapped view of BPF map for those.
Think about the case of frequently polling (or even atomically
exchanging) some stats from userspace, as one possible example. E.g.,
like some map statistics (number of filled elements, p50 of whatever
of those elements, etc). I'm not sure what's there to buy: doing
syscall to get **entire** global data map contents vs just fetching
single integer from memory-mapped region, guess which one is cheaper?
>
> > > As for type-safety: it's C, forget about it :-)
> >
> > C is weakly, but still typed language. There are types and they are
> > helpful. Yes, you can disregard them and re-interpret values as
> > anything, but that's beside the point.
> My point was that there is a certain mental model when working
> with this type of external symbols which feels "natural" for C
> (dlopen/dlsym).
>
> But I agree with you, that as long as code-gen is optional
> and there is an alternative api in libbpf, we should be good.
On 12/11, Andrii Nakryiko wrote:
> On Wed, Dec 11, 2019 at 9:24 AM Stanislav Fomichev <[email protected]> wrote:
> >
> > On 12/10, Andrii Nakryiko wrote:
> > > On Tue, Dec 10, 2019 at 2:59 PM Stanislav Fomichev <[email protected]> wrote:
> > > >
> > > > On 12/10, Andrii Nakryiko wrote:
> > > > > On Tue, Dec 10, 2019 at 1:44 PM Stanislav Fomichev <[email protected]> wrote:
> > > > > >
> > > > > > On 12/10, Jakub Kicinski wrote:
> > > > > > > On Tue, 10 Dec 2019 09:11:31 -0800, Andrii Nakryiko wrote:
> > > > > > > > On Mon, Dec 9, 2019 at 5:57 PM Jakub Kicinski wrote:
> > > > > > > > > On Mon, 9 Dec 2019 17:14:34 -0800, Andrii Nakryiko wrote:
> > > > > > > > > > struct <object-name> {
> > > > > > > > > > /* used by libbpf's skeleton API */
> > > > > > > > > > struct bpf_object_skeleton *skeleton;
> > > > > > > > > > /* bpf_object for libbpf APIs */
> > > > > > > > > > struct bpf_object *obj;
> > > > > > > > > > struct {
> > > > > > > > > > /* for every defined map in BPF object: */
> > > > > > > > > > struct bpf_map *<map-name>;
> > > > > > > > > > } maps;
> > > > > > > > > > struct {
> > > > > > > > > > /* for every program in BPF object: */
> > > > > > > > > > struct bpf_program *<program-name>;
> > > > > > > > > > } progs;
> > > > > > > > > > struct {
> > > > > > > > > > /* for every program in BPF object: */
> > > > > > > > > > struct bpf_link *<program-name>;
> > > > > > > > > > } links;
> > > > > > > > > > /* for every present global data section: */
> > > > > > > > > > struct <object-name>__<one of bss, data, or rodata> {
> > > > > > > > > > /* memory layout of corresponding data section,
> > > > > > > > > > * with every defined variable represented as a struct field
> > > > > > > > > > * with exactly the same type, but without const/volatile
> > > > > > > > > > * modifiers, e.g.:
> > > > > > > > > > */
> > > > > > > > > > int *my_var_1;
> > > > > > > > > > ...
> > > > > > > > > > } *<one of bss, data, or rodata>;
> > > > > > > > > > };
> > > > > > > > >
> > > > > > > > > I think I understand how this is useful, but perhaps the problem here
> > > > > > > > > is that we're using C for everything, and simple programs for which
> > > > > > > > > loading the ELF is majority of the code would be better of being
> > > > > > > > > written in a dynamic language like python? Would it perhaps be a
> > > > > > > > > better idea to work on some high-level language bindings than spend
> > > > > > > > > time writing code gens and working around limitations of C?
> > > > > > > >
> > > > > > > > None of this work prevents Python bindings and other improvements, is
> > > > > > > > it? Patches, as always, are greatly appreciated ;)
> > > > > > >
> > > > > > > This "do it yourself" shit is not really funny :/
> > > > > > >
> > > > > > > I'll stop providing feedback on BPF patches if you guy keep saying
> > > > > > > that :/ Maybe that's what you want.
> > > > > > >
> > > > > > > > This skeleton stuff is not just to save code, but in general to
> > > > > > > > simplify and streamline working with BPF program from userspace side.
> > > > > > > > Fortunately or not, but there are a lot of real-world applications
> > > > > > > > written in C and C++ that could benefit from this, so this is still
> > > > > > > > immensely useful. selftests/bpf themselves benefit a lot from this
> > > > > > > > work, see few of the last patches in this series.
> > > > > > >
> > > > > > > Maybe those applications are written in C and C++ _because_ there
> > > > > > > are no bindings for high level languages. I just wish BPF programming
> > > > > > > was less weird and adding some funky codegen is not getting us closer
> > > > > > > to that goal.
> > > > > > >
> > > > > > > In my experience code gen is nothing more than a hack to work around
> > > > > > > bad APIs, but experiences differ so that's not a solid argument.
> > > > > > *nod*
> > > > > >
> > > > > > We have a nice set of C++ wrappers around libbpf internally, so we can do
> > > > > > something like BpfMap<key type, value type> and get a much better interface
> > > > > > with type checking. Maybe we should focus on higher level languages instead?
> > > > > > We are open to open-sourcing our C++ bits if you want to collaborate.
> > > > >
> > > > > Python/C++ bindings and API wrappers are an orthogonal concerns here.
> > > > > I personally think it would be great to have both Python and C++
> > > > > specific API that uses libbpf under the cover. The only debatable
> > > > > thing is the logistics: where the source code lives, how it's kept in
> > > > > sync with libbpf, how we avoid crippling libbpf itself because
> > > > > something is hard or inconvenient to adapt w/ Python, etc.
> > > >
> > > > [..]
> > > > > The problem I'm trying to solve here is not really C-specific. I don't
> > > > > think you can solve it without code generation for C++. How do you
> > > > > "generate" BPF program-specific layout of .data, .bss, .rodata, etc
> > > > > data sections in such a way, where it's type safe (to the degree that
> > > > > language allows that, of course) and is not "stringly-based" API? This
> > > > > skeleton stuff provides a natural, convenient and type-safe way to
> > > > > work with global data from userspace pretty much at the same level of
> > > > > performance and convenience, as from BPF side. How can you achieve
> > > > > that w/ C++ without code generation? As for Python, sure you can do
> > > > > dynamic lookups based on just the name of property/method, but amount
> > > > > of overheads is not acceptable for all applications (and Python itself
> > > > > is not acceptable for those applications). In addition to that, C is
> > > > > the best way for other less popular languages (e.g., Rust) to leverage
> > > > > libbpf without investing lots of effort in re-implementing libbpf in
> > > > > Rust.
> > > > I'd say that a libbpf API similar to dlopen/dlsym is a more
> > > > straightforward thing to do. Have a way to "open" a section and
> > > > a way to find a symbol in it. Yes, it's a string-based API,
> > > > but there is nothing wrong with it. IMO, this is easier to
> > > > use/understand and I suppose Python/C++ wrappers are trivial.
> > >
> > > Without digging through libbpf source code (or actually, look at code,
> > > but don't run any test program), what's the name of the map
> > > corresponding to .bss section, if object file is
> > > some_bpf_object_file.o? If you got it right (congrats, btw, it took me
> > > multiple attempts to memorize the pattern), how much time did you
> > > spend looking it up? Now compare it to `skel->maps.bss`. Further, if
> > > you use anonymous structs for your global vars, good luck maintaining
> > > two copies of that: one for BPF side and one for userspace.
> > As your average author of BPF programs I don't really care
> > which section my symbol ends up into. Just give me an api
> > to mmap all "global" sections (or a call per section which does all the
> > naming magic inside) and lookup symbol by name; I can cast it to a proper
> > type and set it.
>
> I'd like to not have to know about bss/rodata/data as well, but that's
> how things are done for global variables. In skeleton we can try to
> make an illusion like they are part of one big datasection/struct, but
> that seems like a bit too much magic at this point. But then again,
> one of the reasons I want this as an experimental feature, so that we
> can actually judge from real experience how inconvenient some things
> are, and not just based on "I think it would be ...".
>
> re: "Just give me ...". Following the spirit of "C is hard" from your
> previous arguments, you already have that API: mmap() syscall. C
> programmers have to be able to figure out the rest ;) But on the
> serious note, this auto-generated code in skeleton actually addresses
> all concerns (and more) that you mentioned: mmaping, knowing offsets,
> knowing names and types, etc. And it doesn't preclude adding more
> "conventional" additional APIs to do everything more dynamically,
> based on string names.
We have different understanding of what's difficult :-)
To me, doing transparent data/rodata/bss mmap in bpf_object__load and then
adding a single libbpf api call to lookup symbol by string name is simple
(both from user perspective and from libbpf code complexity). Because in
order to use the codegen I need to teach our build system to spit it
out (which means I need to add bpftool to it and keep it
updated/etc/etc). You can use it as an example of "real experience how
inconvenient some things are".
> > RE anonymous structs: maybe don't use them if you want to share the data
> > between bpf and userspace?
>
> Alright.
>
> >
> > > I never said there is anything wrong with current straightforward
> > > libbpf API, but I also never said it's the easiest and most
> > > user-friendly way to work with BPF either. So we'll have both
> > > code-generated interface and existing API. Furthermore, they are
> > > interoperable (you can pass skel->maps.whatever to any of the existing
> > > libbpf APIs, same for progs, links, obj itself). But there isn't much
> > > that can beat performance and usability of code-generated .data, .bss,
> > > .rodata (and now .extern) layout.
> > I haven't looked closely enough, but is there a libbpf api to get
> > an offset of a variable? Suppose I have the following in bpf.c:
> >
> > int a;
> > int b;
> >
> > Can I get an offset of 'b' in the .bss without manually parsing BTF?
>
> No there isn't right now. There isn't even an API to know that there
> is such a variable called "b". Except for this skeleton, of course.
>
> >
> > TBH, I don't buy the performance argument for these global maps.
> > When you did the mmap patchset for the array, you said it yourself
> > that it's about convenience and not performance.
>
> Yes, it's first and foremost about convenience, addressing exactly the
> problems you mentioned above. But performance is critical for some use
> cases, and nothing can beat memory-mapped view of BPF map for those.
> Think about the case of frequently polling (or even atomically
> exchanging) some stats from userspace, as one possible example. E.g.,
> like some map statistics (number of filled elements, p50 of whatever
> of those elements, etc). I'm not sure what's there to buy: doing
> syscall to get **entire** global data map contents vs just fetching
> single integer from memory-mapped region, guess which one is cheaper?
My understanding was that when you were talking about performance, you
were talking about doing symbol offset lookup at runtime vs having a
generated struct with fixed offsets; not about mmap vs old api with copy
(this debate is settled since your patches are accepted).
But to your original reply: you do understand that if you have multiple
threads that write to this global data you have a bigger problem, right?
On Wed, Dec 11, 2019 at 11:15 AM Stanislav Fomichev <[email protected]> wrote:
>
> On 12/11, Andrii Nakryiko wrote:
> > On Wed, Dec 11, 2019 at 9:24 AM Stanislav Fomichev <[email protected]> wrote:
> > >
> > > On 12/10, Andrii Nakryiko wrote:
> > > > On Tue, Dec 10, 2019 at 2:59 PM Stanislav Fomichev <[email protected]> wrote:
> > > > >
> > > > > On 12/10, Andrii Nakryiko wrote:
> > > > > > On Tue, Dec 10, 2019 at 1:44 PM Stanislav Fomichev <[email protected]> wrote:
> > > > > > >
> > > > > > > On 12/10, Jakub Kicinski wrote:
> > > > > > > > On Tue, 10 Dec 2019 09:11:31 -0800, Andrii Nakryiko wrote:
> > > > > > > > > On Mon, Dec 9, 2019 at 5:57 PM Jakub Kicinski wrote:
> > > > > > > > > > On Mon, 9 Dec 2019 17:14:34 -0800, Andrii Nakryiko wrote:
> > > > > > > > > > > struct <object-name> {
> > > > > > > > > > > /* used by libbpf's skeleton API */
> > > > > > > > > > > struct bpf_object_skeleton *skeleton;
> > > > > > > > > > > /* bpf_object for libbpf APIs */
> > > > > > > > > > > struct bpf_object *obj;
> > > > > > > > > > > struct {
> > > > > > > > > > > /* for every defined map in BPF object: */
> > > > > > > > > > > struct bpf_map *<map-name>;
> > > > > > > > > > > } maps;
> > > > > > > > > > > struct {
> > > > > > > > > > > /* for every program in BPF object: */
> > > > > > > > > > > struct bpf_program *<program-name>;
> > > > > > > > > > > } progs;
> > > > > > > > > > > struct {
> > > > > > > > > > > /* for every program in BPF object: */
> > > > > > > > > > > struct bpf_link *<program-name>;
> > > > > > > > > > > } links;
> > > > > > > > > > > /* for every present global data section: */
> > > > > > > > > > > struct <object-name>__<one of bss, data, or rodata> {
> > > > > > > > > > > /* memory layout of corresponding data section,
> > > > > > > > > > > * with every defined variable represented as a struct field
> > > > > > > > > > > * with exactly the same type, but without const/volatile
> > > > > > > > > > > * modifiers, e.g.:
> > > > > > > > > > > */
> > > > > > > > > > > int *my_var_1;
> > > > > > > > > > > ...
> > > > > > > > > > > } *<one of bss, data, or rodata>;
> > > > > > > > > > > };
> > > > > > > > > >
> > > > > > > > > > I think I understand how this is useful, but perhaps the problem here
> > > > > > > > > > is that we're using C for everything, and simple programs for which
> > > > > > > > > > loading the ELF is majority of the code would be better of being
> > > > > > > > > > written in a dynamic language like python? Would it perhaps be a
> > > > > > > > > > better idea to work on some high-level language bindings than spend
> > > > > > > > > > time writing code gens and working around limitations of C?
> > > > > > > > >
> > > > > > > > > None of this work prevents Python bindings and other improvements, is
> > > > > > > > > it? Patches, as always, are greatly appreciated ;)
> > > > > > > >
> > > > > > > > This "do it yourself" shit is not really funny :/
> > > > > > > >
> > > > > > > > I'll stop providing feedback on BPF patches if you guy keep saying
> > > > > > > > that :/ Maybe that's what you want.
> > > > > > > >
> > > > > > > > > This skeleton stuff is not just to save code, but in general to
> > > > > > > > > simplify and streamline working with BPF program from userspace side.
> > > > > > > > > Fortunately or not, but there are a lot of real-world applications
> > > > > > > > > written in C and C++ that could benefit from this, so this is still
> > > > > > > > > immensely useful. selftests/bpf themselves benefit a lot from this
> > > > > > > > > work, see few of the last patches in this series.
> > > > > > > >
> > > > > > > > Maybe those applications are written in C and C++ _because_ there
> > > > > > > > are no bindings for high level languages. I just wish BPF programming
> > > > > > > > was less weird and adding some funky codegen is not getting us closer
> > > > > > > > to that goal.
> > > > > > > >
> > > > > > > > In my experience code gen is nothing more than a hack to work around
> > > > > > > > bad APIs, but experiences differ so that's not a solid argument.
> > > > > > > *nod*
> > > > > > >
> > > > > > > We have a nice set of C++ wrappers around libbpf internally, so we can do
> > > > > > > something like BpfMap<key type, value type> and get a much better interface
> > > > > > > with type checking. Maybe we should focus on higher level languages instead?
> > > > > > > We are open to open-sourcing our C++ bits if you want to collaborate.
> > > > > >
> > > > > > Python/C++ bindings and API wrappers are an orthogonal concerns here.
> > > > > > I personally think it would be great to have both Python and C++
> > > > > > specific API that uses libbpf under the cover. The only debatable
> > > > > > thing is the logistics: where the source code lives, how it's kept in
> > > > > > sync with libbpf, how we avoid crippling libbpf itself because
> > > > > > something is hard or inconvenient to adapt w/ Python, etc.
> > > > >
> > > > > [..]
> > > > > > The problem I'm trying to solve here is not really C-specific. I don't
> > > > > > think you can solve it without code generation for C++. How do you
> > > > > > "generate" BPF program-specific layout of .data, .bss, .rodata, etc
> > > > > > data sections in such a way, where it's type safe (to the degree that
> > > > > > language allows that, of course) and is not "stringly-based" API? This
> > > > > > skeleton stuff provides a natural, convenient and type-safe way to
> > > > > > work with global data from userspace pretty much at the same level of
> > > > > > performance and convenience, as from BPF side. How can you achieve
> > > > > > that w/ C++ without code generation? As for Python, sure you can do
> > > > > > dynamic lookups based on just the name of property/method, but amount
> > > > > > of overheads is not acceptable for all applications (and Python itself
> > > > > > is not acceptable for those applications). In addition to that, C is
> > > > > > the best way for other less popular languages (e.g., Rust) to leverage
> > > > > > libbpf without investing lots of effort in re-implementing libbpf in
> > > > > > Rust.
> > > > > I'd say that a libbpf API similar to dlopen/dlsym is a more
> > > > > straightforward thing to do. Have a way to "open" a section and
> > > > > a way to find a symbol in it. Yes, it's a string-based API,
> > > > > but there is nothing wrong with it. IMO, this is easier to
> > > > > use/understand and I suppose Python/C++ wrappers are trivial.
> > > >
> > > > Without digging through libbpf source code (or actually, look at code,
> > > > but don't run any test program), what's the name of the map
> > > > corresponding to .bss section, if object file is
> > > > some_bpf_object_file.o? If you got it right (congrats, btw, it took me
> > > > multiple attempts to memorize the pattern), how much time did you
> > > > spend looking it up? Now compare it to `skel->maps.bss`. Further, if
> > > > you use anonymous structs for your global vars, good luck maintaining
> > > > two copies of that: one for BPF side and one for userspace.
> > > As your average author of BPF programs I don't really care
> > > which section my symbol ends up into. Just give me an api
> > > to mmap all "global" sections (or a call per section which does all the
> > > naming magic inside) and lookup symbol by name; I can cast it to a proper
> > > type and set it.
> >
> > I'd like to not have to know about bss/rodata/data as well, but that's
> > how things are done for global variables. In skeleton we can try to
> > make an illusion like they are part of one big datasection/struct, but
> > that seems like a bit too much magic at this point. But then again,
> > one of the reasons I want this as an experimental feature, so that we
> > can actually judge from real experience how inconvenient some things
> > are, and not just based on "I think it would be ...".
> >
> > re: "Just give me ...". Following the spirit of "C is hard" from your
> > previous arguments, you already have that API: mmap() syscall. C
> > programmers have to be able to figure out the rest ;) But on the
> > serious note, this auto-generated code in skeleton actually addresses
> > all concerns (and more) that you mentioned: mmaping, knowing offsets,
> > knowing names and types, etc. And it doesn't preclude adding more
> > "conventional" additional APIs to do everything more dynamically,
> > based on string names.
> We have different understanding of what's difficult :-)
Well, clearly... See below.
>
> To me, doing transparent data/rodata/bss mmap in bpf_object__load and then
> adding a single libbpf api call to lookup symbol by string name is simple
> (both from user perspective and from libbpf code complexity). Because in
> order to use the codegen I need to teach our build system to spit it
> out (which means I need to add bpftool to it and keep it
> updated/etc/etc). You can use it as an example of "real experience how
> inconvenient some things are".
Yes, you need to integrate bpftool in your build process. Which is
exactly what I'm doing internally for Facebook as well. But it's a
mostly one-time cost, which benefits lots of users who have much
better time with these changes, as opposed to make things simpler for
us, libbpf developers, at the expense of more convoluted user
experience for end users. I certainly prefer more complicated
libbpf/bpftool code, if the resulting user experience is simpler for
BPF application developers, no doubt about it.
>
> > > RE anonymous structs: maybe don't use them if you want to share the data
> > > between bpf and userspace?
> >
> > Alright.
> >
> > >
> > > > I never said there is anything wrong with current straightforward
> > > > libbpf API, but I also never said it's the easiest and most
> > > > user-friendly way to work with BPF either. So we'll have both
> > > > code-generated interface and existing API. Furthermore, they are
> > > > interoperable (you can pass skel->maps.whatever to any of the existing
> > > > libbpf APIs, same for progs, links, obj itself). But there isn't much
> > > > that can beat performance and usability of code-generated .data, .bss,
> > > > .rodata (and now .extern) layout.
> > > I haven't looked closely enough, but is there a libbpf api to get
> > > an offset of a variable? Suppose I have the following in bpf.c:
> > >
> > > int a;
> > > int b;
> > >
> > > Can I get an offset of 'b' in the .bss without manually parsing BTF?
> >
> > No there isn't right now. There isn't even an API to know that there
> > is such a variable called "b". Except for this skeleton, of course.
> >
> > >
> > > TBH, I don't buy the performance argument for these global maps.
> > > When you did the mmap patchset for the array, you said it yourself
> > > that it's about convenience and not performance.
> >
> > Yes, it's first and foremost about convenience, addressing exactly the
> > problems you mentioned above. But performance is critical for some use
> > cases, and nothing can beat memory-mapped view of BPF map for those.
> > Think about the case of frequently polling (or even atomically
> > exchanging) some stats from userspace, as one possible example. E.g.,
> > like some map statistics (number of filled elements, p50 of whatever
> > of those elements, etc). I'm not sure what's there to buy: doing
> > syscall to get **entire** global data map contents vs just fetching
> > single integer from memory-mapped region, guess which one is cheaper?
> My understanding was that when you were talking about performance, you
> were talking about doing symbol offset lookup at runtime vs having a
> generated struct with fixed offsets; not about mmap vs old api with copy
> (this debate is settled since your patches are accepted).
Oh, I see. No, I didn't intend to claim that performance of looking up
variable by name in BTF is a big performance concern. Settled then :)
>
> But to your original reply: you do understand that if you have multiple
> threads that write to this global data you have a bigger problem, right?
Not necessarily. BPF has atomic increment instruction, doesn't it? And
can't we still do atomic swap from user-space (it's just a memory,
after all), right? I haven't tried, tbh, but don't see why it wouldn't
work.
On 12/11, Andrii Nakryiko wrote:
> On Wed, Dec 11, 2019 at 11:15 AM Stanislav Fomichev <[email protected]> wrote:
> >
> > On 12/11, Andrii Nakryiko wrote:
> > > On Wed, Dec 11, 2019 at 9:24 AM Stanislav Fomichev <[email protected]> wrote:
> > > >
> > > > On 12/10, Andrii Nakryiko wrote:
> > > > > On Tue, Dec 10, 2019 at 2:59 PM Stanislav Fomichev <[email protected]> wrote:
> > > > > >
> > > > > > On 12/10, Andrii Nakryiko wrote:
> > > > > > > On Tue, Dec 10, 2019 at 1:44 PM Stanislav Fomichev <[email protected]> wrote:
> > > > > > > >
> > > > > > > > On 12/10, Jakub Kicinski wrote:
> > > > > > > > > On Tue, 10 Dec 2019 09:11:31 -0800, Andrii Nakryiko wrote:
> > > > > > > > > > On Mon, Dec 9, 2019 at 5:57 PM Jakub Kicinski wrote:
> > > > > > > > > > > On Mon, 9 Dec 2019 17:14:34 -0800, Andrii Nakryiko wrote:
> > > > > > > > > > > > struct <object-name> {
> > > > > > > > > > > > /* used by libbpf's skeleton API */
> > > > > > > > > > > > struct bpf_object_skeleton *skeleton;
> > > > > > > > > > > > /* bpf_object for libbpf APIs */
> > > > > > > > > > > > struct bpf_object *obj;
> > > > > > > > > > > > struct {
> > > > > > > > > > > > /* for every defined map in BPF object: */
> > > > > > > > > > > > struct bpf_map *<map-name>;
> > > > > > > > > > > > } maps;
> > > > > > > > > > > > struct {
> > > > > > > > > > > > /* for every program in BPF object: */
> > > > > > > > > > > > struct bpf_program *<program-name>;
> > > > > > > > > > > > } progs;
> > > > > > > > > > > > struct {
> > > > > > > > > > > > /* for every program in BPF object: */
> > > > > > > > > > > > struct bpf_link *<program-name>;
> > > > > > > > > > > > } links;
> > > > > > > > > > > > /* for every present global data section: */
> > > > > > > > > > > > struct <object-name>__<one of bss, data, or rodata> {
> > > > > > > > > > > > /* memory layout of corresponding data section,
> > > > > > > > > > > > * with every defined variable represented as a struct field
> > > > > > > > > > > > * with exactly the same type, but without const/volatile
> > > > > > > > > > > > * modifiers, e.g.:
> > > > > > > > > > > > */
> > > > > > > > > > > > int *my_var_1;
> > > > > > > > > > > > ...
> > > > > > > > > > > > } *<one of bss, data, or rodata>;
> > > > > > > > > > > > };
> > > > > > > > > > >
> > > > > > > > > > > I think I understand how this is useful, but perhaps the problem here
> > > > > > > > > > > is that we're using C for everything, and simple programs for which
> > > > > > > > > > > loading the ELF is majority of the code would be better of being
> > > > > > > > > > > written in a dynamic language like python? Would it perhaps be a
> > > > > > > > > > > better idea to work on some high-level language bindings than spend
> > > > > > > > > > > time writing code gens and working around limitations of C?
> > > > > > > > > >
> > > > > > > > > > None of this work prevents Python bindings and other improvements, is
> > > > > > > > > > it? Patches, as always, are greatly appreciated ;)
> > > > > > > > >
> > > > > > > > > This "do it yourself" shit is not really funny :/
> > > > > > > > >
> > > > > > > > > I'll stop providing feedback on BPF patches if you guy keep saying
> > > > > > > > > that :/ Maybe that's what you want.
> > > > > > > > >
> > > > > > > > > > This skeleton stuff is not just to save code, but in general to
> > > > > > > > > > simplify and streamline working with BPF program from userspace side.
> > > > > > > > > > Fortunately or not, but there are a lot of real-world applications
> > > > > > > > > > written in C and C++ that could benefit from this, so this is still
> > > > > > > > > > immensely useful. selftests/bpf themselves benefit a lot from this
> > > > > > > > > > work, see few of the last patches in this series.
> > > > > > > > >
> > > > > > > > > Maybe those applications are written in C and C++ _because_ there
> > > > > > > > > are no bindings for high level languages. I just wish BPF programming
> > > > > > > > > was less weird and adding some funky codegen is not getting us closer
> > > > > > > > > to that goal.
> > > > > > > > >
> > > > > > > > > In my experience code gen is nothing more than a hack to work around
> > > > > > > > > bad APIs, but experiences differ so that's not a solid argument.
> > > > > > > > *nod*
> > > > > > > >
> > > > > > > > We have a nice set of C++ wrappers around libbpf internally, so we can do
> > > > > > > > something like BpfMap<key type, value type> and get a much better interface
> > > > > > > > with type checking. Maybe we should focus on higher level languages instead?
> > > > > > > > We are open to open-sourcing our C++ bits if you want to collaborate.
> > > > > > >
> > > > > > > Python/C++ bindings and API wrappers are an orthogonal concerns here.
> > > > > > > I personally think it would be great to have both Python and C++
> > > > > > > specific API that uses libbpf under the cover. The only debatable
> > > > > > > thing is the logistics: where the source code lives, how it's kept in
> > > > > > > sync with libbpf, how we avoid crippling libbpf itself because
> > > > > > > something is hard or inconvenient to adapt w/ Python, etc.
> > > > > >
> > > > > > [..]
> > > > > > > The problem I'm trying to solve here is not really C-specific. I don't
> > > > > > > think you can solve it without code generation for C++. How do you
> > > > > > > "generate" BPF program-specific layout of .data, .bss, .rodata, etc
> > > > > > > data sections in such a way, where it's type safe (to the degree that
> > > > > > > language allows that, of course) and is not "stringly-based" API? This
> > > > > > > skeleton stuff provides a natural, convenient and type-safe way to
> > > > > > > work with global data from userspace pretty much at the same level of
> > > > > > > performance and convenience, as from BPF side. How can you achieve
> > > > > > > that w/ C++ without code generation? As for Python, sure you can do
> > > > > > > dynamic lookups based on just the name of property/method, but amount
> > > > > > > of overheads is not acceptable for all applications (and Python itself
> > > > > > > is not acceptable for those applications). In addition to that, C is
> > > > > > > the best way for other less popular languages (e.g., Rust) to leverage
> > > > > > > libbpf without investing lots of effort in re-implementing libbpf in
> > > > > > > Rust.
> > > > > > I'd say that a libbpf API similar to dlopen/dlsym is a more
> > > > > > straightforward thing to do. Have a way to "open" a section and
> > > > > > a way to find a symbol in it. Yes, it's a string-based API,
> > > > > > but there is nothing wrong with it. IMO, this is easier to
> > > > > > use/understand and I suppose Python/C++ wrappers are trivial.
> > > > >
> > > > > Without digging through libbpf source code (or actually, look at code,
> > > > > but don't run any test program), what's the name of the map
> > > > > corresponding to .bss section, if object file is
> > > > > some_bpf_object_file.o? If you got it right (congrats, btw, it took me
> > > > > multiple attempts to memorize the pattern), how much time did you
> > > > > spend looking it up? Now compare it to `skel->maps.bss`. Further, if
> > > > > you use anonymous structs for your global vars, good luck maintaining
> > > > > two copies of that: one for BPF side and one for userspace.
> > > > As your average author of BPF programs I don't really care
> > > > which section my symbol ends up into. Just give me an api
> > > > to mmap all "global" sections (or a call per section which does all the
> > > > naming magic inside) and lookup symbol by name; I can cast it to a proper
> > > > type and set it.
> > >
> > > I'd like to not have to know about bss/rodata/data as well, but that's
> > > how things are done for global variables. In skeleton we can try to
> > > make an illusion like they are part of one big datasection/struct, but
> > > that seems like a bit too much magic at this point. But then again,
> > > one of the reasons I want this as an experimental feature, so that we
> > > can actually judge from real experience how inconvenient some things
> > > are, and not just based on "I think it would be ...".
> > >
> > > re: "Just give me ...". Following the spirit of "C is hard" from your
> > > previous arguments, you already have that API: mmap() syscall. C
> > > programmers have to be able to figure out the rest ;) But on the
> > > serious note, this auto-generated code in skeleton actually addresses
> > > all concerns (and more) that you mentioned: mmaping, knowing offsets,
> > > knowing names and types, etc. And it doesn't preclude adding more
> > > "conventional" additional APIs to do everything more dynamically,
> > > based on string names.
> > We have different understanding of what's difficult :-)
>
> Well, clearly... See below.
>
> >
> > To me, doing transparent data/rodata/bss mmap in bpf_object__load and then
> > adding a single libbpf api call to lookup symbol by string name is simple
> > (both from user perspective and from libbpf code complexity). Because in
> > order to use the codegen I need to teach our build system to spit it
> > out (which means I need to add bpftool to it and keep it
> > updated/etc/etc). You can use it as an example of "real experience how
> > inconvenient some things are".
>
> Yes, you need to integrate bpftool in your build process. Which is
> exactly what I'm doing internally for Facebook as well. But it's a
> mostly one-time cost, which benefits lots of users who have much
> better time with these changes, as opposed to make things simpler for
> us, libbpf developers, at the expense of more convoluted user
> experience for end users. I certainly prefer more complicated
> libbpf/bpftool code, if the resulting user experience is simpler for
> BPF application developers, no doubt about it.
I'm in the process of going through this with pahole to get proper BTF.
I don't think I'm willing yet (without a good reason) to go through
this process again :-D (I saw that you've converted a bunch of tests
to it which means I might not be able to run them).
I just hope bpftool codegen doesn't become a requirement for
any new useful feature; same happened to BTF, which was optional
for a while and now I can't run a single selftest without it.
I can totally understand the BTF requirement though, but I don't buy the
"codegen makes user experience simple for bpf application developers",
sorry (I guess, at this point, it's all about preference).
> > > > RE anonymous structs: maybe don't use them if you want to share the data
> > > > between bpf and userspace?
> > >
> > > Alright.
> > >
> > > >
> > > > > I never said there is anything wrong with current straightforward
> > > > > libbpf API, but I also never said it's the easiest and most
> > > > > user-friendly way to work with BPF either. So we'll have both
> > > > > code-generated interface and existing API. Furthermore, they are
> > > > > interoperable (you can pass skel->maps.whatever to any of the existing
> > > > > libbpf APIs, same for progs, links, obj itself). But there isn't much
> > > > > that can beat performance and usability of code-generated .data, .bss,
> > > > > .rodata (and now .extern) layout.
> > > > I haven't looked closely enough, but is there a libbpf api to get
> > > > an offset of a variable? Suppose I have the following in bpf.c:
> > > >
> > > > int a;
> > > > int b;
> > > >
> > > > Can I get an offset of 'b' in the .bss without manually parsing BTF?
> > >
> > > No there isn't right now. There isn't even an API to know that there
> > > is such a variable called "b". Except for this skeleton, of course.
> > >
> > > >
> > > > TBH, I don't buy the performance argument for these global maps.
> > > > When you did the mmap patchset for the array, you said it yourself
> > > > that it's about convenience and not performance.
> > >
> > > Yes, it's first and foremost about convenience, addressing exactly the
> > > problems you mentioned above. But performance is critical for some use
> > > cases, and nothing can beat memory-mapped view of BPF map for those.
> > > Think about the case of frequently polling (or even atomically
> > > exchanging) some stats from userspace, as one possible example. E.g.,
> > > like some map statistics (number of filled elements, p50 of whatever
> > > of those elements, etc). I'm not sure what's there to buy: doing
> > > syscall to get **entire** global data map contents vs just fetching
> > > single integer from memory-mapped region, guess which one is cheaper?
> > My understanding was that when you were talking about performance, you
> > were talking about doing symbol offset lookup at runtime vs having a
> > generated struct with fixed offsets; not about mmap vs old api with copy
> > (this debate is settled since your patches are accepted).
>
> Oh, I see. No, I didn't intend to claim that performance of looking up
> variable by name in BTF is a big performance concern. Settled then :)
>
> >
> > But to your original reply: you do understand that if you have multiple
> > threads that write to this global data you have a bigger problem, right?
>
> Not necessarily. BPF has atomic increment instruction, doesn't it? And
> can't we still do atomic swap from user-space (it's just a memory,
> after all), right? I haven't tried, tbh, but don't see why it wouldn't
> work.
Atomics are even worse because you get all these nice cache bouncing effects.
That's why I didn't understand initialy the argument about performance.
On Wed, Dec 11, 2019 at 12:09 PM Stanislav Fomichev <[email protected]> wrote:
>
> On 12/11, Andrii Nakryiko wrote:
> > On Wed, Dec 11, 2019 at 11:15 AM Stanislav Fomichev <[email protected]> wrote:
> > >
> > > On 12/11, Andrii Nakryiko wrote:
> > > > On Wed, Dec 11, 2019 at 9:24 AM Stanislav Fomichev <[email protected]> wrote:
> > > > >
> > > > > On 12/10, Andrii Nakryiko wrote:
> > > > > > On Tue, Dec 10, 2019 at 2:59 PM Stanislav Fomichev <[email protected]> wrote:
> > > > > > >
> > > > > > > On 12/10, Andrii Nakryiko wrote:
> > > > > > > > On Tue, Dec 10, 2019 at 1:44 PM Stanislav Fomichev <[email protected]> wrote:
> > > > > > > > >
> > > > > > > > > On 12/10, Jakub Kicinski wrote:
> > > > > > > > > > On Tue, 10 Dec 2019 09:11:31 -0800, Andrii Nakryiko wrote:
> > > > > > > > > > > On Mon, Dec 9, 2019 at 5:57 PM Jakub Kicinski wrote:
> > > > > > > > > > > > On Mon, 9 Dec 2019 17:14:34 -0800, Andrii Nakryiko wrote:
> > > > > > > > > > > > > struct <object-name> {
> > > > > > > > > > > > > /* used by libbpf's skeleton API */
> > > > > > > > > > > > > struct bpf_object_skeleton *skeleton;
> > > > > > > > > > > > > /* bpf_object for libbpf APIs */
> > > > > > > > > > > > > struct bpf_object *obj;
> > > > > > > > > > > > > struct {
> > > > > > > > > > > > > /* for every defined map in BPF object: */
> > > > > > > > > > > > > struct bpf_map *<map-name>;
> > > > > > > > > > > > > } maps;
> > > > > > > > > > > > > struct {
> > > > > > > > > > > > > /* for every program in BPF object: */
> > > > > > > > > > > > > struct bpf_program *<program-name>;
> > > > > > > > > > > > > } progs;
> > > > > > > > > > > > > struct {
> > > > > > > > > > > > > /* for every program in BPF object: */
> > > > > > > > > > > > > struct bpf_link *<program-name>;
> > > > > > > > > > > > > } links;
> > > > > > > > > > > > > /* for every present global data section: */
> > > > > > > > > > > > > struct <object-name>__<one of bss, data, or rodata> {
> > > > > > > > > > > > > /* memory layout of corresponding data section,
> > > > > > > > > > > > > * with every defined variable represented as a struct field
> > > > > > > > > > > > > * with exactly the same type, but without const/volatile
> > > > > > > > > > > > > * modifiers, e.g.:
> > > > > > > > > > > > > */
> > > > > > > > > > > > > int *my_var_1;
> > > > > > > > > > > > > ...
> > > > > > > > > > > > > } *<one of bss, data, or rodata>;
> > > > > > > > > > > > > };
> > > > > > > > > > > >
> > > > > > > > > > > > I think I understand how this is useful, but perhaps the problem here
> > > > > > > > > > > > is that we're using C for everything, and simple programs for which
> > > > > > > > > > > > loading the ELF is majority of the code would be better of being
> > > > > > > > > > > > written in a dynamic language like python? Would it perhaps be a
> > > > > > > > > > > > better idea to work on some high-level language bindings than spend
> > > > > > > > > > > > time writing code gens and working around limitations of C?
> > > > > > > > > > >
> > > > > > > > > > > None of this work prevents Python bindings and other improvements, is
> > > > > > > > > > > it? Patches, as always, are greatly appreciated ;)
> > > > > > > > > >
> > > > > > > > > > This "do it yourself" shit is not really funny :/
> > > > > > > > > >
> > > > > > > > > > I'll stop providing feedback on BPF patches if you guy keep saying
> > > > > > > > > > that :/ Maybe that's what you want.
> > > > > > > > > >
> > > > > > > > > > > This skeleton stuff is not just to save code, but in general to
> > > > > > > > > > > simplify and streamline working with BPF program from userspace side.
> > > > > > > > > > > Fortunately or not, but there are a lot of real-world applications
> > > > > > > > > > > written in C and C++ that could benefit from this, so this is still
> > > > > > > > > > > immensely useful. selftests/bpf themselves benefit a lot from this
> > > > > > > > > > > work, see few of the last patches in this series.
> > > > > > > > > >
> > > > > > > > > > Maybe those applications are written in C and C++ _because_ there
> > > > > > > > > > are no bindings for high level languages. I just wish BPF programming
> > > > > > > > > > was less weird and adding some funky codegen is not getting us closer
> > > > > > > > > > to that goal.
> > > > > > > > > >
> > > > > > > > > > In my experience code gen is nothing more than a hack to work around
> > > > > > > > > > bad APIs, but experiences differ so that's not a solid argument.
> > > > > > > > > *nod*
> > > > > > > > >
> > > > > > > > > We have a nice set of C++ wrappers around libbpf internally, so we can do
> > > > > > > > > something like BpfMap<key type, value type> and get a much better interface
> > > > > > > > > with type checking. Maybe we should focus on higher level languages instead?
> > > > > > > > > We are open to open-sourcing our C++ bits if you want to collaborate.
> > > > > > > >
> > > > > > > > Python/C++ bindings and API wrappers are an orthogonal concerns here.
> > > > > > > > I personally think it would be great to have both Python and C++
> > > > > > > > specific API that uses libbpf under the cover. The only debatable
> > > > > > > > thing is the logistics: where the source code lives, how it's kept in
> > > > > > > > sync with libbpf, how we avoid crippling libbpf itself because
> > > > > > > > something is hard or inconvenient to adapt w/ Python, etc.
> > > > > > >
> > > > > > > [..]
> > > > > > > > The problem I'm trying to solve here is not really C-specific. I don't
> > > > > > > > think you can solve it without code generation for C++. How do you
> > > > > > > > "generate" BPF program-specific layout of .data, .bss, .rodata, etc
> > > > > > > > data sections in such a way, where it's type safe (to the degree that
> > > > > > > > language allows that, of course) and is not "stringly-based" API? This
> > > > > > > > skeleton stuff provides a natural, convenient and type-safe way to
> > > > > > > > work with global data from userspace pretty much at the same level of
> > > > > > > > performance and convenience, as from BPF side. How can you achieve
> > > > > > > > that w/ C++ without code generation? As for Python, sure you can do
> > > > > > > > dynamic lookups based on just the name of property/method, but amount
> > > > > > > > of overheads is not acceptable for all applications (and Python itself
> > > > > > > > is not acceptable for those applications). In addition to that, C is
> > > > > > > > the best way for other less popular languages (e.g., Rust) to leverage
> > > > > > > > libbpf without investing lots of effort in re-implementing libbpf in
> > > > > > > > Rust.
> > > > > > > I'd say that a libbpf API similar to dlopen/dlsym is a more
> > > > > > > straightforward thing to do. Have a way to "open" a section and
> > > > > > > a way to find a symbol in it. Yes, it's a string-based API,
> > > > > > > but there is nothing wrong with it. IMO, this is easier to
> > > > > > > use/understand and I suppose Python/C++ wrappers are trivial.
> > > > > >
> > > > > > Without digging through libbpf source code (or actually, look at code,
> > > > > > but don't run any test program), what's the name of the map
> > > > > > corresponding to .bss section, if object file is
> > > > > > some_bpf_object_file.o? If you got it right (congrats, btw, it took me
> > > > > > multiple attempts to memorize the pattern), how much time did you
> > > > > > spend looking it up? Now compare it to `skel->maps.bss`. Further, if
> > > > > > you use anonymous structs for your global vars, good luck maintaining
> > > > > > two copies of that: one for BPF side and one for userspace.
> > > > > As your average author of BPF programs I don't really care
> > > > > which section my symbol ends up into. Just give me an api
> > > > > to mmap all "global" sections (or a call per section which does all the
> > > > > naming magic inside) and lookup symbol by name; I can cast it to a proper
> > > > > type and set it.
> > > >
> > > > I'd like to not have to know about bss/rodata/data as well, but that's
> > > > how things are done for global variables. In skeleton we can try to
> > > > make an illusion like they are part of one big datasection/struct, but
> > > > that seems like a bit too much magic at this point. But then again,
> > > > one of the reasons I want this as an experimental feature, so that we
> > > > can actually judge from real experience how inconvenient some things
> > > > are, and not just based on "I think it would be ...".
> > > >
> > > > re: "Just give me ...". Following the spirit of "C is hard" from your
> > > > previous arguments, you already have that API: mmap() syscall. C
> > > > programmers have to be able to figure out the rest ;) But on the
> > > > serious note, this auto-generated code in skeleton actually addresses
> > > > all concerns (and more) that you mentioned: mmaping, knowing offsets,
> > > > knowing names and types, etc. And it doesn't preclude adding more
> > > > "conventional" additional APIs to do everything more dynamically,
> > > > based on string names.
> > > We have different understanding of what's difficult :-)
> >
> > Well, clearly... See below.
> >
> > >
> > > To me, doing transparent data/rodata/bss mmap in bpf_object__load and then
> > > adding a single libbpf api call to lookup symbol by string name is simple
> > > (both from user perspective and from libbpf code complexity). Because in
> > > order to use the codegen I need to teach our build system to spit it
> > > out (which means I need to add bpftool to it and keep it
> > > updated/etc/etc). You can use it as an example of "real experience how
> > > inconvenient some things are".
> >
> > Yes, you need to integrate bpftool in your build process. Which is
> > exactly what I'm doing internally for Facebook as well. But it's a
> > mostly one-time cost, which benefits lots of users who have much
> > better time with these changes, as opposed to make things simpler for
> > us, libbpf developers, at the expense of more convoluted user
> > experience for end users. I certainly prefer more complicated
> > libbpf/bpftool code, if the resulting user experience is simpler for
> > BPF application developers, no doubt about it.
> I'm in the process of going through this with pahole to get proper BTF.
> I don't think I'm willing yet (without a good reason) to go through
> this process again :-D (I saw that you've converted a bunch of tests
> to it which means I might not be able to run them).
A lot of new functionality is depending on BTF for a really good
reason (check Alexei's fentry/fexit/btf_tp stuff, allowing for safe
direct memory reads and extremely low overhead kretprobes). More stuff
is to come and is going to require in-kernel BTF, so even if it's
painful right now, it's worth it. Think long term and keep perspective
in mind.
>
> I just hope bpftool codegen doesn't become a requirement for
> any new useful feature; same happened to BTF, which was optional
> for a while and now I can't run a single selftest without it.
> I can totally understand the BTF requirement though, but I don't buy the
> "codegen makes user experience simple for bpf application developers",
> sorry (I guess, at this point, it's all about preference).
Bpftool is going to be a requirement for selftests. And it's a good
thing because it allows us to continuously test not just libbpf,
kernel, but now also related tooling. I haven't converted all of the
selftests to skeleton, but given enough time I'd do that, just for the
cleaner and shorter plumbing code it gives.
>
> > > > > RE anonymous structs: maybe don't use them if you want to share the data
> > > > > between bpf and userspace?
> > > >
> > > > Alright.
> > > >
> > > > >
> > > > > > I never said there is anything wrong with current straightforward
> > > > > > libbpf API, but I also never said it's the easiest and most
> > > > > > user-friendly way to work with BPF either. So we'll have both
> > > > > > code-generated interface and existing API. Furthermore, they are
> > > > > > interoperable (you can pass skel->maps.whatever to any of the existing
> > > > > > libbpf APIs, same for progs, links, obj itself). But there isn't much
> > > > > > that can beat performance and usability of code-generated .data, .bss,
> > > > > > .rodata (and now .extern) layout.
> > > > > I haven't looked closely enough, but is there a libbpf api to get
> > > > > an offset of a variable? Suppose I have the following in bpf.c:
> > > > >
> > > > > int a;
> > > > > int b;
> > > > >
> > > > > Can I get an offset of 'b' in the .bss without manually parsing BTF?
> > > >
> > > > No there isn't right now. There isn't even an API to know that there
> > > > is such a variable called "b". Except for this skeleton, of course.
> > > >
> > > > >
> > > > > TBH, I don't buy the performance argument for these global maps.
> > > > > When you did the mmap patchset for the array, you said it yourself
> > > > > that it's about convenience and not performance.
> > > >
> > > > Yes, it's first and foremost about convenience, addressing exactly the
> > > > problems you mentioned above. But performance is critical for some use
> > > > cases, and nothing can beat memory-mapped view of BPF map for those.
> > > > Think about the case of frequently polling (or even atomically
> > > > exchanging) some stats from userspace, as one possible example. E.g.,
> > > > like some map statistics (number of filled elements, p50 of whatever
> > > > of those elements, etc). I'm not sure what's there to buy: doing
> > > > syscall to get **entire** global data map contents vs just fetching
> > > > single integer from memory-mapped region, guess which one is cheaper?
> > > My understanding was that when you were talking about performance, you
> > > were talking about doing symbol offset lookup at runtime vs having a
> > > generated struct with fixed offsets; not about mmap vs old api with copy
> > > (this debate is settled since your patches are accepted).
> >
> > Oh, I see. No, I didn't intend to claim that performance of looking up
> > variable by name in BTF is a big performance concern. Settled then :)
> >
> > >
> > > But to your original reply: you do understand that if you have multiple
> > > threads that write to this global data you have a bigger problem, right?
> >
> > Not necessarily. BPF has atomic increment instruction, doesn't it? And
> > can't we still do atomic swap from user-space (it's just a memory,
> > after all), right? I haven't tried, tbh, but don't see why it wouldn't
> > work.
> Atomics are even worse because you get all these nice cache bouncing effects.
> That's why I didn't understand initialy the argument about performance.
Depends on problems you are trying to solve. I bet it's still cheaper
than doing map updates under lock, don't you think?
On 12/11, Andrii Nakryiko wrote:
> On Wed, Dec 11, 2019 at 12:09 PM Stanislav Fomichev <[email protected]> wrote:
> >
> > On 12/11, Andrii Nakryiko wrote:
> > > On Wed, Dec 11, 2019 at 11:15 AM Stanislav Fomichev <[email protected]> wrote:
> > > >
> > > > On 12/11, Andrii Nakryiko wrote:
> > > > > On Wed, Dec 11, 2019 at 9:24 AM Stanislav Fomichev <[email protected]> wrote:
> > > > > >
> > > > > > On 12/10, Andrii Nakryiko wrote:
> > > > > > > On Tue, Dec 10, 2019 at 2:59 PM Stanislav Fomichev <[email protected]> wrote:
> > > > > > > >
> > > > > > > > On 12/10, Andrii Nakryiko wrote:
> > > > > > > > > On Tue, Dec 10, 2019 at 1:44 PM Stanislav Fomichev <[email protected]> wrote:
> > > > > > > > > >
> > > > > > > > > > On 12/10, Jakub Kicinski wrote:
> > > > > > > > > > > On Tue, 10 Dec 2019 09:11:31 -0800, Andrii Nakryiko wrote:
> > > > > > > > > > > > On Mon, Dec 9, 2019 at 5:57 PM Jakub Kicinski wrote:
> > > > > > > > > > > > > On Mon, 9 Dec 2019 17:14:34 -0800, Andrii Nakryiko wrote:
> > > > > > > > > > > > > > struct <object-name> {
> > > > > > > > > > > > > > /* used by libbpf's skeleton API */
> > > > > > > > > > > > > > struct bpf_object_skeleton *skeleton;
> > > > > > > > > > > > > > /* bpf_object for libbpf APIs */
> > > > > > > > > > > > > > struct bpf_object *obj;
> > > > > > > > > > > > > > struct {
> > > > > > > > > > > > > > /* for every defined map in BPF object: */
> > > > > > > > > > > > > > struct bpf_map *<map-name>;
> > > > > > > > > > > > > > } maps;
> > > > > > > > > > > > > > struct {
> > > > > > > > > > > > > > /* for every program in BPF object: */
> > > > > > > > > > > > > > struct bpf_program *<program-name>;
> > > > > > > > > > > > > > } progs;
> > > > > > > > > > > > > > struct {
> > > > > > > > > > > > > > /* for every program in BPF object: */
> > > > > > > > > > > > > > struct bpf_link *<program-name>;
> > > > > > > > > > > > > > } links;
> > > > > > > > > > > > > > /* for every present global data section: */
> > > > > > > > > > > > > > struct <object-name>__<one of bss, data, or rodata> {
> > > > > > > > > > > > > > /* memory layout of corresponding data section,
> > > > > > > > > > > > > > * with every defined variable represented as a struct field
> > > > > > > > > > > > > > * with exactly the same type, but without const/volatile
> > > > > > > > > > > > > > * modifiers, e.g.:
> > > > > > > > > > > > > > */
> > > > > > > > > > > > > > int *my_var_1;
> > > > > > > > > > > > > > ...
> > > > > > > > > > > > > > } *<one of bss, data, or rodata>;
> > > > > > > > > > > > > > };
> > > > > > > > > > > > >
> > > > > > > > > > > > > I think I understand how this is useful, but perhaps the problem here
> > > > > > > > > > > > > is that we're using C for everything, and simple programs for which
> > > > > > > > > > > > > loading the ELF is majority of the code would be better of being
> > > > > > > > > > > > > written in a dynamic language like python? Would it perhaps be a
> > > > > > > > > > > > > better idea to work on some high-level language bindings than spend
> > > > > > > > > > > > > time writing code gens and working around limitations of C?
> > > > > > > > > > > >
> > > > > > > > > > > > None of this work prevents Python bindings and other improvements, is
> > > > > > > > > > > > it? Patches, as always, are greatly appreciated ;)
> > > > > > > > > > >
> > > > > > > > > > > This "do it yourself" shit is not really funny :/
> > > > > > > > > > >
> > > > > > > > > > > I'll stop providing feedback on BPF patches if you guy keep saying
> > > > > > > > > > > that :/ Maybe that's what you want.
> > > > > > > > > > >
> > > > > > > > > > > > This skeleton stuff is not just to save code, but in general to
> > > > > > > > > > > > simplify and streamline working with BPF program from userspace side.
> > > > > > > > > > > > Fortunately or not, but there are a lot of real-world applications
> > > > > > > > > > > > written in C and C++ that could benefit from this, so this is still
> > > > > > > > > > > > immensely useful. selftests/bpf themselves benefit a lot from this
> > > > > > > > > > > > work, see few of the last patches in this series.
> > > > > > > > > > >
> > > > > > > > > > > Maybe those applications are written in C and C++ _because_ there
> > > > > > > > > > > are no bindings for high level languages. I just wish BPF programming
> > > > > > > > > > > was less weird and adding some funky codegen is not getting us closer
> > > > > > > > > > > to that goal.
> > > > > > > > > > >
> > > > > > > > > > > In my experience code gen is nothing more than a hack to work around
> > > > > > > > > > > bad APIs, but experiences differ so that's not a solid argument.
> > > > > > > > > > *nod*
> > > > > > > > > >
> > > > > > > > > > We have a nice set of C++ wrappers around libbpf internally, so we can do
> > > > > > > > > > something like BpfMap<key type, value type> and get a much better interface
> > > > > > > > > > with type checking. Maybe we should focus on higher level languages instead?
> > > > > > > > > > We are open to open-sourcing our C++ bits if you want to collaborate.
> > > > > > > > >
> > > > > > > > > Python/C++ bindings and API wrappers are an orthogonal concerns here.
> > > > > > > > > I personally think it would be great to have both Python and C++
> > > > > > > > > specific API that uses libbpf under the cover. The only debatable
> > > > > > > > > thing is the logistics: where the source code lives, how it's kept in
> > > > > > > > > sync with libbpf, how we avoid crippling libbpf itself because
> > > > > > > > > something is hard or inconvenient to adapt w/ Python, etc.
> > > > > > > >
> > > > > > > > [..]
> > > > > > > > > The problem I'm trying to solve here is not really C-specific. I don't
> > > > > > > > > think you can solve it without code generation for C++. How do you
> > > > > > > > > "generate" BPF program-specific layout of .data, .bss, .rodata, etc
> > > > > > > > > data sections in such a way, where it's type safe (to the degree that
> > > > > > > > > language allows that, of course) and is not "stringly-based" API? This
> > > > > > > > > skeleton stuff provides a natural, convenient and type-safe way to
> > > > > > > > > work with global data from userspace pretty much at the same level of
> > > > > > > > > performance and convenience, as from BPF side. How can you achieve
> > > > > > > > > that w/ C++ without code generation? As for Python, sure you can do
> > > > > > > > > dynamic lookups based on just the name of property/method, but amount
> > > > > > > > > of overheads is not acceptable for all applications (and Python itself
> > > > > > > > > is not acceptable for those applications). In addition to that, C is
> > > > > > > > > the best way for other less popular languages (e.g., Rust) to leverage
> > > > > > > > > libbpf without investing lots of effort in re-implementing libbpf in
> > > > > > > > > Rust.
> > > > > > > > I'd say that a libbpf API similar to dlopen/dlsym is a more
> > > > > > > > straightforward thing to do. Have a way to "open" a section and
> > > > > > > > a way to find a symbol in it. Yes, it's a string-based API,
> > > > > > > > but there is nothing wrong with it. IMO, this is easier to
> > > > > > > > use/understand and I suppose Python/C++ wrappers are trivial.
> > > > > > >
> > > > > > > Without digging through libbpf source code (or actually, look at code,
> > > > > > > but don't run any test program), what's the name of the map
> > > > > > > corresponding to .bss section, if object file is
> > > > > > > some_bpf_object_file.o? If you got it right (congrats, btw, it took me
> > > > > > > multiple attempts to memorize the pattern), how much time did you
> > > > > > > spend looking it up? Now compare it to `skel->maps.bss`. Further, if
> > > > > > > you use anonymous structs for your global vars, good luck maintaining
> > > > > > > two copies of that: one for BPF side and one for userspace.
> > > > > > As your average author of BPF programs I don't really care
> > > > > > which section my symbol ends up into. Just give me an api
> > > > > > to mmap all "global" sections (or a call per section which does all the
> > > > > > naming magic inside) and lookup symbol by name; I can cast it to a proper
> > > > > > type and set it.
> > > > >
> > > > > I'd like to not have to know about bss/rodata/data as well, but that's
> > > > > how things are done for global variables. In skeleton we can try to
> > > > > make an illusion like they are part of one big datasection/struct, but
> > > > > that seems like a bit too much magic at this point. But then again,
> > > > > one of the reasons I want this as an experimental feature, so that we
> > > > > can actually judge from real experience how inconvenient some things
> > > > > are, and not just based on "I think it would be ...".
> > > > >
> > > > > re: "Just give me ...". Following the spirit of "C is hard" from your
> > > > > previous arguments, you already have that API: mmap() syscall. C
> > > > > programmers have to be able to figure out the rest ;) But on the
> > > > > serious note, this auto-generated code in skeleton actually addresses
> > > > > all concerns (and more) that you mentioned: mmaping, knowing offsets,
> > > > > knowing names and types, etc. And it doesn't preclude adding more
> > > > > "conventional" additional APIs to do everything more dynamically,
> > > > > based on string names.
> > > > We have different understanding of what's difficult :-)
> > >
> > > Well, clearly... See below.
> > >
> > > >
> > > > To me, doing transparent data/rodata/bss mmap in bpf_object__load and then
> > > > adding a single libbpf api call to lookup symbol by string name is simple
> > > > (both from user perspective and from libbpf code complexity). Because in
> > > > order to use the codegen I need to teach our build system to spit it
> > > > out (which means I need to add bpftool to it and keep it
> > > > updated/etc/etc). You can use it as an example of "real experience how
> > > > inconvenient some things are".
> > >
> > > Yes, you need to integrate bpftool in your build process. Which is
> > > exactly what I'm doing internally for Facebook as well. But it's a
> > > mostly one-time cost, which benefits lots of users who have much
> > > better time with these changes, as opposed to make things simpler for
> > > us, libbpf developers, at the expense of more convoluted user
> > > experience for end users. I certainly prefer more complicated
> > > libbpf/bpftool code, if the resulting user experience is simpler for
> > > BPF application developers, no doubt about it.
> > I'm in the process of going through this with pahole to get proper BTF.
> > I don't think I'm willing yet (without a good reason) to go through
> > this process again :-D (I saw that you've converted a bunch of tests
> > to it which means I might not be able to run them).
>
> A lot of new functionality is depending on BTF for a really good
> reason (check Alexei's fentry/fexit/btf_tp stuff, allowing for safe
> direct memory reads and extremely low overhead kretprobes). More stuff
> is to come and is going to require in-kernel BTF, so even if it's
> painful right now, it's worth it. Think long term and keep perspective
> in mind.
Oh yeah, that's totally understandable with BTF, that's why I just started
adding it to our build. Still was a bit surprising that one day most
our our testing went red.
> > I just hope bpftool codegen doesn't become a requirement for
> > any new useful feature; same happened to BTF, which was optional
> > for a while and now I can't run a single selftest without it.
> > I can totally understand the BTF requirement though, but I don't buy the
> > "codegen makes user experience simple for bpf application developers",
> > sorry (I guess, at this point, it's all about preference).
>
> Bpftool is going to be a requirement for selftests. And it's a good
> thing because it allows us to continuously test not just libbpf,
> kernel, but now also related tooling. I haven't converted all of the
> selftests to skeleton, but given enough time I'd do that, just for the
> cleaner and shorter plumbing code it gives.
Then why all the talk about --experimantal flags if you've set up your
mind on converting everything already?
Btw, how hard it would be to do this generation with a new python
script instead of bpftool? Something along the lines of
scripts/bpf_helpers_doc.py that parses BTF and spits out this C header
(shouldn't be that hard to write custom BTF parser in python, right)?
Python might be easier to integrate with other projects (our build system,
cilium, etc).
> > > > > > RE anonymous structs: maybe don't use them if you want to share the data
> > > > > > between bpf and userspace?
> > > > >
> > > > > Alright.
> > > > >
> > > > > >
> > > > > > > I never said there is anything wrong with current straightforward
> > > > > > > libbpf API, but I also never said it's the easiest and most
> > > > > > > user-friendly way to work with BPF either. So we'll have both
> > > > > > > code-generated interface and existing API. Furthermore, they are
> > > > > > > interoperable (you can pass skel->maps.whatever to any of the existing
> > > > > > > libbpf APIs, same for progs, links, obj itself). But there isn't much
> > > > > > > that can beat performance and usability of code-generated .data, .bss,
> > > > > > > .rodata (and now .extern) layout.
> > > > > > I haven't looked closely enough, but is there a libbpf api to get
> > > > > > an offset of a variable? Suppose I have the following in bpf.c:
> > > > > >
> > > > > > int a;
> > > > > > int b;
> > > > > >
> > > > > > Can I get an offset of 'b' in the .bss without manually parsing BTF?
> > > > >
> > > > > No there isn't right now. There isn't even an API to know that there
> > > > > is such a variable called "b". Except for this skeleton, of course.
> > > > >
> > > > > >
> > > > > > TBH, I don't buy the performance argument for these global maps.
> > > > > > When you did the mmap patchset for the array, you said it yourself
> > > > > > that it's about convenience and not performance.
> > > > >
> > > > > Yes, it's first and foremost about convenience, addressing exactly the
> > > > > problems you mentioned above. But performance is critical for some use
> > > > > cases, and nothing can beat memory-mapped view of BPF map for those.
> > > > > Think about the case of frequently polling (or even atomically
> > > > > exchanging) some stats from userspace, as one possible example. E.g.,
> > > > > like some map statistics (number of filled elements, p50 of whatever
> > > > > of those elements, etc). I'm not sure what's there to buy: doing
> > > > > syscall to get **entire** global data map contents vs just fetching
> > > > > single integer from memory-mapped region, guess which one is cheaper?
> > > > My understanding was that when you were talking about performance, you
> > > > were talking about doing symbol offset lookup at runtime vs having a
> > > > generated struct with fixed offsets; not about mmap vs old api with copy
> > > > (this debate is settled since your patches are accepted).
> > >
> > > Oh, I see. No, I didn't intend to claim that performance of looking up
> > > variable by name in BTF is a big performance concern. Settled then :)
> > >
> > > >
> > > > But to your original reply: you do understand that if you have multiple
> > > > threads that write to this global data you have a bigger problem, right?
> > >
> > > Not necessarily. BPF has atomic increment instruction, doesn't it? And
> > > can't we still do atomic swap from user-space (it's just a memory,
> > > after all), right? I haven't tried, tbh, but don't see why it wouldn't
> > > work.
> > Atomics are even worse because you get all these nice cache bouncing effects.
> > That's why I didn't understand initialy the argument about performance.
>
> Depends on problems you are trying to solve. I bet it's still cheaper
> than doing map updates under lock, don't you think?
Exactly, depends on the problem. But still, if you need frequent reads
of that global data, it means there are frequent writes, which is
a problem on its own.
On Wed, Dec 11, 2019 at 6:57 PM Stanislav Fomichev <[email protected]> wrote:
>
> On 12/11, Andrii Nakryiko wrote:
> > On Wed, Dec 11, 2019 at 12:09 PM Stanislav Fomichev <[email protected]> wrote:
> > >
> > > On 12/11, Andrii Nakryiko wrote:
> > > > On Wed, Dec 11, 2019 at 11:15 AM Stanislav Fomichev <[email protected]> wrote:
> > > > >
> > > > > On 12/11, Andrii Nakryiko wrote:
> > > > > > On Wed, Dec 11, 2019 at 9:24 AM Stanislav Fomichev <[email protected]> wrote:
> > > > > > >
> > > > > > > On 12/10, Andrii Nakryiko wrote:
> > > > > > > > On Tue, Dec 10, 2019 at 2:59 PM Stanislav Fomichev <[email protected]> wrote:
> > > > > > > > >
> > > > > > > > > On 12/10, Andrii Nakryiko wrote:
> > > > > > > > > > On Tue, Dec 10, 2019 at 1:44 PM Stanislav Fomichev <[email protected]> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On 12/10, Jakub Kicinski wrote:
> > > > > > > > > > > > On Tue, 10 Dec 2019 09:11:31 -0800, Andrii Nakryiko wrote:
> > > > > > > > > > > > > On Mon, Dec 9, 2019 at 5:57 PM Jakub Kicinski wrote:
> > > > > > > > > > > > > > On Mon, 9 Dec 2019 17:14:34 -0800, Andrii Nakryiko wrote:
> > > > > > > > > > > > > > > struct <object-name> {
> > > > > > > > > > > > > > > /* used by libbpf's skeleton API */
> > > > > > > > > > > > > > > struct bpf_object_skeleton *skeleton;
> > > > > > > > > > > > > > > /* bpf_object for libbpf APIs */
> > > > > > > > > > > > > > > struct bpf_object *obj;
> > > > > > > > > > > > > > > struct {
> > > > > > > > > > > > > > > /* for every defined map in BPF object: */
> > > > > > > > > > > > > > > struct bpf_map *<map-name>;
> > > > > > > > > > > > > > > } maps;
> > > > > > > > > > > > > > > struct {
> > > > > > > > > > > > > > > /* for every program in BPF object: */
> > > > > > > > > > > > > > > struct bpf_program *<program-name>;
> > > > > > > > > > > > > > > } progs;
> > > > > > > > > > > > > > > struct {
> > > > > > > > > > > > > > > /* for every program in BPF object: */
> > > > > > > > > > > > > > > struct bpf_link *<program-name>;
> > > > > > > > > > > > > > > } links;
> > > > > > > > > > > > > > > /* for every present global data section: */
> > > > > > > > > > > > > > > struct <object-name>__<one of bss, data, or rodata> {
> > > > > > > > > > > > > > > /* memory layout of corresponding data section,
> > > > > > > > > > > > > > > * with every defined variable represented as a struct field
> > > > > > > > > > > > > > > * with exactly the same type, but without const/volatile
> > > > > > > > > > > > > > > * modifiers, e.g.:
> > > > > > > > > > > > > > > */
> > > > > > > > > > > > > > > int *my_var_1;
> > > > > > > > > > > > > > > ...
> > > > > > > > > > > > > > > } *<one of bss, data, or rodata>;
> > > > > > > > > > > > > > > };
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I think I understand how this is useful, but perhaps the problem here
> > > > > > > > > > > > > > is that we're using C for everything, and simple programs for which
> > > > > > > > > > > > > > loading the ELF is majority of the code would be better of being
> > > > > > > > > > > > > > written in a dynamic language like python? Would it perhaps be a
> > > > > > > > > > > > > > better idea to work on some high-level language bindings than spend
> > > > > > > > > > > > > > time writing code gens and working around limitations of C?
> > > > > > > > > > > > >
> > > > > > > > > > > > > None of this work prevents Python bindings and other improvements, is
> > > > > > > > > > > > > it? Patches, as always, are greatly appreciated ;)
> > > > > > > > > > > >
> > > > > > > > > > > > This "do it yourself" shit is not really funny :/
> > > > > > > > > > > >
> > > > > > > > > > > > I'll stop providing feedback on BPF patches if you guy keep saying
> > > > > > > > > > > > that :/ Maybe that's what you want.
> > > > > > > > > > > >
> > > > > > > > > > > > > This skeleton stuff is not just to save code, but in general to
> > > > > > > > > > > > > simplify and streamline working with BPF program from userspace side.
> > > > > > > > > > > > > Fortunately or not, but there are a lot of real-world applications
> > > > > > > > > > > > > written in C and C++ that could benefit from this, so this is still
> > > > > > > > > > > > > immensely useful. selftests/bpf themselves benefit a lot from this
> > > > > > > > > > > > > work, see few of the last patches in this series.
> > > > > > > > > > > >
> > > > > > > > > > > > Maybe those applications are written in C and C++ _because_ there
> > > > > > > > > > > > are no bindings for high level languages. I just wish BPF programming
> > > > > > > > > > > > was less weird and adding some funky codegen is not getting us closer
> > > > > > > > > > > > to that goal.
> > > > > > > > > > > >
> > > > > > > > > > > > In my experience code gen is nothing more than a hack to work around
> > > > > > > > > > > > bad APIs, but experiences differ so that's not a solid argument.
> > > > > > > > > > > *nod*
> > > > > > > > > > >
> > > > > > > > > > > We have a nice set of C++ wrappers around libbpf internally, so we can do
> > > > > > > > > > > something like BpfMap<key type, value type> and get a much better interface
> > > > > > > > > > > with type checking. Maybe we should focus on higher level languages instead?
> > > > > > > > > > > We are open to open-sourcing our C++ bits if you want to collaborate.
> > > > > > > > > >
> > > > > > > > > > Python/C++ bindings and API wrappers are an orthogonal concerns here.
> > > > > > > > > > I personally think it would be great to have both Python and C++
> > > > > > > > > > specific API that uses libbpf under the cover. The only debatable
> > > > > > > > > > thing is the logistics: where the source code lives, how it's kept in
> > > > > > > > > > sync with libbpf, how we avoid crippling libbpf itself because
> > > > > > > > > > something is hard or inconvenient to adapt w/ Python, etc.
> > > > > > > > >
> > > > > > > > > [..]
> > > > > > > > > > The problem I'm trying to solve here is not really C-specific. I don't
> > > > > > > > > > think you can solve it without code generation for C++. How do you
> > > > > > > > > > "generate" BPF program-specific layout of .data, .bss, .rodata, etc
> > > > > > > > > > data sections in such a way, where it's type safe (to the degree that
> > > > > > > > > > language allows that, of course) and is not "stringly-based" API? This
> > > > > > > > > > skeleton stuff provides a natural, convenient and type-safe way to
> > > > > > > > > > work with global data from userspace pretty much at the same level of
> > > > > > > > > > performance and convenience, as from BPF side. How can you achieve
> > > > > > > > > > that w/ C++ without code generation? As for Python, sure you can do
> > > > > > > > > > dynamic lookups based on just the name of property/method, but amount
> > > > > > > > > > of overheads is not acceptable for all applications (and Python itself
> > > > > > > > > > is not acceptable for those applications). In addition to that, C is
> > > > > > > > > > the best way for other less popular languages (e.g., Rust) to leverage
> > > > > > > > > > libbpf without investing lots of effort in re-implementing libbpf in
> > > > > > > > > > Rust.
> > > > > > > > > I'd say that a libbpf API similar to dlopen/dlsym is a more
> > > > > > > > > straightforward thing to do. Have a way to "open" a section and
> > > > > > > > > a way to find a symbol in it. Yes, it's a string-based API,
> > > > > > > > > but there is nothing wrong with it. IMO, this is easier to
> > > > > > > > > use/understand and I suppose Python/C++ wrappers are trivial.
> > > > > > > >
> > > > > > > > Without digging through libbpf source code (or actually, look at code,
> > > > > > > > but don't run any test program), what's the name of the map
> > > > > > > > corresponding to .bss section, if object file is
> > > > > > > > some_bpf_object_file.o? If you got it right (congrats, btw, it took me
> > > > > > > > multiple attempts to memorize the pattern), how much time did you
> > > > > > > > spend looking it up? Now compare it to `skel->maps.bss`. Further, if
> > > > > > > > you use anonymous structs for your global vars, good luck maintaining
> > > > > > > > two copies of that: one for BPF side and one for userspace.
> > > > > > > As your average author of BPF programs I don't really care
> > > > > > > which section my symbol ends up into. Just give me an api
> > > > > > > to mmap all "global" sections (or a call per section which does all the
> > > > > > > naming magic inside) and lookup symbol by name; I can cast it to a proper
> > > > > > > type and set it.
> > > > > >
> > > > > > I'd like to not have to know about bss/rodata/data as well, but that's
> > > > > > how things are done for global variables. In skeleton we can try to
> > > > > > make an illusion like they are part of one big datasection/struct, but
> > > > > > that seems like a bit too much magic at this point. But then again,
> > > > > > one of the reasons I want this as an experimental feature, so that we
> > > > > > can actually judge from real experience how inconvenient some things
> > > > > > are, and not just based on "I think it would be ...".
> > > > > >
> > > > > > re: "Just give me ...". Following the spirit of "C is hard" from your
> > > > > > previous arguments, you already have that API: mmap() syscall. C
> > > > > > programmers have to be able to figure out the rest ;) But on the
> > > > > > serious note, this auto-generated code in skeleton actually addresses
> > > > > > all concerns (and more) that you mentioned: mmaping, knowing offsets,
> > > > > > knowing names and types, etc. And it doesn't preclude adding more
> > > > > > "conventional" additional APIs to do everything more dynamically,
> > > > > > based on string names.
> > > > > We have different understanding of what's difficult :-)
> > > >
> > > > Well, clearly... See below.
> > > >
> > > > >
> > > > > To me, doing transparent data/rodata/bss mmap in bpf_object__load and then
> > > > > adding a single libbpf api call to lookup symbol by string name is simple
> > > > > (both from user perspective and from libbpf code complexity). Because in
> > > > > order to use the codegen I need to teach our build system to spit it
> > > > > out (which means I need to add bpftool to it and keep it
> > > > > updated/etc/etc). You can use it as an example of "real experience how
> > > > > inconvenient some things are".
> > > >
> > > > Yes, you need to integrate bpftool in your build process. Which is
> > > > exactly what I'm doing internally for Facebook as well. But it's a
> > > > mostly one-time cost, which benefits lots of users who have much
> > > > better time with these changes, as opposed to make things simpler for
> > > > us, libbpf developers, at the expense of more convoluted user
> > > > experience for end users. I certainly prefer more complicated
> > > > libbpf/bpftool code, if the resulting user experience is simpler for
> > > > BPF application developers, no doubt about it.
> > > I'm in the process of going through this with pahole to get proper BTF.
> > > I don't think I'm willing yet (without a good reason) to go through
> > > this process again :-D (I saw that you've converted a bunch of tests
> > > to it which means I might not be able to run them).
> >
> > A lot of new functionality is depending on BTF for a really good
> > reason (check Alexei's fentry/fexit/btf_tp stuff, allowing for safe
> > direct memory reads and extremely low overhead kretprobes). More stuff
> > is to come and is going to require in-kernel BTF, so even if it's
> > painful right now, it's worth it. Think long term and keep perspective
> > in mind.
> Oh yeah, that's totally understandable with BTF, that's why I just started
> adding it to our build. Still was a bit surprising that one day most
> our our testing went red.
>
> > > I just hope bpftool codegen doesn't become a requirement for
> > > any new useful feature; same happened to BTF, which was optional
> > > for a while and now I can't run a single selftest without it.
> > > I can totally understand the BTF requirement though, but I don't buy the
> > > "codegen makes user experience simple for bpf application developers",
> > > sorry (I guess, at this point, it's all about preference).
> >
> > Bpftool is going to be a requirement for selftests. And it's a good
> > thing because it allows us to continuously test not just libbpf,
> > kernel, but now also related tooling. I haven't converted all of the
> > selftests to skeleton, but given enough time I'd do that, just for the
> > cleaner and shorter plumbing code it gives.
> Then why all the talk about --experimantal flags if you've set up your
> mind on converting everything already?
Where's contradiction? I'm not converting everything right now, same
as I haven't converted everything into test_progs, right? But I do
think that we should work towards that. But all that is still besides
the point of experimental, because we are talking about selftests, we
can atomically fix them up with whatever changes we do to those
experimental APIs.
The only reason for this experimental disclaimer is for user code
outside of kernel source tree that is going to try and use skeleton.
If we need to change something up a little bit, I'd like to still have
a bit of a wiggle room to adjust things, even if that causes a small
and easily fixable source code breakage (even though I don't see it
happening yet, it might be necessary).
>
> Btw, how hard it would be to do this generation with a new python
> script instead of bpftool? Something along the lines of
> scripts/bpf_helpers_doc.py that parses BTF and spits out this C header
> (shouldn't be that hard to write custom BTF parser in python, right)?
>
Not impossible, but harder than I'd care to deal with. I certainly
don't want to re-implement a good chunk of ELF and BTF parsing (maps,
progs, in addition to datasec stuff). But "it's hard to use bpftool in
our build system" doesn't seem like good enough reason to do all that.
> Python might be easier to integrate with other projects (our build system,
> cilium, etc).
>
> > > > > > > RE anonymous structs: maybe don't use them if you want to share the data
> > > > > > > between bpf and userspace?
> > > > > >
> > > > > > Alright.
> > > > > >
> > > > > > >
> > > > > > > > I never said there is anything wrong with current straightforward
> > > > > > > > libbpf API, but I also never said it's the easiest and most
> > > > > > > > user-friendly way to work with BPF either. So we'll have both
> > > > > > > > code-generated interface and existing API. Furthermore, they are
> > > > > > > > interoperable (you can pass skel->maps.whatever to any of the existing
> > > > > > > > libbpf APIs, same for progs, links, obj itself). But there isn't much
> > > > > > > > that can beat performance and usability of code-generated .data, .bss,
> > > > > > > > .rodata (and now .extern) layout.
> > > > > > > I haven't looked closely enough, but is there a libbpf api to get
> > > > > > > an offset of a variable? Suppose I have the following in bpf.c:
> > > > > > >
> > > > > > > int a;
> > > > > > > int b;
> > > > > > >
> > > > > > > Can I get an offset of 'b' in the .bss without manually parsing BTF?
> > > > > >
> > > > > > No there isn't right now. There isn't even an API to know that there
> > > > > > is such a variable called "b". Except for this skeleton, of course.
> > > > > >
> > > > > > >
> > > > > > > TBH, I don't buy the performance argument for these global maps.
> > > > > > > When you did the mmap patchset for the array, you said it yourself
> > > > > > > that it's about convenience and not performance.
> > > > > >
> > > > > > Yes, it's first and foremost about convenience, addressing exactly the
> > > > > > problems you mentioned above. But performance is critical for some use
> > > > > > cases, and nothing can beat memory-mapped view of BPF map for those.
> > > > > > Think about the case of frequently polling (or even atomically
> > > > > > exchanging) some stats from userspace, as one possible example. E.g.,
> > > > > > like some map statistics (number of filled elements, p50 of whatever
> > > > > > of those elements, etc). I'm not sure what's there to buy: doing
> > > > > > syscall to get **entire** global data map contents vs just fetching
> > > > > > single integer from memory-mapped region, guess which one is cheaper?
> > > > > My understanding was that when you were talking about performance, you
> > > > > were talking about doing symbol offset lookup at runtime vs having a
> > > > > generated struct with fixed offsets; not about mmap vs old api with copy
> > > > > (this debate is settled since your patches are accepted).
> > > >
> > > > Oh, I see. No, I didn't intend to claim that performance of looking up
> > > > variable by name in BTF is a big performance concern. Settled then :)
> > > >
> > > > >
> > > > > But to your original reply: you do understand that if you have multiple
> > > > > threads that write to this global data you have a bigger problem, right?
> > > >
> > > > Not necessarily. BPF has atomic increment instruction, doesn't it? And
> > > > can't we still do atomic swap from user-space (it's just a memory,
> > > > after all), right? I haven't tried, tbh, but don't see why it wouldn't
> > > > work.
> > > Atomics are even worse because you get all these nice cache bouncing effects.
> > > That's why I didn't understand initialy the argument about performance.
> >
> > Depends on problems you are trying to solve. I bet it's still cheaper
> > than doing map updates under lock, don't you think?
> Exactly, depends on the problem. But still, if you need frequent reads
> of that global data, it means there are frequent writes, which is
> a problem on its own.
On 12/11, Andrii Nakryiko wrote:
> On Wed, Dec 11, 2019 at 6:57 PM Stanislav Fomichev <[email protected]> wrote:
> >
> > On 12/11, Andrii Nakryiko wrote:
> > > On Wed, Dec 11, 2019 at 12:09 PM Stanislav Fomichev <[email protected]> wrote:
> > > >
> > > > On 12/11, Andrii Nakryiko wrote:
> > > > > On Wed, Dec 11, 2019 at 11:15 AM Stanislav Fomichev <[email protected]> wrote:
> > > > > >
> > > > > > On 12/11, Andrii Nakryiko wrote:
> > > > > > > On Wed, Dec 11, 2019 at 9:24 AM Stanislav Fomichev <[email protected]> wrote:
> > > > > > > >
> > > > > > > > On 12/10, Andrii Nakryiko wrote:
> > > > > > > > > On Tue, Dec 10, 2019 at 2:59 PM Stanislav Fomichev <[email protected]> wrote:
> > > > > > > > > >
> > > > > > > > > > On 12/10, Andrii Nakryiko wrote:
> > > > > > > > > > > On Tue, Dec 10, 2019 at 1:44 PM Stanislav Fomichev <[email protected]> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On 12/10, Jakub Kicinski wrote:
> > > > > > > > > > > > > On Tue, 10 Dec 2019 09:11:31 -0800, Andrii Nakryiko wrote:
> > > > > > > > > > > > > > On Mon, Dec 9, 2019 at 5:57 PM Jakub Kicinski wrote:
> > > > > > > > > > > > > > > On Mon, 9 Dec 2019 17:14:34 -0800, Andrii Nakryiko wrote:
> > > > > > > > > > > > > > > > struct <object-name> {
> > > > > > > > > > > > > > > > /* used by libbpf's skeleton API */
> > > > > > > > > > > > > > > > struct bpf_object_skeleton *skeleton;
> > > > > > > > > > > > > > > > /* bpf_object for libbpf APIs */
> > > > > > > > > > > > > > > > struct bpf_object *obj;
> > > > > > > > > > > > > > > > struct {
> > > > > > > > > > > > > > > > /* for every defined map in BPF object: */
> > > > > > > > > > > > > > > > struct bpf_map *<map-name>;
> > > > > > > > > > > > > > > > } maps;
> > > > > > > > > > > > > > > > struct {
> > > > > > > > > > > > > > > > /* for every program in BPF object: */
> > > > > > > > > > > > > > > > struct bpf_program *<program-name>;
> > > > > > > > > > > > > > > > } progs;
> > > > > > > > > > > > > > > > struct {
> > > > > > > > > > > > > > > > /* for every program in BPF object: */
> > > > > > > > > > > > > > > > struct bpf_link *<program-name>;
> > > > > > > > > > > > > > > > } links;
> > > > > > > > > > > > > > > > /* for every present global data section: */
> > > > > > > > > > > > > > > > struct <object-name>__<one of bss, data, or rodata> {
> > > > > > > > > > > > > > > > /* memory layout of corresponding data section,
> > > > > > > > > > > > > > > > * with every defined variable represented as a struct field
> > > > > > > > > > > > > > > > * with exactly the same type, but without const/volatile
> > > > > > > > > > > > > > > > * modifiers, e.g.:
> > > > > > > > > > > > > > > > */
> > > > > > > > > > > > > > > > int *my_var_1;
> > > > > > > > > > > > > > > > ...
> > > > > > > > > > > > > > > > } *<one of bss, data, or rodata>;
> > > > > > > > > > > > > > > > };
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I think I understand how this is useful, but perhaps the problem here
> > > > > > > > > > > > > > > is that we're using C for everything, and simple programs for which
> > > > > > > > > > > > > > > loading the ELF is majority of the code would be better of being
> > > > > > > > > > > > > > > written in a dynamic language like python? Would it perhaps be a
> > > > > > > > > > > > > > > better idea to work on some high-level language bindings than spend
> > > > > > > > > > > > > > > time writing code gens and working around limitations of C?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > None of this work prevents Python bindings and other improvements, is
> > > > > > > > > > > > > > it? Patches, as always, are greatly appreciated ;)
> > > > > > > > > > > > >
> > > > > > > > > > > > > This "do it yourself" shit is not really funny :/
> > > > > > > > > > > > >
> > > > > > > > > > > > > I'll stop providing feedback on BPF patches if you guy keep saying
> > > > > > > > > > > > > that :/ Maybe that's what you want.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > This skeleton stuff is not just to save code, but in general to
> > > > > > > > > > > > > > simplify and streamline working with BPF program from userspace side.
> > > > > > > > > > > > > > Fortunately or not, but there are a lot of real-world applications
> > > > > > > > > > > > > > written in C and C++ that could benefit from this, so this is still
> > > > > > > > > > > > > > immensely useful. selftests/bpf themselves benefit a lot from this
> > > > > > > > > > > > > > work, see few of the last patches in this series.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Maybe those applications are written in C and C++ _because_ there
> > > > > > > > > > > > > are no bindings for high level languages. I just wish BPF programming
> > > > > > > > > > > > > was less weird and adding some funky codegen is not getting us closer
> > > > > > > > > > > > > to that goal.
> > > > > > > > > > > > >
> > > > > > > > > > > > > In my experience code gen is nothing more than a hack to work around
> > > > > > > > > > > > > bad APIs, but experiences differ so that's not a solid argument.
> > > > > > > > > > > > *nod*
> > > > > > > > > > > >
> > > > > > > > > > > > We have a nice set of C++ wrappers around libbpf internally, so we can do
> > > > > > > > > > > > something like BpfMap<key type, value type> and get a much better interface
> > > > > > > > > > > > with type checking. Maybe we should focus on higher level languages instead?
> > > > > > > > > > > > We are open to open-sourcing our C++ bits if you want to collaborate.
> > > > > > > > > > >
> > > > > > > > > > > Python/C++ bindings and API wrappers are an orthogonal concerns here.
> > > > > > > > > > > I personally think it would be great to have both Python and C++
> > > > > > > > > > > specific API that uses libbpf under the cover. The only debatable
> > > > > > > > > > > thing is the logistics: where the source code lives, how it's kept in
> > > > > > > > > > > sync with libbpf, how we avoid crippling libbpf itself because
> > > > > > > > > > > something is hard or inconvenient to adapt w/ Python, etc.
> > > > > > > > > >
> > > > > > > > > > [..]
> > > > > > > > > > > The problem I'm trying to solve here is not really C-specific. I don't
> > > > > > > > > > > think you can solve it without code generation for C++. How do you
> > > > > > > > > > > "generate" BPF program-specific layout of .data, .bss, .rodata, etc
> > > > > > > > > > > data sections in such a way, where it's type safe (to the degree that
> > > > > > > > > > > language allows that, of course) and is not "stringly-based" API? This
> > > > > > > > > > > skeleton stuff provides a natural, convenient and type-safe way to
> > > > > > > > > > > work with global data from userspace pretty much at the same level of
> > > > > > > > > > > performance and convenience, as from BPF side. How can you achieve
> > > > > > > > > > > that w/ C++ without code generation? As for Python, sure you can do
> > > > > > > > > > > dynamic lookups based on just the name of property/method, but amount
> > > > > > > > > > > of overheads is not acceptable for all applications (and Python itself
> > > > > > > > > > > is not acceptable for those applications). In addition to that, C is
> > > > > > > > > > > the best way for other less popular languages (e.g., Rust) to leverage
> > > > > > > > > > > libbpf without investing lots of effort in re-implementing libbpf in
> > > > > > > > > > > Rust.
> > > > > > > > > > I'd say that a libbpf API similar to dlopen/dlsym is a more
> > > > > > > > > > straightforward thing to do. Have a way to "open" a section and
> > > > > > > > > > a way to find a symbol in it. Yes, it's a string-based API,
> > > > > > > > > > but there is nothing wrong with it. IMO, this is easier to
> > > > > > > > > > use/understand and I suppose Python/C++ wrappers are trivial.
> > > > > > > > >
> > > > > > > > > Without digging through libbpf source code (or actually, look at code,
> > > > > > > > > but don't run any test program), what's the name of the map
> > > > > > > > > corresponding to .bss section, if object file is
> > > > > > > > > some_bpf_object_file.o? If you got it right (congrats, btw, it took me
> > > > > > > > > multiple attempts to memorize the pattern), how much time did you
> > > > > > > > > spend looking it up? Now compare it to `skel->maps.bss`. Further, if
> > > > > > > > > you use anonymous structs for your global vars, good luck maintaining
> > > > > > > > > two copies of that: one for BPF side and one for userspace.
> > > > > > > > As your average author of BPF programs I don't really care
> > > > > > > > which section my symbol ends up into. Just give me an api
> > > > > > > > to mmap all "global" sections (or a call per section which does all the
> > > > > > > > naming magic inside) and lookup symbol by name; I can cast it to a proper
> > > > > > > > type and set it.
> > > > > > >
> > > > > > > I'd like to not have to know about bss/rodata/data as well, but that's
> > > > > > > how things are done for global variables. In skeleton we can try to
> > > > > > > make an illusion like they are part of one big datasection/struct, but
> > > > > > > that seems like a bit too much magic at this point. But then again,
> > > > > > > one of the reasons I want this as an experimental feature, so that we
> > > > > > > can actually judge from real experience how inconvenient some things
> > > > > > > are, and not just based on "I think it would be ...".
> > > > > > >
> > > > > > > re: "Just give me ...". Following the spirit of "C is hard" from your
> > > > > > > previous arguments, you already have that API: mmap() syscall. C
> > > > > > > programmers have to be able to figure out the rest ;) But on the
> > > > > > > serious note, this auto-generated code in skeleton actually addresses
> > > > > > > all concerns (and more) that you mentioned: mmaping, knowing offsets,
> > > > > > > knowing names and types, etc. And it doesn't preclude adding more
> > > > > > > "conventional" additional APIs to do everything more dynamically,
> > > > > > > based on string names.
> > > > > > We have different understanding of what's difficult :-)
> > > > >
> > > > > Well, clearly... See below.
> > > > >
> > > > > >
> > > > > > To me, doing transparent data/rodata/bss mmap in bpf_object__load and then
> > > > > > adding a single libbpf api call to lookup symbol by string name is simple
> > > > > > (both from user perspective and from libbpf code complexity). Because in
> > > > > > order to use the codegen I need to teach our build system to spit it
> > > > > > out (which means I need to add bpftool to it and keep it
> > > > > > updated/etc/etc). You can use it as an example of "real experience how
> > > > > > inconvenient some things are".
> > > > >
> > > > > Yes, you need to integrate bpftool in your build process. Which is
> > > > > exactly what I'm doing internally for Facebook as well. But it's a
> > > > > mostly one-time cost, which benefits lots of users who have much
> > > > > better time with these changes, as opposed to make things simpler for
> > > > > us, libbpf developers, at the expense of more convoluted user
> > > > > experience for end users. I certainly prefer more complicated
> > > > > libbpf/bpftool code, if the resulting user experience is simpler for
> > > > > BPF application developers, no doubt about it.
> > > > I'm in the process of going through this with pahole to get proper BTF.
> > > > I don't think I'm willing yet (without a good reason) to go through
> > > > this process again :-D (I saw that you've converted a bunch of tests
> > > > to it which means I might not be able to run them).
> > >
> > > A lot of new functionality is depending on BTF for a really good
> > > reason (check Alexei's fentry/fexit/btf_tp stuff, allowing for safe
> > > direct memory reads and extremely low overhead kretprobes). More stuff
> > > is to come and is going to require in-kernel BTF, so even if it's
> > > painful right now, it's worth it. Think long term and keep perspective
> > > in mind.
> > Oh yeah, that's totally understandable with BTF, that's why I just started
> > adding it to our build. Still was a bit surprising that one day most
> > our our testing went red.
> >
> > > > I just hope bpftool codegen doesn't become a requirement for
> > > > any new useful feature; same happened to BTF, which was optional
> > > > for a while and now I can't run a single selftest without it.
> > > > I can totally understand the BTF requirement though, but I don't buy the
> > > > "codegen makes user experience simple for bpf application developers",
> > > > sorry (I guess, at this point, it's all about preference).
> > >
> > > Bpftool is going to be a requirement for selftests. And it's a good
> > > thing because it allows us to continuously test not just libbpf,
> > > kernel, but now also related tooling. I haven't converted all of the
> > > selftests to skeleton, but given enough time I'd do that, just for the
> > > cleaner and shorter plumbing code it gives.
> > Then why all the talk about --experimantal flags if you've set up your
> > mind on converting everything already?
>
> Where's contradiction? I'm not converting everything right now, same
> as I haven't converted everything into test_progs, right? But I do
> think that we should work towards that. But all that is still besides
> the point of experimental, because we are talking about selftests, we
> can atomically fix them up with whatever changes we do to those
> experimental APIs.
>
> The only reason for this experimental disclaimer is for user code
> outside of kernel source tree that is going to try and use skeleton.
> If we need to change something up a little bit, I'd like to still have
> a bit of a wiggle room to adjust things, even if that causes a small
> and easily fixable source code breakage (even though I don't see it
> happening yet, it might be necessary).
In this case, I might have misunderstood your original intent.
I thought you wanted to try this experimental stuff out in
the wild, collect some feedback and then, based on that, decide
whether you want to commit or not. But it seems that it's not
the case.
> > Btw, how hard it would be to do this generation with a new python
> > script instead of bpftool? Something along the lines of
> > scripts/bpf_helpers_doc.py that parses BTF and spits out this C header
> > (shouldn't be that hard to write custom BTF parser in python, right)?
> >
>
> Not impossible, but harder than I'd care to deal with. I certainly
> don't want to re-implement a good chunk of ELF and BTF parsing (maps,
> progs, in addition to datasec stuff). But "it's hard to use bpftool in
> our build system" doesn't seem like good enough reason to do all that.
You can replace "our build system" with some other project you care about,
like systemd. They'd have the same problem with vendoring in recent enough
bpftool or waiting for every distro to do it. And all this work is
because you think that doing:
my_obj->rodata->my_var = 123;
Is easier / more type safe than doing:
int *my_var = bpf_object__rodata_lookup(obj, "my_var");
*my_var = 123;
I don't think we are going anywhere with this thread, so feel free
to ignore me and don't reply.
> > Python might be easier to integrate with other projects (our build system,
> > cilium, etc).
> >
> > > > > > > > RE anonymous structs: maybe don't use them if you want to share the data
> > > > > > > > between bpf and userspace?
> > > > > > >
> > > > > > > Alright.
> > > > > > >
> > > > > > > >
> > > > > > > > > I never said there is anything wrong with current straightforward
> > > > > > > > > libbpf API, but I also never said it's the easiest and most
> > > > > > > > > user-friendly way to work with BPF either. So we'll have both
> > > > > > > > > code-generated interface and existing API. Furthermore, they are
> > > > > > > > > interoperable (you can pass skel->maps.whatever to any of the existing
> > > > > > > > > libbpf APIs, same for progs, links, obj itself). But there isn't much
> > > > > > > > > that can beat performance and usability of code-generated .data, .bss,
> > > > > > > > > .rodata (and now .extern) layout.
> > > > > > > > I haven't looked closely enough, but is there a libbpf api to get
> > > > > > > > an offset of a variable? Suppose I have the following in bpf.c:
> > > > > > > >
> > > > > > > > int a;
> > > > > > > > int b;
> > > > > > > >
> > > > > > > > Can I get an offset of 'b' in the .bss without manually parsing BTF?
> > > > > > >
> > > > > > > No there isn't right now. There isn't even an API to know that there
> > > > > > > is such a variable called "b". Except for this skeleton, of course.
> > > > > > >
> > > > > > > >
> > > > > > > > TBH, I don't buy the performance argument for these global maps.
> > > > > > > > When you did the mmap patchset for the array, you said it yourself
> > > > > > > > that it's about convenience and not performance.
> > > > > > >
> > > > > > > Yes, it's first and foremost about convenience, addressing exactly the
> > > > > > > problems you mentioned above. But performance is critical for some use
> > > > > > > cases, and nothing can beat memory-mapped view of BPF map for those.
> > > > > > > Think about the case of frequently polling (or even atomically
> > > > > > > exchanging) some stats from userspace, as one possible example. E.g.,
> > > > > > > like some map statistics (number of filled elements, p50 of whatever
> > > > > > > of those elements, etc). I'm not sure what's there to buy: doing
> > > > > > > syscall to get **entire** global data map contents vs just fetching
> > > > > > > single integer from memory-mapped region, guess which one is cheaper?
> > > > > > My understanding was that when you were talking about performance, you
> > > > > > were talking about doing symbol offset lookup at runtime vs having a
> > > > > > generated struct with fixed offsets; not about mmap vs old api with copy
> > > > > > (this debate is settled since your patches are accepted).
> > > > >
> > > > > Oh, I see. No, I didn't intend to claim that performance of looking up
> > > > > variable by name in BTF is a big performance concern. Settled then :)
> > > > >
> > > > > >
> > > > > > But to your original reply: you do understand that if you have multiple
> > > > > > threads that write to this global data you have a bigger problem, right?
> > > > >
> > > > > Not necessarily. BPF has atomic increment instruction, doesn't it? And
> > > > > can't we still do atomic swap from user-space (it's just a memory,
> > > > > after all), right? I haven't tried, tbh, but don't see why it wouldn't
> > > > > work.
> > > > Atomics are even worse because you get all these nice cache bouncing effects.
> > > > That's why I didn't understand initialy the argument about performance.
> > >
> > > Depends on problems you are trying to solve. I bet it's still cheaper
> > > than doing map updates under lock, don't you think?
> > Exactly, depends on the problem. But still, if you need frequent reads
> > of that global data, it means there are frequent writes, which is
> > a problem on its own.
On Thu, Dec 12, 2019 at 8:29 AM Stanislav Fomichev <[email protected]> wrote:
>
> On 12/11, Andrii Nakryiko wrote:
> > On Wed, Dec 11, 2019 at 6:57 PM Stanislav Fomichev <[email protected]> wrote:
> > >
> > > On 12/11, Andrii Nakryiko wrote:
> > > > On Wed, Dec 11, 2019 at 12:09 PM Stanislav Fomichev <[email protected]> wrote:
> > > > >
> > > > > On 12/11, Andrii Nakryiko wrote:
> > > > > > On Wed, Dec 11, 2019 at 11:15 AM Stanislav Fomichev <[email protected]> wrote:
> > > > > > >
> > > > > > > On 12/11, Andrii Nakryiko wrote:
> > > > > > > > On Wed, Dec 11, 2019 at 9:24 AM Stanislav Fomichev <[email protected]> wrote:
> > > > > > > > >
> > > > > > > > > On 12/10, Andrii Nakryiko wrote:
> > > > > > > > > > On Tue, Dec 10, 2019 at 2:59 PM Stanislav Fomichev <[email protected]> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On 12/10, Andrii Nakryiko wrote:
> > > > > > > > > > > > On Tue, Dec 10, 2019 at 1:44 PM Stanislav Fomichev <[email protected]> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > On 12/10, Jakub Kicinski wrote:
> > > > > > > > > > > > > > On Tue, 10 Dec 2019 09:11:31 -0800, Andrii Nakryiko wrote:
> > > > > > > > > > > > > > > On Mon, Dec 9, 2019 at 5:57 PM Jakub Kicinski wrote:
> > > > > > > > > > > > > > > > On Mon, 9 Dec 2019 17:14:34 -0800, Andrii Nakryiko wrote:
> > > > > > > > > > > > > > > > > struct <object-name> {
> > > > > > > > > > > > > > > > > /* used by libbpf's skeleton API */
> > > > > > > > > > > > > > > > > struct bpf_object_skeleton *skeleton;
> > > > > > > > > > > > > > > > > /* bpf_object for libbpf APIs */
> > > > > > > > > > > > > > > > > struct bpf_object *obj;
> > > > > > > > > > > > > > > > > struct {
> > > > > > > > > > > > > > > > > /* for every defined map in BPF object: */
> > > > > > > > > > > > > > > > > struct bpf_map *<map-name>;
> > > > > > > > > > > > > > > > > } maps;
> > > > > > > > > > > > > > > > > struct {
> > > > > > > > > > > > > > > > > /* for every program in BPF object: */
> > > > > > > > > > > > > > > > > struct bpf_program *<program-name>;
> > > > > > > > > > > > > > > > > } progs;
> > > > > > > > > > > > > > > > > struct {
> > > > > > > > > > > > > > > > > /* for every program in BPF object: */
> > > > > > > > > > > > > > > > > struct bpf_link *<program-name>;
> > > > > > > > > > > > > > > > > } links;
> > > > > > > > > > > > > > > > > /* for every present global data section: */
> > > > > > > > > > > > > > > > > struct <object-name>__<one of bss, data, or rodata> {
> > > > > > > > > > > > > > > > > /* memory layout of corresponding data section,
> > > > > > > > > > > > > > > > > * with every defined variable represented as a struct field
> > > > > > > > > > > > > > > > > * with exactly the same type, but without const/volatile
> > > > > > > > > > > > > > > > > * modifiers, e.g.:
> > > > > > > > > > > > > > > > > */
> > > > > > > > > > > > > > > > > int *my_var_1;
> > > > > > > > > > > > > > > > > ...
> > > > > > > > > > > > > > > > > } *<one of bss, data, or rodata>;
> > > > > > > > > > > > > > > > > };
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I think I understand how this is useful, but perhaps the problem here
> > > > > > > > > > > > > > > > is that we're using C for everything, and simple programs for which
> > > > > > > > > > > > > > > > loading the ELF is majority of the code would be better of being
> > > > > > > > > > > > > > > > written in a dynamic language like python? Would it perhaps be a
> > > > > > > > > > > > > > > > better idea to work on some high-level language bindings than spend
> > > > > > > > > > > > > > > > time writing code gens and working around limitations of C?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > None of this work prevents Python bindings and other improvements, is
> > > > > > > > > > > > > > > it? Patches, as always, are greatly appreciated ;)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > This "do it yourself" shit is not really funny :/
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I'll stop providing feedback on BPF patches if you guy keep saying
> > > > > > > > > > > > > > that :/ Maybe that's what you want.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > This skeleton stuff is not just to save code, but in general to
> > > > > > > > > > > > > > > simplify and streamline working with BPF program from userspace side.
> > > > > > > > > > > > > > > Fortunately or not, but there are a lot of real-world applications
> > > > > > > > > > > > > > > written in C and C++ that could benefit from this, so this is still
> > > > > > > > > > > > > > > immensely useful. selftests/bpf themselves benefit a lot from this
> > > > > > > > > > > > > > > work, see few of the last patches in this series.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Maybe those applications are written in C and C++ _because_ there
> > > > > > > > > > > > > > are no bindings for high level languages. I just wish BPF programming
> > > > > > > > > > > > > > was less weird and adding some funky codegen is not getting us closer
> > > > > > > > > > > > > > to that goal.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > In my experience code gen is nothing more than a hack to work around
> > > > > > > > > > > > > > bad APIs, but experiences differ so that's not a solid argument.
> > > > > > > > > > > > > *nod*
> > > > > > > > > > > > >
> > > > > > > > > > > > > We have a nice set of C++ wrappers around libbpf internally, so we can do
> > > > > > > > > > > > > something like BpfMap<key type, value type> and get a much better interface
> > > > > > > > > > > > > with type checking. Maybe we should focus on higher level languages instead?
> > > > > > > > > > > > > We are open to open-sourcing our C++ bits if you want to collaborate.
> > > > > > > > > > > >
> > > > > > > > > > > > Python/C++ bindings and API wrappers are an orthogonal concerns here.
> > > > > > > > > > > > I personally think it would be great to have both Python and C++
> > > > > > > > > > > > specific API that uses libbpf under the cover. The only debatable
> > > > > > > > > > > > thing is the logistics: where the source code lives, how it's kept in
> > > > > > > > > > > > sync with libbpf, how we avoid crippling libbpf itself because
> > > > > > > > > > > > something is hard or inconvenient to adapt w/ Python, etc.
> > > > > > > > > > >
> > > > > > > > > > > [..]
> > > > > > > > > > > > The problem I'm trying to solve here is not really C-specific. I don't
> > > > > > > > > > > > think you can solve it without code generation for C++. How do you
> > > > > > > > > > > > "generate" BPF program-specific layout of .data, .bss, .rodata, etc
> > > > > > > > > > > > data sections in such a way, where it's type safe (to the degree that
> > > > > > > > > > > > language allows that, of course) and is not "stringly-based" API? This
> > > > > > > > > > > > skeleton stuff provides a natural, convenient and type-safe way to
> > > > > > > > > > > > work with global data from userspace pretty much at the same level of
> > > > > > > > > > > > performance and convenience, as from BPF side. How can you achieve
> > > > > > > > > > > > that w/ C++ without code generation? As for Python, sure you can do
> > > > > > > > > > > > dynamic lookups based on just the name of property/method, but amount
> > > > > > > > > > > > of overheads is not acceptable for all applications (and Python itself
> > > > > > > > > > > > is not acceptable for those applications). In addition to that, C is
> > > > > > > > > > > > the best way for other less popular languages (e.g., Rust) to leverage
> > > > > > > > > > > > libbpf without investing lots of effort in re-implementing libbpf in
> > > > > > > > > > > > Rust.
> > > > > > > > > > > I'd say that a libbpf API similar to dlopen/dlsym is a more
> > > > > > > > > > > straightforward thing to do. Have a way to "open" a section and
> > > > > > > > > > > a way to find a symbol in it. Yes, it's a string-based API,
> > > > > > > > > > > but there is nothing wrong with it. IMO, this is easier to
> > > > > > > > > > > use/understand and I suppose Python/C++ wrappers are trivial.
> > > > > > > > > >
> > > > > > > > > > Without digging through libbpf source code (or actually, look at code,
> > > > > > > > > > but don't run any test program), what's the name of the map
> > > > > > > > > > corresponding to .bss section, if object file is
> > > > > > > > > > some_bpf_object_file.o? If you got it right (congrats, btw, it took me
> > > > > > > > > > multiple attempts to memorize the pattern), how much time did you
> > > > > > > > > > spend looking it up? Now compare it to `skel->maps.bss`. Further, if
> > > > > > > > > > you use anonymous structs for your global vars, good luck maintaining
> > > > > > > > > > two copies of that: one for BPF side and one for userspace.
> > > > > > > > > As your average author of BPF programs I don't really care
> > > > > > > > > which section my symbol ends up into. Just give me an api
> > > > > > > > > to mmap all "global" sections (or a call per section which does all the
> > > > > > > > > naming magic inside) and lookup symbol by name; I can cast it to a proper
> > > > > > > > > type and set it.
> > > > > > > >
> > > > > > > > I'd like to not have to know about bss/rodata/data as well, but that's
> > > > > > > > how things are done for global variables. In skeleton we can try to
> > > > > > > > make an illusion like they are part of one big datasection/struct, but
> > > > > > > > that seems like a bit too much magic at this point. But then again,
> > > > > > > > one of the reasons I want this as an experimental feature, so that we
> > > > > > > > can actually judge from real experience how inconvenient some things
> > > > > > > > are, and not just based on "I think it would be ...".
> > > > > > > >
> > > > > > > > re: "Just give me ...". Following the spirit of "C is hard" from your
> > > > > > > > previous arguments, you already have that API: mmap() syscall. C
> > > > > > > > programmers have to be able to figure out the rest ;) But on the
> > > > > > > > serious note, this auto-generated code in skeleton actually addresses
> > > > > > > > all concerns (and more) that you mentioned: mmaping, knowing offsets,
> > > > > > > > knowing names and types, etc. And it doesn't preclude adding more
> > > > > > > > "conventional" additional APIs to do everything more dynamically,
> > > > > > > > based on string names.
> > > > > > > We have different understanding of what's difficult :-)
> > > > > >
> > > > > > Well, clearly... See below.
> > > > > >
> > > > > > >
> > > > > > > To me, doing transparent data/rodata/bss mmap in bpf_object__load and then
> > > > > > > adding a single libbpf api call to lookup symbol by string name is simple
> > > > > > > (both from user perspective and from libbpf code complexity). Because in
> > > > > > > order to use the codegen I need to teach our build system to spit it
> > > > > > > out (which means I need to add bpftool to it and keep it
> > > > > > > updated/etc/etc). You can use it as an example of "real experience how
> > > > > > > inconvenient some things are".
> > > > > >
> > > > > > Yes, you need to integrate bpftool in your build process. Which is
> > > > > > exactly what I'm doing internally for Facebook as well. But it's a
> > > > > > mostly one-time cost, which benefits lots of users who have much
> > > > > > better time with these changes, as opposed to make things simpler for
> > > > > > us, libbpf developers, at the expense of more convoluted user
> > > > > > experience for end users. I certainly prefer more complicated
> > > > > > libbpf/bpftool code, if the resulting user experience is simpler for
> > > > > > BPF application developers, no doubt about it.
> > > > > I'm in the process of going through this with pahole to get proper BTF.
> > > > > I don't think I'm willing yet (without a good reason) to go through
> > > > > this process again :-D (I saw that you've converted a bunch of tests
> > > > > to it which means I might not be able to run them).
> > > >
> > > > A lot of new functionality is depending on BTF for a really good
> > > > reason (check Alexei's fentry/fexit/btf_tp stuff, allowing for safe
> > > > direct memory reads and extremely low overhead kretprobes). More stuff
> > > > is to come and is going to require in-kernel BTF, so even if it's
> > > > painful right now, it's worth it. Think long term and keep perspective
> > > > in mind.
> > > Oh yeah, that's totally understandable with BTF, that's why I just started
> > > adding it to our build. Still was a bit surprising that one day most
> > > our our testing went red.
> > >
> > > > > I just hope bpftool codegen doesn't become a requirement for
> > > > > any new useful feature; same happened to BTF, which was optional
> > > > > for a while and now I can't run a single selftest without it.
> > > > > I can totally understand the BTF requirement though, but I don't buy the
> > > > > "codegen makes user experience simple for bpf application developers",
> > > > > sorry (I guess, at this point, it's all about preference).
> > > >
> > > > Bpftool is going to be a requirement for selftests. And it's a good
> > > > thing because it allows us to continuously test not just libbpf,
> > > > kernel, but now also related tooling. I haven't converted all of the
> > > > selftests to skeleton, but given enough time I'd do that, just for the
> > > > cleaner and shorter plumbing code it gives.
> > > Then why all the talk about --experimantal flags if you've set up your
> > > mind on converting everything already?
> >
> > Where's contradiction? I'm not converting everything right now, same
> > as I haven't converted everything into test_progs, right? But I do
> > think that we should work towards that. But all that is still besides
> > the point of experimental, because we are talking about selftests, we
> > can atomically fix them up with whatever changes we do to those
> > experimental APIs.
> >
> > The only reason for this experimental disclaimer is for user code
> > outside of kernel source tree that is going to try and use skeleton.
> > If we need to change something up a little bit, I'd like to still have
> > a bit of a wiggle room to adjust things, even if that causes a small
> > and easily fixable source code breakage (even though I don't see it
> > happening yet, it might be necessary).
> In this case, I might have misunderstood your original intent.
> I thought you wanted to try this experimental stuff out in
> the wild, collect some feedback and then, based on that, decide
> whether you want to commit or not. But it seems that it's not
> the case.
>
> > > Btw, how hard it would be to do this generation with a new python
> > > script instead of bpftool? Something along the lines of
> > > scripts/bpf_helpers_doc.py that parses BTF and spits out this C header
> > > (shouldn't be that hard to write custom BTF parser in python, right)?
> > >
> >
> > Not impossible, but harder than I'd care to deal with. I certainly
> > don't want to re-implement a good chunk of ELF and BTF parsing (maps,
> > progs, in addition to datasec stuff). But "it's hard to use bpftool in
> > our build system" doesn't seem like good enough reason to do all that.
> You can replace "our build system" with some other project you care about,
> like systemd. They'd have the same problem with vendoring in recent enough
> bpftool or waiting for every distro to do it. And all this work is
> because you think that doing:
>
> my_obj->rodata->my_var = 123;
>
> Is easier / more type safe than doing:
> int *my_var = bpf_object__rodata_lookup(obj, "my_var");
> *my_var = 123;
Your arguments are confusing me. Did I say that we shouldn't add this
type of "dynamic" interface to variables? Or did I say that every
single BPF application has to adopt skeleton and bpftool? I made no
such claims and it seems like discussion is just based around where I
have to apply my time and efforts... You think it's not useful - don't
integrate bpftool into your build system, simple as that. Skeleton is
used for selftests, but it's up to maintainers to decide whether to
keep this, similar to all the BTF decisions.
So in conclusion. Yes, I think skeleton way is better/safer/easier
than string-based interface you are proposing, but no I'm not claiming
it should be the only way. It just happens to be the way I chose to
implement (first?). If you think API you have in mind makes sense,
feel free to implement it and we'll have a separate discussion for
those.
>
> I don't think we are going anywhere with this thread, so feel free
> to ignore me and don't reply.
>
> > > Python might be easier to integrate with other projects (our build system,
> > > cilium, etc).
> > >
> > > > > > > > > RE anonymous structs: maybe don't use them if you want to share the data
> > > > > > > > > between bpf and userspace?
> > > > > > > >
> > > > > > > > Alright.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > I never said there is anything wrong with current straightforward
> > > > > > > > > > libbpf API, but I also never said it's the easiest and most
> > > > > > > > > > user-friendly way to work with BPF either. So we'll have both
> > > > > > > > > > code-generated interface and existing API. Furthermore, they are
> > > > > > > > > > interoperable (you can pass skel->maps.whatever to any of the existing
> > > > > > > > > > libbpf APIs, same for progs, links, obj itself). But there isn't much
> > > > > > > > > > that can beat performance and usability of code-generated .data, .bss,
> > > > > > > > > > .rodata (and now .extern) layout.
> > > > > > > > > I haven't looked closely enough, but is there a libbpf api to get
> > > > > > > > > an offset of a variable? Suppose I have the following in bpf.c:
> > > > > > > > >
> > > > > > > > > int a;
> > > > > > > > > int b;
> > > > > > > > >
> > > > > > > > > Can I get an offset of 'b' in the .bss without manually parsing BTF?
> > > > > > > >
> > > > > > > > No there isn't right now. There isn't even an API to know that there
> > > > > > > > is such a variable called "b". Except for this skeleton, of course.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > TBH, I don't buy the performance argument for these global maps.
> > > > > > > > > When you did the mmap patchset for the array, you said it yourself
> > > > > > > > > that it's about convenience and not performance.
> > > > > > > >
> > > > > > > > Yes, it's first and foremost about convenience, addressing exactly the
> > > > > > > > problems you mentioned above. But performance is critical for some use
> > > > > > > > cases, and nothing can beat memory-mapped view of BPF map for those.
> > > > > > > > Think about the case of frequently polling (or even atomically
> > > > > > > > exchanging) some stats from userspace, as one possible example. E.g.,
> > > > > > > > like some map statistics (number of filled elements, p50 of whatever
> > > > > > > > of those elements, etc). I'm not sure what's there to buy: doing
> > > > > > > > syscall to get **entire** global data map contents vs just fetching
> > > > > > > > single integer from memory-mapped region, guess which one is cheaper?
> > > > > > > My understanding was that when you were talking about performance, you
> > > > > > > were talking about doing symbol offset lookup at runtime vs having a
> > > > > > > generated struct with fixed offsets; not about mmap vs old api with copy
> > > > > > > (this debate is settled since your patches are accepted).
> > > > > >
> > > > > > Oh, I see. No, I didn't intend to claim that performance of looking up
> > > > > > variable by name in BTF is a big performance concern. Settled then :)
> > > > > >
> > > > > > >
> > > > > > > But to your original reply: you do understand that if you have multiple
> > > > > > > threads that write to this global data you have a bigger problem, right?
> > > > > >
> > > > > > Not necessarily. BPF has atomic increment instruction, doesn't it? And
> > > > > > can't we still do atomic swap from user-space (it's just a memory,
> > > > > > after all), right? I haven't tried, tbh, but don't see why it wouldn't
> > > > > > work.
> > > > > Atomics are even worse because you get all these nice cache bouncing effects.
> > > > > That's why I didn't understand initialy the argument about performance.
> > > >
> > > > Depends on problems you are trying to solve. I bet it's still cheaper
> > > > than doing map updates under lock, don't you think?
> > > Exactly, depends on the problem. But still, if you need frequent reads
> > > of that global data, it means there are frequent writes, which is
> > > a problem on its own.
On Thu, 12 Dec 2019 08:53:22 -0800, Andrii Nakryiko wrote:
> > > > Btw, how hard it would be to do this generation with a new python
> > > > script instead of bpftool? Something along the lines of
> > > > scripts/bpf_helpers_doc.py that parses BTF and spits out this C header
> > > > (shouldn't be that hard to write custom BTF parser in python, right)?
> > > >
> > >
> > > Not impossible, but harder than I'd care to deal with. I certainly
> > > don't want to re-implement a good chunk of ELF and BTF parsing (maps,
> > > progs, in addition to datasec stuff). But "it's hard to use bpftool in
> > > our build system" doesn't seem like good enough reason to do all that.
> > You can replace "our build system" with some other project you care about,
> > like systemd. They'd have the same problem with vendoring in recent enough
> > bpftool or waiting for every distro to do it. And all this work is
> > because you think that doing:
> >
> > my_obj->rodata->my_var = 123;
> >
> > Is easier / more type safe than doing:
> > int *my_var = bpf_object__rodata_lookup(obj, "my_var");
> > *my_var = 123;
>
> Your arguments are confusing me. Did I say that we shouldn't add this
> type of "dynamic" interface to variables? Or did I say that every
> single BPF application has to adopt skeleton and bpftool? I made no
> such claims and it seems like discussion is just based around where I
> have to apply my time and efforts... You think it's not useful - don't
> integrate bpftool into your build system, simple as that. Skeleton is
> used for selftests, but it's up to maintainers to decide whether to
> keep this, similar to all the BTF decisions.
Since we have two people suggesting this functionality to be a separate
tool could you please reconsider my arguments from two days ago?
There absolutely nothing this tool needs from [bpftool], no
JSON needed, no bpffs etc. It can be a separate tool like
libbpf-skel-gen or libbpf-c-skel or something, distributed with libbpf.
That way you can actually soften the backward compat. In case people
become dependent on it they can carry that little tool on their own.
I'd honestly leave the distro packaging problem for people who actually
work on that to complain about.
On 12/12, Jakub Kicinski wrote:
> On Thu, 12 Dec 2019 08:53:22 -0800, Andrii Nakryiko wrote:
> > > > > Btw, how hard it would be to do this generation with a new python
> > > > > script instead of bpftool? Something along the lines of
> > > > > scripts/bpf_helpers_doc.py that parses BTF and spits out this C header
> > > > > (shouldn't be that hard to write custom BTF parser in python, right)?
> > > > >
> > > >
> > > > Not impossible, but harder than I'd care to deal with. I certainly
> > > > don't want to re-implement a good chunk of ELF and BTF parsing (maps,
> > > > progs, in addition to datasec stuff). But "it's hard to use bpftool in
> > > > our build system" doesn't seem like good enough reason to do all that.
> > > You can replace "our build system" with some other project you care about,
> > > like systemd. They'd have the same problem with vendoring in recent enough
> > > bpftool or waiting for every distro to do it. And all this work is
> > > because you think that doing:
> > >
> > > my_obj->rodata->my_var = 123;
> > >
> > > Is easier / more type safe than doing:
> > > int *my_var = bpf_object__rodata_lookup(obj, "my_var");
> > > *my_var = 123;
> >
> > Your arguments are confusing me. Did I say that we shouldn't add this
> > type of "dynamic" interface to variables? Or did I say that every
> > single BPF application has to adopt skeleton and bpftool? I made no
> > such claims and it seems like discussion is just based around where I
> > have to apply my time and efforts... You think it's not useful - don't
> > integrate bpftool into your build system, simple as that. Skeleton is
> > used for selftests, but it's up to maintainers to decide whether to
> > keep this, similar to all the BTF decisions.
>
> Since we have two people suggesting this functionality to be a separate
> tool could you please reconsider my arguments from two days ago?
>
> There absolutely nothing this tool needs from [bpftool], no
> JSON needed, no bpffs etc. It can be a separate tool like
> libbpf-skel-gen or libbpf-c-skel or something, distributed with libbpf.
> That way you can actually soften the backward compat. In case people
> become dependent on it they can carry that little tool on their own.
[..]
> I'd honestly leave the distro packaging problem for people who actually
> work on that to complain about.
I'm representing a 'Google distro' :-D
On Thu, 12 Dec 2019 10:58:31 -0800, Stanislav Fomichev wrote:
> > I'd honestly leave the distro packaging problem for people who actually
> > work on that to complain about.
> I'm representing a 'Google distro' :-D
Suits me :)
On Thu, Dec 12, 2019 at 10:43:34AM -0800, Jakub Kicinski wrote:
> On Thu, 12 Dec 2019 08:53:22 -0800, Andrii Nakryiko wrote:
> > > > > Btw, how hard it would be to do this generation with a new python
> > > > > script instead of bpftool? Something along the lines of
> > > > > scripts/bpf_helpers_doc.py that parses BTF and spits out this C header
> > > > > (shouldn't be that hard to write custom BTF parser in python, right)?
> > > > >
> > > >
> > > > Not impossible, but harder than I'd care to deal with. I certainly
> > > > don't want to re-implement a good chunk of ELF and BTF parsing (maps,
> > > > progs, in addition to datasec stuff). But "it's hard to use bpftool in
> > > > our build system" doesn't seem like good enough reason to do all that.
> > > You can replace "our build system" with some other project you care about,
> > > like systemd. They'd have the same problem with vendoring in recent enough
> > > bpftool or waiting for every distro to do it. And all this work is
> > > because you think that doing:
> > >
> > > my_obj->rodata->my_var = 123;
> > >
> > > Is easier / more type safe than doing:
> > > int *my_var = bpf_object__rodata_lookup(obj, "my_var");
> > > *my_var = 123;
> >
> > Your arguments are confusing me. Did I say that we shouldn't add this
> > type of "dynamic" interface to variables? Or did I say that every
> > single BPF application has to adopt skeleton and bpftool? I made no
> > such claims and it seems like discussion is just based around where I
> > have to apply my time and efforts... You think it's not useful - don't
> > integrate bpftool into your build system, simple as that. Skeleton is
> > used for selftests, but it's up to maintainers to decide whether to
> > keep this, similar to all the BTF decisions.
>
> Since we have two people suggesting this functionality to be a separate
> tool could you please reconsider my arguments from two days ago?
>
> There absolutely nothing this tool needs from [bpftool], no
> JSON needed, no bpffs etc.
To generate vmlinux.h bpftool doesn't need json and doesn't need bpffs.
> It can be a separate tool like
> libbpf-skel-gen or libbpf-c-skel or something, distributed with libbpf.
> That way you can actually soften the backward compat. In case people
> become dependent on it they can carry that little tool on their own.
Jakub,
Could you please consider Andrii's reply to your comment from two days ago:
https://lore.kernel.org/bpf/CAEf4BzbeZbmCTOOo2uQXjm0GL0WDu7aLN6fdUk18Nv2g0kfwVg@mail.gmail.com/
"we are trying to make users lives easier by having major distributions
distribute bpftool and libbpf properly. Adding extra binaries to
distribute around doesn't seem to be easing any of users pains."
My opinion is the following.
bpftool is necessary to write bpf programs already. It's necessary to produce
vmlinux.h for bpf programs to include it. It's part of build process. I can
relate to Stan's complains that he needs to update clang and pahole. He missed
the fact that he needs to update bpftool too if he wants to use all features of
CO-RE. Same thing for skeleton generation. If people need to run the latest
selftest/bpf on the latest kernel they need to upgrade to the latest clang,
pahole, libbpf, bpftool. Nothing new here.
Backwards compat is the same concern for skeleton generation and for vmlinux.h
generation. Obviously no one wants to introduce something that will keep
changing. Is vmlinux.h generation stable? I like to believe so. Same with
skeleton. I wouldn't want to see it changing, but in both cases such chance
exists. We cannot and should not adopt kernel-like ABI guarantees to user space
code. It will paralyze the development.
Now consider if vmlinux.h and skeleton generation is split out of bpftool into
new tool. Effectively it would mean a fork of bpftool. Two binaries doing bpf
elf file processing without clear distinction between them is going to be very
confusing.
One more point from Stan's email:
> You can replace "our build system" with some other project you care about,
> like systemd. They'd have the same problem with vendoring in recent enough
we've been working with systemd folks for ~8 month to integrate libbpf into
their build that is using meson build system and their CI that is github based.
So we're well aware about systemd requirements for libbpf and friends.
> bpftool or waiting for every distro to do it. And all this work is
> because you think that doing:
>
> my_obj->rodata->my_var = 123;
>
> Is easier / more type safe than doing:
> int *my_var = bpf_object__rodata_lookup(obj, "my_var");
> *my_var = 123;
Stan, you conveniently skipped error checking. It should have been:
int *my_var = bpf_object__rodata_lookup(obj, "my_var");
if (IS_ERROR_NULL(my_var))
goto out_cleanup;
*my_var = 123;
Now multiply this code by N variables and consider how cleanup code will look like.
Then multiply this lookup plus cleanup code for all maps, all program and all links.
Today this bpf_object__*lookup*("string") for programs, maps, links is a major
chunk of code not all only for all tests, but for all C, python, C++ services
that use BPF. It's error prone and verbose. Generated skeleton moves the tedious
job of writing lookup accessors from humans into bpftool.
Take a look at Andrii's patch 13/15:
5 files changed, 149 insertions(+), 249 deletions(-)
Those are simple selftests, yet code removal is huge. Bigger project benefits
even more.
bcc and bpftrace are successful projects because barrier of entry is low.
Both allow single line 'hello world' to get going with BPF. We're missing
this completely for networking. Some people suggest that bpftrace like
domain specific language needs to be invented for networking. I don't mind. Yet
I'd like C to be the first choice both for networking and for tracing. To
achieve that the barrier of entry need to be drastically reduced. 'Hello world'
in BPF C and corresponding user space C should fit on one slide. The existing
amount of boiler plate code in libbpf is such barrier. Skeleton generation
solves this usability problem. I do expect a lot more tools to be written in C
because skeleton exists. Can we teach skeleton to generate C++, go, rust,
python wrappers? Surely we can. One step at a time. If we make it work well for
"bpf C" plus "user C" it will work well when user side is written in a
different language.
On Thu, 12 Dec 2019 11:54:16 -0800, Alexei Starovoitov wrote:
> On Thu, Dec 12, 2019 at 10:43:34AM -0800, Jakub Kicinski wrote:
> > On Thu, 12 Dec 2019 08:53:22 -0800, Andrii Nakryiko wrote:
> > > > > > Btw, how hard it would be to do this generation with a new python
> > > > > > script instead of bpftool? Something along the lines of
> > > > > > scripts/bpf_helpers_doc.py that parses BTF and spits out this C header
> > > > > > (shouldn't be that hard to write custom BTF parser in python, right)?
> > > > > >
> > > > >
> > > > > Not impossible, but harder than I'd care to deal with. I certainly
> > > > > don't want to re-implement a good chunk of ELF and BTF parsing (maps,
> > > > > progs, in addition to datasec stuff). But "it's hard to use bpftool in
> > > > > our build system" doesn't seem like good enough reason to do all that.
> > > > You can replace "our build system" with some other project you care about,
> > > > like systemd. They'd have the same problem with vendoring in recent enough
> > > > bpftool or waiting for every distro to do it. And all this work is
> > > > because you think that doing:
> > > >
> > > > my_obj->rodata->my_var = 123;
> > > >
> > > > Is easier / more type safe than doing:
> > > > int *my_var = bpf_object__rodata_lookup(obj, "my_var");
> > > > *my_var = 123;
> > >
> > > Your arguments are confusing me. Did I say that we shouldn't add this
> > > type of "dynamic" interface to variables? Or did I say that every
> > > single BPF application has to adopt skeleton and bpftool? I made no
> > > such claims and it seems like discussion is just based around where I
> > > have to apply my time and efforts... You think it's not useful - don't
> > > integrate bpftool into your build system, simple as that. Skeleton is
> > > used for selftests, but it's up to maintainers to decide whether to
> > > keep this, similar to all the BTF decisions.
> >
> > Since we have two people suggesting this functionality to be a separate
> > tool could you please reconsider my arguments from two days ago?
> >
> > There absolutely nothing this tool needs from [bpftool], no
> > JSON needed, no bpffs etc.
>
> To generate vmlinux.h bpftool doesn't need json and doesn't need bpffs.
At least for header generation it pertains to the running system.
And bpftool was (and still is AFAICT) about interacting with the BPF
state on the running system.
> > It can be a separate tool like
> > libbpf-skel-gen or libbpf-c-skel or something, distributed with libbpf.
> > That way you can actually soften the backward compat. In case people
> > become dependent on it they can carry that little tool on their own.
>
> Jakub,
>
> Could you please consider Andrii's reply to your comment from two days ago:
> https://lore.kernel.org/bpf/CAEf4BzbeZbmCTOOo2uQXjm0GL0WDu7aLN6fdUk18Nv2g0kfwVg@mail.gmail.com/
> "we are trying to make users lives easier by having major distributions
> distribute bpftool and libbpf properly. Adding extra binaries to
> distribute around doesn't seem to be easing any of users pains."
Last time we argued I heard how GH makes libbpf packaging easier.
Only to have that dis-proven once the people in Europe who do distro
packaging woke up:
https://lkml.org/lkml/2019/12/5/101
https://lkml.org/lkml/2019/12/5/312
I feel I'm justified not to take your opinion on this as fact.
> My opinion is the following.
> bpftool is necessary to write bpf programs already. It's necessary to produce
> vmlinux.h for bpf programs to include it. It's part of build process. I can
> relate to Stan's complains that he needs to update clang and pahole. He missed
> the fact that he needs to update bpftool too if he wants to use all features of
> CO-RE. Same thing for skeleton generation. If people need to run the latest
> selftest/bpf on the latest kernel they need to upgrade to the latest clang,
> pahole, libbpf, bpftool. Nothing new here.
They have to update libbpf, so why can't the code gen tool be part of
libbpf? We don't need to build all BPF user space into one binary.
> Backwards compat is the same concern for skeleton generation and for vmlinux.h
> generation. Obviously no one wants to introduce something that will keep
> changing. Is vmlinux.h generation stable? I like to believe so. Same with
> skeleton. I wouldn't want to see it changing, but in both cases such chance
> exists.
vmlinux.h is pretty stable, there isn't much wiggle room there.
It's more of a conversion tool, if you will.
Skeleton OTOH is supposed to make people's lives easier, so it's a
completely different beast. It should be malleable so that users can
improve and hack on it. Baking it into as system tool is counter
productive. Users should be able to grab the skel tool single-file
source and adjust for their project's needs. Distributing your own copy
of bpftool because you want to adjust skel is a heavy lift.
And maybe one day we do have Python/Go/whatever bindings, and we can
convert the skel tool to a higher level language with modern templating.
> We cannot and should not adopt kernel-like ABI guarantees to user space
> code. It will paralyze the development.
Discussion for another time :)
> Now consider if vmlinux.h and skeleton generation is split out of bpftool into
> new tool. Effectively it would mean a fork of bpftool. Two binaries doing bpf
> elf file processing without clear distinction between them is going to be very
> confusing.
To be clear I'm suggesting skel gen is a separate tool, vmlinux and
Quentin's header gen work on the running system, they are not pure
build env tools.
On Thu, Dec 12, 2019 at 12:21:15PM -0800, Jakub Kicinski wrote:
> > >
> > > There absolutely nothing this tool needs from [bpftool], no
> > > JSON needed, no bpffs etc.
> >
> > To generate vmlinux.h bpftool doesn't need json and doesn't need bpffs.
>
> At least for header generation it pertains to the running system.
> And bpftool was (and still is AFAICT) about interacting with the BPF
> state on the running system.
No. Reality is different. vmlinux.h generation doesn't need to touch
kernel on the running system. Part of its job is to generate multiple
vmlinux.h from a set of vmlinux elf files. Different .h for different kernels.
It can generate vmlinux.h from running kernel too, but its less relevant
to make use of CO-RE.
In the future bpftool will be used to merge such multiple .h-s.
Likely it will first merge BTFs from vmlinuxes and then will produce
merged vmlinux_4_x_and_5_x.h
> > > It can be a separate tool like
> > > libbpf-skel-gen or libbpf-c-skel or something, distributed with libbpf.
> > > That way you can actually soften the backward compat. In case people
> > > become dependent on it they can carry that little tool on their own.
> >
> > Jakub,
> >
> > Could you please consider Andrii's reply to your comment from two days ago:
> > https://lore.kernel.org/bpf/CAEf4BzbeZbmCTOOo2uQXjm0GL0WDu7aLN6fdUk18Nv2g0kfwVg@mail.gmail.com/
> > "we are trying to make users lives easier by having major distributions
> > distribute bpftool and libbpf properly. Adding extra binaries to
> > distribute around doesn't seem to be easing any of users pains."
>
> Last time we argued I heard how GH makes libbpf packaging easier.
> Only to have that dis-proven once the people in Europe who do distro
> packaging woke up:
>
> https://lkml.org/lkml/2019/12/5/101
> https://lkml.org/lkml/2019/12/5/312
I think you missed the point of these two comments. It was about packaging
bpftool and libbpf together. Regardless how bpftool is packaged. I still
strongly suggest to use github/libbpf to package libbpf. It's something that is
actually tested whereas libbpf in the kernel tree has unit test coverage only.
>
> > My opinion is the following.
> > bpftool is necessary to write bpf programs already. It's necessary to produce
> > vmlinux.h for bpf programs to include it. It's part of build process. I can
> > relate to Stan's complains that he needs to update clang and pahole. He missed
> > the fact that he needs to update bpftool too if he wants to use all features of
> > CO-RE. Same thing for skeleton generation. If people need to run the latest
> > selftest/bpf on the latest kernel they need to upgrade to the latest clang,
> > pahole, libbpf, bpftool. Nothing new here.
>
> They have to update libbpf, so why can't the code gen tool be part of
> libbpf?
I'm not sure why two answers were not enough.
No idea how to answer this question differently for the third time.
> > Backwards compat is the same concern for skeleton generation and for vmlinux.h
> > generation. Obviously no one wants to introduce something that will keep
> > changing. Is vmlinux.h generation stable? I like to believe so. Same with
> > skeleton. I wouldn't want to see it changing, but in both cases such chance
> > exists.
>
> vmlinux.h is pretty stable, there isn't much wiggle room there.
Do you have experience working with vmlinux.h? I bet the answer is no.
While we have and identified few things that needs improvement.
They require vmlinux.h to be generated differently.
> It's more of a conversion tool, if you will.
>
> Skeleton OTOH is supposed to make people's lives easier, so it's a
> completely different beast. It should be malleable so that users can
> improve and hack on it. Baking it into as system tool is counter
> productive. Users should be able to grab the skel tool single-file
> source and adjust for their project's needs. Distributing your own copy
> of bpftool because you want to adjust skel is a heavy lift.
Adjust generator for their custom needs? essentially fork it for
private use? I'd rather prevent such possibility.
When people start using it I'd prefer they come back to this mailing
list with patches than do 'easy fork'.
> > Now consider if vmlinux.h and skeleton generation is split out of bpftool into
> > new tool. Effectively it would mean a fork of bpftool. Two binaries doing bpf
> > elf file processing without clear distinction between them is going to be very
> > confusing.
>
> To be clear I'm suggesting skel gen is a separate tool, vmlinux and
> Quentin's header gen work on the running system, they are not pure
> build env tools.
You meant to say Andrii's header generator that is based on Quentin's man page
generator. Its output bpf_helper_defs.h makes sense as a part of libbpf
package. The generator script itself doesn't need to be included with any package.
bpftool vmlinux gen consumes vmlinux elf files and is a part of the build.
bpftool skeleton gen consumes bpf elf files and is a part of the same build.
On Thu, 12 Dec 2019 13:28:00 -0800, Alexei Starovoitov wrote:
> On Thu, Dec 12, 2019 at 12:21:15PM -0800, Jakub Kicinski wrote:
> > > > It can be a separate tool like
> > > > libbpf-skel-gen or libbpf-c-skel or something, distributed with libbpf.
> > > > That way you can actually soften the backward compat. In case people
> > > > become dependent on it they can carry that little tool on their own.
> > >
> > > Jakub,
> > >
> > > Could you please consider Andrii's reply to your comment from two days ago:
> > > https://lore.kernel.org/bpf/CAEf4BzbeZbmCTOOo2uQXjm0GL0WDu7aLN6fdUk18Nv2g0kfwVg@mail.gmail.com/
> > > "we are trying to make users lives easier by having major distributions
> > > distribute bpftool and libbpf properly. Adding extra binaries to
> > > distribute around doesn't seem to be easing any of users pains."
> >
> > Last time we argued I heard how GH makes libbpf packaging easier.
> > Only to have that dis-proven once the people in Europe who do distro
> > packaging woke up:
> >
> > https://lkml.org/lkml/2019/12/5/101
> > https://lkml.org/lkml/2019/12/5/312
>
> I think you missed the point of these two comments. It was about packaging
> bpftool and libbpf together. Regardless how bpftool is packaged. I still
> strongly suggest to use github/libbpf to package libbpf. It's something that is
> actually tested whereas libbpf in the kernel tree has unit test coverage only.
I disagree.
> > > My opinion is the following.
> > > bpftool is necessary to write bpf programs already. It's necessary to produce
> > > vmlinux.h for bpf programs to include it. It's part of build process. I can
> > > relate to Stan's complains that he needs to update clang and pahole. He missed
> > > the fact that he needs to update bpftool too if he wants to use all features of
> > > CO-RE. Same thing for skeleton generation. If people need to run the latest
> > > selftest/bpf on the latest kernel they need to upgrade to the latest clang,
> > > pahole, libbpf, bpftool. Nothing new here.
> >
> > They have to update libbpf, so why can't the code gen tool be part of
> > libbpf?
>
> I'm not sure why two answers were not enough.
> No idea how to answer this question differently for the third time.
I'm just presenting what I consider to be a cleaner solution.
> > > Backwards compat is the same concern for skeleton generation and for vmlinux.h
> > > generation. Obviously no one wants to introduce something that will keep
> > > changing. Is vmlinux.h generation stable? I like to believe so. Same with
> > > skeleton. I wouldn't want to see it changing, but in both cases such chance
> > > exists.
> >
> > vmlinux.h is pretty stable, there isn't much wiggle room there.
>
> Do you have experience working with vmlinux.h? I bet the answer is no.
> While we have and identified few things that needs improvement.
> They require vmlinux.h to be generated differently.
>
> > It's more of a conversion tool, if you will.
> >
> > Skeleton OTOH is supposed to make people's lives easier, so it's a
> > completely different beast. It should be malleable so that users can
> > improve and hack on it. Baking it into as system tool is counter
> > productive. Users should be able to grab the skel tool single-file
> > source and adjust for their project's needs. Distributing your own copy
> > of bpftool because you want to adjust skel is a heavy lift.
>
> Adjust generator for their custom needs? essentially fork it for
> private use? I'd rather prevent such possibility.
> When people start using it I'd prefer they come back to this mailing
> list with patches than do 'easy fork'.
>
> > > Now consider if vmlinux.h and skeleton generation is split out of bpftool into
> > > new tool. Effectively it would mean a fork of bpftool. Two binaries doing bpf
> > > elf file processing without clear distinction between them is going to be very
> > > confusing.
> >
> > To be clear I'm suggesting skel gen is a separate tool, vmlinux and
> > Quentin's header gen work on the running system, they are not pure
> > build env tools.
>
> You meant to say Andrii's header generator that is based on Quentin's man page
> generator. Its output bpf_helper_defs.h makes sense as a part of libbpf
> package. The generator script itself doesn't need to be included with any package.
> bpftool vmlinux gen consumes vmlinux elf files and is a part of the build.
> bpftool skeleton gen consumes bpf elf files and is a part of the same build.
I said what I meant to say tools/bpf/bpftool/feature.c
On 12/12, Alexei Starovoitov wrote:
> On Thu, Dec 12, 2019 at 10:43:34AM -0800, Jakub Kicinski wrote:
> One more point from Stan's email:
>
> > You can replace "our build system" with some other project you care about,
> > like systemd. They'd have the same problem with vendoring in recent enough
>
> we've been working with systemd folks for ~8 month to integrate libbpf into
> their build that is using meson build system and their CI that is github based.
> So we're well aware about systemd requirements for libbpf and friends.
Just curious (searching on systemd github for bpftool/libbpf doesn't
show up any code/issues): are you saying that there will be another ~8 months
to bring in bpftool or that it's already being worked on as part of
libbpf integration?
> > bpftool or waiting for every distro to do it. And all this work is
> > because you think that doing:
> >
> > my_obj->rodata->my_var = 123;
> >
> > Is easier / more type safe than doing:
> > int *my_var = bpf_object__rodata_lookup(obj, "my_var");
> > *my_var = 123;
>
> Stan, you conveniently skipped error checking. It should have been:
> int *my_var = bpf_object__rodata_lookup(obj, "my_var");
> if (IS_ERROR_NULL(my_var))
> goto out_cleanup;
> *my_var = 123;
Yeah, but you have a choice, right? You can choose to check the error
and support old programs that don't export some global var and a new
program that has it. Or you can skip the error checks and rely on null
deref crash which is sometimes an option.
(might be not relevant with the introduction of EMBED_FILE which you
seem to be using more and more; ideally, we still would like to be able to
distribute bpf.o and userspace binary separately).
> Take a look at Andrii's patch 13/15:
> 5 files changed, 149 insertions(+), 249 deletions(-)
> Those are simple selftests, yet code removal is huge. Bigger project benefits
> even more.
Excluding fentry/fexit tests (where new find_program_by_title+attach
helper and mmap might help), it looks like the majority of those gains come
from the fact that the patch in question doesn't do any error checking.
You can drop all the CHECK() stuff for existing
find_map_by_name/find_prog_by_name instead and get the same gains.
[as usual, feel free to ignore me, I don't want to keep flaming about
it, but it's hard not to reply]
On Thu, Dec 12, 2019 at 1:46 PM Stanislav Fomichev <[email protected]> wrote:
>
> On 12/12, Alexei Starovoitov wrote:
> > On Thu, Dec 12, 2019 at 10:43:34AM -0800, Jakub Kicinski wrote:
> > One more point from Stan's email:
> >
> > > You can replace "our build system" with some other project you care about,
> > > like systemd. They'd have the same problem with vendoring in recent enough
> >
> > we've been working with systemd folks for ~8 month to integrate libbpf into
> > their build that is using meson build system and their CI that is github based.
> > So we're well aware about systemd requirements for libbpf and friends.
> Just curious (searching on systemd github for bpftool/libbpf doesn't
> show up any code/issues): are you saying that there will be another ~8 months
> to bring in bpftool or that it's already being worked on as part of
> libbpf integration?
>
> > > bpftool or waiting for every distro to do it. And all this work is
> > > because you think that doing:
> > >
> > > my_obj->rodata->my_var = 123;
> > >
> > > Is easier / more type safe than doing:
> > > int *my_var = bpf_object__rodata_lookup(obj, "my_var");
> > > *my_var = 123;
> >
> > Stan, you conveniently skipped error checking. It should have been:
> > int *my_var = bpf_object__rodata_lookup(obj, "my_var");
> > if (IS_ERROR_NULL(my_var))
> > goto out_cleanup;
> > *my_var = 123;
> Yeah, but you have a choice, right? You can choose to check the error
> and support old programs that don't export some global var and a new
> program that has it. Or you can skip the error checks and rely on null
> deref crash which is sometimes an option.
So your point is that between
A. Skeleton way:
my_obj->rodata->my_var = 123;
and
B. look up way:
int *my_var = bpf_object__rodata_lookup(obj, "my_var");
*my_var = 123;
B is **better**, because **if you drop error checking** (which
apparently is a choice within "Google distro") then it comes really
close to succinctness of skeleton, is that right?
Let's follow this through to the end with two pretty typical situations.
1. Someone renames my_var into my_var1.
A. Skeleton: compilation error.
B. Lookup: NULL deref, core dump. If there is helpful in-program
infra, it will dump stack trace and it should be pretty easy to
pinpoint which variable look up failed (provided binary is built with
debug info). If not - gdb is your friend, you'll figure this out
pretty quickly.
2. Someone changes the type of my_var from u64 to u32.
A. Skeleton: compiler will warn on too large constant assignment,
but otherwise it's C, baby. But at least we are only going to write
(potentially truncated) 4 bytes.
B. Lookup: we are happily overwriting 4 bytes of some other variable
(best case -- just padding), without anyone realizing this. Good if we
have exhaustive correctness testing for program logic.
I do think there is a clear winner, it just seems like we disagree which one.
>
> (might be not relevant with the introduction of EMBED_FILE which you
> seem to be using more and more; ideally, we still would like to be able to
> distribute bpf.o and userspace binary separately).
I don't even know what BPF_EMBED_OBJ has to do with skeleton, tbh.
BPF_EMBED_OBJ is just one of the ways to get contents of BPF object
file. You can just as well instantiate skeleton from separate .o file,
if you read its content in memory and create struct bpf_embed_data
with pointer to it (or just mmap() it, which would be even easier). If
this is typical case, we can generate an extra function to instantiate
skeleton from file, of course.
>
> > Take a look at Andrii's patch 13/15:
> > 5 files changed, 149 insertions(+), 249 deletions(-)
> > Those are simple selftests, yet code removal is huge. Bigger project benefits
> > even more.
> Excluding fentry/fexit tests (where new find_program_by_title+attach
> helper and mmap might help), it looks like the majority of those gains come
> from the fact that the patch in question doesn't do any error checking.
> You can drop all the CHECK() stuff for existing
> find_map_by_name/find_prog_by_name instead and get the same gains.
See above. The fact we are not doing error checking with skeleton is
because it can't fail. If programmer screwed up name of variable,
program, or variable -- that's compilation error! That's the reason we
don't check errors, not because it's a cowboy-style "You can choose to
check the error" approach.
>
> [as usual, feel free to ignore me, I don't want to keep flaming about
> it, but it's hard not to reply]
likewise
On Thu, Dec 12, 2019 at 12:21 PM Jakub Kicinski
<[email protected]> wrote:
>
> On Thu, 12 Dec 2019 11:54:16 -0800, Alexei Starovoitov wrote:
> > On Thu, Dec 12, 2019 at 10:43:34AM -0800, Jakub Kicinski wrote:
> > > On Thu, 12 Dec 2019 08:53:22 -0800, Andrii Nakryiko wrote:
> > > > > > > Btw, how hard it would be to do this generation with a new python
> > > > > > > script instead of bpftool? Something along the lines of
> > > > > > > scripts/bpf_helpers_doc.py that parses BTF and spits out this C header
> > > > > > > (shouldn't be that hard to write custom BTF parser in python, right)?
> > > > > > >
> > > > > >
> > > > > > Not impossible, but harder than I'd care to deal with. I certainly
> > > > > > don't want to re-implement a good chunk of ELF and BTF parsing (maps,
> > > > > > progs, in addition to datasec stuff). But "it's hard to use bpftool in
> > > > > > our build system" doesn't seem like good enough reason to do all that.
> > > > > You can replace "our build system" with some other project you care about,
> > > > > like systemd. They'd have the same problem with vendoring in recent enough
> > > > > bpftool or waiting for every distro to do it. And all this work is
> > > > > because you think that doing:
> > > > >
> > > > > my_obj->rodata->my_var = 123;
> > > > >
> > > > > Is easier / more type safe than doing:
> > > > > int *my_var = bpf_object__rodata_lookup(obj, "my_var");
> > > > > *my_var = 123;
> > > >
> > > > Your arguments are confusing me. Did I say that we shouldn't add this
> > > > type of "dynamic" interface to variables? Or did I say that every
> > > > single BPF application has to adopt skeleton and bpftool? I made no
> > > > such claims and it seems like discussion is just based around where I
> > > > have to apply my time and efforts... You think it's not useful - don't
> > > > integrate bpftool into your build system, simple as that. Skeleton is
> > > > used for selftests, but it's up to maintainers to decide whether to
> > > > keep this, similar to all the BTF decisions.
> > >
> > > Since we have two people suggesting this functionality to be a separate
> > > tool could you please reconsider my arguments from two days ago?
> > >
> > > There absolutely nothing this tool needs from [bpftool], no
> > > JSON needed, no bpffs etc.
> >
> > To generate vmlinux.h bpftool doesn't need json and doesn't need bpffs.
>
> At least for header generation it pertains to the running system.
> And bpftool was (and still is AFAICT) about interacting with the BPF
> state on the running system.
As Alexei already mentioned, `bpftool btf dump file
<elf-file-with-BTF-section>` has as much to do with running system,
as, say, `readelf -x .BTF <elf-file-with-BTF-section>`. So if
bpftool's only goal is to interact with BPF state of running system,
as opposed to be, you know, the "all things BPF" multitool, then that
ship has sailed when we added `bpftool btf` subcommand, which I don't
remember hearing too much objections about at that time (which
happened quite a while ago, btw).
>
> > > It can be a separate tool like
> > > libbpf-skel-gen or libbpf-c-skel or something, distributed with libbpf.
> > > That way you can actually soften the backward compat. In case people
> > > become dependent on it they can carry that little tool on their own.
> >
> > Jakub,
> >
> > Could you please consider Andrii's reply to your comment from two days ago:
> > https://lore.kernel.org/bpf/CAEf4BzbeZbmCTOOo2uQXjm0GL0WDu7aLN6fdUk18Nv2g0kfwVg@mail.gmail.com/
> > "we are trying to make users lives easier by having major distributions
> > distribute bpftool and libbpf properly. Adding extra binaries to
> > distribute around doesn't seem to be easing any of users pains."
>
> Last time we argued I heard how GH makes libbpf packaging easier.
> Only to have that dis-proven once the people in Europe who do distro
> packaging woke up:
>
> https://lkml.org/lkml/2019/12/5/101
> https://lkml.org/lkml/2019/12/5/312
>
> I feel I'm justified not to take your opinion on this as fact.
>
> > My opinion is the following.
> > bpftool is necessary to write bpf programs already. It's necessary to produce
> > vmlinux.h for bpf programs to include it. It's part of build process. I can
> > relate to Stan's complains that he needs to update clang and pahole. He missed
> > the fact that he needs to update bpftool too if he wants to use all features of
> > CO-RE. Same thing for skeleton generation. If people need to run the latest
> > selftest/bpf on the latest kernel they need to upgrade to the latest clang,
> > pahole, libbpf, bpftool. Nothing new here.
>
> They have to update libbpf, so why can't the code gen tool be part of
> libbpf? We don't need to build all BPF user space into one binary.
>
> > Backwards compat is the same concern for skeleton generation and for vmlinux.h
> > generation. Obviously no one wants to introduce something that will keep
> > changing. Is vmlinux.h generation stable? I like to believe so. Same with
> > skeleton. I wouldn't want to see it changing, but in both cases such chance
> > exists.
>
> vmlinux.h is pretty stable, there isn't much wiggle room there.
> It's more of a conversion tool, if you will.
>
> Skeleton OTOH is supposed to make people's lives easier, so it's a
> completely different beast. It should be malleable so that users can
Skeleton is also a conversion tool. From compiled BPF object file to
it's "C representation". Just like BTF-to-C converter takes BTF and
creates its C representation. Both make people's lives easier. They
are in the same boat: productivity-enhancing tools, not a system
introspection tools.
> improve and hack on it. Baking it into as system tool is counter
> productive. Users should be able to grab the skel tool single-file
> source and adjust for their project's needs. Distributing your own copy
> of bpftool because you want to adjust skel is a heavy lift.
Skeleton is auto-generated code, it's not supposed to be tweaked or
"adjusted" by hand. Because next time you do tiny change to your BPF
source code (just swap order of two global variables), skeleton
changes. If someone is not satisfied with the way skeleton generation
looks like, they should propose changes and contribute to common
algorithm. Or, of course, they can just go and re-implement it on
their own, if struct bpf_object_skeleton suits them still (which is
what libbpf works with). Then they can do it in Python, Go, Scala,
Java, Perl, whatnot. But somehow I doubt anyone would want to do that.
>
> And maybe one day we do have Python/Go/whatever bindings, and we can
> convert the skel tool to a higher level language with modern templating.
Because high-level implementation is going to be so much simpler and
shorter, really? Is it that complicated in C right now? What's the
real benefit of waiting to be able to do it in "higher level" language
beyond being the contrarian? Apart from \n\ (which is mostly hidden
from view), I don't think high-level templates are going to be much
more clean.
>
> > We cannot and should not adopt kernel-like ABI guarantees to user space
> > code. It will paralyze the development.
>
> Discussion for another time :)
If this "experimental" disclaimer is a real blocker for all of this, I
don't mind making it a public API right now. bpf_object_skeleton is
already designed to be backward/forward compatible with size of struct
itself and all the sub-structs recorded during initialization. I
didn't mean to create impression like this whole approach is so raw
and untried that it will most certainly break and we are still unsure
about it. It's not and it certainly improves set up code for
real-world applications. We might need to add some extra option here
and there, but the stuff that's there already will stay as is.
Would moving all the skeleton-related stuff into libbpf.h and
"stabilizing" it make all this more tolerable for you?
>
> > Now consider if vmlinux.h and skeleton generation is split out of bpftool into
> > new tool. Effectively it would mean a fork of bpftool. Two binaries doing bpf
> > elf file processing without clear distinction between them is going to be very
> > confusing.
>
> To be clear I'm suggesting skel gen is a separate tool, vmlinux and
> Quentin's header gen work on the running system, they are not pure
> build env tools.
On Thu, 12 Dec 2019 22:48:10 -0800, Andrii Nakryiko wrote:
> > improve and hack on it. Baking it into as system tool is counter
> > productive. Users should be able to grab the skel tool single-file
> > source and adjust for their project's needs. Distributing your own copy
> > of bpftool because you want to adjust skel is a heavy lift.
>
> Skeleton is auto-generated code, it's not supposed to be tweaked or
> "adjusted" by hand.
Obviously not, I said adjusting the codegen tool, not the output.
> Because next time you do tiny change to your BPF
> source code (just swap order of two global variables), skeleton
> changes. If someone is not satisfied with the way skeleton generation
> looks like, they should propose changes and contribute to common
> algorithm. Or, of course, they can just go and re-implement it on
> their own, if struct bpf_object_skeleton suits them still (which is
> what libbpf works with). Then they can do it in Python, Go, Scala,
> Java, Perl, whatnot. But somehow I doubt anyone would want to do that.
>
> > And maybe one day we do have Python/Go/whatever bindings, and we can
> > convert the skel tool to a higher level language with modern templating.
>
> Because high-level implementation is going to be so much simpler and
> shorter, really? Is it that complicated in C right now? What's the
> real benefit of waiting to be able to do it in "higher level" language
> beyond being the contrarian?
I did not say wait, I said do C and convert to something else once easy.
You really gotta read responses more carefully :/
> Apart from \n\ (which is mostly hidden
> from view), I don't think high-level templates are going to be much
> more clean.
>
> > > We cannot and should not adopt kernel-like ABI guarantees to user space
> > > code. It will paralyze the development.
> >
> > Discussion for another time :)
>
> If this "experimental" disclaimer is a real blocker for all of this, I
> don't mind making it a public API right now. bpf_object_skeleton is
> already designed to be backward/forward compatible with size of struct
> itself and all the sub-structs recorded during initialization. I
> didn't mean to create impression like this whole approach is so raw
> and untried that it will most certainly break and we are still unsure
> about it. It's not and it certainly improves set up code for
> real-world applications. We might need to add some extra option here
> and there, but the stuff that's there already will stay as is.
As explained the experimental disclaimer is fairly useless and it gives
people precedent for maybe not caring as hard as they should about
ironing the details out before sending code upstream.
I think we can just add a switch or option for improved generation when
needed. You already check there are not extra trailing arguments so we
should be good.
> Would moving all the skeleton-related stuff into libbpf.h and
> "stabilizing" it make all this more tolerable for you?
I think I'm too tired if this to have an option any more.