LinuxLists.cc - [PATCH 0/8] bpf: Add fprobe link

[permalink] [raw]

Subject: [PATCH 5/8] libbpf: Add bpf_link_create support for multi kprobes

Adding new kprobe struct in bpf_link_create_opts object
to pass multi kprobe data to link_create attr API.

Signed-off-by: Jiri Olsa <[email protected]>
---
tools/lib/bpf/bpf.c | 7 +++++++
tools/lib/bpf/bpf.h | 9 ++++++++-
2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 418b259166f8..98156709a96c 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -853,6 +853,13 @@ int bpf_link_create(int prog_fd, int target_fd,
if (!OPTS_ZEROED(opts, perf_event))
return libbpf_err(-EINVAL);
break;
+ case BPF_TRACE_FPROBE:
+ attr.link_create.fprobe.syms = OPTS_GET(opts, fprobe.syms, 0);
+ attr.link_create.fprobe.addrs = OPTS_GET(opts, fprobe.addrs, 0);
+ attr.link_create.fprobe.cnt = OPTS_GET(opts, fprobe.cnt, 0);
+ attr.link_create.fprobe.flags = OPTS_GET(opts, fprobe.flags, 0);
+ attr.link_create.fprobe.bpf_cookies = OPTS_GET(opts, fprobe.bpf_cookies, 0);
+ break;
default:
if (!OPTS_ZEROED(opts, flags))
return libbpf_err(-EINVAL);
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index c2e8327010f9..114e828ae027 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -413,10 +413,17 @@ struct bpf_link_create_opts {
struct {
__u64 bpf_cookie;
} perf_event;
+ struct {
+ __u64 syms;
+ __u64 addrs;
+ __u32 cnt;
+ __u32 flags;
+ __u64 bpf_cookies;
+ } fprobe;
};
size_t :0;
};
-#define bpf_link_create_opts__last_field perf_event
+#define bpf_link_create_opts__last_field fprobe.bpf_cookies

LIBBPF_API int bpf_link_create(int prog_fd, int target_fd,
enum bpf_attach_type attach_type,
--
2.34.1

2022-02-04 09:54:42

by Steven Rostedt

[permalink] [raw]

Subject: Re: [PATCH 0/8] bpf: Add fprobe link

On Thu, 3 Feb 2022 18:12:11 -0800
Alexei Starovoitov <[email protected]> wrote:

> > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > transparently.
>
> Not true.
> fprobe is nothing but _explicit_ kprobe on ftrace.
> There was an implicit optimization for kprobe when ftrace
> could be used.
> All this new interface is doing is making it explicit.
> So a new name is not warranted here.
>
> > from that viewpoint, fprobe and kprobe interface are similar but different.
>
> What is the difference?
> I don't see it.

IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
abilities that a normal kprobe does not. Namely, "what is the function
parameters?"

You can only reliably get the parameters at function entry. Hence, by
having a probe that is unique to functions as supposed to the middle of a
function, makes sense to me.

That is, the API can change. "Give me parameter X". That along with some
BTF reading, could figure out how to get parameter X, and record that.

-- Steve

2022-02-04 18:40:30

[permalink] [raw]

Subject: [PATCH 4/8] libbpf: Add libbpf__kallsyms_parse function

Move the kallsyms parsing in internal libbpf__kallsyms_parse
function, so it can be used from other places.

It will be used in following changes.

Signed-off-by: Jiri Olsa <[email protected]>
---
tools/lib/bpf/libbpf.c | 62 ++++++++++++++++++++-------------
tools/lib/bpf/libbpf_internal.h | 5 +++
2 files changed, 43 insertions(+), 24 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 1b0936b016d9..7d595cfd03bc 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -7165,12 +7165,10 @@ static int bpf_object__sanitize_maps(struct bpf_object *obj)
return 0;
}

-static int bpf_object__read_kallsyms_file(struct bpf_object *obj)
+int libbpf__kallsyms_parse(void *arg, kallsyms_cb_t cb)
{
char sym_type, sym_name[500];
unsigned long long sym_addr;
- const struct btf_type *t;
- struct extern_desc *ext;
int ret, err = 0;
FILE *f;

@@ -7189,35 +7187,51 @@ static int bpf_object__read_kallsyms_file(struct bpf_object *obj)
if (ret != 3) {
pr_warn("failed to read kallsyms entry: %d\n", ret);
err = -EINVAL;
- goto out;
+ break;
}

- ext = find_extern_by_name(obj, sym_name);
- if (!ext || ext->type != EXT_KSYM)
- continue;
-
- t = btf__type_by_id(obj->btf, ext->btf_id);
- if (!btf_is_var(t))
- continue;
-
- if (ext->is_set && ext->ksym.addr != sym_addr) {
- pr_warn("extern (ksym) '%s' resolution is ambiguous: 0x%llx or 0x%llx\n",
- sym_name, ext->ksym.addr, sym_addr);
- err = -EINVAL;
- goto out;
- }
- if (!ext->is_set) {
- ext->is_set = true;
- ext->ksym.addr = sym_addr;
- pr_debug("extern (ksym) %s=0x%llx\n", sym_name, sym_addr);
- }
+ err = cb(arg, sym_addr, sym_type, sym_name);
+ if (err)
+ break;
}

-out:
fclose(f);
return err;
}

+static int kallsyms_cb(void *arg, unsigned long long sym_addr,
+ char sym_type, const char *sym_name)
+{
+ struct bpf_object *obj = arg;
+ const struct btf_type *t;
+ struct extern_desc *ext;
+
+ ext = find_extern_by_name(obj, sym_name);
+ if (!ext || ext->type != EXT_KSYM)
+ return 0;
+
+ t = btf__type_by_id(obj->btf, ext->btf_id);
+ if (!btf_is_var(t))
+ return 0;
+
+ if (ext->is_set && ext->ksym.addr != sym_addr) {
+ pr_warn("extern (ksym) '%s' resolution is ambiguous: 0x%llx or 0x%llx\n",
+ sym_name, ext->ksym.addr, sym_addr);
+ return -EINVAL;
+ }
+ if (!ext->is_set) {
+ ext->is_set = true;
+ ext->ksym.addr = sym_addr;
+ pr_debug("extern (ksym) %s=0x%llx\n", sym_name, sym_addr);
+ }
+ return 0;
+}
+
+static int bpf_object__read_kallsyms_file(struct bpf_object *obj)
+{
+ return libbpf__kallsyms_parse(obj, kallsyms_cb);
+}
+
static int find_ksym_btf_id(struct bpf_object *obj, const char *ksym_name,
__u16 kind, struct btf **res_btf,
struct module_btf **res_mod_btf)
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
index bc86b82e90d1..fb3b07d401df 100644
--- a/tools/lib/bpf/libbpf_internal.h
+++ b/tools/lib/bpf/libbpf_internal.h
@@ -449,6 +449,11 @@ __s32 btf__find_by_name_kind_own(const struct btf *btf, const char *type_name,

extern enum libbpf_strict_mode libbpf_mode;

+typedef int (*kallsyms_cb_t)(void *arg, unsigned long long sym_addr,
+ char sym_type, const char *sym_name);
+
+int libbpf__kallsyms_parse(void *arg, kallsyms_cb_t cb);
+
/* handle direct returned errors */
static inline int libbpf_err(int ret)
{
--
2.34.1

2022-02-04 21:20:16

by Alexei Starovoitov

[permalink] [raw]

Subject: Re: [PATCH 0/8] bpf: Add fprobe link

On Thu, Feb 3, 2022 at 6:19 PM Steven Rostedt <[email protected]> wrote:
>
> On Thu, 3 Feb 2022 18:12:11 -0800
> Alexei Starovoitov <[email protected]> wrote:
>
> > > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > > transparently.
> >
> > Not true.
> > fprobe is nothing but _explicit_ kprobe on ftrace.
> > There was an implicit optimization for kprobe when ftrace
> > could be used.
> > All this new interface is doing is making it explicit.
> > So a new name is not warranted here.
> >
> > > from that viewpoint, fprobe and kprobe interface are similar but different.
> >
> > What is the difference?
> > I don't see it.
>
> IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
> abilities that a normal kprobe does not. Namely, "what is the function
> parameters?"
>
> You can only reliably get the parameters at function entry. Hence, by
> having a probe that is unique to functions as supposed to the middle of a
> function, makes sense to me.
>
> That is, the API can change. "Give me parameter X". That along with some
> BTF reading, could figure out how to get parameter X, and record that.

This is more or less a description of kprobe on ftrace :)
The bpf+kprobe users were relying on that for a long time.
See PT_REGS_PARM1() macros in bpf_tracing.h
They're meaningful only with kprobe on ftrace.
So, no, fprobe is not inventing anything new here.

No one is using kprobe in the middle of the function.
It's too difficult to make anything useful out of it,
so no one bothers.
When people say "kprobe" 99 out of 100 they mean
kprobe on ftrace/fentry.

2022-02-04 22:59:58

[permalink] [raw]

Subject: Re: [PATCH 0/8] bpf: Add fprobe link

Hi Alexei,

On Thu, 3 Feb 2022 18:42:22 -0800
Alexei Starovoitov <[email protected]> wrote:

> On Thu, Feb 3, 2022 at 6:19 PM Steven Rostedt <[email protected]> wrote:
> >
> > On Thu, 3 Feb 2022 18:12:11 -0800
> > Alexei Starovoitov <[email protected]> wrote:
> >
> > > > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > > > transparently.
> > >
> > > Not true.
> > > fprobe is nothing but _explicit_ kprobe on ftrace.
> > > There was an implicit optimization for kprobe when ftrace
> > > could be used.
> > > All this new interface is doing is making it explicit.
> > > So a new name is not warranted here.
> > >
> > > > from that viewpoint, fprobe and kprobe interface are similar but different.
> > >
> > > What is the difference?
> > > I don't see it.
> >
> > IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
> > abilities that a normal kprobe does not. Namely, "what is the function
> > parameters?"
> >
> > You can only reliably get the parameters at function entry. Hence, by
> > having a probe that is unique to functions as supposed to the middle of a
> > function, makes sense to me.
> >
> > That is, the API can change. "Give me parameter X". That along with some
> > BTF reading, could figure out how to get parameter X, and record that.
>
> This is more or less a description of kprobe on ftrace :)
> The bpf+kprobe users were relying on that for a long time.
> See PT_REGS_PARM1() macros in bpf_tracing.h
> They're meaningful only with kprobe on ftrace.
> So, no, fprobe is not inventing anything new here.

Hmm, you may be misleading why PT_REGS_PARAM1() macro works. You can use
it even if CONFIG_FUNCITON_TRACER=n if your kernel is built with
CONFIG_KPROBES=y. It is valid unless you put a probe out of function
entry.

> No one is using kprobe in the middle of the function.
> It's too difficult to make anything useful out of it,
> so no one bothers.
> When people say "kprobe" 99 out of 100 they mean
> kprobe on ftrace/fentry.

I see. But the kprobe is kprobe. It is not designed to support multiple
probe points. If I'm forced to say, I can rename the struct fprobe to
struct multi_kprobe, but that doesn't change the essence. You may need
to use both of kprobes and so-called multi_kprobe properly. (Someone
need to do that.)

Thank you,

--
Masami Hiramatsu <[email protected]>

2022-02-04 23:40:10

[permalink] [raw]

Subject: Re: [PATCH 0/8] bpf: Add fprobe link

On Thu, 3 Feb 2022 17:34:54 -0800
Alexei Starovoitov <[email protected]> wrote:

> On Thu, Feb 3, 2022 at 4:46 PM Masami Hiramatsu <[email protected]> wrote:
> >
> > I thought What Alexei pointed was that don't expose the FPROBE name
> > to user space. If so, I agree with that. We can continue to use
> > KPROBE for user space. Using fprobe is just for kernel implementation.
>
> Clearly that intent is not working.

Thanks for confirmation :-)

> The "fprobe" name is already leaking outside of the kernel internals.
> The module interface is being proposed.

Yes, but that is only for making the example module.
It is easy for me to enclose it inside kernel. I'm preparing KUnit
selftest code for next version. After integrated that, we don't need
that example module anymore.

> You'd need to document it, etc.

Yes, I've added a document of the APIs for the series. :-)

> I think it's only causing confusion to users.
> The new name serves no additional purpose other than
> being new and unheard of.
> fprobe is kprobe on ftrace. That's it.

No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
transparently.

> Just call it kprobe on ftrace in api and everywhere.
> Please?

Hmm, no, I think that's the work for who provide user-interface, isn't it?.
Inside kernel, IMHO, the interface named from the programing viewpoint, and
from that viewpoint, fprobe and kprobe interface are similar but different.

I'm able to allow kprobe-event (of ftrace) to accept "func*" (yeah, that's
actually good idea), but ftrace interface will not export as fprobe. Even if
it internally uses fprobe, I don't call it fprobe. It's kprobes from the
viewpoint of ftrace user. (Yeah, I think it should be called as
"dynamic-probe-event-for-kernel" but historically, it is called as kprobe-event.)

Thank you,

--
Masami Hiramatsu <[email protected]>

2022-02-07 06:18:12

by Alexei Starovoitov

[permalink] [raw]

Subject: Re: [PATCH 0/8] bpf: Add fprobe link

On Thu, Feb 3, 2022 at 6:07 PM Masami Hiramatsu <[email protected]> wrote:
>
> On Thu, 3 Feb 2022 17:34:54 -0800
> Alexei Starovoitov <[email protected]> wrote:
>
> > On Thu, Feb 3, 2022 at 4:46 PM Masami Hiramatsu <[email protected]> wrote:
> > >
> > > I thought What Alexei pointed was that don't expose the FPROBE name
> > > to user space. If so, I agree with that. We can continue to use
> > > KPROBE for user space. Using fprobe is just for kernel implementation.
> >
> > Clearly that intent is not working.
>
> Thanks for confirmation :-)
>
> > The "fprobe" name is already leaking outside of the kernel internals.
> > The module interface is being proposed.
>
> Yes, but that is only for making the example module.
> It is easy for me to enclose it inside kernel. I'm preparing KUnit
> selftest code for next version. After integrated that, we don't need
> that example module anymore.
>
> > You'd need to document it, etc.
>
> Yes, I've added a document of the APIs for the series. :-)
>
> > I think it's only causing confusion to users.
> > The new name serves no additional purpose other than
> > being new and unheard of.
> > fprobe is kprobe on ftrace. That's it.
>
> No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> transparently.

Not true.
fprobe is nothing but _explicit_ kprobe on ftrace.
There was an implicit optimization for kprobe when ftrace
could be used.
All this new interface is doing is making it explicit.
So a new name is not warranted here.

> from that viewpoint, fprobe and kprobe interface are similar but different.

What is the difference?
I don't see it.

2022-02-09 06:36:32

[permalink] [raw]

Subject: Re: [PATCH 4/8] libbpf: Add libbpf__kallsyms_parse function

On Wed, Feb 2, 2022 at 5:54 AM Jiri Olsa <[email protected]> wrote:
>
> Move the kallsyms parsing in internal libbpf__kallsyms_parse
> function, so it can be used from other places.
>
> It will be used in following changes.
>
> Signed-off-by: Jiri Olsa <[email protected]>
> ---
> tools/lib/bpf/libbpf.c | 62 ++++++++++++++++++++-------------
> tools/lib/bpf/libbpf_internal.h | 5 +++
> 2 files changed, 43 insertions(+), 24 deletions(-)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 1b0936b016d9..7d595cfd03bc 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -7165,12 +7165,10 @@ static int bpf_object__sanitize_maps(struct bpf_object *obj)
> return 0;
> }
>
> -static int bpf_object__read_kallsyms_file(struct bpf_object *obj)
> +int libbpf__kallsyms_parse(void *arg, kallsyms_cb_t cb)

please call it libbpf_kallsyms_parse(), internal APIs don't use
"object oriented" double underscore separator

also this "arg" is normally called "ctx" in similar APIs in libbpf and
is passed the last, can you please adjust all that for consistency?

> {
> char sym_type, sym_name[500];
> unsigned long long sym_addr;
> - const struct btf_type *t;
> - struct extern_desc *ext;
> int ret, err = 0;
> FILE *f;
>

[...]

2022-02-09 10:30:08

[permalink] [raw]

Subject: Re: [PATCH 4/8] libbpf: Add libbpf__kallsyms_parse function

On Mon, Feb 07, 2022 at 10:59:24AM -0800, Andrii Nakryiko wrote:
> On Wed, Feb 2, 2022 at 5:54 AM Jiri Olsa <[email protected]> wrote:
> >
> > Move the kallsyms parsing in internal libbpf__kallsyms_parse
> > function, so it can be used from other places.
> >
> > It will be used in following changes.
> >
> > Signed-off-by: Jiri Olsa <[email protected]>
> > ---
> > tools/lib/bpf/libbpf.c | 62 ++++++++++++++++++++-------------
> > tools/lib/bpf/libbpf_internal.h | 5 +++
> > 2 files changed, 43 insertions(+), 24 deletions(-)
> >
> > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > index 1b0936b016d9..7d595cfd03bc 100644
> > --- a/tools/lib/bpf/libbpf.c
> > +++ b/tools/lib/bpf/libbpf.c
> > @@ -7165,12 +7165,10 @@ static int bpf_object__sanitize_maps(struct bpf_object *obj)
> > return 0;
> > }
> >
> > -static int bpf_object__read_kallsyms_file(struct bpf_object *obj)
> > +int libbpf__kallsyms_parse(void *arg, kallsyms_cb_t cb)
>
> please call it libbpf_kallsyms_parse(), internal APIs don't use
> "object oriented" double underscore separator
>
> also this "arg" is normally called "ctx" in similar APIs in libbpf and
> is passed the last, can you please adjust all that for consistency?

ok, thanks

jirka

>
> > {
> > char sym_type, sym_name[500];
> > unsigned long long sym_addr;
> > - const struct btf_type *t;
> > - struct extern_desc *ext;
> > int ret, err = 0;
> > FILE *f;
> >
>
> [...]
>

2022-02-15 15:30:22

[permalink] [raw]

Subject: Re: [PATCH 0/8] bpf: Add fprobe link

On Fri, Feb 04, 2022 at 12:59:42PM +0900, Masami Hiramatsu wrote:
> Hi Alexei,
>
> On Thu, 3 Feb 2022 18:42:22 -0800
> Alexei Starovoitov <[email protected]> wrote:
>
> > On Thu, Feb 3, 2022 at 6:19 PM Steven Rostedt <[email protected]> wrote:
> > >
> > > On Thu, 3 Feb 2022 18:12:11 -0800
> > > Alexei Starovoitov <[email protected]> wrote:
> > >
> > > > > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > > > > transparently.
> > > >
> > > > Not true.
> > > > fprobe is nothing but _explicit_ kprobe on ftrace.
> > > > There was an implicit optimization for kprobe when ftrace
> > > > could be used.
> > > > All this new interface is doing is making it explicit.
> > > > So a new name is not warranted here.
> > > >
> > > > > from that viewpoint, fprobe and kprobe interface are similar but different.
> > > >
> > > > What is the difference?
> > > > I don't see it.
> > >
> > > IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
> > > abilities that a normal kprobe does not. Namely, "what is the function
> > > parameters?"
> > >
> > > You can only reliably get the parameters at function entry. Hence, by
> > > having a probe that is unique to functions as supposed to the middle of a
> > > function, makes sense to me.
> > >
> > > That is, the API can change. "Give me parameter X". That along with some
> > > BTF reading, could figure out how to get parameter X, and record that.
> >
> > This is more or less a description of kprobe on ftrace :)
> > The bpf+kprobe users were relying on that for a long time.
> > See PT_REGS_PARM1() macros in bpf_tracing.h
> > They're meaningful only with kprobe on ftrace.
> > So, no, fprobe is not inventing anything new here.
>
> Hmm, you may be misleading why PT_REGS_PARAM1() macro works. You can use
> it even if CONFIG_FUNCITON_TRACER=n if your kernel is built with
> CONFIG_KPROBES=y. It is valid unless you put a probe out of function
> entry.
>
> > No one is using kprobe in the middle of the function.
> > It's too difficult to make anything useful out of it,
> > so no one bothers.
> > When people say "kprobe" 99 out of 100 they mean
> > kprobe on ftrace/fentry.
>
> I see. But the kprobe is kprobe. It is not designed to support multiple
> probe points. If I'm forced to say, I can rename the struct fprobe to
> struct multi_kprobe, but that doesn't change the essence. You may need
> to use both of kprobes and so-called multi_kprobe properly. (Someone
> need to do that.)

hi,
tying to kick things further ;-) I was thinking about bpf side of this
and we could use following interface:

enum bpf_attach_type {
...
BPF_TRACE_KPROBE_MULTI
};

enum bpf_link_type {
...
BPF_LINK_TYPE_KPROBE_MULTI
};

union bpf_attr {

struct {
...
struct {
__aligned_u64 syms;
__aligned_u64 addrs;
__aligned_u64 cookies;
__u32 cnt;
__u32 flags;
} kprobe_multi;
} link_create;
}

because from bpf user POV it's new link for attaching multiple kprobes
and I agree new 'fprobe' type name in here brings more confusion, using
kprobe_multi is straightforward

thoguhts?

thanks,
jirka

2022-02-16 19:42:09

[permalink] [raw]

Subject: Re: [PATCH 0/8] bpf: Add fprobe link

On Tue, Feb 15, 2022 at 5:21 AM Jiri Olsa <[email protected]> wrote:
>
> On Fri, Feb 04, 2022 at 12:59:42PM +0900, Masami Hiramatsu wrote:
> > Hi Alexei,
> >
> > On Thu, 3 Feb 2022 18:42:22 -0800
> > Alexei Starovoitov <[email protected]> wrote:
> >
> > > On Thu, Feb 3, 2022 at 6:19 PM Steven Rostedt <[email protected]> wrote:
> > > >
> > > > On Thu, 3 Feb 2022 18:12:11 -0800
> > > > Alexei Starovoitov <[email protected]> wrote:
> > > >
> > > > > > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > > > > > transparently.
> > > > >
> > > > > Not true.
> > > > > fprobe is nothing but _explicit_ kprobe on ftrace.
> > > > > There was an implicit optimization for kprobe when ftrace
> > > > > could be used.
> > > > > All this new interface is doing is making it explicit.
> > > > > So a new name is not warranted here.
> > > > >
> > > > > > from that viewpoint, fprobe and kprobe interface are similar but different.
> > > > >
> > > > > What is the difference?
> > > > > I don't see it.
> > > >
> > > > IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
> > > > abilities that a normal kprobe does not. Namely, "what is the function
> > > > parameters?"
> > > >
> > > > You can only reliably get the parameters at function entry. Hence, by
> > > > having a probe that is unique to functions as supposed to the middle of a
> > > > function, makes sense to me.
> > > >
> > > > That is, the API can change. "Give me parameter X". That along with some
> > > > BTF reading, could figure out how to get parameter X, and record that.
> > >
> > > This is more or less a description of kprobe on ftrace :)
> > > The bpf+kprobe users were relying on that for a long time.
> > > See PT_REGS_PARM1() macros in bpf_tracing.h
> > > They're meaningful only with kprobe on ftrace.
> > > So, no, fprobe is not inventing anything new here.
> >
> > Hmm, you may be misleading why PT_REGS_PARAM1() macro works. You can use
> > it even if CONFIG_FUNCITON_TRACER=n if your kernel is built with
> > CONFIG_KPROBES=y. It is valid unless you put a probe out of function
> > entry.
> >
> > > No one is using kprobe in the middle of the function.
> > > It's too difficult to make anything useful out of it,
> > > so no one bothers.
> > > When people say "kprobe" 99 out of 100 they mean
> > > kprobe on ftrace/fentry.
> >
> > I see. But the kprobe is kprobe. It is not designed to support multiple
> > probe points. If I'm forced to say, I can rename the struct fprobe to
> > struct multi_kprobe, but that doesn't change the essence. You may need
> > to use both of kprobes and so-called multi_kprobe properly. (Someone
> > need to do that.)
>
> hi,
> tying to kick things further ;-) I was thinking about bpf side of this
> and we could use following interface:
>
> enum bpf_attach_type {
> ...
> BPF_TRACE_KPROBE_MULTI
> };
>
> enum bpf_link_type {
> ...
> BPF_LINK_TYPE_KPROBE_MULTI
> };
>
> union bpf_attr {
>
> struct {
> ...
> struct {
> __aligned_u64 syms;
> __aligned_u64 addrs;
> __aligned_u64 cookies;
> __u32 cnt;
> __u32 flags;
> } kprobe_multi;
> } link_create;
> }
>
> because from bpf user POV it's new link for attaching multiple kprobes
> and I agree new 'fprobe' type name in here brings more confusion, using
> kprobe_multi is straightforward
>
> thoguhts?

I think this makes sense. We do need new type of link to store ip ->
cookie mapping anyways.

Is there any chance to support this fast multi-attach for uprobe? If
yes, we might want to reuse the same link for both (so should we name
it more generically? on the other hand BPF program type for uprobe is
BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
consistent with what we have today).

But yeah, the main question is whether there is something preventing
us from supporting multi-attach uprobe as well? It would be really
great for USDT use case.

>
> thanks,
> jirka

2022-02-17 14:39:36

[permalink] [raw]

Subject: Re: [PATCH 0/8] bpf: Add fprobe link

On Wed, 16 Feb 2022 10:27:19 -0800
Andrii Nakryiko <[email protected]> wrote:

> On Tue, Feb 15, 2022 at 5:21 AM Jiri Olsa <[email protected]> wrote:
> >
> > On Fri, Feb 04, 2022 at 12:59:42PM +0900, Masami Hiramatsu wrote:
> > > Hi Alexei,
> > >
> > > On Thu, 3 Feb 2022 18:42:22 -0800
> > > Alexei Starovoitov <[email protected]> wrote:
> > >
> > > > On Thu, Feb 3, 2022 at 6:19 PM Steven Rostedt <[email protected]> wrote:
> > > > >
> > > > > On Thu, 3 Feb 2022 18:12:11 -0800
> > > > > Alexei Starovoitov <[email protected]> wrote:
> > > > >
> > > > > > > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > > > > > > transparently.
> > > > > >
> > > > > > Not true.
> > > > > > fprobe is nothing but _explicit_ kprobe on ftrace.
> > > > > > There was an implicit optimization for kprobe when ftrace
> > > > > > could be used.
> > > > > > All this new interface is doing is making it explicit.
> > > > > > So a new name is not warranted here.
> > > > > >
> > > > > > > from that viewpoint, fprobe and kprobe interface are similar but different.
> > > > > >
> > > > > > What is the difference?
> > > > > > I don't see it.
> > > > >
> > > > > IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
> > > > > abilities that a normal kprobe does not. Namely, "what is the function
> > > > > parameters?"
> > > > >
> > > > > You can only reliably get the parameters at function entry. Hence, by
> > > > > having a probe that is unique to functions as supposed to the middle of a
> > > > > function, makes sense to me.
> > > > >
> > > > > That is, the API can change. "Give me parameter X". That along with some
> > > > > BTF reading, could figure out how to get parameter X, and record that.
> > > >
> > > > This is more or less a description of kprobe on ftrace :)
> > > > The bpf+kprobe users were relying on that for a long time.
> > > > See PT_REGS_PARM1() macros in bpf_tracing.h
> > > > They're meaningful only with kprobe on ftrace.
> > > > So, no, fprobe is not inventing anything new here.
> > >
> > > Hmm, you may be misleading why PT_REGS_PARAM1() macro works. You can use
> > > it even if CONFIG_FUNCITON_TRACER=n if your kernel is built with
> > > CONFIG_KPROBES=y. It is valid unless you put a probe out of function
> > > entry.
> > >
> > > > No one is using kprobe in the middle of the function.
> > > > It's too difficult to make anything useful out of it,
> > > > so no one bothers.
> > > > When people say "kprobe" 99 out of 100 they mean
> > > > kprobe on ftrace/fentry.
> > >
> > > I see. But the kprobe is kprobe. It is not designed to support multiple
> > > probe points. If I'm forced to say, I can rename the struct fprobe to
> > > struct multi_kprobe, but that doesn't change the essence. You may need
> > > to use both of kprobes and so-called multi_kprobe properly. (Someone
> > > need to do that.)
> >
> > hi,
> > tying to kick things further ;-) I was thinking about bpf side of this
> > and we could use following interface:
> >
> > enum bpf_attach_type {
> > ...
> > BPF_TRACE_KPROBE_MULTI
> > };
> >
> > enum bpf_link_type {
> > ...
> > BPF_LINK_TYPE_KPROBE_MULTI
> > };
> >
> > union bpf_attr {
> >
> > struct {
> > ...
> > struct {
> > __aligned_u64 syms;
> > __aligned_u64 addrs;
> > __aligned_u64 cookies;
> > __u32 cnt;
> > __u32 flags;
> > } kprobe_multi;
> > } link_create;
> > }
> >
> > because from bpf user POV it's new link for attaching multiple kprobes
> > and I agree new 'fprobe' type name in here brings more confusion, using
> > kprobe_multi is straightforward
> >
> > thoguhts?
>
> I think this makes sense. We do need new type of link to store ip ->
> cookie mapping anyways.

This looks good to me too.

>
> Is there any chance to support this fast multi-attach for uprobe? If
> yes, we might want to reuse the same link for both (so should we name
> it more generically?

There is no interface to do that but also there is no limitation to
expand uprobes. For the kprobes, there are some limitations for the
function entry because it needs to share the space with ftrace. So
I introduced fprobe for easier to use.

> on the other hand BPF program type for uprobe is
> BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
> consistent with what we have today).

Hmm, I'm not sure why BPF made such design choice... (Uprobe needs
the target program.)

> But yeah, the main question is whether there is something preventing
> us from supporting multi-attach uprobe as well? It would be really
> great for USDT use case.

Ah, for the USDT, it will be useful. But since now we will have "user-event"
which is faster than uprobes, we may be better to consider to use it.

I'm not so sure how uprobes probes the target process, but maybe it has
to manage some memory pages and task related things. If we can split
those task-related part from struct uprobe software-breakpoint part,
it maybe easy to support multiple probe (one task-related part + multiple
software-breakpoint parts.)

Thank you,

--
Masami Hiramatsu <[email protected]>

2022-02-17 23:50:54

[permalink] [raw]

Subject: Re: [PATCH 0/8] bpf: Add fprobe link

On Thu, Feb 17, 2022 at 6:04 AM Masami Hiramatsu <[email protected]> wrote:
>
> On Wed, 16 Feb 2022 10:27:19 -0800
> Andrii Nakryiko <[email protected]> wrote:
>
> > On Tue, Feb 15, 2022 at 5:21 AM Jiri Olsa <[email protected]> wrote:
> > >
> > > On Fri, Feb 04, 2022 at 12:59:42PM +0900, Masami Hiramatsu wrote:
> > > > Hi Alexei,
> > > >
> > > > On Thu, 3 Feb 2022 18:42:22 -0800
> > > > Alexei Starovoitov <[email protected]> wrote:
> > > >
> > > > > On Thu, Feb 3, 2022 at 6:19 PM Steven Rostedt <[email protected]> wrote:
> > > > > >
> > > > > > On Thu, 3 Feb 2022 18:12:11 -0800
> > > > > > Alexei Starovoitov <[email protected]> wrote:
> > > > > >
> > > > > > > > No, fprobe is NOT kprobe on ftrace, kprobe on ftrace is already implemented
> > > > > > > > transparently.
> > > > > > >
> > > > > > > Not true.
> > > > > > > fprobe is nothing but _explicit_ kprobe on ftrace.
> > > > > > > There was an implicit optimization for kprobe when ftrace
> > > > > > > could be used.
> > > > > > > All this new interface is doing is making it explicit.
> > > > > > > So a new name is not warranted here.
> > > > > > >
> > > > > > > > from that viewpoint, fprobe and kprobe interface are similar but different.
> > > > > > >
> > > > > > > What is the difference?
> > > > > > > I don't see it.
> > > > > >
> > > > > > IIUC, a kprobe on a function (or ftrace, aka fprobe) gives some extra
> > > > > > abilities that a normal kprobe does not. Namely, "what is the function
> > > > > > parameters?"
> > > > > >
> > > > > > You can only reliably get the parameters at function entry. Hence, by
> > > > > > having a probe that is unique to functions as supposed to the middle of a
> > > > > > function, makes sense to me.
> > > > > >
> > > > > > That is, the API can change. "Give me parameter X". That along with some
> > > > > > BTF reading, could figure out how to get parameter X, and record that.
> > > > >
> > > > > This is more or less a description of kprobe on ftrace :)
> > > > > The bpf+kprobe users were relying on that for a long time.
> > > > > See PT_REGS_PARM1() macros in bpf_tracing.h
> > > > > They're meaningful only with kprobe on ftrace.
> > > > > So, no, fprobe is not inventing anything new here.
> > > >
> > > > Hmm, you may be misleading why PT_REGS_PARAM1() macro works. You can use
> > > > it even if CONFIG_FUNCITON_TRACER=n if your kernel is built with
> > > > CONFIG_KPROBES=y. It is valid unless you put a probe out of function
> > > > entry.
> > > >
> > > > > No one is using kprobe in the middle of the function.
> > > > > It's too difficult to make anything useful out of it,
> > > > > so no one bothers.
> > > > > When people say "kprobe" 99 out of 100 they mean
> > > > > kprobe on ftrace/fentry.
> > > >
> > > > I see. But the kprobe is kprobe. It is not designed to support multiple
> > > > probe points. If I'm forced to say, I can rename the struct fprobe to
> > > > struct multi_kprobe, but that doesn't change the essence. You may need
> > > > to use both of kprobes and so-called multi_kprobe properly. (Someone
> > > > need to do that.)
> > >
> > > hi,
> > > tying to kick things further ;-) I was thinking about bpf side of this
> > > and we could use following interface:
> > >
> > > enum bpf_attach_type {
> > > ...
> > > BPF_TRACE_KPROBE_MULTI
> > > };
> > >
> > > enum bpf_link_type {
> > > ...
> > > BPF_LINK_TYPE_KPROBE_MULTI
> > > };
> > >
> > > union bpf_attr {
> > >
> > > struct {
> > > ...
> > > struct {
> > > __aligned_u64 syms;
> > > __aligned_u64 addrs;
> > > __aligned_u64 cookies;
> > > __u32 cnt;
> > > __u32 flags;
> > > } kprobe_multi;
> > > } link_create;
> > > }
> > >
> > > because from bpf user POV it's new link for attaching multiple kprobes
> > > and I agree new 'fprobe' type name in here brings more confusion, using
> > > kprobe_multi is straightforward
> > >
> > > thoguhts?
> >
> > I think this makes sense. We do need new type of link to store ip ->
> > cookie mapping anyways.
>
> This looks good to me too.
>
> >
> > Is there any chance to support this fast multi-attach for uprobe? If
> > yes, we might want to reuse the same link for both (so should we name
> > it more generically?
>
> There is no interface to do that but also there is no limitation to
> expand uprobes. For the kprobes, there are some limitations for the
> function entry because it needs to share the space with ftrace. So
> I introduced fprobe for easier to use.
>
> > on the other hand BPF program type for uprobe is
> > BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
> > consistent with what we have today).
>
> Hmm, I'm not sure why BPF made such design choice... (Uprobe needs
> the target program.)
>

We've been talking about sleepable uprobe programs, so we might need
to add uprobe-specific program type, probably. But historically, from
BPF point of view there was no difference between kprobe and uprobe
programs (in terms of how they are run and what's available to them).
From BPF point of view, it was just attaching BPF program to a
perf_event.

>
> > But yeah, the main question is whether there is something preventing
> > us from supporting multi-attach uprobe as well? It would be really
> > great for USDT use case.
>
> Ah, for the USDT, it will be useful. But since now we will have "user-event"
> which is faster than uprobes, we may be better to consider to use it.

Any pointers? I'm not sure what "user-event" refers to.

>
> I'm not so sure how uprobes probes the target process, but maybe it has
> to manage some memory pages and task related things. If we can split
> those task-related part from struct uprobe software-breakpoint part,
> it maybe easy to support multiple probe (one task-related part + multiple
> software-breakpoint parts.)
>
> Thank you,
>
> --
> Masami Hiramatsu <[email protected]>

2022-02-18 04:21:12

[permalink] [raw]

Subject: Re: [PATCH 0/8] bpf: Add fprobe link

On Thu, 17 Feb 2022 14:01:30 -0800
Andrii Nakryiko <[email protected]> wrote:

> > > Is there any chance to support this fast multi-attach for uprobe? If
> > > yes, we might want to reuse the same link for both (so should we name
> > > it more generically?
> >
> > There is no interface to do that but also there is no limitation to
> > expand uprobes. For the kprobes, there are some limitations for the
> > function entry because it needs to share the space with ftrace. So
> > I introduced fprobe for easier to use.
> >
> > > on the other hand BPF program type for uprobe is
> > > BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
> > > consistent with what we have today).
> >
> > Hmm, I'm not sure why BPF made such design choice... (Uprobe needs
> > the target program.)
> >
>
> We've been talking about sleepable uprobe programs, so we might need
> to add uprobe-specific program type, probably. But historically, from
> BPF point of view there was no difference between kprobe and uprobe
> programs (in terms of how they are run and what's available to them).
> From BPF point of view, it was just attaching BPF program to a
> perf_event.

Got it, so that will reuse the uprobe_events in ftrace. But I think
the uprobe requires a "path" to the attached binary, how is it
specified?

> > > But yeah, the main question is whether there is something preventing
> > > us from supporting multi-attach uprobe as well? It would be really
> > > great for USDT use case.
> >
> > Ah, for the USDT, it will be useful. But since now we will have "user-event"
> > which is faster than uprobes, we may be better to consider to use it.
>
> Any pointers? I'm not sure what "user-event" refers to.

Here is the user-events series, which allows user program to define
raw dynamic events and it can write raw event data directly from
user space.

https://lore.kernel.org/all/[email protected]/

Thank you,

--
Masami Hiramatsu <[email protected]>

2022-02-19 07:08:38

[permalink] [raw]

Subject: Re: [PATCH 0/8] bpf: Add fprobe link

On Thu, Feb 17, 2022 at 8:07 PM Masami Hiramatsu <[email protected]> wrote:
>
> On Thu, 17 Feb 2022 14:01:30 -0800
> Andrii Nakryiko <[email protected]> wrote:
>
>
> > > > Is there any chance to support this fast multi-attach for uprobe? If
> > > > yes, we might want to reuse the same link for both (so should we name
> > > > it more generically?
> > >
> > > There is no interface to do that but also there is no limitation to
> > > expand uprobes. For the kprobes, there are some limitations for the
> > > function entry because it needs to share the space with ftrace. So
> > > I introduced fprobe for easier to use.
> > >
> > > > on the other hand BPF program type for uprobe is
> > > > BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
> > > > consistent with what we have today).
> > >
> > > Hmm, I'm not sure why BPF made such design choice... (Uprobe needs
> > > the target program.)
> > >
> >
> > We've been talking about sleepable uprobe programs, so we might need
> > to add uprobe-specific program type, probably. But historically, from
> > BPF point of view there was no difference between kprobe and uprobe
> > programs (in terms of how they are run and what's available to them).
> > From BPF point of view, it was just attaching BPF program to a
> > perf_event.
>
> Got it, so that will reuse the uprobe_events in ftrace. But I think
> the uprobe requires a "path" to the attached binary, how is it
> specified?

It's passed as a string to perf subsystem during perf_event_open() syscall.

>
> > > > But yeah, the main question is whether there is something preventing
> > > > us from supporting multi-attach uprobe as well? It would be really
> > > > great for USDT use case.
> > >
> > > Ah, for the USDT, it will be useful. But since now we will have "user-event"
> > > which is faster than uprobes, we may be better to consider to use it.
> >
> > Any pointers? I'm not sure what "user-event" refers to.
>
> Here is the user-events series, which allows user program to define
> raw dynamic events and it can write raw event data directly from
> user space.
>
> https://lore.kernel.org/all/[email protected]/
>

Thanks for the link! I'll check it out.

> Thank you,
>
> --
> Masami Hiramatsu <[email protected]>

2022-02-19 15:29:12

by Alexei Starovoitov

[permalink] [raw]

Subject: Re: [PATCH 0/8] bpf: Add fprobe link

On Thu, Feb 17, 2022 at 8:07 PM Masami Hiramatsu <[email protected]> wrote:
>
> On Thu, 17 Feb 2022 14:01:30 -0800
> Andrii Nakryiko <[email protected]> wrote:
>
>
> > > > Is there any chance to support this fast multi-attach for uprobe? If
> > > > yes, we might want to reuse the same link for both (so should we name
> > > > it more generically?
> > >
> > > There is no interface to do that but also there is no limitation to
> > > expand uprobes. For the kprobes, there are some limitations for the
> > > function entry because it needs to share the space with ftrace. So
> > > I introduced fprobe for easier to use.
> > >
> > > > on the other hand BPF program type for uprobe is
> > > > BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
> > > > consistent with what we have today).
> > >
> > > Hmm, I'm not sure why BPF made such design choice... (Uprobe needs
> > > the target program.)
> > >
> >
> > We've been talking about sleepable uprobe programs, so we might need
> > to add uprobe-specific program type, probably. But historically, from
> > BPF point of view there was no difference between kprobe and uprobe
> > programs (in terms of how they are run and what's available to them).
> > From BPF point of view, it was just attaching BPF program to a
> > perf_event.
>
> Got it, so that will reuse the uprobe_events in ftrace. But I think
> the uprobe requires a "path" to the attached binary, how is it
> specified?
>
> > > > But yeah, the main question is whether there is something preventing
> > > > us from supporting multi-attach uprobe as well? It would be really
> > > > great for USDT use case.
> > >
> > > Ah, for the USDT, it will be useful. But since now we will have "user-event"
> > > which is faster than uprobes, we may be better to consider to use it.
> >
> > Any pointers? I'm not sure what "user-event" refers to.
>
> Here is the user-events series, which allows user program to define
> raw dynamic events and it can write raw event data directly from
> user space.
>
> https://lore.kernel.org/all/[email protected]/

Is this a way for user space to inject user bytes into kernel events?
What is the use case?

2022-02-21 09:17:37

[permalink] [raw]

Subject: Re: [PATCH 0/8] bpf: Add fprobe link

On Fri, 18 Feb 2022 18:10:08 -0800
Alexei Starovoitov <[email protected]> wrote:

> On Thu, Feb 17, 2022 at 8:07 PM Masami Hiramatsu <[email protected]> wrote:
> >
> > On Thu, 17 Feb 2022 14:01:30 -0800
> > Andrii Nakryiko <[email protected]> wrote:
> >
> >
> > > > > Is there any chance to support this fast multi-attach for uprobe? If
> > > > > yes, we might want to reuse the same link for both (so should we name
> > > > > it more generically?
> > > >
> > > > There is no interface to do that but also there is no limitation to
> > > > expand uprobes. For the kprobes, there are some limitations for the
> > > > function entry because it needs to share the space with ftrace. So
> > > > I introduced fprobe for easier to use.
> > > >
> > > > > on the other hand BPF program type for uprobe is
> > > > > BPF_PROG_TYPE_KPROBE anyway, so keeping it as "kprobe" also would be
> > > > > consistent with what we have today).
> > > >
> > > > Hmm, I'm not sure why BPF made such design choice... (Uprobe needs
> > > > the target program.)
> > > >
> > >
> > > We've been talking about sleepable uprobe programs, so we might need
> > > to add uprobe-specific program type, probably. But historically, from
> > > BPF point of view there was no difference between kprobe and uprobe
> > > programs (in terms of how they are run and what's available to them).
> > > From BPF point of view, it was just attaching BPF program to a
> > > perf_event.
> >
> > Got it, so that will reuse the uprobe_events in ftrace. But I think
> > the uprobe requires a "path" to the attached binary, how is it
> > specified?
> >
> > > > > But yeah, the main question is whether there is something preventing
> > > > > us from supporting multi-attach uprobe as well? It would be really
> > > > > great for USDT use case.
> > > >
> > > > Ah, for the USDT, it will be useful. But since now we will have "user-event"
> > > > which is faster than uprobes, we may be better to consider to use it.
> > >
> > > Any pointers? I'm not sure what "user-event" refers to.
> >
> > Here is the user-events series, which allows user program to define
> > raw dynamic events and it can write raw event data directly from
> > user space.
> >
> > https://lore.kernel.org/all/[email protected]/
>
> Is this a way for user space to inject user bytes into kernel events?

Yes, it is.

> What is the use case?

This is like trace_marker but more ftrace/perf friendly version. The trace_marker
can only send a user string, and the kernel can not parse it. Thus, the traced
data will be shown in the trace buffer, but the event filter, event trigger,
histogram etc didn't work with trace_marker.

On the other hand, the user-events allows user-space defines new events with
various arguments with types, and the application can send the formatted raw
data to the kernel. Thus the kernel can apply event filter, event trigger and
histograms on those events as same as other kernel defined events.

This will be helpful for users to push their own data as events of ftrace
and perf (and eBPF I think) so that they can use those tracing tools to analyze
both of their events and kernel events. :-)

Thank you,

--
Masami Hiramatsu <[email protected]>

2022-02-22 12:57:51