When user specifies symbols and cookies for kprobe_multi link
interface it's very likely the cookies will be misplaced and
returned to wrong functions (via get_attach_cookie helper).
The reason is that to resolve the provided functions we sort
them before passing them to ftrace_lookup_symbols, but we do
not do the same sort on the cookie values.
Fixing this by using sort_r function with custom swap callback
that swaps cookie values as well.
Fixes: 0236fec57a15 ("bpf: Resolve symbols with ftrace_lookup_symbols for kprobe multi link")
Signed-off-by: Jiri Olsa <[email protected]>
---
kernel/trace/bpf_trace.c | 60 ++++++++++++++++++++++++++++++----------
1 file changed, 45 insertions(+), 15 deletions(-)
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 7a13e6ac6327..88589d74a892 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -2423,7 +2423,7 @@ kprobe_multi_link_handler(struct fprobe *fp, unsigned long entry_ip,
kprobe_multi_link_prog_run(link, entry_ip, regs);
}
-static int symbols_cmp(const void *a, const void *b)
+static int symbols_cmp_r(const void *a, const void *b, const void *priv)
{
const char **str_a = (const char **) a;
const char **str_b = (const char **) b;
@@ -2431,6 +2431,28 @@ static int symbols_cmp(const void *a, const void *b)
return strcmp(*str_a, *str_b);
}
+struct multi_symbols_sort {
+ const char **funcs;
+ u64 *cookies;
+};
+
+static void symbols_swap_r(void *a, void *b, int size, const void *priv)
+{
+ const struct multi_symbols_sort *data = priv;
+ const char **name_a = a, **name_b = b;
+
+ swap(*name_a, *name_b);
+
+ /* If defined, swap also related cookies. */
+ if (data->cookies) {
+ u64 *cookie_a, *cookie_b;
+
+ cookie_a = data->cookies + (name_a - data->funcs);
+ cookie_b = data->cookies + (name_b - data->funcs);
+ swap(*cookie_a, *cookie_b);
+ }
+}
+
int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
{
struct bpf_kprobe_multi_link *link = NULL;
@@ -2468,38 +2490,46 @@ int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
if (!addrs)
return -ENOMEM;
+ ucookies = u64_to_user_ptr(attr->link_create.kprobe_multi.cookies);
+ if (ucookies) {
+ cookies = kvmalloc_array(cnt, sizeof(*addrs), GFP_KERNEL);
+ if (!cookies) {
+ err = -ENOMEM;
+ goto error;
+ }
+ if (copy_from_user(cookies, ucookies, size)) {
+ err = -EFAULT;
+ goto error;
+ }
+ }
+
if (uaddrs) {
if (copy_from_user(addrs, uaddrs, size)) {
err = -EFAULT;
goto error;
}
} else {
+ struct multi_symbols_sort data = {
+ .cookies = cookies,
+ };
struct user_syms us;
err = copy_user_syms(&us, usyms, cnt);
if (err)
goto error;
- sort(us.syms, cnt, sizeof(*us.syms), symbols_cmp, NULL);
+ if (cookies)
+ data.funcs = us.syms;
+
+ sort_r(us.syms, cnt, sizeof(*us.syms), symbols_cmp_r,
+ symbols_swap_r, &data);
+
err = ftrace_lookup_symbols(us.syms, cnt, addrs);
free_user_syms(&us);
if (err)
goto error;
}
- ucookies = u64_to_user_ptr(attr->link_create.kprobe_multi.cookies);
- if (ucookies) {
- cookies = kvmalloc_array(cnt, sizeof(*addrs), GFP_KERNEL);
- if (!cookies) {
- err = -ENOMEM;
- goto error;
- }
- if (copy_from_user(cookies, ucookies, size)) {
- err = -EFAULT;
- goto error;
- }
- }
-
link = kzalloc(sizeof(*link), GFP_KERNEL);
if (!link) {
err = -ENOMEM;
--
2.35.3
On Mon, Jun 6, 2022 at 11:48 AM Jiri Olsa <[email protected]> wrote:
>
> When user specifies symbols and cookies for kprobe_multi link
> interface it's very likely the cookies will be misplaced and
> returned to wrong functions (via get_attach_cookie helper).
>
> The reason is that to resolve the provided functions we sort
> them before passing them to ftrace_lookup_symbols, but we do
> not do the same sort on the cookie values.
>
> Fixing this by using sort_r function with custom swap callback
> that swaps cookie values as well.
>
> Fixes: 0236fec57a15 ("bpf: Resolve symbols with ftrace_lookup_symbols for kprobe multi link")
> Signed-off-by: Jiri Olsa <[email protected]>
It looks good, but something in this patch is causing a regression:
./test_progs -t kprobe_multi
test_kprobe_multi_test:PASS:load_kallsyms 0 nsec
#80/1 kprobe_multi_test/skel_api:OK
#80/2 kprobe_multi_test/link_api_addrs:OK
#80/3 kprobe_multi_test/link_api_syms:OK
#80/4 kprobe_multi_test/attach_api_pattern:OK
#80/5 kprobe_multi_test/attach_api_addrs:OK
#80/6 kprobe_multi_test/attach_api_syms:OK
#80/7 kprobe_multi_test/attach_api_fails:OK
test_bench_attach:PASS:get_syms 0 nsec
test_bench_attach:PASS:kprobe_multi_empty__open_and_load 0 nsec
libbpf: prog 'test_kprobe_empty': failed to attach: No such process
test_bench_attach:FAIL:bpf_program__attach_kprobe_multi_opts
unexpected error: -3
#80/8 kprobe_multi_test/bench_attach:FAIL
#80 kprobe_multi_test:FAIL
CI is unfortunately green, because we don't run it there:
#80/1 kprobe_multi_test/skel_api:OK
#80/2 kprobe_multi_test/link_api_addrs:OK
#80/3 kprobe_multi_test/link_api_syms:OK
#80/4 kprobe_multi_test/attach_api_pattern:OK
#80/5 kprobe_multi_test/attach_api_addrs:OK
#80/6 kprobe_multi_test/attach_api_syms:OK
#80/7 kprobe_multi_test/attach_api_fails:OK
#80 kprobe_multi_test:OK
On Tue, Jun 07, 2022 at 11:40:47AM -0700, Alexei Starovoitov wrote:
> On Mon, Jun 6, 2022 at 11:48 AM Jiri Olsa <[email protected]> wrote:
> >
> > When user specifies symbols and cookies for kprobe_multi link
> > interface it's very likely the cookies will be misplaced and
> > returned to wrong functions (via get_attach_cookie helper).
> >
> > The reason is that to resolve the provided functions we sort
> > them before passing them to ftrace_lookup_symbols, but we do
> > not do the same sort on the cookie values.
> >
> > Fixing this by using sort_r function with custom swap callback
> > that swaps cookie values as well.
> >
> > Fixes: 0236fec57a15 ("bpf: Resolve symbols with ftrace_lookup_symbols for kprobe multi link")
> > Signed-off-by: Jiri Olsa <[email protected]>
>
> It looks good, but something in this patch is causing a regression:
> ./test_progs -t kprobe_multi
> test_kprobe_multi_test:PASS:load_kallsyms 0 nsec
> #80/1 kprobe_multi_test/skel_api:OK
> #80/2 kprobe_multi_test/link_api_addrs:OK
> #80/3 kprobe_multi_test/link_api_syms:OK
> #80/4 kprobe_multi_test/attach_api_pattern:OK
> #80/5 kprobe_multi_test/attach_api_addrs:OK
> #80/6 kprobe_multi_test/attach_api_syms:OK
> #80/7 kprobe_multi_test/attach_api_fails:OK
> test_bench_attach:PASS:get_syms 0 nsec
> test_bench_attach:PASS:kprobe_multi_empty__open_and_load 0 nsec
> libbpf: prog 'test_kprobe_empty': failed to attach: No such process
> test_bench_attach:FAIL:bpf_program__attach_kprobe_multi_opts
> unexpected error: -3
> #80/8 kprobe_multi_test/bench_attach:FAIL
> #80 kprobe_multi_test:FAIL
looks like kallsyms search failed to find some symbol,
but I can't reproduce with:
./vmtest.sh -- ./test_progs -t kprobe_multi
can you share .config you used?
thanks,
jirka
>
> CI is unfortunately green, because we don't run it there:
> #80/1 kprobe_multi_test/skel_api:OK
> #80/2 kprobe_multi_test/link_api_addrs:OK
> #80/3 kprobe_multi_test/link_api_syms:OK
> #80/4 kprobe_multi_test/attach_api_pattern:OK
> #80/5 kprobe_multi_test/attach_api_addrs:OK
> #80/6 kprobe_multi_test/attach_api_syms:OK
> #80/7 kprobe_multi_test/attach_api_fails:OK
> #80 kprobe_multi_test:OK
On Tue, Jun 7, 2022 at 12:56 PM Jiri Olsa <[email protected]> wrote:
>
> On Tue, Jun 07, 2022 at 11:40:47AM -0700, Alexei Starovoitov wrote:
> > On Mon, Jun 6, 2022 at 11:48 AM Jiri Olsa <[email protected]> wrote:
> > >
> > > When user specifies symbols and cookies for kprobe_multi link
> > > interface it's very likely the cookies will be misplaced and
> > > returned to wrong functions (via get_attach_cookie helper).
> > >
> > > The reason is that to resolve the provided functions we sort
> > > them before passing them to ftrace_lookup_symbols, but we do
> > > not do the same sort on the cookie values.
> > >
> > > Fixing this by using sort_r function with custom swap callback
> > > that swaps cookie values as well.
> > >
> > > Fixes: 0236fec57a15 ("bpf: Resolve symbols with ftrace_lookup_symbols for kprobe multi link")
> > > Signed-off-by: Jiri Olsa <[email protected]>
> >
> > It looks good, but something in this patch is causing a regression:
> > ./test_progs -t kprobe_multi
> > test_kprobe_multi_test:PASS:load_kallsyms 0 nsec
> > #80/1 kprobe_multi_test/skel_api:OK
> > #80/2 kprobe_multi_test/link_api_addrs:OK
> > #80/3 kprobe_multi_test/link_api_syms:OK
> > #80/4 kprobe_multi_test/attach_api_pattern:OK
> > #80/5 kprobe_multi_test/attach_api_addrs:OK
> > #80/6 kprobe_multi_test/attach_api_syms:OK
> > #80/7 kprobe_multi_test/attach_api_fails:OK
> > test_bench_attach:PASS:get_syms 0 nsec
> > test_bench_attach:PASS:kprobe_multi_empty__open_and_load 0 nsec
> > libbpf: prog 'test_kprobe_empty': failed to attach: No such process
> > test_bench_attach:FAIL:bpf_program__attach_kprobe_multi_opts
> > unexpected error: -3
> > #80/8 kprobe_multi_test/bench_attach:FAIL
> > #80 kprobe_multi_test:FAIL
>
> looks like kallsyms search failed to find some symbol,
> but I can't reproduce with:
>
> ./vmtest.sh -- ./test_progs -t kprobe_multi
>
> can you share .config you used?
I don't think it's config related.
Patch 2 is doing:
- if (!bsearch(&name, args->syms, args->cnt, sizeof(*args->syms), symbols_cmp))
+ sym = bsearch(&name, args->syms, args->cnt, sizeof(*args->syms), symbols_cmp);
+ if (!sym)
+ return 0;
+
+ idx = sym - args->syms;
+ if (args->addrs[idx])
return 0;
addr = ftrace_location(addr);
if (!addr)
return 0;
- args->addrs[args->found++] = addr;
+ args->addrs[idx] = addr;
+ args->found++;
return args->found == args->cnt ? 1 : 0;
There are plenty of functions with the same name
in available_filter_functions.
So
if (args->addrs[idx])
return 0;
triggers for a lot of them.
At the end args->found != args->cnt.
Here is trivial debug patch:
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 601ccf1b2f09..c567cf56cb57 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -8037,8 +8037,10 @@ static int kallsyms_callback(void *data, const
char *name,
return 0;
idx = sym - args->syms;
- if (args->addrs[idx])
+ if (args->addrs[idx]) {
+ printk("idx %x name %s\n", idx, name);
return 0;
+ }
addr = ftrace_location(addr);
if (!addr)
@@ -8078,6 +8080,7 @@ int ftrace_lookup_symbols(const char
**sorted_syms, size_t cnt, unsigned long *a
err = kallsyms_on_each_symbol(kallsyms_callback, &args);
if (err < 0)
return err;
+ printk("found %zd cnt %zd\n", args.found, args.cnt);
return args.found == args.cnt ? 0 : -ESRCH;
}
[ 13.096160] idx a500 name unregister_vclock
[ 13.096930] idx 82fb name pt_regs_offset
[ 13.106969] idx 92be name set_root
[ 13.107290] idx 4414 name event_function
[ 13.112570] idx 7d1d name phy_init
[ 13.114459] idx 7d13 name phy_exit
[ 13.114777] idx ab91 name watchdog
[ 13.115730] found 46921 cnt 47036
I don't understand how it works for you at all.
It passes in BPF CI only because we don't run
kprobe_multi_test/bench_attach there (yet).
On Tue, Jun 07, 2022 at 09:10:32PM -0700, Alexei Starovoitov wrote:
> On Tue, Jun 7, 2022 at 12:56 PM Jiri Olsa <[email protected]> wrote:
> >
> > On Tue, Jun 07, 2022 at 11:40:47AM -0700, Alexei Starovoitov wrote:
> > > On Mon, Jun 6, 2022 at 11:48 AM Jiri Olsa <[email protected]> wrote:
> > > >
> > > > When user specifies symbols and cookies for kprobe_multi link
> > > > interface it's very likely the cookies will be misplaced and
> > > > returned to wrong functions (via get_attach_cookie helper).
> > > >
> > > > The reason is that to resolve the provided functions we sort
> > > > them before passing them to ftrace_lookup_symbols, but we do
> > > > not do the same sort on the cookie values.
> > > >
> > > > Fixing this by using sort_r function with custom swap callback
> > > > that swaps cookie values as well.
> > > >
> > > > Fixes: 0236fec57a15 ("bpf: Resolve symbols with ftrace_lookup_symbols for kprobe multi link")
> > > > Signed-off-by: Jiri Olsa <[email protected]>
> > >
> > > It looks good, but something in this patch is causing a regression:
> > > ./test_progs -t kprobe_multi
> > > test_kprobe_multi_test:PASS:load_kallsyms 0 nsec
> > > #80/1 kprobe_multi_test/skel_api:OK
> > > #80/2 kprobe_multi_test/link_api_addrs:OK
> > > #80/3 kprobe_multi_test/link_api_syms:OK
> > > #80/4 kprobe_multi_test/attach_api_pattern:OK
> > > #80/5 kprobe_multi_test/attach_api_addrs:OK
> > > #80/6 kprobe_multi_test/attach_api_syms:OK
> > > #80/7 kprobe_multi_test/attach_api_fails:OK
> > > test_bench_attach:PASS:get_syms 0 nsec
> > > test_bench_attach:PASS:kprobe_multi_empty__open_and_load 0 nsec
> > > libbpf: prog 'test_kprobe_empty': failed to attach: No such process
> > > test_bench_attach:FAIL:bpf_program__attach_kprobe_multi_opts
> > > unexpected error: -3
> > > #80/8 kprobe_multi_test/bench_attach:FAIL
> > > #80 kprobe_multi_test:FAIL
> >
> > looks like kallsyms search failed to find some symbol,
> > but I can't reproduce with:
> >
> > ./vmtest.sh -- ./test_progs -t kprobe_multi
> >
> > can you share .config you used?
>
> I don't think it's config related.
> Patch 2 is doing:
>
> - if (!bsearch(&name, args->syms, args->cnt, sizeof(*args->syms), symbols_cmp))
> + sym = bsearch(&name, args->syms, args->cnt, sizeof(*args->syms), symbols_cmp);
> + if (!sym)
> + return 0;
> +
> + idx = sym - args->syms;
> + if (args->addrs[idx])
> return 0;
>
> addr = ftrace_location(addr);
> if (!addr)
> return 0;
>
> - args->addrs[args->found++] = addr;
> + args->addrs[idx] = addr;
> + args->found++;
> return args->found == args->cnt ? 1 : 0;
>
> There are plenty of functions with the same name
> in available_filter_functions.
> So
> if (args->addrs[idx])
> return 0;
> triggers for a lot of them.
> At the end args->found != args->cnt.
there's code in get_syms (prog_tests/kprobe_multi_test.c)
that filters out duplicates
>
> Here is trivial debug patch:
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 601ccf1b2f09..c567cf56cb57 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -8037,8 +8037,10 @@ static int kallsyms_callback(void *data, const
> char *name,
> return 0;
>
> idx = sym - args->syms;
> - if (args->addrs[idx])
> + if (args->addrs[idx]) {
> + printk("idx %x name %s\n", idx, name);
> return 0;
> + }
>
> addr = ftrace_location(addr);
> if (!addr)
> @@ -8078,6 +8080,7 @@ int ftrace_lookup_symbols(const char
> **sorted_syms, size_t cnt, unsigned long *a
> err = kallsyms_on_each_symbol(kallsyms_callback, &args);
> if (err < 0)
> return err;
> + printk("found %zd cnt %zd\n", args.found, args.cnt);
> return args.found == args.cnt ? 0 : -ESRCH;
> }
>
> [ 13.096160] idx a500 name unregister_vclock
> [ 13.096930] idx 82fb name pt_regs_offset
> [ 13.106969] idx 92be name set_root
> [ 13.107290] idx 4414 name event_function
> [ 13.112570] idx 7d1d name phy_init
> [ 13.114459] idx 7d13 name phy_exit
> [ 13.114777] idx ab91 name watchdog
> [ 13.115730] found 46921 cnt 47036
>
> I don't understand how it works for you at all.
> It passes in BPF CI only because we don't run
> kprobe_multi_test/bench_attach there (yet).
reproduced after I updated the tree today.. not sure why I did
not see that before, going to check
jirka
On Wed, Jun 08, 2022 at 09:59:41AM +0200, Jiri Olsa wrote:
> On Tue, Jun 07, 2022 at 09:10:32PM -0700, Alexei Starovoitov wrote:
> > On Tue, Jun 7, 2022 at 12:56 PM Jiri Olsa <[email protected]> wrote:
> > >
> > > On Tue, Jun 07, 2022 at 11:40:47AM -0700, Alexei Starovoitov wrote:
> > > > On Mon, Jun 6, 2022 at 11:48 AM Jiri Olsa <[email protected]> wrote:
> > > > >
> > > > > When user specifies symbols and cookies for kprobe_multi link
> > > > > interface it's very likely the cookies will be misplaced and
> > > > > returned to wrong functions (via get_attach_cookie helper).
> > > > >
> > > > > The reason is that to resolve the provided functions we sort
> > > > > them before passing them to ftrace_lookup_symbols, but we do
> > > > > not do the same sort on the cookie values.
> > > > >
> > > > > Fixing this by using sort_r function with custom swap callback
> > > > > that swaps cookie values as well.
> > > > >
> > > > > Fixes: 0236fec57a15 ("bpf: Resolve symbols with ftrace_lookup_symbols for kprobe multi link")
> > > > > Signed-off-by: Jiri Olsa <[email protected]>
> > > >
> > > > It looks good, but something in this patch is causing a regression:
> > > > ./test_progs -t kprobe_multi
> > > > test_kprobe_multi_test:PASS:load_kallsyms 0 nsec
> > > > #80/1 kprobe_multi_test/skel_api:OK
> > > > #80/2 kprobe_multi_test/link_api_addrs:OK
> > > > #80/3 kprobe_multi_test/link_api_syms:OK
> > > > #80/4 kprobe_multi_test/attach_api_pattern:OK
> > > > #80/5 kprobe_multi_test/attach_api_addrs:OK
> > > > #80/6 kprobe_multi_test/attach_api_syms:OK
> > > > #80/7 kprobe_multi_test/attach_api_fails:OK
> > > > test_bench_attach:PASS:get_syms 0 nsec
> > > > test_bench_attach:PASS:kprobe_multi_empty__open_and_load 0 nsec
> > > > libbpf: prog 'test_kprobe_empty': failed to attach: No such process
> > > > test_bench_attach:FAIL:bpf_program__attach_kprobe_multi_opts
> > > > unexpected error: -3
> > > > #80/8 kprobe_multi_test/bench_attach:FAIL
> > > > #80 kprobe_multi_test:FAIL
> > >
> > > looks like kallsyms search failed to find some symbol,
> > > but I can't reproduce with:
> > >
> > > ./vmtest.sh -- ./test_progs -t kprobe_multi
> > >
> > > can you share .config you used?
> >
> > I don't think it's config related.
> > Patch 2 is doing:
> >
> > - if (!bsearch(&name, args->syms, args->cnt, sizeof(*args->syms), symbols_cmp))
> > + sym = bsearch(&name, args->syms, args->cnt, sizeof(*args->syms), symbols_cmp);
> > + if (!sym)
> > + return 0;
> > +
> > + idx = sym - args->syms;
> > + if (args->addrs[idx])
> > return 0;
> >
> > addr = ftrace_location(addr);
> > if (!addr)
> > return 0;
> >
> > - args->addrs[args->found++] = addr;
> > + args->addrs[idx] = addr;
> > + args->found++;
> > return args->found == args->cnt ? 1 : 0;
> >
> > There are plenty of functions with the same name
> > in available_filter_functions.
> > So
> > if (args->addrs[idx])
> > return 0;
> > triggers for a lot of them.
> > At the end args->found != args->cnt.
>
> there's code in get_syms (prog_tests/kprobe_multi_test.c)
> that filters out duplicates
>
> >
> > Here is trivial debug patch:
> > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> > index 601ccf1b2f09..c567cf56cb57 100644
> > --- a/kernel/trace/ftrace.c
> > +++ b/kernel/trace/ftrace.c
> > @@ -8037,8 +8037,10 @@ static int kallsyms_callback(void *data, const
> > char *name,
> > return 0;
> >
> > idx = sym - args->syms;
> > - if (args->addrs[idx])
> > + if (args->addrs[idx]) {
> > + printk("idx %x name %s\n", idx, name);
> > return 0;
> > + }
> >
> > addr = ftrace_location(addr);
> > if (!addr)
> > @@ -8078,6 +8080,7 @@ int ftrace_lookup_symbols(const char
> > **sorted_syms, size_t cnt, unsigned long *a
> > err = kallsyms_on_each_symbol(kallsyms_callback, &args);
> > if (err < 0)
> > return err;
> > + printk("found %zd cnt %zd\n", args.found, args.cnt);
> > return args.found == args.cnt ? 0 : -ESRCH;
> > }
> >
> > [ 13.096160] idx a500 name unregister_vclock
> > [ 13.096930] idx 82fb name pt_regs_offset
> > [ 13.106969] idx 92be name set_root
> > [ 13.107290] idx 4414 name event_function
> > [ 13.112570] idx 7d1d name phy_init
> > [ 13.114459] idx 7d13 name phy_exit
> > [ 13.114777] idx ab91 name watchdog
> > [ 13.115730] found 46921 cnt 47036
> >
> > I don't understand how it works for you at all.
> > It passes in BPF CI only because we don't run
> > kprobe_multi_test/bench_attach there (yet).
>
> reproduced after I updated the tree today.. not sure why I did
> not see that before, going to check
ok, I'm not completely crazy ;-) it's the weak functions fix:
b39181f7c690 ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function
I should have noticed this before (from changelog):
A worker thread is added at boot up to scan all the ftrace record entries,
and will mark any that fail the FTRACE_MCOUNT_MAX_OFFSET test as disabled.
They will still appear in the available_filter_functions file as:
__ftrace_invalid_address___<invalid-offset>
(showing the offset that caused it to be invalid).
Steven,
is there a reason to show '__ftrace_invalid_address___*' symbols in
available_filter_functions? it seems more like debug message to me
I can easily filter them out, but my assumption was that any symbol
in available_filter_functions could be resolved in /proc/kalsyms
with the workaround patch below the bench test is passing for me
thanks,
jirka
---
diff --git a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
index 586dc52d6fb9..88086ac23ef3 100644
--- a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
+++ b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
@@ -364,6 +364,8 @@ static int get_syms(char ***symsp, size_t *cntp)
continue;
if (!strncmp(name, "rcu_", 4))
continue;
+ if (!strncmp(name, "__ftrace_invalid_address__", sizeof("__ftrace_invalid_address__") - 1))
+ continue;
err = hashmap__add(map, name, NULL);
if (err) {
free(name);
On Wed, 8 Jun 2022 11:57:48 +0200
Jiri Olsa <[email protected]> wrote:
> Steven,
> is there a reason to show '__ftrace_invalid_address___*' symbols in
> available_filter_functions? it seems more like debug message to me
>
Yes, because set_ftrace_filter may be set by index. That is, if schedule is
the 43,245th entry in available_filter_functions, then you can do:
# echo 43245 > set_ftrace_filter
# cat set_ftrace_filter
schedule
That index must match the array index of the entries in the function list
internally. The reason for this is that entering a name is an O(n)
operation, where n is the number of functions in
available_filter_functions. If you want to enable half of those functions,
then it takes O(n^2) to do so.
I first implemented this trick to help with bisecting bad functions. That
is, every so often a function that should be annotated with notrace, isn't
and if it gets traced it cause the machine to reboot. To bisect this, I
would enable half the functions at a time and enable tracing to see if it
reboots or not, and if it does, I know that one of the half I enabled is
the culprit, if not, it's in the other half. It would take over 5 minutes
to enable half the functions. Where as the number trick took one second,
not only was it O(1) per function, but it did not need to do kallsym
lookups either. It simply enabled the function at the index.
Later, libtracefs (used by trace-cmd and others) would allow regex(3)
enabling of functions. That is, it would search available_filter_functions
in user space, match them via normal regex, create an index of the
functions to know where they are, and then write in those numbers to enable
them. It's much faster than writing in strings.
My original fix was to simply ignore those functions, but then it would
make the index no longer match what got set. I noticed this while writing
my slides for Kernel Recipes, and then fixed it.
The commit you mention above even states this:
__ftrace_invalid_address___<invalid-offset>
(showing the offset that caused it to be invalid).
This is required for tools that use libtracefs (like trace-cmd does) that
scan the available_filter_functions and enable set_ftrace_filter and
set_ftrace_notrace using indexes of the function listed in the file (this
is a speedup, as enabling thousands of files via names is an O(n^2)
operation and can take minutes to complete, where the indexing takes less
than a second).
In other words, having a placeholder is required to keep from breaking user
space.
-- Steve
On Wed, Jun 08, 2022 at 08:40:23AM -0400, Steven Rostedt wrote:
> On Wed, 8 Jun 2022 11:57:48 +0200
> Jiri Olsa <[email protected]> wrote:
>
> > Steven,
> > is there a reason to show '__ftrace_invalid_address___*' symbols in
> > available_filter_functions? it seems more like debug message to me
> >
>
> Yes, because set_ftrace_filter may be set by index. That is, if schedule is
> the 43,245th entry in available_filter_functions, then you can do:
>
> # echo 43245 > set_ftrace_filter
> # cat set_ftrace_filter
> schedule
>
> That index must match the array index of the entries in the function list
> internally. The reason for this is that entering a name is an O(n)
> operation, where n is the number of functions in
> available_filter_functions. If you want to enable half of those functions,
> then it takes O(n^2) to do so.
>
> I first implemented this trick to help with bisecting bad functions. That
> is, every so often a function that should be annotated with notrace, isn't
> and if it gets traced it cause the machine to reboot. To bisect this, I
> would enable half the functions at a time and enable tracing to see if it
> reboots or not, and if it does, I know that one of the half I enabled is
> the culprit, if not, it's in the other half. It would take over 5 minutes
> to enable half the functions. Where as the number trick took one second,
> not only was it O(1) per function, but it did not need to do kallsym
> lookups either. It simply enabled the function at the index.
>
> Later, libtracefs (used by trace-cmd and others) would allow regex(3)
> enabling of functions. That is, it would search available_filter_functions
> in user space, match them via normal regex, create an index of the
> functions to know where they are, and then write in those numbers to enable
> them. It's much faster than writing in strings.
>
> My original fix was to simply ignore those functions, but then it would
> make the index no longer match what got set. I noticed this while writing
> my slides for Kernel Recipes, and then fixed it.
>
> The commit you mention above even states this:
>
> __ftrace_invalid_address___<invalid-offset>
>
> (showing the offset that caused it to be invalid).
>
> This is required for tools that use libtracefs (like trace-cmd does) that
> scan the available_filter_functions and enable set_ftrace_filter and
> set_ftrace_notrace using indexes of the function listed in the file (this
> is a speedup, as enabling thousands of files via names is an O(n^2)
> operation and can take minutes to complete, where the indexing takes less
> than a second).
>
> In other words, having a placeholder is required to keep from breaking user
> space.
ok, we'll have to workaround that then
thanks,
jirka
On Wed, Jun 8, 2022 at 5:40 AM Steven Rostedt <[email protected]> wrote:
>
> On Wed, 8 Jun 2022 11:57:48 +0200
> Jiri Olsa <[email protected]> wrote:
>
> > Steven,
> > is there a reason to show '__ftrace_invalid_address___*' symbols in
> > available_filter_functions? it seems more like debug message to me
> >
>
> Yes, because set_ftrace_filter may be set by index. That is, if schedule is
> the 43,245th entry in available_filter_functions, then you can do:
>
> # echo 43245 > set_ftrace_filter
> # cat set_ftrace_filter
> schedule
>
> That index must match the array index of the entries in the function list
> internally. The reason for this is that entering a name is an O(n)
> operation, where n is the number of functions in
> available_filter_functions. If you want to enable half of those functions,
> then it takes O(n^2) to do so.
>
> I first implemented this trick to help with bisecting bad functions. That
> is, every so often a function that should be annotated with notrace, isn't
> and if it gets traced it cause the machine to reboot. To bisect this, I
> would enable half the functions at a time and enable tracing to see if it
> reboots or not, and if it does, I know that one of the half I enabled is
> the culprit, if not, it's in the other half. It would take over 5 minutes
> to enable half the functions. Where as the number trick took one second,
> not only was it O(1) per function, but it did not need to do kallsym
> lookups either. It simply enabled the function at the index.
>
> Later, libtracefs (used by trace-cmd and others) would allow regex(3)
> enabling of functions. That is, it would search available_filter_functions
> in user space, match them via normal regex, create an index of the
> functions to know where they are, and then write in those numbers to enable
> them. It's much faster than writing in strings.
>
> My original fix was to simply ignore those functions, but then it would
> make the index no longer match what got set. I noticed this while writing
> my slides for Kernel Recipes, and then fixed it.
>
> The commit you mention above even states this:
>
> __ftrace_invalid_address___<invalid-offset>
>
> (showing the offset that caused it to be invalid).
>
> This is required for tools that use libtracefs (like trace-cmd does) that
> scan the available_filter_functions and enable set_ftrace_filter and
> set_ftrace_notrace using indexes of the function listed in the file (this
> is a speedup, as enabling thousands of files via names is an O(n^2)
> operation and can take minutes to complete, where the indexing takes less
> than a second).
>
> In other words, having a placeholder is required to keep from breaking user
> space.
Would it be possible to preprocess ftrace_pages to remove such invalid
records (so that by the time we have to report
available_filter_functions there are no invalid records)? Or that data
is read-only when kernel is running?
>
> -- Steve
>
>
On Wed, 8 Jun 2022 08:59:50 -0700
Andrii Nakryiko <[email protected]> wrote:
> Would it be possible to preprocess ftrace_pages to remove such invalid
> records (so that by the time we have to report
> available_filter_functions there are no invalid records)? Or that data
> is read-only when kernel is running?
It's possible, but will be time consuming (slow down boot up) and racy. In
other words, I didn't feel it was worth it.
We can add it. How much of an issue is it to have these place holders for
you? Currently, I only see it causes issues with tests. Is it really an
issue for use cases?
-- Steve
On Wed, Jun 8, 2022 at 9:08 AM Steven Rostedt <[email protected]> wrote:
>
> On Wed, 8 Jun 2022 08:59:50 -0700
> Andrii Nakryiko <[email protected]> wrote:
>
> > Would it be possible to preprocess ftrace_pages to remove such invalid
> > records (so that by the time we have to report
> > available_filter_functions there are no invalid records)? Or that data
> > is read-only when kernel is running?
>
> It's possible, but will be time consuming (slow down boot up) and racy. In
> other words, I didn't feel it was worth it.
>
> We can add it. How much of an issue is it to have these place holders for
> you? Currently, I only see it causes issues with tests. Is it really an
> issue for use cases?
I have the tool (retsnoop) that uses available_filter_functions, I'll
have to update it to ignore such entries. It's a small inconvenience,
once you know about this change, but multiply that for multiple users
that use available_filter_functions for some sort of generic tooling
doing kprobes/tracing, and it adds up. So while it's not horrible,
ideally user-visible data shouldn't have non-usable placeholders.
How much slowdown would you expect on start up? Not clear what would
be racy about this start up preprocessing, but I believe you.
So in summary, it's not the end of the world, but as a user I'd prefer
not to know about this quirk, of course :)
>
> -- Steve
On Thu, Jun 09, 2022 at 11:32:59AM -0700, Andrii Nakryiko wrote:
> On Wed, Jun 8, 2022 at 9:08 AM Steven Rostedt <[email protected]> wrote:
> >
> > On Wed, 8 Jun 2022 08:59:50 -0700
> > Andrii Nakryiko <[email protected]> wrote:
> >
> > > Would it be possible to preprocess ftrace_pages to remove such invalid
> > > records (so that by the time we have to report
> > > available_filter_functions there are no invalid records)? Or that data
> > > is read-only when kernel is running?
> >
> > It's possible, but will be time consuming (slow down boot up) and racy. In
> > other words, I didn't feel it was worth it.
> >
> > We can add it. How much of an issue is it to have these place holders for
> > you? Currently, I only see it causes issues with tests. Is it really an
> > issue for use cases?
>
> I have the tool (retsnoop) that uses available_filter_functions, I'll
> have to update it to ignore such entries. It's a small inconvenience,
> once you know about this change, but multiply that for multiple users
> that use available_filter_functions for some sort of generic tooling
> doing kprobes/tracing, and it adds up. So while it's not horrible,
> ideally user-visible data shouldn't have non-usable placeholders.
>
> How much slowdown would you expect on start up? Not clear what would
> be racy about this start up preprocessing, but I believe you.
>
> So in summary, it's not the end of the world, but as a user I'd prefer
> not to know about this quirk, of course :)
ok, I'l resend with the test workaround
jirka