2024-01-23 20:44:53

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v2] libbpf: Add some details for BTF parsing failures

As CONFIG_DEBUG_INFO_BTF is default off the existing "failed to find
valid kernel BTF" message makes diagnosing the kernel build issue some
what cryptic. Add a little more detail with the hope of helping users.

Before:
```
libbpf: failed to find valid kernel BTF
libbpf: Error loading vmlinux BTF: -3
libbpf: failed to load object 'lock_contention_bpf'
libbpf: failed to load BPF skeleton 'lock_contention_bpf': -3
```

After no access /sys/kernel/btf/vmlinux:
```
libbpf: Unable to access canonical vmlinux BTF from /sys/kernel/btf/vmlinux
libbpf: Error loading vmlinux BTF: -3
libbpf: failed to load object 'lock_contention_bpf'
libbpf: failed to load BPF skeleton 'lock_contention_bpf': -3
```

After no BTF /sys/kernel/btf/vmlinux:
```
libbpf: Failed to load vmlinux BTF from /sys/kernel/btf/vmlinux, was CONFIG_DEBUG_INFO_BTF enabled?
libbpf: Error loading vmlinux BTF: -3
libbpf: failed to load object 'lock_contention_bpf'
libbpf: failed to load BPF skeleton 'lock_contention_bpf': -3
```

Closes: https://lore.kernel.org/bpf/CAP-5=fU+DN_+Y=Y4gtELUsJxKNDDCOvJzPHvjUVaUoeFAzNnig@mail.gmail.com/
Signed-off-by: Ian Rogers <[email protected]>

---
v2. Try to address review comments from Andrii Nakryiko.
---
tools/lib/bpf/btf.c | 49 ++++++++++++++++++++++++++++++++-------------
1 file changed, 35 insertions(+), 14 deletions(-)

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index ee95fd379d4d..d8a05dda0836 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -4920,16 +4920,25 @@ static int btf_dedup_remap_types(struct btf_dedup *d)
return 0;
}

+static struct btf *btf__load_vmlinux_btf_path(const char *path)
+{
+ struct btf *btf;
+ int err;
+
+ btf = btf__parse(path, NULL);
+ err = libbpf_get_error(btf);
+ pr_debug("loading kernel BTF '%s': %d\n", path, err);
+ return err ? NULL : btf;
+}
+
/*
* Probe few well-known locations for vmlinux kernel image and try to load BTF
* data out of it to use for target BTF.
*/
struct btf *btf__load_vmlinux_btf(void)
{
+ /* fall back locations, trying to find vmlinux on disk */
const char *locations[] = {
- /* try canonical vmlinux BTF through sysfs first */
- "/sys/kernel/btf/vmlinux",
- /* fall back to trying to find vmlinux on disk otherwise */
"/boot/vmlinux-%1$s",
"/lib/modules/%1$s/vmlinux-%1$s",
"/lib/modules/%1$s/build/vmlinux",
@@ -4938,29 +4947,41 @@ struct btf *btf__load_vmlinux_btf(void)
"/usr/lib/debug/boot/vmlinux-%1$s.debug",
"/usr/lib/debug/lib/modules/%1$s/vmlinux",
};
- char path[PATH_MAX + 1];
+ const char *location;
struct utsname buf;
struct btf *btf;
- int i, err;
+ int i;

- uname(&buf);
+ /* try canonical vmlinux BTF through sysfs first */
+ location = "/sys/kernel/btf/vmlinux";
+ if (faccessat(AT_FDCWD, location, R_OK, AT_EACCESS) == 0) {
+ btf = btf__load_vmlinux_btf_path(location);
+ if (btf)
+ return btf;
+
+ pr_warn("Failed to load vmlinux BTF from %s, was CONFIG_DEBUG_INFO_BTF enabled?\n",
+ location);
+ } else
+ pr_warn("Unable to access canonical vmlinux BTF from %s\n", location);

+ uname(&buf);
for (i = 0; i < ARRAY_SIZE(locations); i++) {
- snprintf(path, PATH_MAX, locations[i], buf.release);
+ char path[PATH_MAX + 1];
+
+ snprintf(path, sizeof(path), locations[i], buf.release);

+ btf = btf__load_vmlinux_btf_path(path);
if (faccessat(AT_FDCWD, path, R_OK, AT_EACCESS))
continue;

- btf = btf__parse(path, NULL);
- err = libbpf_get_error(btf);
- pr_debug("loading kernel BTF '%s': %d\n", path, err);
- if (err)
- continue;
+ btf = btf__load_vmlinux_btf_path(location);
+ if (btf)
+ return btf;

- return btf;
+ pr_warn("Failed to load vmlinux BTF from %s, was CONFIG_DEBUG_INFO_BTF enabled?\n",
+ path);
}

- pr_warn("failed to find valid kernel BTF\n");
return libbpf_err_ptr(-ESRCH);
}

--
2.43.0.429.g432eaa2c6b-goog



2024-01-24 04:27:46

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH v2] libbpf: Add some details for BTF parsing failures

On Tue, Jan 23, 2024 at 12:44 PM Ian Rogers <[email protected]> wrote:
>
> As CONFIG_DEBUG_INFO_BTF is default off the existing "failed to find
> valid kernel BTF" message makes diagnosing the kernel build issue some
> what cryptic. Add a little more detail with the hope of helping users.
>
> Before:
> ```
> libbpf: failed to find valid kernel BTF
> libbpf: Error loading vmlinux BTF: -3
> libbpf: failed to load object 'lock_contention_bpf'
> libbpf: failed to load BPF skeleton 'lock_contention_bpf': -3
> ```
>
> After no access /sys/kernel/btf/vmlinux:
> ```
> libbpf: Unable to access canonical vmlinux BTF from /sys/kernel/btf/vmlinux
> libbpf: Error loading vmlinux BTF: -3
> libbpf: failed to load object 'lock_contention_bpf'
> libbpf: failed to load BPF skeleton 'lock_contention_bpf': -3
> ```
>
> After no BTF /sys/kernel/btf/vmlinux:
> ```
> libbpf: Failed to load vmlinux BTF from /sys/kernel/btf/vmlinux, was CONFIG_DEBUG_INFO_BTF enabled?
> libbpf: Error loading vmlinux BTF: -3
> libbpf: failed to load object 'lock_contention_bpf'
> libbpf: failed to load BPF skeleton 'lock_contention_bpf': -3
> ```
>
> Closes: https://lore.kernel.org/bpf/CAP-5=fU+DN_+Y=Y4gtELUsJxKNDDCOvJzPHvjUVaUoeFAzNnig@mail.gmail.com/
> Signed-off-by: Ian Rogers <[email protected]>
>
> ---
> v2. Try to address review comments from Andrii Nakryiko.
> ---
> tools/lib/bpf/btf.c | 49 ++++++++++++++++++++++++++++++++-------------
> 1 file changed, 35 insertions(+), 14 deletions(-)
>
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index ee95fd379d4d..d8a05dda0836 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -4920,16 +4920,25 @@ static int btf_dedup_remap_types(struct btf_dedup *d)
> return 0;
> }
>
> +static struct btf *btf__load_vmlinux_btf_path(const char *path)

I don't think we need this helper, you literally call btf__parse() and
pr_debug(), that's all

> +{
> + struct btf *btf;
> + int err;
> +
> + btf = btf__parse(path, NULL);
> + err = libbpf_get_error(btf);

we should stop using libbpf_get_error, in libbpf v1.0+ it's best to do just

btf = btf__parse(path, NULL);
if (!btf) {
err = -errno;
pr_debug(...);
return NULL;
}

> + pr_debug("loading kernel BTF '%s': %d\n", path, err);
> + return err ? NULL : btf;
> +}
> +
> /*
> * Probe few well-known locations for vmlinux kernel image and try to load BTF
> * data out of it to use for target BTF.
> */
> struct btf *btf__load_vmlinux_btf(void)
> {
> + /* fall back locations, trying to find vmlinux on disk */
> const char *locations[] = {
> - /* try canonical vmlinux BTF through sysfs first */
> - "/sys/kernel/btf/vmlinux",
> - /* fall back to trying to find vmlinux on disk otherwise */
> "/boot/vmlinux-%1$s",
> "/lib/modules/%1$s/vmlinux-%1$s",
> "/lib/modules/%1$s/build/vmlinux",
> @@ -4938,29 +4947,41 @@ struct btf *btf__load_vmlinux_btf(void)
> "/usr/lib/debug/boot/vmlinux-%1$s.debug",
> "/usr/lib/debug/lib/modules/%1$s/vmlinux",
> };
> - char path[PATH_MAX + 1];
> + const char *location;
> struct utsname buf;
> struct btf *btf;
> - int i, err;
> + int i;
>
> - uname(&buf);
> + /* try canonical vmlinux BTF through sysfs first */
> + location = "/sys/kernel/btf/vmlinux";
> + if (faccessat(AT_FDCWD, location, R_OK, AT_EACCESS) == 0) {
> + btf = btf__load_vmlinux_btf_path(location);
> + if (btf)
> + return btf;
> +
> + pr_warn("Failed to load vmlinux BTF from %s, was CONFIG_DEBUG_INFO_BTF enabled?\n",
> + location);

Mentioning CONFIG_DEBUG_INFO_BTF seems inappropriate here,
/sys/kernel/btf/vmlinux exists, we just failed to parse its data,
right? So it's not about CONFIG_DEBUG_INFO_BTF, we just don't support
something in BTF data. Just pr_warn("Failed to load vmlinux BTF from
%s: %d", location, err); should be good

> + } else
> + pr_warn("Unable to access canonical vmlinux BTF from %s\n", location);

here the question of CONFIG_DEBUG_INFO_BTF is more appropriate, if
/sys/kernel/btf/vmlinux (on modern enough kernels) is missing, then
CONFIG_DEBUG_INFO_BTF is missing, probably. But I'd emit this only
after trying all the fallback paths and not finding anything.

also stylistical nit: if one side of if has {}, the other has to have
{} as well, even if it's just one line

>
> + uname(&buf);
> for (i = 0; i < ARRAY_SIZE(locations); i++) {
> - snprintf(path, PATH_MAX, locations[i], buf.release);
> + char path[PATH_MAX + 1];
> +
> + snprintf(path, sizeof(path), locations[i], buf.release);
>
> + btf = btf__load_vmlinux_btf_path(path);
> if (faccessat(AT_FDCWD, path, R_OK, AT_EACCESS))
> continue;
>
> - btf = btf__parse(path, NULL);
> - err = libbpf_get_error(btf);
> - pr_debug("loading kernel BTF '%s': %d\n", path, err);
> - if (err)
> - continue;
> + btf = btf__load_vmlinux_btf_path(location);
> + if (btf)
> + return btf;
>
> - return btf;
> + pr_warn("Failed to load vmlinux BTF from %s, was CONFIG_DEBUG_INFO_BTF enabled?\n",

we should do better here as well. We should distinguish between "there
is vmlinux image, but it has no BTF" vs "there is no vmlinux image" vs
"vmlinux image is there, there is BTF, but we can't parse it". See
btf__parse(). We return -ENODATA if ELF doesn't have BTF, that's the
first situation. We can probably use faccessat() check for second
situation. Everything else can be reported as pr_debug() with location
(but still no CONFIG_DEBUG_INFO_BTF, it's meaningless for fallback BTF
locations)

> + path);
> }
>
> - pr_warn("failed to find valid kernel BTF\n");

and then here we can probably warn that we failed to find any kernel
BTF, and suggest CONFIG_DEBUG_INFO_BTF

> return libbpf_err_ptr(-ESRCH);
> }
>
> --
> 2.43.0.429.g432eaa2c6b-goog
>

2024-01-24 04:37:25

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH v2] libbpf: Add some details for BTF parsing failures

On Tue, Jan 23, 2024 at 8:25 PM Andrii Nakryiko
<[email protected]> wrote:
>
> On Tue, Jan 23, 2024 at 12:44 PM Ian Rogers <[email protected]> wrote:
> >
> > As CONFIG_DEBUG_INFO_BTF is default off the existing "failed to find
> > valid kernel BTF" message makes diagnosing the kernel build issue some
> > what cryptic. Add a little more detail with the hope of helping users.
> >
> > Before:
> > ```
> > libbpf: failed to find valid kernel BTF
> > libbpf: Error loading vmlinux BTF: -3
> > libbpf: failed to load object 'lock_contention_bpf'
> > libbpf: failed to load BPF skeleton 'lock_contention_bpf': -3
> > ```
> >
> > After no access /sys/kernel/btf/vmlinux:
> > ```
> > libbpf: Unable to access canonical vmlinux BTF from /sys/kernel/btf/vmlinux
> > libbpf: Error loading vmlinux BTF: -3
> > libbpf: failed to load object 'lock_contention_bpf'
> > libbpf: failed to load BPF skeleton 'lock_contention_bpf': -3
> > ```
> >
> > After no BTF /sys/kernel/btf/vmlinux:
> > ```
> > libbpf: Failed to load vmlinux BTF from /sys/kernel/btf/vmlinux, was CONFIG_DEBUG_INFO_BTF enabled?
> > libbpf: Error loading vmlinux BTF: -3
> > libbpf: failed to load object 'lock_contention_bpf'
> > libbpf: failed to load BPF skeleton 'lock_contention_bpf': -3
> > ```
> >
> > Closes: https://lore.kernel.org/bpf/CAP-5=fU+DN_+Y=Y4gtELUsJxKNDDCOvJzPHvjUVaUoeFAzNnig@mail.gmail.com/
> > Signed-off-by: Ian Rogers <[email protected]>
> >
> > ---
> > v2. Try to address review comments from Andrii Nakryiko.
> > ---
> > tools/lib/bpf/btf.c | 49 ++++++++++++++++++++++++++++++++-------------
> > 1 file changed, 35 insertions(+), 14 deletions(-)
> >
> > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > index ee95fd379d4d..d8a05dda0836 100644
> > --- a/tools/lib/bpf/btf.c
> > +++ b/tools/lib/bpf/btf.c
> > @@ -4920,16 +4920,25 @@ static int btf_dedup_remap_types(struct btf_dedup *d)
> > return 0;
> > }
> >
> > +static struct btf *btf__load_vmlinux_btf_path(const char *path)
>
> I don't think we need this helper, you literally call btf__parse() and
> pr_debug(), that's all
>
> > +{
> > + struct btf *btf;
> > + int err;
> > +
> > + btf = btf__parse(path, NULL);
> > + err = libbpf_get_error(btf);
>
> we should stop using libbpf_get_error, in libbpf v1.0+ it's best to do just
>
> btf = btf__parse(path, NULL);
> if (!btf) {
> err = -errno;
> pr_debug(...);
> return NULL;
> }
>
> > + pr_debug("loading kernel BTF '%s': %d\n", path, err);
> > + return err ? NULL : btf;
> > +}
> > +
> > /*
> > * Probe few well-known locations for vmlinux kernel image and try to load BTF
> > * data out of it to use for target BTF.
> > */
> > struct btf *btf__load_vmlinux_btf(void)
> > {
> > + /* fall back locations, trying to find vmlinux on disk */
> > const char *locations[] = {
> > - /* try canonical vmlinux BTF through sysfs first */
> > - "/sys/kernel/btf/vmlinux",
> > - /* fall back to trying to find vmlinux on disk otherwise */
> > "/boot/vmlinux-%1$s",
> > "/lib/modules/%1$s/vmlinux-%1$s",
> > "/lib/modules/%1$s/build/vmlinux",
> > @@ -4938,29 +4947,41 @@ struct btf *btf__load_vmlinux_btf(void)
> > "/usr/lib/debug/boot/vmlinux-%1$s.debug",
> > "/usr/lib/debug/lib/modules/%1$s/vmlinux",
> > };
> > - char path[PATH_MAX + 1];
> > + const char *location;
> > struct utsname buf;
> > struct btf *btf;
> > - int i, err;
> > + int i;
> >
> > - uname(&buf);
> > + /* try canonical vmlinux BTF through sysfs first */
> > + location = "/sys/kernel/btf/vmlinux";
> > + if (faccessat(AT_FDCWD, location, R_OK, AT_EACCESS) == 0) {
> > + btf = btf__load_vmlinux_btf_path(location);
> > + if (btf)
> > + return btf;
> > +
> > + pr_warn("Failed to load vmlinux BTF from %s, was CONFIG_DEBUG_INFO_BTF enabled?\n",
> > + location);
>
> Mentioning CONFIG_DEBUG_INFO_BTF seems inappropriate here,
> /sys/kernel/btf/vmlinux exists, we just failed to parse its data,
> right? So it's not about CONFIG_DEBUG_INFO_BTF, we just don't support
> something in BTF data. Just pr_warn("Failed to load vmlinux BTF from
> %s: %d", location, err); should be good

I think that assumes a lot about a user, they understand what BTF
means, they know it is controlled by a kernel config option, and that
the config option needs to be overridden (as it is defaulted off) for
BTF to work. Given this escaped Raspberry Pi OS the potential for this
mistake seems high - hence wanting to highlight the config option.

> > + } else
> > + pr_warn("Unable to access canonical vmlinux BTF from %s\n", location);
>
> here the question of CONFIG_DEBUG_INFO_BTF is more appropriate, if
> /sys/kernel/btf/vmlinux (on modern enough kernels) is missing, then
> CONFIG_DEBUG_INFO_BTF is missing, probably. But I'd emit this only
> after trying all the fallback paths and not finding anything.
>
> also stylistical nit: if one side of if has {}, the other has to have
> {} as well, even if it's just one line
>
> >
> > + uname(&buf);
> > for (i = 0; i < ARRAY_SIZE(locations); i++) {
> > - snprintf(path, PATH_MAX, locations[i], buf.release);
> > + char path[PATH_MAX + 1];
> > +
> > + snprintf(path, sizeof(path), locations[i], buf.release);
> >
> > + btf = btf__load_vmlinux_btf_path(path);
> > if (faccessat(AT_FDCWD, path, R_OK, AT_EACCESS))
> > continue;
> >
> > - btf = btf__parse(path, NULL);
> > - err = libbpf_get_error(btf);
> > - pr_debug("loading kernel BTF '%s': %d\n", path, err);
> > - if (err)
> > - continue;
> > + btf = btf__load_vmlinux_btf_path(location);
> > + if (btf)
> > + return btf;
> >
> > - return btf;
> > + pr_warn("Failed to load vmlinux BTF from %s, was CONFIG_DEBUG_INFO_BTF enabled?\n",
>
> we should do better here as well. We should distinguish between "there
> is vmlinux image, but it has no BTF" vs "there is no vmlinux image" vs
> "vmlinux image is there, there is BTF, but we can't parse it". See
> btf__parse(). We return -ENODATA if ELF doesn't have BTF, that's the
> first situation. We can probably use faccessat() check for second
> situation. Everything else can be reported as pr_debug() with location
> (but still no CONFIG_DEBUG_INFO_BTF, it's meaningless for fallback BTF
> locations)
>
> > + path);
> > }
> >
> > - pr_warn("failed to find valid kernel BTF\n");
>
> and then here we can probably warn that we failed to find any kernel
> BTF, and suggest CONFIG_DEBUG_INFO_BTF

Andrii, you've basically written this patch, can I pass this over to you?

Thanks,
Ian

> > return libbpf_err_ptr(-ESRCH);
> > }
> >
> > --
> > 2.43.0.429.g432eaa2c6b-goog
> >

2024-01-24 17:41:54

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH v2] libbpf: Add some details for BTF parsing failures

On Tue, Jan 23, 2024 at 8:37 PM Ian Rogers <[email protected]> wrote:
>
> On Tue, Jan 23, 2024 at 8:25 PM Andrii Nakryiko
> <[email protected]> wrote:
> >
> > On Tue, Jan 23, 2024 at 12:44 PM Ian Rogers <[email protected]> wrote:
> > >
> > > As CONFIG_DEBUG_INFO_BTF is default off the existing "failed to find
> > > valid kernel BTF" message makes diagnosing the kernel build issue some
> > > what cryptic. Add a little more detail with the hope of helping users.
> > >
> > > Before:
> > > ```
> > > libbpf: failed to find valid kernel BTF
> > > libbpf: Error loading vmlinux BTF: -3
> > > libbpf: failed to load object 'lock_contention_bpf'
> > > libbpf: failed to load BPF skeleton 'lock_contention_bpf': -3
> > > ```
> > >
> > > After no access /sys/kernel/btf/vmlinux:
> > > ```
> > > libbpf: Unable to access canonical vmlinux BTF from /sys/kernel/btf/vmlinux
> > > libbpf: Error loading vmlinux BTF: -3
> > > libbpf: failed to load object 'lock_contention_bpf'
> > > libbpf: failed to load BPF skeleton 'lock_contention_bpf': -3
> > > ```
> > >
> > > After no BTF /sys/kernel/btf/vmlinux:
> > > ```
> > > libbpf: Failed to load vmlinux BTF from /sys/kernel/btf/vmlinux, was CONFIG_DEBUG_INFO_BTF enabled?
> > > libbpf: Error loading vmlinux BTF: -3
> > > libbpf: failed to load object 'lock_contention_bpf'
> > > libbpf: failed to load BPF skeleton 'lock_contention_bpf': -3
> > > ```
> > >
> > > Closes: https://lore.kernel.org/bpf/CAP-5=fU+DN_+Y=Y4gtELUsJxKNDDCOvJzPHvjUVaUoeFAzNnig@mail.gmail.com/
> > > Signed-off-by: Ian Rogers <[email protected]>
> > >
> > > ---
> > > v2. Try to address review comments from Andrii Nakryiko.
> > > ---
> > > tools/lib/bpf/btf.c | 49 ++++++++++++++++++++++++++++++++-------------
> > > 1 file changed, 35 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > > index ee95fd379d4d..d8a05dda0836 100644
> > > --- a/tools/lib/bpf/btf.c
> > > +++ b/tools/lib/bpf/btf.c
> > > @@ -4920,16 +4920,25 @@ static int btf_dedup_remap_types(struct btf_dedup *d)
> > > return 0;
> > > }
> > >
> > > +static struct btf *btf__load_vmlinux_btf_path(const char *path)
> >
> > I don't think we need this helper, you literally call btf__parse() and
> > pr_debug(), that's all
> >
> > > +{
> > > + struct btf *btf;
> > > + int err;
> > > +
> > > + btf = btf__parse(path, NULL);
> > > + err = libbpf_get_error(btf);
> >
> > we should stop using libbpf_get_error, in libbpf v1.0+ it's best to do just
> >
> > btf = btf__parse(path, NULL);
> > if (!btf) {
> > err = -errno;
> > pr_debug(...);
> > return NULL;
> > }
> >
> > > + pr_debug("loading kernel BTF '%s': %d\n", path, err);
> > > + return err ? NULL : btf;
> > > +}
> > > +
> > > /*
> > > * Probe few well-known locations for vmlinux kernel image and try to load BTF
> > > * data out of it to use for target BTF.
> > > */
> > > struct btf *btf__load_vmlinux_btf(void)
> > > {
> > > + /* fall back locations, trying to find vmlinux on disk */
> > > const char *locations[] = {
> > > - /* try canonical vmlinux BTF through sysfs first */
> > > - "/sys/kernel/btf/vmlinux",
> > > - /* fall back to trying to find vmlinux on disk otherwise */
> > > "/boot/vmlinux-%1$s",
> > > "/lib/modules/%1$s/vmlinux-%1$s",
> > > "/lib/modules/%1$s/build/vmlinux",
> > > @@ -4938,29 +4947,41 @@ struct btf *btf__load_vmlinux_btf(void)
> > > "/usr/lib/debug/boot/vmlinux-%1$s.debug",
> > > "/usr/lib/debug/lib/modules/%1$s/vmlinux",
> > > };
> > > - char path[PATH_MAX + 1];
> > > + const char *location;
> > > struct utsname buf;
> > > struct btf *btf;
> > > - int i, err;
> > > + int i;
> > >
> > > - uname(&buf);
> > > + /* try canonical vmlinux BTF through sysfs first */
> > > + location = "/sys/kernel/btf/vmlinux";
> > > + if (faccessat(AT_FDCWD, location, R_OK, AT_EACCESS) == 0) {
> > > + btf = btf__load_vmlinux_btf_path(location);
> > > + if (btf)
> > > + return btf;
> > > +
> > > + pr_warn("Failed to load vmlinux BTF from %s, was CONFIG_DEBUG_INFO_BTF enabled?\n",
> > > + location);
> >
> > Mentioning CONFIG_DEBUG_INFO_BTF seems inappropriate here,
> > /sys/kernel/btf/vmlinux exists, we just failed to parse its data,
> > right? So it's not about CONFIG_DEBUG_INFO_BTF, we just don't support
> > something in BTF data. Just pr_warn("Failed to load vmlinux BTF from
> > %s: %d", location, err); should be good
>
> I think that assumes a lot about a user, they understand what BTF
> means, they know it is controlled by a kernel config option, and that
> the config option needs to be overridden (as it is defaulted off) for
> BTF to work. Given this escaped Raspberry Pi OS the potential for this
> mistake seems high - hence wanting to highlight the config option.

But there is nothing wrong with CONFIG_DEBUG_INFO_BTF, it is enabled,
and hence there is /sys/kernel/btf/vmlinux on the system. With
CONFIG_DEBUG_INFO_BTF suggestion you'll just lead users astray. What
am I missing?

>
> > > + } else
> > > + pr_warn("Unable to access canonical vmlinux BTF from %s\n", location);
> >
> > here the question of CONFIG_DEBUG_INFO_BTF is more appropriate, if
> > /sys/kernel/btf/vmlinux (on modern enough kernels) is missing, then
> > CONFIG_DEBUG_INFO_BTF is missing, probably. But I'd emit this only
> > after trying all the fallback paths and not finding anything.
> >
> > also stylistical nit: if one side of if has {}, the other has to have
> > {} as well, even if it's just one line
> >
> > >
> > > + uname(&buf);
> > > for (i = 0; i < ARRAY_SIZE(locations); i++) {
> > > - snprintf(path, PATH_MAX, locations[i], buf.release);
> > > + char path[PATH_MAX + 1];
> > > +
> > > + snprintf(path, sizeof(path), locations[i], buf.release);
> > >
> > > + btf = btf__load_vmlinux_btf_path(path);
> > > if (faccessat(AT_FDCWD, path, R_OK, AT_EACCESS))
> > > continue;
> > >
> > > - btf = btf__parse(path, NULL);
> > > - err = libbpf_get_error(btf);
> > > - pr_debug("loading kernel BTF '%s': %d\n", path, err);
> > > - if (err)
> > > - continue;
> > > + btf = btf__load_vmlinux_btf_path(location);
> > > + if (btf)
> > > + return btf;
> > >
> > > - return btf;
> > > + pr_warn("Failed to load vmlinux BTF from %s, was CONFIG_DEBUG_INFO_BTF enabled?\n",
> >
> > we should do better here as well. We should distinguish between "there
> > is vmlinux image, but it has no BTF" vs "there is no vmlinux image" vs
> > "vmlinux image is there, there is BTF, but we can't parse it". See
> > btf__parse(). We return -ENODATA if ELF doesn't have BTF, that's the
> > first situation. We can probably use faccessat() check for second
> > situation. Everything else can be reported as pr_debug() with location
> > (but still no CONFIG_DEBUG_INFO_BTF, it's meaningless for fallback BTF
> > locations)
> >
> > > + path);
> > > }
> > >
> > > - pr_warn("failed to find valid kernel BTF\n");
> >
> > and then here we can probably warn that we failed to find any kernel
> > BTF, and suggest CONFIG_DEBUG_INFO_BTF
>
> Andrii, you've basically written this patch, can I pass this over to you?

I think it would be great if you can finish thi, thanks.

>
> Thanks,
> Ian
>
> > > return libbpf_err_ptr(-ESRCH);
> > > }
> > >
> > > --
> > > 2.43.0.429.g432eaa2c6b-goog
> > >

2024-01-24 18:18:41

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH v2] libbpf: Add some details for BTF parsing failures

On Wed, Jan 24, 2024 at 9:40 AM Andrii Nakryiko
<[email protected]> wrote:
>
> On Tue, Jan 23, 2024 at 8:37 PM Ian Rogers <[email protected]> wrote:
> >
> > On Tue, Jan 23, 2024 at 8:25 PM Andrii Nakryiko
> > <[email protected]> wrote:
> > >
> > > On Tue, Jan 23, 2024 at 12:44 PM Ian Rogers <[email protected]> wrote:
> > > >
> > > > As CONFIG_DEBUG_INFO_BTF is default off the existing "failed to find
> > > > valid kernel BTF" message makes diagnosing the kernel build issue some
> > > > what cryptic. Add a little more detail with the hope of helping users.
> > > >
> > > > Before:
> > > > ```
> > > > libbpf: failed to find valid kernel BTF
> > > > libbpf: Error loading vmlinux BTF: -3
> > > > libbpf: failed to load object 'lock_contention_bpf'
> > > > libbpf: failed to load BPF skeleton 'lock_contention_bpf': -3
> > > > ```
> > > >
> > > > After no access /sys/kernel/btf/vmlinux:
> > > > ```
> > > > libbpf: Unable to access canonical vmlinux BTF from /sys/kernel/btf/vmlinux
> > > > libbpf: Error loading vmlinux BTF: -3
> > > > libbpf: failed to load object 'lock_contention_bpf'
> > > > libbpf: failed to load BPF skeleton 'lock_contention_bpf': -3
> > > > ```
> > > >
> > > > After no BTF /sys/kernel/btf/vmlinux:
> > > > ```
> > > > libbpf: Failed to load vmlinux BTF from /sys/kernel/btf/vmlinux, was CONFIG_DEBUG_INFO_BTF enabled?
> > > > libbpf: Error loading vmlinux BTF: -3
> > > > libbpf: failed to load object 'lock_contention_bpf'
> > > > libbpf: failed to load BPF skeleton 'lock_contention_bpf': -3
> > > > ```
> > > >
> > > > Closes: https://lore.kernel.org/bpf/CAP-5=fU+DN_+Y=Y4gtELUsJxKNDDCOvJzPHvjUVaUoeFAzNnig@mail.gmail.com/
> > > > Signed-off-by: Ian Rogers <[email protected]>
> > > >
> > > > ---
> > > > v2. Try to address review comments from Andrii Nakryiko.
> > > > ---
> > > > tools/lib/bpf/btf.c | 49 ++++++++++++++++++++++++++++++++-------------
> > > > 1 file changed, 35 insertions(+), 14 deletions(-)
> > > >
> > > > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > > > index ee95fd379d4d..d8a05dda0836 100644
> > > > --- a/tools/lib/bpf/btf.c
> > > > +++ b/tools/lib/bpf/btf.c
> > > > @@ -4920,16 +4920,25 @@ static int btf_dedup_remap_types(struct btf_dedup *d)
> > > > return 0;
> > > > }
> > > >
> > > > +static struct btf *btf__load_vmlinux_btf_path(const char *path)
> > >
> > > I don't think we need this helper, you literally call btf__parse() and
> > > pr_debug(), that's all
> > >
> > > > +{
> > > > + struct btf *btf;
> > > > + int err;
> > > > +
> > > > + btf = btf__parse(path, NULL);
> > > > + err = libbpf_get_error(btf);
> > >
> > > we should stop using libbpf_get_error, in libbpf v1.0+ it's best to do just
> > >
> > > btf = btf__parse(path, NULL);
> > > if (!btf) {
> > > err = -errno;
> > > pr_debug(...);
> > > return NULL;
> > > }
> > >
> > > > + pr_debug("loading kernel BTF '%s': %d\n", path, err);
> > > > + return err ? NULL : btf;
> > > > +}
> > > > +
> > > > /*
> > > > * Probe few well-known locations for vmlinux kernel image and try to load BTF
> > > > * data out of it to use for target BTF.
> > > > */
> > > > struct btf *btf__load_vmlinux_btf(void)
> > > > {
> > > > + /* fall back locations, trying to find vmlinux on disk */
> > > > const char *locations[] = {
> > > > - /* try canonical vmlinux BTF through sysfs first */
> > > > - "/sys/kernel/btf/vmlinux",
> > > > - /* fall back to trying to find vmlinux on disk otherwise */
> > > > "/boot/vmlinux-%1$s",
> > > > "/lib/modules/%1$s/vmlinux-%1$s",
> > > > "/lib/modules/%1$s/build/vmlinux",
> > > > @@ -4938,29 +4947,41 @@ struct btf *btf__load_vmlinux_btf(void)
> > > > "/usr/lib/debug/boot/vmlinux-%1$s.debug",
> > > > "/usr/lib/debug/lib/modules/%1$s/vmlinux",
> > > > };
> > > > - char path[PATH_MAX + 1];
> > > > + const char *location;
> > > > struct utsname buf;
> > > > struct btf *btf;
> > > > - int i, err;
> > > > + int i;
> > > >
> > > > - uname(&buf);
> > > > + /* try canonical vmlinux BTF through sysfs first */
> > > > + location = "/sys/kernel/btf/vmlinux";
> > > > + if (faccessat(AT_FDCWD, location, R_OK, AT_EACCESS) == 0) {
> > > > + btf = btf__load_vmlinux_btf_path(location);
> > > > + if (btf)
> > > > + return btf;
> > > > +
> > > > + pr_warn("Failed to load vmlinux BTF from %s, was CONFIG_DEBUG_INFO_BTF enabled?\n",
> > > > + location);
> > >
> > > Mentioning CONFIG_DEBUG_INFO_BTF seems inappropriate here,
> > > /sys/kernel/btf/vmlinux exists, we just failed to parse its data,
> > > right? So it's not about CONFIG_DEBUG_INFO_BTF, we just don't support
> > > something in BTF data. Just pr_warn("Failed to load vmlinux BTF from
> > > %s: %d", location, err); should be good
> >
> > I think that assumes a lot about a user, they understand what BTF
> > means, they know it is controlled by a kernel config option, and that
> > the config option needs to be overridden (as it is defaulted off) for
> > BTF to work. Given this escaped Raspberry Pi OS the potential for this
> > mistake seems high - hence wanting to highlight the config option.
>
> But there is nothing wrong with CONFIG_DEBUG_INFO_BTF, it is enabled,
> and hence there is /sys/kernel/btf/vmlinux on the system. With
> CONFIG_DEBUG_INFO_BTF suggestion you'll just lead users astray. What
> am I missing?

Okay, so can we warn about CONFIG_DEBUG_INFO_BTF if
/sys/kernel/btf/vmlinux isn't accessible? I'd like to clear up
confusion over permissions, generally not an issue as these tools tend
to get run as root, and a missing kernel configuration - the issue
with Raspberry Pi OS.

Thanks,
Ian

> >
> > > > + } else
> > > > + pr_warn("Unable to access canonical vmlinux BTF from %s\n", location);
> > >
> > > here the question of CONFIG_DEBUG_INFO_BTF is more appropriate, if
> > > /sys/kernel/btf/vmlinux (on modern enough kernels) is missing, then
> > > CONFIG_DEBUG_INFO_BTF is missing, probably. But I'd emit this only
> > > after trying all the fallback paths and not finding anything.
> > >
> > > also stylistical nit: if one side of if has {}, the other has to have
> > > {} as well, even if it's just one line
> > >
> > > >
> > > > + uname(&buf);
> > > > for (i = 0; i < ARRAY_SIZE(locations); i++) {
> > > > - snprintf(path, PATH_MAX, locations[i], buf.release);
> > > > + char path[PATH_MAX + 1];
> > > > +
> > > > + snprintf(path, sizeof(path), locations[i], buf.release);
> > > >
> > > > + btf = btf__load_vmlinux_btf_path(path);
> > > > if (faccessat(AT_FDCWD, path, R_OK, AT_EACCESS))
> > > > continue;
> > > >
> > > > - btf = btf__parse(path, NULL);
> > > > - err = libbpf_get_error(btf);
> > > > - pr_debug("loading kernel BTF '%s': %d\n", path, err);
> > > > - if (err)
> > > > - continue;
> > > > + btf = btf__load_vmlinux_btf_path(location);
> > > > + if (btf)
> > > > + return btf;
> > > >
> > > > - return btf;
> > > > + pr_warn("Failed to load vmlinux BTF from %s, was CONFIG_DEBUG_INFO_BTF enabled?\n",
> > >
> > > we should do better here as well. We should distinguish between "there
> > > is vmlinux image, but it has no BTF" vs "there is no vmlinux image" vs
> > > "vmlinux image is there, there is BTF, but we can't parse it". See
> > > btf__parse(). We return -ENODATA if ELF doesn't have BTF, that's the
> > > first situation. We can probably use faccessat() check for second
> > > situation. Everything else can be reported as pr_debug() with location
> > > (but still no CONFIG_DEBUG_INFO_BTF, it's meaningless for fallback BTF
> > > locations)
> > >
> > > > + path);
> > > > }
> > > >
> > > > - pr_warn("failed to find valid kernel BTF\n");
> > >
> > > and then here we can probably warn that we failed to find any kernel
> > > BTF, and suggest CONFIG_DEBUG_INFO_BTF
> >
> > Andrii, you've basically written this patch, can I pass this over to you?
>
> I think it would be great if you can finish thi, thanks.
>
> >
> > Thanks,
> > Ian
> >
> > > > return libbpf_err_ptr(-ESRCH);
> > > > }
> > > >
> > > > --
> > > > 2.43.0.429.g432eaa2c6b-goog
> > > >

2024-01-24 19:36:11

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH v2] libbpf: Add some details for BTF parsing failures

On Wed, Jan 24, 2024 at 10:18 AM Ian Rogers <[email protected]> wrote:
>
> On Wed, Jan 24, 2024 at 9:40 AM Andrii Nakryiko
> <[email protected]> wrote:
> >
> > On Tue, Jan 23, 2024 at 8:37 PM Ian Rogers <[email protected]> wrote:
> > >
> > > On Tue, Jan 23, 2024 at 8:25 PM Andrii Nakryiko
> > > <[email protected]> wrote:
> > > >
> > > > On Tue, Jan 23, 2024 at 12:44 PM Ian Rogers <irogers@googlecom> wrote:
> > > > >
> > > > > As CONFIG_DEBUG_INFO_BTF is default off the existing "failed to find
> > > > > valid kernel BTF" message makes diagnosing the kernel build issue some
> > > > > what cryptic. Add a little more detail with the hope of helping users.
> > > > >
> > > > > Before:
> > > > > ```
> > > > > libbpf: failed to find valid kernel BTF
> > > > > libbpf: Error loading vmlinux BTF: -3
> > > > > libbpf: failed to load object 'lock_contention_bpf'
> > > > > libbpf: failed to load BPF skeleton 'lock_contention_bpf': -3
> > > > > ```
> > > > >
> > > > > After no access /sys/kernel/btf/vmlinux:
> > > > > ```
> > > > > libbpf: Unable to access canonical vmlinux BTF from /sys/kernel/btf/vmlinux
> > > > > libbpf: Error loading vmlinux BTF: -3
> > > > > libbpf: failed to load object 'lock_contention_bpf'
> > > > > libbpf: failed to load BPF skeleton 'lock_contention_bpf': -3
> > > > > ```
> > > > >
> > > > > After no BTF /sys/kernel/btf/vmlinux:
> > > > > ```
> > > > > libbpf: Failed to load vmlinux BTF from /sys/kernel/btf/vmlinux, was CONFIG_DEBUG_INFO_BTF enabled?
> > > > > libbpf: Error loading vmlinux BTF: -3
> > > > > libbpf: failed to load object 'lock_contention_bpf'
> > > > > libbpf: failed to load BPF skeleton 'lock_contention_bpf': -3
> > > > > ```
> > > > >
> > > > > Closes: https://lore.kernel.org/bpf/CAP-5=fU+DN_+Y=Y4gtELUsJxKNDDCOvJzPHvjUVaUoeFAzNnig@mail.gmail.com/
> > > > > Signed-off-by: Ian Rogers <[email protected]>
> > > > >
> > > > > ---
> > > > > v2. Try to address review comments from Andrii Nakryiko.
> > > > > ---
> > > > > tools/lib/bpf/btf.c | 49 ++++++++++++++++++++++++++++++++-------------
> > > > > 1 file changed, 35 insertions(+), 14 deletions(-)
> > > > >
> > > > > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > > > > index ee95fd379d4d..d8a05dda0836 100644
> > > > > --- a/tools/lib/bpf/btf.c
> > > > > +++ b/tools/lib/bpf/btf.c
> > > > > @@ -4920,16 +4920,25 @@ static int btf_dedup_remap_types(struct btf_dedup *d)
> > > > > return 0;
> > > > > }
> > > > >
> > > > > +static struct btf *btf__load_vmlinux_btf_path(const char *path)
> > > >
> > > > I don't think we need this helper, you literally call btf__parse() and
> > > > pr_debug(), that's all
> > > >
> > > > > +{
> > > > > + struct btf *btf;
> > > > > + int err;
> > > > > +
> > > > > + btf = btf__parse(path, NULL);
> > > > > + err = libbpf_get_error(btf);
> > > >
> > > > we should stop using libbpf_get_error, in libbpf v1.0+ it's best to do just
> > > >
> > > > btf = btf__parse(path, NULL);
> > > > if (!btf) {
> > > > err = -errno;
> > > > pr_debug(...);
> > > > return NULL;
> > > > }
> > > >
> > > > > + pr_debug("loading kernel BTF '%s': %d\n", path, err);
> > > > > + return err ? NULL : btf;
> > > > > +}
> > > > > +
> > > > > /*
> > > > > * Probe few well-known locations for vmlinux kernel image and try to load BTF
> > > > > * data out of it to use for target BTF.
> > > > > */
> > > > > struct btf *btf__load_vmlinux_btf(void)
> > > > > {
> > > > > + /* fall back locations, trying to find vmlinux on disk */
> > > > > const char *locations[] = {
> > > > > - /* try canonical vmlinux BTF through sysfs first */
> > > > > - "/sys/kernel/btf/vmlinux",
> > > > > - /* fall back to trying to find vmlinux on disk otherwise */
> > > > > "/boot/vmlinux-%1$s",
> > > > > "/lib/modules/%1$s/vmlinux-%1$s",
> > > > > "/lib/modules/%1$s/build/vmlinux",
> > > > > @@ -4938,29 +4947,41 @@ struct btf *btf__load_vmlinux_btf(void)
> > > > > "/usr/lib/debug/boot/vmlinux-%1$s.debug",
> > > > > "/usr/lib/debug/lib/modules/%1$s/vmlinux",
> > > > > };
> > > > > - char path[PATH_MAX + 1];
> > > > > + const char *location;
> > > > > struct utsname buf;
> > > > > struct btf *btf;
> > > > > - int i, err;
> > > > > + int i;
> > > > >
> > > > > - uname(&buf);
> > > > > + /* try canonical vmlinux BTF through sysfs first */
> > > > > + location = "/sys/kernel/btf/vmlinux";
> > > > > + if (faccessat(AT_FDCWD, location, R_OK, AT_EACCESS) == 0) {
> > > > > + btf = btf__load_vmlinux_btf_path(location);
> > > > > + if (btf)
> > > > > + return btf;
> > > > > +
> > > > > + pr_warn("Failed to load vmlinux BTF from %s, was CONFIG_DEBUG_INFO_BTF enabled?\n",
> > > > > + location);
> > > >
> > > > Mentioning CONFIG_DEBUG_INFO_BTF seems inappropriate here,
> > > > /sys/kernel/btf/vmlinux exists, we just failed to parse its data,
> > > > right? So it's not about CONFIG_DEBUG_INFO_BTF, we just don't support
> > > > something in BTF data. Just pr_warn("Failed to load vmlinux BTF from
> > > > %s: %d", location, err); should be good
> > >
> > > I think that assumes a lot about a user, they understand what BTF
> > > means, they know it is controlled by a kernel config option, and that
> > > the config option needs to be overridden (as it is defaulted off) for
> > > BTF to work. Given this escaped Raspberry Pi OS the potential for this
> > > mistake seems high - hence wanting to highlight the config option.
> >
> > But there is nothing wrong with CONFIG_DEBUG_INFO_BTF, it is enabled,
> > and hence there is /sys/kernel/btf/vmlinux on the system. With
> > CONFIG_DEBUG_INFO_BTF suggestion you'll just lead users astray. What
> > am I missing?
>
> Okay, so can we warn about CONFIG_DEBUG_INFO_BTF if
> /sys/kernel/btf/vmlinux isn't accessible? I'd like to clear up
> confusion over permissions, generally not an issue as these tools tend
> to get run as root, and a missing kernel configuration - the issue
> with Raspberry Pi OS.

Why not do two checks. One faccessat(F_OK) (whether file exists), and
if not, report CONFIG_DEBUG_INFO_BTF. And then separately R_OK access,
and report permissions problems. Everything else will be a failure to
parse BTF.

>
> Thanks,
> Ian
>
> > >
> > > > > + } else
> > > > > + pr_warn("Unable to access canonical vmlinux BTF from %s\n", location);
> > > >
> > > > here the question of CONFIG_DEBUG_INFO_BTF is more appropriate, if
> > > > /sys/kernel/btf/vmlinux (on modern enough kernels) is missing, then
> > > > CONFIG_DEBUG_INFO_BTF is missing, probably. But I'd emit this only
> > > > after trying all the fallback paths and not finding anything.
> > > >
> > > > also stylistical nit: if one side of if has {}, the other has to have
> > > > {} as well, even if it's just one line
> > > >
> > > > >
> > > > > + uname(&buf);
> > > > > for (i = 0; i < ARRAY_SIZE(locations); i++) {
> > > > > - snprintf(path, PATH_MAX, locations[i], buf.release);
> > > > > + char path[PATH_MAX + 1];
> > > > > +
> > > > > + snprintf(path, sizeof(path), locations[i], buf.release);
> > > > >
> > > > > + btf = btf__load_vmlinux_btf_path(path);
> > > > > if (faccessat(AT_FDCWD, path, R_OK, AT_EACCESS))
> > > > > continue;
> > > > >
> > > > > - btf = btf__parse(path, NULL);
> > > > > - err = libbpf_get_error(btf);
> > > > > - pr_debug("loading kernel BTF '%s': %d\n", path, err);
> > > > > - if (err)
> > > > > - continue;
> > > > > + btf = btf__load_vmlinux_btf_path(location);
> > > > > + if (btf)
> > > > > + return btf;
> > > > >
> > > > > - return btf;
> > > > > + pr_warn("Failed to load vmlinux BTF from %s, was CONFIG_DEBUG_INFO_BTF enabled?\n",
> > > >
> > > > we should do better here as well. We should distinguish between "there
> > > > is vmlinux image, but it has no BTF" vs "there is no vmlinux image" vs
> > > > "vmlinux image is there, there is BTF, but we can't parse it". See
> > > > btf__parse(). We return -ENODATA if ELF doesn't have BTF, that's the
> > > > first situation. We can probably use faccessat() check for second
> > > > situation. Everything else can be reported as pr_debug() with location
> > > > (but still no CONFIG_DEBUG_INFO_BTF, it's meaningless for fallback BTF
> > > > locations)
> > > >
> > > > > + path);
> > > > > }
> > > > >
> > > > > - pr_warn("failed to find valid kernel BTF\n");
> > > >
> > > > and then here we can probably warn that we failed to find any kernel
> > > > BTF, and suggest CONFIG_DEBUG_INFO_BTF
> > >
> > > Andrii, you've basically written this patch, can I pass this over to you?
> >
> > I think it would be great if you can finish thi, thanks.
> >
> > >
> > > Thanks,
> > > Ian
> > >
> > > > > return libbpf_err_ptr(-ESRCH);
> > > > > }
> > > > >
> > > > > --
> > > > > 2.43.0.429.g432eaa2c6b-goog
> > > > >