2020-11-10 01:24:21

by Andrii Nakryiko

[permalink] [raw]
Subject: [PATCH v4 bpf-next 0/5] Integrate kernel module BTF support

This patch set adds BTF generation for kernel modules using a compact split
BTF approach. Respective patches have all the details.

Kernel module BTFs rely on pahole's split BTF support, which is added in [0]
and will be available starting from v1.19. Support for it is detected
automatically during kernel build time.

This patch set implements in-kernel support for split BTF loading and
validation. It also extends GET_OBJ_INFO API for BTFs to return BTF's module
name and a flag whether BTF itself is in-kernel or user-provided. vmlinux BTF
is also exposed to user-space through the same BTF object iteration APIs.

Follow up patch set will utilize the fact that vmlinux and module BTFs now
have associated ID to provide ability to attach BPF fentry/fexit/etc programs
to functions defined in kernel modules.

bpftool is also extended to show module/vmlinux BTF's name.

[0] https://patchwork.kernel.org/project/netdevbpf/list/?series=378699&state=*

v3->v4:
- copy_to_user() on ENOSPC in btf_get_info_by_fd() (Martin);
v2->v3:
- get rid of unnecessary gotos (Song);
v2->v1:
- drop WARNs, add fewer pr_warn()'s instead (Greg);
- properly initialize sysfs binary attribute structure (Greg);
- add __maybe_unused to any_section_objs, used conditionally by module BTF;
rfc->v1:
- CONFIG_DEBUG_INFO_BTF_MODULES is derived automatically (Alexei);
- vmlinux BTF now has explicit "vmlinux" name (Alexei);
- added sysfs ABI documentation for /sys/kernel/btf/<module> (Greg).

Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>

Andrii Nakryiko (5):
bpf: add in-kernel split BTF support
bpf: assign ID to vmlinux BTF and return extra info for BTF in
GET_OBJ_INFO
kbuild: build kernel module BTFs if BTF is enabled and pahole supports
it
bpf: load and verify kernel module BTFs
tools/bpftool: add support for in-kernel and named BTF in `btf show`

Documentation/ABI/testing/sysfs-kernel-btf | 8 +
include/linux/bpf.h | 2 +
include/linux/module.h | 4 +
include/uapi/linux/bpf.h | 3 +
kernel/bpf/btf.c | 406 ++++++++++++++++++---
kernel/bpf/sysfs_btf.c | 2 +-
kernel/module.c | 32 ++
lib/Kconfig.debug | 9 +
scripts/Makefile.modfinal | 20 +-
tools/bpf/bpftool/btf.c | 28 +-
tools/include/uapi/linux/bpf.h | 3 +
11 files changed, 459 insertions(+), 58 deletions(-)

--
2.24.1


2020-11-10 01:25:09

by Andrii Nakryiko

[permalink] [raw]
Subject: [PATCH v4 bpf-next 2/5] bpf: assign ID to vmlinux BTF and return extra info for BTF in GET_OBJ_INFO

Allocate ID for vmlinux BTF. This makes it visible when iterating over all BTF
objects in the system. To allow distinguishing vmlinux BTF (and later kernel
module BTF) from user-provided BTFs, expose extra kernel_btf flag, as well as
BTF name ("vmlinux" for vmlinux BTF, will equal to module's name for module
BTF). We might want to later allow specifying BTF name for user-provided BTFs
as well, if that makes sense. But currently this is reserved only for
in-kernel BTFs.

Having in-kernel BTFs exposed IDs will allow to extend BPF APIs that require
in-kernel BTF type with ability to specify BTF types from kernel modules, not
just vmlinux BTF. This will be implemented in a follow up patch set for
fentry/fexit/fmod_ret/lsm/etc.

Signed-off-by: Andrii Nakryiko <[email protected]>
---
include/uapi/linux/bpf.h | 3 +++
kernel/bpf/btf.c | 43 +++++++++++++++++++++++++++++++---
tools/include/uapi/linux/bpf.h | 3 +++
3 files changed, 46 insertions(+), 3 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 9879d6793e90..162999b12790 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4466,6 +4466,9 @@ struct bpf_btf_info {
__aligned_u64 btf;
__u32 btf_size;
__u32 id;
+ __aligned_u64 name;
+ __u32 name_len;
+ __u32 kernel_btf;
} __attribute__((aligned(8)));

struct bpf_link_info {
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 727c1c27053f..856585db7aa7 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -214,6 +214,8 @@ struct btf {
struct btf *base_btf;
u32 start_id; /* first type ID in this BTF (0 for base BTF) */
u32 start_str_off; /* first string offset (0 for base BTF) */
+ char name[MODULE_NAME_LEN];
+ bool kernel_btf;
};

enum verifier_phase {
@@ -4429,6 +4431,8 @@ struct btf *btf_parse_vmlinux(void)

btf->data = __start_BTF;
btf->data_size = __stop_BTF - __start_BTF;
+ btf->kernel_btf = true;
+ snprintf(btf->name, sizeof(btf->name), "vmlinux");

err = btf_parse_hdr(env);
if (err)
@@ -4454,8 +4458,13 @@ struct btf *btf_parse_vmlinux(void)

bpf_struct_ops_init(btf, log);

- btf_verifier_env_free(env);
refcount_set(&btf->refcnt, 1);
+
+ err = btf_alloc_id(btf);
+ if (err)
+ goto errout;
+
+ btf_verifier_env_free(env);
return btf;

errout:
@@ -5553,7 +5562,9 @@ int btf_get_info_by_fd(const struct btf *btf,
struct bpf_btf_info info;
u32 info_copy, btf_copy;
void __user *ubtf;
- u32 uinfo_len;
+ char __user *uname;
+ u32 uinfo_len, uname_len, name_len;
+ int ret = 0;

uinfo = u64_to_user_ptr(attr->info.info);
uinfo_len = attr->info.info_len;
@@ -5570,11 +5581,37 @@ int btf_get_info_by_fd(const struct btf *btf,
return -EFAULT;
info.btf_size = btf->data_size;

+ info.kernel_btf = btf->kernel_btf;
+
+ uname = u64_to_user_ptr(info.name);
+ uname_len = info.name_len;
+ if (!uname ^ !uname_len)
+ return -EINVAL;
+
+ name_len = strlen(btf->name);
+ info.name_len = name_len;
+
+ if (uname) {
+ if (uname_len >= name_len + 1) {
+ if (copy_to_user(uname, btf->name, name_len + 1))
+ return -EFAULT;
+ } else {
+ char zero = '\0';
+
+ if (copy_to_user(uname, btf->name, uname_len - 1))
+ return -EFAULT;
+ if (put_user(zero, uname + uname_len - 1))
+ return -EFAULT;
+ /* let user-space know about too short buffer */
+ ret = -ENOSPC;
+ }
+ }
+
if (copy_to_user(uinfo, &info, info_copy) ||
put_user(info_copy, &uattr->info.info_len))
return -EFAULT;

- return 0;
+ return ret;
}

int btf_get_fd_by_id(u32 id)
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 9879d6793e90..162999b12790 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -4466,6 +4466,9 @@ struct bpf_btf_info {
__aligned_u64 btf;
__u32 btf_size;
__u32 id;
+ __aligned_u64 name;
+ __u32 name_len;
+ __u32 kernel_btf;
} __attribute__((aligned(8)));

struct bpf_link_info {
--
2.24.1

2020-11-10 01:25:14

by Andrii Nakryiko

[permalink] [raw]
Subject: [PATCH v4 bpf-next 1/5] bpf: add in-kernel split BTF support

Adjust in-kernel BTF implementation to support a split BTF mode of operation.
Changes are mostly mirroring libbpf split BTF changes, with the exception of
start_id being 0 for in-kernel implementation due to simpler read-only mode.

Otherwise, for split BTF logic, most of the logic of jumping to base BTF,
where necessary, is encapsulated in few helper functions. Type numbering and
string offset in a split BTF are logically continuing where base BTF ends, so
most of the high-level logic is kept without changes.

Type verification and size resolution is only doing an added resolution of new
split BTF types and relies on already cached size and type resolution results
in the base BTF.

Signed-off-by: Andrii Nakryiko <[email protected]>
---
kernel/bpf/btf.c | 171 +++++++++++++++++++++++++++++++++--------------
1 file changed, 119 insertions(+), 52 deletions(-)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 6324de8c59f7..727c1c27053f 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -203,12 +203,17 @@ struct btf {
const char *strings;
void *nohdr_data;
struct btf_header hdr;
- u32 nr_types;
+ u32 nr_types; /* includes VOID for base BTF */
u32 types_size;
u32 data_size;
refcount_t refcnt;
u32 id;
struct rcu_head rcu;
+
+ /* split BTF support */
+ struct btf *base_btf;
+ u32 start_id; /* first type ID in this BTF (0 for base BTF) */
+ u32 start_str_off; /* first string offset (0 for base BTF) */
};

enum verifier_phase {
@@ -449,14 +454,27 @@ static bool btf_type_is_datasec(const struct btf_type *t)
return BTF_INFO_KIND(t->info) == BTF_KIND_DATASEC;
}

+static u32 btf_nr_types_total(const struct btf *btf)
+{
+ u32 total = 0;
+
+ while (btf) {
+ total += btf->nr_types;
+ btf = btf->base_btf;
+ }
+
+ return total;
+}
+
s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
{
const struct btf_type *t;
const char *tname;
- u32 i;
+ u32 i, total;

- for (i = 1; i <= btf->nr_types; i++) {
- t = btf->types[i];
+ total = btf_nr_types_total(btf);
+ for (i = 1; i < total; i++) {
+ t = btf_type_by_id(btf, i);
if (BTF_INFO_KIND(t->info) != kind)
continue;

@@ -599,8 +617,14 @@ static const struct btf_kind_operations *btf_type_ops(const struct btf_type *t)

static bool btf_name_offset_valid(const struct btf *btf, u32 offset)
{
- return BTF_STR_OFFSET_VALID(offset) &&
- offset < btf->hdr.str_len;
+ if (!BTF_STR_OFFSET_VALID(offset))
+ return false;
+
+ while (offset < btf->start_str_off)
+ btf = btf->base_btf;
+
+ offset -= btf->start_str_off;
+ return offset < btf->hdr.str_len;
}

static bool __btf_name_char_ok(char c, bool first, bool dot_ok)
@@ -614,10 +638,22 @@ static bool __btf_name_char_ok(char c, bool first, bool dot_ok)
return true;
}

+static const char *btf_str_by_offset(const struct btf *btf, u32 offset)
+{
+ while (offset < btf->start_str_off)
+ btf = btf->base_btf;
+
+ offset -= btf->start_str_off;
+ if (offset < btf->hdr.str_len)
+ return &btf->strings[offset];
+
+ return NULL;
+}
+
static bool __btf_name_valid(const struct btf *btf, u32 offset, bool dot_ok)
{
/* offset must be valid */
- const char *src = &btf->strings[offset];
+ const char *src = btf_str_by_offset(btf, offset);
const char *src_limit;

if (!__btf_name_char_ok(*src, true, dot_ok))
@@ -650,27 +686,28 @@ static bool btf_name_valid_section(const struct btf *btf, u32 offset)

static const char *__btf_name_by_offset(const struct btf *btf, u32 offset)
{
+ const char *name;
+
if (!offset)
return "(anon)";
- else if (offset < btf->hdr.str_len)
- return &btf->strings[offset];
- else
- return "(invalid-name-offset)";
+
+ name = btf_str_by_offset(btf, offset);
+ return name ?: "(invalid-name-offset)";
}

const char *btf_name_by_offset(const struct btf *btf, u32 offset)
{
- if (offset < btf->hdr.str_len)
- return &btf->strings[offset];
-
- return NULL;
+ return btf_str_by_offset(btf, offset);
}

const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id)
{
- if (type_id > btf->nr_types)
- return NULL;
+ while (type_id < btf->start_id)
+ btf = btf->base_btf;

+ type_id -= btf->start_id;
+ if (type_id >= btf->nr_types)
+ return NULL;
return btf->types[type_id];
}

@@ -1390,17 +1427,13 @@ static int btf_add_type(struct btf_verifier_env *env, struct btf_type *t)
{
struct btf *btf = env->btf;

- /* < 2 because +1 for btf_void which is always in btf->types[0].
- * btf_void is not accounted in btf->nr_types because btf_void
- * does not come from the BTF file.
- */
- if (btf->types_size - btf->nr_types < 2) {
+ if (btf->types_size == btf->nr_types) {
/* Expand 'types' array */

struct btf_type **new_types;
u32 expand_by, new_size;

- if (btf->types_size == BTF_MAX_TYPE) {
+ if (btf->start_id + btf->types_size == BTF_MAX_TYPE) {
btf_verifier_log(env, "Exceeded max num of types");
return -E2BIG;
}
@@ -1414,18 +1447,23 @@ static int btf_add_type(struct btf_verifier_env *env, struct btf_type *t)
if (!new_types)
return -ENOMEM;

- if (btf->nr_types == 0)
- new_types[0] = &btf_void;
- else
+ if (btf->nr_types == 0) {
+ if (!btf->base_btf) {
+ /* lazily init VOID type */
+ new_types[0] = &btf_void;
+ btf->nr_types++;
+ }
+ } else {
memcpy(new_types, btf->types,
- sizeof(*btf->types) * (btf->nr_types + 1));
+ sizeof(*btf->types) * btf->nr_types);
+ }

kvfree(btf->types);
btf->types = new_types;
btf->types_size = new_size;
}

- btf->types[++(btf->nr_types)] = t;
+ btf->types[btf->nr_types++] = t;

return 0;
}
@@ -1498,18 +1536,17 @@ static int env_resolve_init(struct btf_verifier_env *env)
u32 *resolved_ids = NULL;
u8 *visit_states = NULL;

- /* +1 for btf_void */
- resolved_sizes = kvcalloc(nr_types + 1, sizeof(*resolved_sizes),
+ resolved_sizes = kvcalloc(nr_types, sizeof(*resolved_sizes),
GFP_KERNEL | __GFP_NOWARN);
if (!resolved_sizes)
goto nomem;

- resolved_ids = kvcalloc(nr_types + 1, sizeof(*resolved_ids),
+ resolved_ids = kvcalloc(nr_types, sizeof(*resolved_ids),
GFP_KERNEL | __GFP_NOWARN);
if (!resolved_ids)
goto nomem;

- visit_states = kvcalloc(nr_types + 1, sizeof(*visit_states),
+ visit_states = kvcalloc(nr_types, sizeof(*visit_states),
GFP_KERNEL | __GFP_NOWARN);
if (!visit_states)
goto nomem;
@@ -1561,21 +1598,27 @@ static bool env_type_is_resolve_sink(const struct btf_verifier_env *env,
static bool env_type_is_resolved(const struct btf_verifier_env *env,
u32 type_id)
{
- return env->visit_states[type_id] == RESOLVED;
+ /* base BTF types should be resolved by now */
+ if (type_id < env->btf->start_id)
+ return true;
+
+ return env->visit_states[type_id - env->btf->start_id] == RESOLVED;
}

static int env_stack_push(struct btf_verifier_env *env,
const struct btf_type *t, u32 type_id)
{
+ const struct btf *btf = env->btf;
struct resolve_vertex *v;

if (env->top_stack == MAX_RESOLVE_DEPTH)
return -E2BIG;

- if (env->visit_states[type_id] != NOT_VISITED)
+ if (type_id < btf->start_id
+ || env->visit_states[type_id - btf->start_id] != NOT_VISITED)
return -EEXIST;

- env->visit_states[type_id] = VISITED;
+ env->visit_states[type_id - btf->start_id] = VISITED;

v = &env->stack[env->top_stack++];
v->t = t;
@@ -1605,6 +1648,7 @@ static void env_stack_pop_resolved(struct btf_verifier_env *env,
u32 type_id = env->stack[--(env->top_stack)].type_id;
struct btf *btf = env->btf;

+ type_id -= btf->start_id; /* adjust to local type id */
btf->resolved_sizes[type_id] = resolved_size;
btf->resolved_ids[type_id] = resolved_type_id;
env->visit_states[type_id] = RESOLVED;
@@ -1709,14 +1753,30 @@ btf_resolve_size(const struct btf *btf, const struct btf_type *type,
return __btf_resolve_size(btf, type, type_size, NULL, NULL, NULL, NULL);
}

+static u32 btf_resolved_type_id(const struct btf *btf, u32 type_id)
+{
+ while (type_id < btf->start_id)
+ btf = btf->base_btf;
+
+ return btf->resolved_ids[type_id - btf->start_id];
+}
+
/* The input param "type_id" must point to a needs_resolve type */
static const struct btf_type *btf_type_id_resolve(const struct btf *btf,
u32 *type_id)
{
- *type_id = btf->resolved_ids[*type_id];
+ *type_id = btf_resolved_type_id(btf, *type_id);
return btf_type_by_id(btf, *type_id);
}

+static u32 btf_resolved_type_size(const struct btf *btf, u32 type_id)
+{
+ while (type_id < btf->start_id)
+ btf = btf->base_btf;
+
+ return btf->resolved_sizes[type_id - btf->start_id];
+}
+
const struct btf_type *btf_type_id_size(const struct btf *btf,
u32 *type_id, u32 *ret_size)
{
@@ -1731,7 +1791,7 @@ const struct btf_type *btf_type_id_size(const struct btf *btf,
if (btf_type_has_size(size_type)) {
size = size_type->size;
} else if (btf_type_is_array(size_type)) {
- size = btf->resolved_sizes[size_type_id];
+ size = btf_resolved_type_size(btf, size_type_id);
} else if (btf_type_is_ptr(size_type)) {
size = sizeof(void *);
} else {
@@ -1739,14 +1799,14 @@ const struct btf_type *btf_type_id_size(const struct btf *btf,
!btf_type_is_var(size_type)))
return NULL;

- size_type_id = btf->resolved_ids[size_type_id];
+ size_type_id = btf_resolved_type_id(btf, size_type_id);
size_type = btf_type_by_id(btf, size_type_id);
if (btf_type_nosize_or_null(size_type))
return NULL;
else if (btf_type_has_size(size_type))
size = size_type->size;
else if (btf_type_is_array(size_type))
- size = btf->resolved_sizes[size_type_id];
+ size = btf_resolved_type_size(btf, size_type_id);
else if (btf_type_is_ptr(size_type))
size = sizeof(void *);
else
@@ -3798,7 +3858,7 @@ static int btf_check_all_metas(struct btf_verifier_env *env)
cur = btf->nohdr_data + hdr->type_off;
end = cur + hdr->type_len;

- env->log_type_id = 1;
+ env->log_type_id = btf->base_btf ? btf->start_id : 1;
while (cur < end) {
struct btf_type *t = cur;
s32 meta_size;
@@ -3825,8 +3885,8 @@ static bool btf_resolve_valid(struct btf_verifier_env *env,
return false;

if (btf_type_is_struct(t) || btf_type_is_datasec(t))
- return !btf->resolved_ids[type_id] &&
- !btf->resolved_sizes[type_id];
+ return !btf_resolved_type_id(btf, type_id) &&
+ !btf_resolved_type_size(btf, type_id);

if (btf_type_is_modifier(t) || btf_type_is_ptr(t) ||
btf_type_is_var(t)) {
@@ -3846,7 +3906,7 @@ static bool btf_resolve_valid(struct btf_verifier_env *env,
elem_type = btf_type_id_size(btf, &elem_type_id, &elem_size);
return elem_type && !btf_type_is_modifier(elem_type) &&
(array->nelems * elem_size ==
- btf->resolved_sizes[type_id]);
+ btf_resolved_type_size(btf, type_id));
}

return false;
@@ -3888,7 +3948,8 @@ static int btf_resolve(struct btf_verifier_env *env,
static int btf_check_all_types(struct btf_verifier_env *env)
{
struct btf *btf = env->btf;
- u32 type_id;
+ const struct btf_type *t;
+ u32 type_id, i;
int err;

err = env_resolve_init(env);
@@ -3896,8 +3957,9 @@ static int btf_check_all_types(struct btf_verifier_env *env)
return err;

env->phase++;
- for (type_id = 1; type_id <= btf->nr_types; type_id++) {
- const struct btf_type *t = btf_type_by_id(btf, type_id);
+ for (i = btf->base_btf ? 0 : 1; i < btf->nr_types; i++) {
+ type_id = btf->start_id + i;
+ t = btf_type_by_id(btf, type_id);

env->log_type_id = type_id;
if (btf_type_needs_resolve(t) &&
@@ -3934,7 +3996,7 @@ static int btf_parse_type_sec(struct btf_verifier_env *env)
return -EINVAL;
}

- if (!hdr->type_len) {
+ if (!env->btf->base_btf && !hdr->type_len) {
btf_verifier_log(env, "No type found");
return -EINVAL;
}
@@ -3961,13 +4023,18 @@ static int btf_parse_str_sec(struct btf_verifier_env *env)
return -EINVAL;
}

- if (!hdr->str_len || hdr->str_len - 1 > BTF_MAX_NAME_OFFSET ||
- start[0] || end[-1]) {
+ btf->strings = start;
+
+ if (btf->base_btf && !hdr->str_len)
+ return 0;
+ if (!hdr->str_len || hdr->str_len - 1 > BTF_MAX_NAME_OFFSET || end[-1]) {
+ btf_verifier_log(env, "Invalid string section");
+ return -EINVAL;
+ }
+ if (!btf->base_btf && start[0]) {
btf_verifier_log(env, "Invalid string section");
return -EINVAL;
}
-
- btf->strings = start;

return 0;
}
@@ -4908,7 +4975,7 @@ static int __get_type_size(struct btf *btf, u32 btf_id,
while (t && btf_type_is_modifier(t))
t = btf_type_by_id(btf, t->type);
if (!t) {
- *bad_type = btf->types[0];
+ *bad_type = btf_type_by_id(btf, 0);
return -EINVAL;
}
if (btf_type_is_ptr(t))
--
2.24.1

2020-11-10 01:25:47

by Andrii Nakryiko

[permalink] [raw]
Subject: [PATCH v4 bpf-next 4/5] bpf: load and verify kernel module BTFs

Add kernel module listener that will load/validate and unload module BTF.
Module BTFs gets ID generated for them, which makes it possible to iterate
them with existing BTF iteration API. They are given their respective module's
names, which will get reported through GET_OBJ_INFO API. They are also marked
as in-kernel BTFs for tooling to distinguish them from user-provided BTFs.

Also, similarly to vmlinux BTF, kernel module BTFs are exposed through
sysfs as /sys/kernel/btf/<module-name>. This is convenient for user-space
tools to inspect module BTF contents and dump their types with existing tools:

[vmuser@archvm bpf]$ ls -la /sys/kernel/btf
total 0
drwxr-xr-x 2 root root 0 Nov 4 19:46 .
drwxr-xr-x 13 root root 0 Nov 4 19:46 ..

...

-r--r--r-- 1 root root 888 Nov 4 19:46 irqbypass
-r--r--r-- 1 root root 100225 Nov 4 19:46 kvm
-r--r--r-- 1 root root 35401 Nov 4 19:46 kvm_intel
-r--r--r-- 1 root root 120 Nov 4 19:46 pcspkr
-r--r--r-- 1 root root 399 Nov 4 19:46 serio_raw
-r--r--r-- 1 root root 4094095 Nov 4 19:46 vmlinux

Reviewed-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
---
Documentation/ABI/testing/sysfs-kernel-btf | 8 +
include/linux/bpf.h | 2 +
include/linux/module.h | 4 +
kernel/bpf/btf.c | 194 +++++++++++++++++++++
kernel/bpf/sysfs_btf.c | 2 +-
kernel/module.c | 32 ++++
6 files changed, 241 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-kernel-btf b/Documentation/ABI/testing/sysfs-kernel-btf
index 2c9744b2cd59..fe96efdc9b6c 100644
--- a/Documentation/ABI/testing/sysfs-kernel-btf
+++ b/Documentation/ABI/testing/sysfs-kernel-btf
@@ -15,3 +15,11 @@ Description:
information with description of all internal kernel types. See
Documentation/bpf/btf.rst for detailed description of format
itself.
+
+What: /sys/kernel/btf/<module-name>
+Date: Nov 2020
+KernelVersion: 5.11
+Contact: [email protected]
+Description:
+ Read-only binary attribute exposing kernel module's BTF type
+ information as an add-on to the kernel's BTF (/sys/kernel/btf/vmlinux).
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 73d5381a5d5c..581b2a2e78eb 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -36,9 +36,11 @@ struct seq_operations;
struct bpf_iter_aux_info;
struct bpf_local_storage;
struct bpf_local_storage_map;
+struct kobject;

extern struct idr btf_idr;
extern spinlock_t btf_idr_lock;
+extern struct kobject *btf_kobj;

typedef int (*bpf_iter_init_seq_priv_t)(void *private_data,
struct bpf_iter_aux_info *aux);
diff --git a/include/linux/module.h b/include/linux/module.h
index a29187f7c360..20fce258ffba 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -475,6 +475,10 @@ struct module {
unsigned int num_bpf_raw_events;
struct bpf_raw_event_map *bpf_raw_events;
#endif
+#ifdef CONFIG_DEBUG_INFO_BTF_MODULES
+ unsigned int btf_data_size;
+ void *btf_data;
+#endif
#ifdef CONFIG_JUMP_LABEL
struct jump_entry *jump_entries;
unsigned int num_jump_entries;
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 856585db7aa7..0f1fd2669d69 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -22,6 +22,8 @@
#include <linux/skmsg.h>
#include <linux/perf_event.h>
#include <linux/bsearch.h>
+#include <linux/kobject.h>
+#include <linux/sysfs.h>
#include <net/sock.h>

/* BTF (BPF Type Format) is the meta data format which describes
@@ -4476,6 +4478,75 @@ struct btf *btf_parse_vmlinux(void)
return ERR_PTR(err);
}

+static struct btf *btf_parse_module(const char *module_name, const void *data, unsigned int data_size)
+{
+ struct btf_verifier_env *env = NULL;
+ struct bpf_verifier_log *log;
+ struct btf *btf = NULL, *base_btf;
+ int err;
+
+ base_btf = bpf_get_btf_vmlinux();
+ if (IS_ERR(base_btf))
+ return base_btf;
+ if (!base_btf)
+ return ERR_PTR(-EINVAL);
+
+ env = kzalloc(sizeof(*env), GFP_KERNEL | __GFP_NOWARN);
+ if (!env)
+ return ERR_PTR(-ENOMEM);
+
+ log = &env->log;
+ log->level = BPF_LOG_KERNEL;
+
+ btf = kzalloc(sizeof(*btf), GFP_KERNEL | __GFP_NOWARN);
+ if (!btf) {
+ err = -ENOMEM;
+ goto errout;
+ }
+ env->btf = btf;
+
+ btf->base_btf = base_btf;
+ btf->start_id = base_btf->nr_types;
+ btf->start_str_off = base_btf->hdr.str_len;
+ btf->kernel_btf = true;
+ snprintf(btf->name, sizeof(btf->name), "%s", module_name);
+
+ btf->data = kvmalloc(data_size, GFP_KERNEL | __GFP_NOWARN);
+ if (!btf->data) {
+ err = -ENOMEM;
+ goto errout;
+ }
+ memcpy(btf->data, data, data_size);
+ btf->data_size = data_size;
+
+ err = btf_parse_hdr(env);
+ if (err)
+ goto errout;
+
+ btf->nohdr_data = btf->data + btf->hdr.hdr_len;
+
+ err = btf_parse_str_sec(env);
+ if (err)
+ goto errout;
+
+ err = btf_check_all_metas(env);
+ if (err)
+ goto errout;
+
+ btf_verifier_env_free(env);
+ refcount_set(&btf->refcnt, 1);
+ return btf;
+
+errout:
+ btf_verifier_env_free(env);
+ if (btf) {
+ kvfree(btf->data);
+ kvfree(btf->types);
+ kfree(btf);
+ }
+ return ERR_PTR(err);
+}
+
struct btf *bpf_prog_get_target_btf(const struct bpf_prog *prog)
{
struct bpf_prog *tgt_prog = prog->aux->dst_prog;
@@ -5651,3 +5722,126 @@ bool btf_id_set_contains(const struct btf_id_set *set, u32 id)
{
return bsearch(&id, set->ids, set->cnt, sizeof(u32), btf_id_cmp_func) != NULL;
}
+
+#ifdef CONFIG_DEBUG_INFO_BTF_MODULES
+struct btf_module {
+ struct list_head list;
+ struct module *module;
+ struct btf *btf;
+ struct bin_attribute *sysfs_attr;
+};
+
+static LIST_HEAD(btf_modules);
+static DEFINE_MUTEX(btf_module_mutex);
+
+static ssize_t
+btf_module_read(struct file *file, struct kobject *kobj,
+ struct bin_attribute *bin_attr,
+ char *buf, loff_t off, size_t len)
+{
+ const struct btf *btf = bin_attr->private;
+
+ memcpy(buf, btf->data + off, len);
+ return len;
+}
+
+static int btf_module_notify(struct notifier_block *nb, unsigned long op,
+ void *module)
+{
+ struct btf_module *btf_mod, *tmp;
+ struct module *mod = module;
+ struct btf *btf;
+ int err = 0;
+
+ if (mod->btf_data_size == 0 ||
+ (op != MODULE_STATE_COMING && op != MODULE_STATE_GOING))
+ goto out;
+
+ switch (op) {
+ case MODULE_STATE_COMING:
+ btf_mod = kzalloc(sizeof(*btf_mod), GFP_KERNEL);
+ if (!btf_mod) {
+ err = -ENOMEM;
+ goto out;
+ }
+ btf = btf_parse_module(mod->name, mod->btf_data, mod->btf_data_size);
+ if (IS_ERR(btf)) {
+ pr_warn("failed to validate module [%s] BTF: %ld\n",
+ mod->name, PTR_ERR(btf));
+ kfree(btf_mod);
+ err = PTR_ERR(btf);
+ goto out;
+ }
+ err = btf_alloc_id(btf);
+ if (err) {
+ btf_free(btf);
+ kfree(btf_mod);
+ goto out;
+ }
+
+ mutex_lock(&btf_module_mutex);
+ btf_mod->module = module;
+ btf_mod->btf = btf;
+ list_add(&btf_mod->list, &btf_modules);
+ mutex_unlock(&btf_module_mutex);
+
+ if (IS_ENABLED(CONFIG_SYSFS)) {
+ struct bin_attribute *attr;
+
+ attr = kzalloc(sizeof(*attr), GFP_KERNEL);
+ if (!attr)
+ goto out;
+
+ sysfs_bin_attr_init(attr);
+ attr->attr.name = btf->name;
+ attr->attr.mode = 0444;
+ attr->size = btf->data_size;
+ attr->private = btf;
+ attr->read = btf_module_read;
+
+ err = sysfs_create_bin_file(btf_kobj, attr);
+ if (err) {
+ pr_warn("failed to register module [%s] BTF in sysfs: %d\n",
+ mod->name, err);
+ kfree(attr);
+ err = 0;
+ goto out;
+ }
+
+ btf_mod->sysfs_attr = attr;
+ }
+
+ break;
+ case MODULE_STATE_GOING:
+ mutex_lock(&btf_module_mutex);
+ list_for_each_entry_safe(btf_mod, tmp, &btf_modules, list) {
+ if (btf_mod->module != module)
+ continue;
+
+ list_del(&btf_mod->list);
+ if (btf_mod->sysfs_attr)
+ sysfs_remove_bin_file(btf_kobj, btf_mod->sysfs_attr);
+ btf_put(btf_mod->btf);
+ kfree(btf_mod->sysfs_attr);
+ kfree(btf_mod);
+ break;
+ }
+ mutex_unlock(&btf_module_mutex);
+ break;
+ }
+out:
+ return notifier_from_errno(err);
+}
+
+static struct notifier_block btf_module_nb = {
+ .notifier_call = btf_module_notify,
+};
+
+static int __init btf_module_init(void)
+{
+ register_module_notifier(&btf_module_nb);
+ return 0;
+}
+
+fs_initcall(btf_module_init);
+#endif /* CONFIG_DEBUG_INFO_BTF_MODULES */
diff --git a/kernel/bpf/sysfs_btf.c b/kernel/bpf/sysfs_btf.c
index 11b3380887fa..ef6911aee3bb 100644
--- a/kernel/bpf/sysfs_btf.c
+++ b/kernel/bpf/sysfs_btf.c
@@ -26,7 +26,7 @@ static struct bin_attribute bin_attr_btf_vmlinux __ro_after_init = {
.read = btf_vmlinux_read,
};

-static struct kobject *btf_kobj;
+struct kobject *btf_kobj;

static int __init btf_vmlinux_init(void)
{
diff --git a/kernel/module.c b/kernel/module.c
index a4fa44a652a7..f2996b02ab2e 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -380,6 +380,35 @@ static void *section_objs(const struct load_info *info,
return (void *)info->sechdrs[sec].sh_addr;
}

+/* Find a module section: 0 means not found. Ignores SHF_ALLOC flag. */
+static unsigned int find_any_sec(const struct load_info *info, const char *name)
+{
+ unsigned int i;
+
+ for (i = 1; i < info->hdr->e_shnum; i++) {
+ Elf_Shdr *shdr = &info->sechdrs[i];
+ if (strcmp(info->secstrings + shdr->sh_name, name) == 0)
+ return i;
+ }
+ return 0;
+}
+
+/*
+ * Find a module section, or NULL. Fill in number of "objects" in section.
+ * Ignores SHF_ALLOC flag.
+ */
+static __maybe_unused void *any_section_objs(const struct load_info *info,
+ const char *name,
+ size_t object_size,
+ unsigned int *num)
+{
+ unsigned int sec = find_any_sec(info, name);
+
+ /* Section 0 has sh_addr 0 and sh_size 0. */
+ *num = info->sechdrs[sec].sh_size / object_size;
+ return (void *)info->sechdrs[sec].sh_addr;
+}
+
/* Provided by the linker */
extern const struct kernel_symbol __start___ksymtab[];
extern const struct kernel_symbol __stop___ksymtab[];
@@ -3250,6 +3279,9 @@ static int find_module_sections(struct module *mod, struct load_info *info)
sizeof(*mod->bpf_raw_events),
&mod->num_bpf_raw_events);
#endif
+#ifdef CONFIG_DEBUG_INFO_BTF_MODULES
+ mod->btf_data = any_section_objs(info, ".BTF", 1, &mod->btf_data_size);
+#endif
#ifdef CONFIG_JUMP_LABEL
mod->jump_entries = section_objs(info, "__jump_table",
sizeof(*mod->jump_entries),
--
2.24.1

2020-11-10 17:54:38

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH v4 bpf-next 1/5] bpf: add in-kernel split BTF support



> On Nov 9, 2020, at 5:19 PM, Andrii Nakryiko <[email protected]> wrote:
>
> Adjust in-kernel BTF implementation to support a split BTF mode of operation.
> Changes are mostly mirroring libbpf split BTF changes, with the exception of
> start_id being 0 for in-kernel implementation due to simpler read-only mode.
>
> Otherwise, for split BTF logic, most of the logic of jumping to base BTF,
> where necessary, is encapsulated in few helper functions. Type numbering and
> string offset in a split BTF are logically continuing where base BTF ends, so
> most of the high-level logic is kept without changes.
>
> Type verification and size resolution is only doing an added resolution of new
> split BTF types and relies on already cached size and type resolution results
> in the base BTF.
>
> Signed-off-by: Andrii Nakryiko <[email protected]>
> ---
> kernel/bpf/btf.c | 171 +++++++++++++++++++++++++++++++++--------------
> 1 file changed, 119 insertions(+), 52 deletions(-)
>
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 6324de8c59f7..727c1c27053f 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -203,12 +203,17 @@ struct btf {
> const char *strings;
> void *nohdr_data;
> struct btf_header hdr;
> - u32 nr_types;
> + u32 nr_types; /* includes VOID for base BTF */
> u32 types_size;
> u32 data_size;
> refcount_t refcnt;
> u32 id;
> struct rcu_head rcu;
> +
> + /* split BTF support */
> + struct btf *base_btf;
> + u32 start_id; /* first type ID in this BTF (0 for base BTF) */
> + u32 start_str_off; /* first string offset (0 for base BTF) */
> };
>
> enum verifier_phase {
> @@ -449,14 +454,27 @@ static bool btf_type_is_datasec(const struct btf_type *t)
> return BTF_INFO_KIND(t->info) == BTF_KIND_DATASEC;
> }
>
> +static u32 btf_nr_types_total(const struct btf *btf)
> +{
> + u32 total = 0;
> +
> + while (btf) {
> + total += btf->nr_types;
> + btf = btf->base_btf;
> + }
> +
> + return total;
> +}
> +
> s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
> {
> const struct btf_type *t;
> const char *tname;
> - u32 i;
> + u32 i, total;
>
> - for (i = 1; i <= btf->nr_types; i++) {
> - t = btf->types[i];
> + total = btf_nr_types_total(btf);
> + for (i = 1; i < total; i++) {
> + t = btf_type_by_id(btf, i);
> if (BTF_INFO_KIND(t->info) != kind)
> continue;
>
> @@ -599,8 +617,14 @@ static const struct btf_kind_operations *btf_type_ops(const struct btf_type *t)
>
> static bool btf_name_offset_valid(const struct btf *btf, u32 offset)
> {
> - return BTF_STR_OFFSET_VALID(offset) &&
> - offset < btf->hdr.str_len;
> + if (!BTF_STR_OFFSET_VALID(offset))
> + return false;
> +
> + while (offset < btf->start_str_off)
> + btf = btf->base_btf;

Do we need "if (!btf) return false;" in the while loop? (and some other loops below)

[...]

2020-11-10 18:35:58

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH v4 bpf-next 1/5] bpf: add in-kernel split BTF support

On Tue, Nov 10, 2020 at 9:50 AM Song Liu <[email protected]> wrote:
>
>
>
> > On Nov 9, 2020, at 5:19 PM, Andrii Nakryiko <[email protected]> wrote:
> >
> > Adjust in-kernel BTF implementation to support a split BTF mode of operation.
> > Changes are mostly mirroring libbpf split BTF changes, with the exception of
> > start_id being 0 for in-kernel implementation due to simpler read-only mode.
> >
> > Otherwise, for split BTF logic, most of the logic of jumping to base BTF,
> > where necessary, is encapsulated in few helper functions. Type numbering and
> > string offset in a split BTF are logically continuing where base BTF ends, so
> > most of the high-level logic is kept without changes.
> >
> > Type verification and size resolution is only doing an added resolution of new
> > split BTF types and relies on already cached size and type resolution results
> > in the base BTF.
> >
> > Signed-off-by: Andrii Nakryiko <[email protected]>
> > ---
> > kernel/bpf/btf.c | 171 +++++++++++++++++++++++++++++++++--------------
> > 1 file changed, 119 insertions(+), 52 deletions(-)
> >
> > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > index 6324de8c59f7..727c1c27053f 100644
> > --- a/kernel/bpf/btf.c
> > +++ b/kernel/bpf/btf.c
> > @@ -203,12 +203,17 @@ struct btf {
> > const char *strings;
> > void *nohdr_data;
> > struct btf_header hdr;
> > - u32 nr_types;
> > + u32 nr_types; /* includes VOID for base BTF */
> > u32 types_size;
> > u32 data_size;
> > refcount_t refcnt;
> > u32 id;
> > struct rcu_head rcu;
> > +
> > + /* split BTF support */
> > + struct btf *base_btf;
> > + u32 start_id; /* first type ID in this BTF (0 for base BTF) */
> > + u32 start_str_off; /* first string offset (0 for base BTF) */
> > };
> >
> > enum verifier_phase {
> > @@ -449,14 +454,27 @@ static bool btf_type_is_datasec(const struct btf_type *t)
> > return BTF_INFO_KIND(t->info) == BTF_KIND_DATASEC;
> > }
> >
> > +static u32 btf_nr_types_total(const struct btf *btf)
> > +{
> > + u32 total = 0;
> > +
> > + while (btf) {
> > + total += btf->nr_types;
> > + btf = btf->base_btf;
> > + }
> > +
> > + return total;
> > +}
> > +
> > s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
> > {
> > const struct btf_type *t;
> > const char *tname;
> > - u32 i;
> > + u32 i, total;
> >
> > - for (i = 1; i <= btf->nr_types; i++) {
> > - t = btf->types[i];
> > + total = btf_nr_types_total(btf);
> > + for (i = 1; i < total; i++) {
> > + t = btf_type_by_id(btf, i);
> > if (BTF_INFO_KIND(t->info) != kind)
> > continue;
> >
> > @@ -599,8 +617,14 @@ static const struct btf_kind_operations *btf_type_ops(const struct btf_type *t)
> >
> > static bool btf_name_offset_valid(const struct btf *btf, u32 offset)
> > {
> > - return BTF_STR_OFFSET_VALID(offset) &&
> > - offset < btf->hdr.str_len;
> > + if (!BTF_STR_OFFSET_VALID(offset))
> > + return false;
> > +
> > + while (offset < btf->start_str_off)
> > + btf = btf->base_btf;
>
> Do we need "if (!btf) return false;" in the while loop? (and some other loops below)

No, because for base btf start_str_off and start_type_id are always
zero, so loop condition is always false.

>
> [...]
>

2020-11-10 20:21:21

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH v4 bpf-next 1/5] bpf: add in-kernel split BTF support



> On Nov 10, 2020, at 10:31 AM, Andrii Nakryiko <[email protected]> wrote:
>
> On Tue, Nov 10, 2020 at 9:50 AM Song Liu <[email protected]> wrote:
>>
>>
>>
>>> On Nov 9, 2020, at 5:19 PM, Andrii Nakryiko <[email protected]> wrote:
>>>
>>> Adjust in-kernel BTF implementation to support a split BTF mode of operation.
>>> Changes are mostly mirroring libbpf split BTF changes, with the exception of
>>> start_id being 0 for in-kernel implementation due to simpler read-only mode.
>>>
>>> Otherwise, for split BTF logic, most of the logic of jumping to base BTF,
>>> where necessary, is encapsulated in few helper functions. Type numbering and
>>> string offset in a split BTF are logically continuing where base BTF ends, so
>>> most of the high-level logic is kept without changes.
>>>
>>> Type verification and size resolution is only doing an added resolution of new
>>> split BTF types and relies on already cached size and type resolution results
>>> in the base BTF.
>>>
>>> Signed-off-by: Andrii Nakryiko <[email protected]>
>>> ---
>>> kernel/bpf/btf.c | 171 +++++++++++++++++++++++++++++++++--------------
>>> 1 file changed, 119 insertions(+), 52 deletions(-)
>>>
>>> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
>>> index 6324de8c59f7..727c1c27053f 100644
>>> --- a/kernel/bpf/btf.c
>>> +++ b/kernel/bpf/btf.c
>>> @@ -203,12 +203,17 @@ struct btf {
>>> const char *strings;
>>> void *nohdr_data;
>>> struct btf_header hdr;
>>> - u32 nr_types;
>>> + u32 nr_types; /* includes VOID for base BTF */
>>> u32 types_size;
>>> u32 data_size;
>>> refcount_t refcnt;
>>> u32 id;
>>> struct rcu_head rcu;
>>> +
>>> + /* split BTF support */
>>> + struct btf *base_btf;
>>> + u32 start_id; /* first type ID in this BTF (0 for base BTF) */
>>> + u32 start_str_off; /* first string offset (0 for base BTF) */
>>> };
>>>
>>> enum verifier_phase {
>>> @@ -449,14 +454,27 @@ static bool btf_type_is_datasec(const struct btf_type *t)
>>> return BTF_INFO_KIND(t->info) == BTF_KIND_DATASEC;
>>> }
>>>
>>> +static u32 btf_nr_types_total(const struct btf *btf)
>>> +{
>>> + u32 total = 0;
>>> +
>>> + while (btf) {
>>> + total += btf->nr_types;
>>> + btf = btf->base_btf;
>>> + }
>>> +
>>> + return total;
>>> +}
>>> +
>>> s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
>>> {
>>> const struct btf_type *t;
>>> const char *tname;
>>> - u32 i;
>>> + u32 i, total;
>>>
>>> - for (i = 1; i <= btf->nr_types; i++) {
>>> - t = btf->types[i];
>>> + total = btf_nr_types_total(btf);
>>> + for (i = 1; i < total; i++) {
>>> + t = btf_type_by_id(btf, i);
>>> if (BTF_INFO_KIND(t->info) != kind)
>>> continue;
>>>
>>> @@ -599,8 +617,14 @@ static const struct btf_kind_operations *btf_type_ops(const struct btf_type *t)
>>>
>>> static bool btf_name_offset_valid(const struct btf *btf, u32 offset)
>>> {
>>> - return BTF_STR_OFFSET_VALID(offset) &&
>>> - offset < btf->hdr.str_len;
>>> + if (!BTF_STR_OFFSET_VALID(offset))
>>> + return false;
>>> +
>>> + while (offset < btf->start_str_off)
>>> + btf = btf->base_btf;
>>
>> Do we need "if (!btf) return false;" in the while loop? (and some other loops below)
>
> No, because for base btf start_str_off and start_type_id are always
> zero, so loop condition is always false.

Ah, I misread the code. Thanks for the explanation.

Acked-by: Song Liu <[email protected]>

2020-11-10 20:26:48

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH v4 bpf-next 2/5] bpf: assign ID to vmlinux BTF and return extra info for BTF in GET_OBJ_INFO



> On Nov 9, 2020, at 5:19 PM, Andrii Nakryiko <[email protected]> wrote:
>
> Allocate ID for vmlinux BTF. This makes it visible when iterating over all BTF
> objects in the system. To allow distinguishing vmlinux BTF (and later kernel
> module BTF) from user-provided BTFs, expose extra kernel_btf flag, as well as
> BTF name ("vmlinux" for vmlinux BTF, will equal to module's name for module
> BTF). We might want to later allow specifying BTF name for user-provided BTFs
> as well, if that makes sense. But currently this is reserved only for
> in-kernel BTFs.
>
> Having in-kernel BTFs exposed IDs will allow to extend BPF APIs that require
> in-kernel BTF type with ability to specify BTF types from kernel modules, not
> just vmlinux BTF. This will be implemented in a follow up patch set for
> fentry/fexit/fmod_ret/lsm/etc.
>
> Signed-off-by: Andrii Nakryiko <[email protected]>

Acked-by: Song Liu <[email protected]>

2020-11-11 07:14:58

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v4 bpf-next 4/5] bpf: load and verify kernel module BTFs

Hi Andrii,

I love your patch! Perhaps something to improve:

[auto build test WARNING on bpf-next/master]

url: https://github.com/0day-ci/linux/commits/Andrii-Nakryiko/Integrate-kernel-module-BTF-support/20201110-095309
base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: mips-randconfig-r002-20201110 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 4d81c8adb6ed9840257f6cb6b93f60856d422a15)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install mips cross compiling tool for clang build
# apt-get install binutils-mips-linux-gnu
# https://github.com/0day-ci/linux/commit/dcd763b7808fdc01ebf70bbe07ba92388df4d20d
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Andrii-Nakryiko/Integrate-kernel-module-BTF-support/20201110-095309
git checkout dcd763b7808fdc01ebf70bbe07ba92388df4d20d
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=mips

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All warnings (new ones prefixed by >>):

>> kernel/bpf/btf.c:4481:20: warning: unused function 'btf_parse_module'
static struct btf char const void unsigned int data_size)
^
fatal error: error in backend: Nested variants found in inline asm string: ' .set push
.set mips64r2
.if ( 0x00 ) != -1)) 0x00 ) != -1)) : ($( static struct ftrace_branch_data __attribute__((__aligned__(4))) __attribute__((__section__("_ftrace_branch"))) __if_trace = $( .func = __func__, .file = "arch/mips/include/asm/atomic.h", .line = 154, $); 0x00 ) != -1)) : $))) ) && ( 0 ); .set push; .set mips64r2; .rept 1; sync 0x00; .endr; .set pop; .else; ; .endif
1: ll $1, $2 # atomic_fetch_sub
subu $0, $1, $3
sc $0, $2
beqz $0, 1b
.set pop
move $0, $1
'
clang-12: error: clang frontend command failed with exit code 70 (use -v to see invocation)
clang version 12.0.0 (git://gitmirror/llvm_project 874b0a0b9db93f5d3350ffe6b5efda2d908415d0)
Target: mipsel-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/cross/clang-874b0a0b9d/bin
clang-12: note: diagnostic msg:
Makefile arch drivers include kernel mm scripts source usr

vim +/btf_parse_module +4481 kernel/bpf/btf.c

4480
> 4481 static struct btf *btf_parse_module(const char *module_name, const void *data, unsigned int data_size)
4482 {
4483 struct btf_verifier_env *env = NULL;
4484 struct bpf_verifier_log *log;
4485 struct btf *btf = NULL, *base_btf;
4486 int err;
4487
4488 base_btf = bpf_get_btf_vmlinux();
4489 if (IS_ERR(base_btf))
4490 return base_btf;
4491 if (!base_btf)
4492 return ERR_PTR(-EINVAL);
4493
4494 env = kzalloc(sizeof(*env), GFP_KERNEL | __GFP_NOWARN);
4495 if (!env)
4496 return ERR_PTR(-ENOMEM);
4497
4498 log = &env->log;
4499 log->level = BPF_LOG_KERNEL;
4500
4501 btf = kzalloc(sizeof(*btf), GFP_KERNEL | __GFP_NOWARN);
4502 if (!btf) {
4503 err = -ENOMEM;
4504 goto errout;
4505 }
4506 env->btf = btf;
4507
4508 btf->base_btf = base_btf;
4509 btf->start_id = base_btf->nr_types;
4510 btf->start_str_off = base_btf->hdr.str_len;
4511 btf->kernel_btf = true;
4512 snprintf(btf->name, sizeof(btf->name), "%s", module_name);
4513
4514 btf->data = kvmalloc(data_size, GFP_KERNEL | __GFP_NOWARN);
4515 if (!btf->data) {
4516 err = -ENOMEM;
4517 goto errout;
4518 }
4519 memcpy(btf->data, data, data_size);
4520 btf->data_size = data_size;
4521
4522 err = btf_parse_hdr(env);
4523 if (err)
4524 goto errout;
4525
4526 btf->nohdr_data = btf->data + btf->hdr.hdr_len;
4527
4528 err = btf_parse_str_sec(env);
4529 if (err)
4530 goto errout;
4531
4532 err = btf_check_all_metas(env);
4533 if (err)
4534 goto errout;
4535
4536 btf_verifier_env_free(env);
4537 refcount_set(&btf->refcnt, 1);
4538 return btf;
4539
4540 errout:
4541 btf_verifier_env_free(env);
4542 if (btf) {
4543 kvfree(btf->data);
4544 kvfree(btf->types);
4545 kfree(btf);
4546 }
4547 return ERR_PTR(err);
4548 }
4549

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (4.58 kB)
.config.gz (30.17 kB)
Download all attachments

2020-11-11 10:15:14

by Jessica Yu

[permalink] [raw]
Subject: Re: [PATCH v4 bpf-next 4/5] bpf: load and verify kernel module BTFs

+++ Andrii Nakryiko [09/11/20 17:19 -0800]:
[snipped]
>diff --git a/kernel/module.c b/kernel/module.c
>index a4fa44a652a7..f2996b02ab2e 100644
>--- a/kernel/module.c
>+++ b/kernel/module.c
>@@ -380,6 +380,35 @@ static void *section_objs(const struct load_info *info,
> return (void *)info->sechdrs[sec].sh_addr;
> }
>
>+/* Find a module section: 0 means not found. Ignores SHF_ALLOC flag. */
>+static unsigned int find_any_sec(const struct load_info *info, const char *name)
>+{
>+ unsigned int i;
>+
>+ for (i = 1; i < info->hdr->e_shnum; i++) {
>+ Elf_Shdr *shdr = &info->sechdrs[i];
>+ if (strcmp(info->secstrings + shdr->sh_name, name) == 0)
>+ return i;
>+ }
>+ return 0;
>+}
>+
>+/*
>+ * Find a module section, or NULL. Fill in number of "objects" in section.
>+ * Ignores SHF_ALLOC flag.
>+ */
>+static __maybe_unused void *any_section_objs(const struct load_info *info,
>+ const char *name,
>+ size_t object_size,
>+ unsigned int *num)
>+{
>+ unsigned int sec = find_any_sec(info, name);
>+
>+ /* Section 0 has sh_addr 0 and sh_size 0. */
>+ *num = info->sechdrs[sec].sh_size / object_size;
>+ return (void *)info->sechdrs[sec].sh_addr;
>+}
>+

Hm, I see this patchset has already been applied to bpf-next, but I
guess that doesn't preclude any follow-up patches :-)

I am not a huge fan of the code duplication here, and also the fact
that they're only called in one place. any_section_objs() and
find_any_sec() are pretty much identical to section_objs() and
find_sec(), other than the fact the former drops the SHF_ALLOC check.

Moreover, since it appears that the ".BTF" section is not marked
SHF_ALLOC, I think this will leave mod->btf_data as a dangling pointer
after the module is done loading and the module's load_info has been
deallocated, since SHF_ALLOC sections are not allocated nor copied to
the module's final location in memory.

Why not simply mark the ".BTF" section in the module SHF_ALLOC? We
already do some sh_flags rewriting in rewrite_section_headers(). Then
the module loader knows to keep the section in memory and you can use
section_objs(). And since the .BTF section stays in module memory,
that might save you the memcpy() to btf->data in btf_parse_module()
(unless that is still needed for some reason).

Thanks,

Jessica

> /* Provided by the linker */
> extern const struct kernel_symbol __start___ksymtab[];
> extern const struct kernel_symbol __stop___ksymtab[];
>@@ -3250,6 +3279,9 @@ static int find_module_sections(struct module *mod, struct load_info *info)
> sizeof(*mod->bpf_raw_events),
> &mod->num_bpf_raw_events);
> #endif
>+#ifdef CONFIG_DEBUG_INFO_BTF_MODULES
>+ mod->btf_data = any_section_objs(info, ".BTF", 1, &mod->btf_data_size);
>+#endif
> #ifdef CONFIG_JUMP_LABEL
> mod->jump_entries = section_objs(info, "__jump_table",
> sizeof(*mod->jump_entries),
>--
>2.24.1
>

2020-11-11 20:14:30

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH v4 bpf-next 4/5] bpf: load and verify kernel module BTFs

On Wed, Nov 11, 2020 at 2:13 AM Jessica Yu <[email protected]> wrote:
>
> +++ Andrii Nakryiko [09/11/20 17:19 -0800]:
> [snipped]
> >diff --git a/kernel/module.c b/kernel/module.c
> >index a4fa44a652a7..f2996b02ab2e 100644
> >--- a/kernel/module.c
> >+++ b/kernel/module.c
> >@@ -380,6 +380,35 @@ static void *section_objs(const struct load_info *info,
> > return (void *)info->sechdrs[sec].sh_addr;
> > }
> >
> >+/* Find a module section: 0 means not found. Ignores SHF_ALLOC flag. */
> >+static unsigned int find_any_sec(const struct load_info *info, const char *name)
> >+{
> >+ unsigned int i;
> >+
> >+ for (i = 1; i < info->hdr->e_shnum; i++) {
> >+ Elf_Shdr *shdr = &info->sechdrs[i];
> >+ if (strcmp(info->secstrings + shdr->sh_name, name) == 0)
> >+ return i;
> >+ }
> >+ return 0;
> >+}
> >+
> >+/*
> >+ * Find a module section, or NULL. Fill in number of "objects" in section.
> >+ * Ignores SHF_ALLOC flag.
> >+ */
> >+static __maybe_unused void *any_section_objs(const struct load_info *info,
> >+ const char *name,
> >+ size_t object_size,
> >+ unsigned int *num)
> >+{
> >+ unsigned int sec = find_any_sec(info, name);
> >+
> >+ /* Section 0 has sh_addr 0 and sh_size 0. */
> >+ *num = info->sechdrs[sec].sh_size / object_size;
> >+ return (void *)info->sechdrs[sec].sh_addr;
> >+}
> >+
>
> Hm, I see this patchset has already been applied to bpf-next, but I
> guess that doesn't preclude any follow-up patches :-)

Of course!

>
> I am not a huge fan of the code duplication here, and also the fact
> that they're only called in one place. any_section_objs() and
> find_any_sec() are pretty much identical to section_objs() and
> find_sec(), other than the fact the former drops the SHF_ALLOC check.

Right, but the alternative was to add a new flag to existing
section_objs() and find_sec() functions, which would cause much more
code churn for no good reason (besides saving some trivial code
duplication). And those true/false flags are harder to read in code
anyways.

>
> Moreover, since it appears that the ".BTF" section is not marked
> SHF_ALLOC, I think this will leave mod->btf_data as a dangling pointer
> after the module is done loading and the module's load_info has been
> deallocated, since SHF_ALLOC sections are not allocated nor copied to
> the module's final location in memory.

I can make sure that we also reset the btf_data pointer back to NULL,
if that's a big concern.

>
> Why not simply mark the ".BTF" section in the module SHF_ALLOC? We
> already do some sh_flags rewriting in rewrite_section_headers(). Then
> the module loader knows to keep the section in memory and you can use
> section_objs(). And since the .BTF section stays in module memory,
> that might save you the memcpy() to btf->data in btf_parse_module()
> (unless that is still needed for some reason).

Wasn't aware about rewrite_section_headers() manipulations. Are you
suggesting to just add SHF_ALLOC there for the .BTF section from the
kernel side? I guess that would work, but won't avoid memory copy (so
actually would waste kernel memory, if I understand correctly). The
reason being that the module's BTF is registered as an independently
ref-counted BTF object, which could be held past the kernel module
being unloaded. So I can't directly reference module's .BTF data
anyways.

Also, marking .BTF with SHF_ALLOC with pahole or objcopy tool actually
might generate warnings because SHF_ALLOC sections need to be
allocated to data segments, which neither of those tools know how to
do, it requires a linker support. We do that for vmlinux with extra
linker script logic, but for kernel modules we don't have and probably
don't want to do that.

So in the end, the cleanest approach still seems like not doing
SHF_ALLOC but allowing "capturing" .BTF data with an extra helper.

>
> Thanks,
>
> Jessica
>
> > /* Provided by the linker */
> > extern const struct kernel_symbol __start___ksymtab[];
> > extern const struct kernel_symbol __stop___ksymtab[];
> >@@ -3250,6 +3279,9 @@ static int find_module_sections(struct module *mod, struct load_info *info)
> > sizeof(*mod->bpf_raw_events),
> > &mod->num_bpf_raw_events);
> > #endif
> >+#ifdef CONFIG_DEBUG_INFO_BTF_MODULES
> >+ mod->btf_data = any_section_objs(info, ".BTF", 1, &mod->btf_data_size);
> >+#endif
> > #ifdef CONFIG_JUMP_LABEL
> > mod->jump_entries = section_objs(info, "__jump_table",
> > sizeof(*mod->jump_entries),
> >--
> >2.24.1
> >

2020-11-13 10:34:22

by Jessica Yu

[permalink] [raw]
Subject: Re: [PATCH v4 bpf-next 4/5] bpf: load and verify kernel module BTFs

+++ Andrii Nakryiko [11/11/20 12:11 -0800]:
>On Wed, Nov 11, 2020 at 2:13 AM Jessica Yu <[email protected]> wrote:
>>
>> +++ Andrii Nakryiko [09/11/20 17:19 -0800]:
>> [snipped]
>> >diff --git a/kernel/module.c b/kernel/module.c
>> >index a4fa44a652a7..f2996b02ab2e 100644
>> >--- a/kernel/module.c
>> >+++ b/kernel/module.c
>> >@@ -380,6 +380,35 @@ static void *section_objs(const struct load_info *info,
>> > return (void *)info->sechdrs[sec].sh_addr;
>> > }
>> >
>> >+/* Find a module section: 0 means not found. Ignores SHF_ALLOC flag. */
>> >+static unsigned int find_any_sec(const struct load_info *info, const char *name)
>> >+{
>> >+ unsigned int i;
>> >+
>> >+ for (i = 1; i < info->hdr->e_shnum; i++) {
>> >+ Elf_Shdr *shdr = &info->sechdrs[i];
>> >+ if (strcmp(info->secstrings + shdr->sh_name, name) == 0)
>> >+ return i;
>> >+ }
>> >+ return 0;
>> >+}
>> >+
>> >+/*
>> >+ * Find a module section, or NULL. Fill in number of "objects" in section.
>> >+ * Ignores SHF_ALLOC flag.
>> >+ */
>> >+static __maybe_unused void *any_section_objs(const struct load_info *info,
>> >+ const char *name,
>> >+ size_t object_size,
>> >+ unsigned int *num)
>> >+{
>> >+ unsigned int sec = find_any_sec(info, name);
>> >+
>> >+ /* Section 0 has sh_addr 0 and sh_size 0. */
>> >+ *num = info->sechdrs[sec].sh_size / object_size;
>> >+ return (void *)info->sechdrs[sec].sh_addr;
>> >+}
>> >+
>>
>> Hm, I see this patchset has already been applied to bpf-next, but I
>> guess that doesn't preclude any follow-up patches :-)
>
>Of course!
>
>>
>> I am not a huge fan of the code duplication here, and also the fact
>> that they're only called in one place. any_section_objs() and
>> find_any_sec() are pretty much identical to section_objs() and
>> find_sec(), other than the fact the former drops the SHF_ALLOC check.
>
>Right, but the alternative was to add a new flag to existing
>section_objs() and find_sec() functions, which would cause much more
>code churn for no good reason (besides saving some trivial code
>duplication). And those true/false flags are harder to read in code
>anyways.

That's true, all fair points. I thought there was the possibility to
avoid the code duplication if .BTF were also set to SHF_ALLOC, but I
see for reasons you explained below it is more trouble than it's worth.

>>
>> Moreover, since it appears that the ".BTF" section is not marked
>> SHF_ALLOC, I think this will leave mod->btf_data as a dangling pointer
>> after the module is done loading and the module's load_info has been
>> deallocated, since SHF_ALLOC sections are not allocated nor copied to
>> the module's final location in memory.
>
>I can make sure that we also reset the btf_data pointer back to NULL,
>if that's a big concern.

It's not a terribly huge concern, since mod->btf_data is only accessed
in the btf coming notifier at the moment, but it's probably best to at
least not advertise it as a valid pointer anymore after the module is
done loading. We do some pointer and section size cleanup at the end
of do_init_module() for sections that are deallocated at the end of
module load (starting where init_layout.base is reset to NULL),
we could just tack on mod->btf_data = NULL there as well.

>>
>> Why not simply mark the ".BTF" section in the module SHF_ALLOC? We
>> already do some sh_flags rewriting in rewrite_section_headers(). Then
>> the module loader knows to keep the section in memory and you can use
>> section_objs(). And since the .BTF section stays in module memory,
>> that might save you the memcpy() to btf->data in btf_parse_module()
>> (unless that is still needed for some reason).
>
>Wasn't aware about rewrite_section_headers() manipulations. Are you
>suggesting to just add SHF_ALLOC there for the .BTF section from the
>kernel side? I guess that would work, but won't avoid memory copy (so
>actually would waste kernel memory, if I understand correctly). The
>reason being that the module's BTF is registered as an independently
>ref-counted BTF object, which could be held past the kernel module
>being unloaded. So I can't directly reference module's .BTF data
>anyways.

Ah OK, I was not aware that the section could be held past the module
being unloaded. Then yeah, it would be a memory waste to keep them in
memory if they are being memcpy'd anyway. Thanks for clarifying!

Jessica

2020-11-13 18:56:36

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH v4 bpf-next 4/5] bpf: load and verify kernel module BTFs

On Fri, Nov 13, 2020 at 2:32 AM Jessica Yu <[email protected]> wrote:
>
> +++ Andrii Nakryiko [11/11/20 12:11 -0800]:
> >On Wed, Nov 11, 2020 at 2:13 AM Jessica Yu <[email protected]> wrote:
> >>
> >> +++ Andrii Nakryiko [09/11/20 17:19 -0800]:
> >> [snipped]
> >> >diff --git a/kernel/module.c b/kernel/module.c
> >> >index a4fa44a652a7..f2996b02ab2e 100644
> >> >--- a/kernel/module.c
> >> >+++ b/kernel/module.c
> >> >@@ -380,6 +380,35 @@ static void *section_objs(const struct load_info *info,
> >> > return (void *)info->sechdrs[sec].sh_addr;
> >> > }
> >> >
> >> >+/* Find a module section: 0 means not found. Ignores SHF_ALLOC flag. */
> >> >+static unsigned int find_any_sec(const struct load_info *info, const char *name)
> >> >+{
> >> >+ unsigned int i;
> >> >+
> >> >+ for (i = 1; i < info->hdr->e_shnum; i++) {
> >> >+ Elf_Shdr *shdr = &info->sechdrs[i];
> >> >+ if (strcmp(info->secstrings + shdr->sh_name, name) == 0)
> >> >+ return i;
> >> >+ }
> >> >+ return 0;
> >> >+}
> >> >+
> >> >+/*
> >> >+ * Find a module section, or NULL. Fill in number of "objects" in section.
> >> >+ * Ignores SHF_ALLOC flag.
> >> >+ */
> >> >+static __maybe_unused void *any_section_objs(const struct load_info *info,
> >> >+ const char *name,
> >> >+ size_t object_size,
> >> >+ unsigned int *num)
> >> >+{
> >> >+ unsigned int sec = find_any_sec(info, name);
> >> >+
> >> >+ /* Section 0 has sh_addr 0 and sh_size 0. */
> >> >+ *num = info->sechdrs[sec].sh_size / object_size;
> >> >+ return (void *)info->sechdrs[sec].sh_addr;
> >> >+}
> >> >+
> >>
> >> Hm, I see this patchset has already been applied to bpf-next, but I
> >> guess that doesn't preclude any follow-up patches :-)
> >
> >Of course!
> >
> >>
> >> I am not a huge fan of the code duplication here, and also the fact
> >> that they're only called in one place. any_section_objs() and
> >> find_any_sec() are pretty much identical to section_objs() and
> >> find_sec(), other than the fact the former drops the SHF_ALLOC check.
> >
> >Right, but the alternative was to add a new flag to existing
> >section_objs() and find_sec() functions, which would cause much more
> >code churn for no good reason (besides saving some trivial code
> >duplication). And those true/false flags are harder to read in code
> >anyways.
>
> That's true, all fair points. I thought there was the possibility to
> avoid the code duplication if .BTF were also set to SHF_ALLOC, but I
> see for reasons you explained below it is more trouble than it's worth.
>
> >>
> >> Moreover, since it appears that the ".BTF" section is not marked
> >> SHF_ALLOC, I think this will leave mod->btf_data as a dangling pointer
> >> after the module is done loading and the module's load_info has been
> >> deallocated, since SHF_ALLOC sections are not allocated nor copied to
> >> the module's final location in memory.
> >
> >I can make sure that we also reset the btf_data pointer back to NULL,
> >if that's a big concern.
>
> It's not a terribly huge concern, since mod->btf_data is only accessed
> in the btf coming notifier at the moment, but it's probably best to at
> least not advertise it as a valid pointer anymore after the module is
> done loading. We do some pointer and section size cleanup at the end
> of do_init_module() for sections that are deallocated at the end of
> module load (starting where init_layout.base is reset to NULL),
> we could just tack on mod->btf_data = NULL there as well.

Sounds good, I'll send a follow up patch. Thanks!

>
> >>
> >> Why not simply mark the ".BTF" section in the module SHF_ALLOC? We
> >> already do some sh_flags rewriting in rewrite_section_headers(). Then
> >> the module loader knows to keep the section in memory and you can use
> >> section_objs(). And since the .BTF section stays in module memory,
> >> that might save you the memcpy() to btf->data in btf_parse_module()
> >> (unless that is still needed for some reason).
> >
> >Wasn't aware about rewrite_section_headers() manipulations. Are you
> >suggesting to just add SHF_ALLOC there for the .BTF section from the
> >kernel side? I guess that would work, but won't avoid memory copy (so
> >actually would waste kernel memory, if I understand correctly). The
> >reason being that the module's BTF is registered as an independently
> >ref-counted BTF object, which could be held past the kernel module
> >being unloaded. So I can't directly reference module's .BTF data
> >anyways.
>
> Ah OK, I was not aware that the section could be held past the module
> being unloaded. Then yeah, it would be a memory waste to keep them in
> memory if they are being memcpy'd anyway. Thanks for clarifying!
>
> Jessica