2020-07-30 14:08:54

by KP Singh

[permalink] [raw]
Subject: [PATCH bpf-next v7 0/7] Generalizing bpf_local_storage

From: KP Singh <[email protected]>


# v6 -> v7

- Updated the series to use Martin's POC patch:

https://lore.kernel.org/bpf/[email protected]/

I added a Co-developed-by: tag, but would need Martin's Signoff
(was not sure of the procedure here).

- Rebase.

# v5 -> v6

- Fixed a build warning.
- Rebase.

# v4 -> v5

- Split non-functional changes into separate commits.
- Updated the cache macros to be simpler.
- Fixed some bugs noticed by Martin.
- Updated the userspace map functions to use an fd for lookups, updates
and deletes.
- Rebase.

# v3 -> v4

- Fixed a missing include to bpf_sk_storage.h in bpf_sk_storage.c
- Fixed some functions that were not marked as static which led to
W=1 compilation warnings.

# v2 -> v3

* Restructured the code as per Martin's suggestions:
- Common functionality in bpf_local_storage.c
- bpf_sk_storage functionality remains in net/bpf_sk_storage.
- bpf_inode_storage is kept separate as it is enabled only with
CONFIG_BPF_LSM.
* A separate cache for inode and sk storage with macros to define it.
* Use the ops style approach as suggested by Martin instead of the
enum + switch style.
* Added the inode map to bpftool bash completion and docs.
* Rebase and indentation fixes.

# v1 -> v2

* Use the security blob pointer instead of dedicated member in
struct inode.
* Better code re-use as suggested by Alexei.
* Dropped the inode count arithmetic as pointed out by Alexei.
* Minor bug fixes and rebase.

bpf_sk_storage can already be used by some BPF program types to annotate
socket objects. These annotations are managed with the life-cycle of the
object (i.e. freed when the object is freed) which makes BPF programs
much simpler and less prone to errors and leaks.

This patch series:

* Generalizes the bpf_sk_storage infrastructure to allow easy
implementation of local storage for other objects
* Implements local storage for inodes
* Makes both bpf_{sk, inode}_storage available to LSM programs.

Local storage is safe to use in LSM programs as the attachment sites are
limited and the owning object won't be freed, however, this is not the
case for tracing. Usage in tracing is expected to follow a white-list
based approach similar to the d_path helper
(https://lore.kernel.org/bpf/[email protected]).

Access to local storage would allow LSM programs to implement stateful
detections like detecting the unlink of a running executable from the
examples shared as a part of the KRSI series
https://lore.kernel.org/bpf/[email protected]/
and
https://github.com/sinkap/linux-krsi/blob/patch/v1/examples/samples/bpf/lsm_detect_exec_unlink.c


KP Singh (7):
A purely mechanical change to split the renaming from the actual
generalization.
bpf: Generalize caching for sk_storage.
bpf: Generalize bpf_sk_storage
bpf: Split bpf_local_storage to bpf_sk_storage
bpf: Implement bpf_local_storage for inodes
bpf: Allow local storage to be used from LSM programs
bpf: Add selftests for local_storage

include/linux/bpf.h | 9 +
include/linux/bpf_local_storage.h | 173 ++++
include/linux/bpf_lsm.h | 21 +
include/linux/bpf_types.h | 3 +
include/net/bpf_sk_storage.h | 13 +
include/net/sock.h | 4 +-
include/uapi/linux/bpf.h | 54 +-
kernel/bpf/Makefile | 2 +
kernel/bpf/bpf_inode_storage.c | 254 ++++++
kernel/bpf/bpf_local_storage.c | 600 +++++++++++++
kernel/bpf/bpf_lsm.c | 21 +-
kernel/bpf/syscall.c | 3 +-
kernel/bpf/verifier.c | 10 +
net/core/bpf_sk_storage.c | 827 +++---------------
security/bpf/hooks.c | 7 +
.../bpf/bpftool/Documentation/bpftool-map.rst | 2 +-
tools/bpf/bpftool/bash-completion/bpftool | 3 +-
tools/bpf/bpftool/map.c | 3 +-
tools/include/uapi/linux/bpf.h | 54 +-
tools/lib/bpf/libbpf_probes.c | 5 +-
.../bpf/prog_tests/test_local_storage.c | 60 ++
.../selftests/bpf/progs/local_storage.c | 136 +++
22 files changed, 1553 insertions(+), 711 deletions(-)
create mode 100644 include/linux/bpf_local_storage.h
create mode 100644 kernel/bpf/bpf_inode_storage.c
create mode 100644 kernel/bpf/bpf_local_storage.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/test_local_storage.c
create mode 100644 tools/testing/selftests/bpf/progs/local_storage.c

--
2.28.0.rc0.142.g3c755180ce-goog


2020-07-30 14:08:56

by KP Singh

[permalink] [raw]
Subject: [PATCH bpf-next v7 5/7] bpf: Implement bpf_local_storage for inodes

From: KP Singh <[email protected]>

Similar to bpf_local_storage for sockets, add local storage for inodes.
The life-cycle of storage is managed with the life-cycle of the inode.
i.e. the storage is destroyed along with the owning inode.

The BPF LSM allocates an __rcu pointer to the bpf_local_storage in the
security blob which are now stackable and can co-exist with other LSMs.

Signed-off-by: KP Singh <[email protected]>
---
include/linux/bpf_local_storage.h | 10 +
include/linux/bpf_lsm.h | 21 ++
include/linux/bpf_types.h | 3 +
include/uapi/linux/bpf.h | 38 +++
kernel/bpf/Makefile | 1 +
kernel/bpf/bpf_inode_storage.c | 254 ++++++++++++++++++
kernel/bpf/syscall.c | 3 +-
kernel/bpf/verifier.c | 10 +
security/bpf/hooks.c | 7 +
.../bpf/bpftool/Documentation/bpftool-map.rst | 2 +-
tools/bpf/bpftool/bash-completion/bpftool | 3 +-
tools/bpf/bpftool/map.c | 3 +-
tools/include/uapi/linux/bpf.h | 38 +++
tools/lib/bpf/libbpf_probes.c | 5 +-
14 files changed, 392 insertions(+), 6 deletions(-)
create mode 100644 kernel/bpf/bpf_inode_storage.c

diff --git a/include/linux/bpf_local_storage.h b/include/linux/bpf_local_storage.h
index 3685f681a7fc..75c76847fad5 100644
--- a/include/linux/bpf_local_storage.h
+++ b/include/linux/bpf_local_storage.h
@@ -160,4 +160,14 @@ struct bpf_local_storage_data *
bpf_local_storage_update(void *owner, struct bpf_map *map, void *value,
u64 map_flags);

+#ifdef CONFIG_BPF_LSM
+extern const struct bpf_func_proto bpf_inode_storage_get_proto;
+extern const struct bpf_func_proto bpf_inode_storage_delete_proto;
+void bpf_inode_storage_free(struct inode *inode);
+#else
+static inline void bpf_inode_storage_free(struct inode *inode)
+{
+}
+#endif /* CONFIG_BPF_LSM */
+
#endif /* _BPF_LOCAL_STORAGE_H */
diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
index af74712af585..d0683ada1e49 100644
--- a/include/linux/bpf_lsm.h
+++ b/include/linux/bpf_lsm.h
@@ -17,9 +17,24 @@
#include <linux/lsm_hook_defs.h>
#undef LSM_HOOK

+struct bpf_storage_blob {
+ struct bpf_local_storage __rcu *storage;
+};
+
+extern struct lsm_blob_sizes bpf_lsm_blob_sizes;
+
int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
const struct bpf_prog *prog);

+static inline struct bpf_storage_blob *bpf_inode(
+ const struct inode *inode)
+{
+ if (unlikely(!inode->i_security))
+ return NULL;
+
+ return inode->i_security + bpf_lsm_blob_sizes.lbs_inode;
+}
+
#else /* !CONFIG_BPF_LSM */

static inline int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
@@ -28,6 +43,12 @@ static inline int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
return -EOPNOTSUPP;
}

+static inline struct bpf_storage_blob *bpf_inode(
+ const struct inode *inode)
+{
+ return NULL;
+}
+
#endif /* CONFIG_BPF_LSM */

#endif /* _LINUX_BPF_LSM_H */
diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
index a52a5688418e..2e6f568377f1 100644
--- a/include/linux/bpf_types.h
+++ b/include/linux/bpf_types.h
@@ -107,6 +107,9 @@ BPF_MAP_TYPE(BPF_MAP_TYPE_SK_STORAGE, sk_storage_map_ops)
BPF_MAP_TYPE(BPF_MAP_TYPE_SOCKMAP, sock_map_ops)
BPF_MAP_TYPE(BPF_MAP_TYPE_SOCKHASH, sock_hash_ops)
#endif
+#ifdef CONFIG_BPF_LSM
+BPF_MAP_TYPE(BPF_MAP_TYPE_INODE_STORAGE, inode_storage_map_ops)
+#endif
BPF_MAP_TYPE(BPF_MAP_TYPE_CPUMAP, cpu_map_ops)
#if defined(CONFIG_XDP_SOCKETS)
BPF_MAP_TYPE(BPF_MAP_TYPE_XSKMAP, xsk_map_ops)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index cd5a47a74533..d3617f1ce3fc 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -148,6 +148,7 @@ enum bpf_map_type {
BPF_MAP_TYPE_DEVMAP_HASH,
BPF_MAP_TYPE_STRUCT_OPS,
BPF_MAP_TYPE_RINGBUF,
+ BPF_MAP_TYPE_INODE_STORAGE,
};

/* Note that tracing related programs such as
@@ -3389,6 +3390,41 @@ union bpf_attr {
* A non-negative value equal to or less than *size* on success,
* or a negative error in case of failure.
*
+ * void *bpf_inode_storage_get(struct bpf_map *map, void *inode, void *value, u64 flags)
+ * Description
+ * Get a bpf_local_storage from an *inode*.
+ *
+ * Logically, it could be thought of as getting the value from
+ * a *map* with *inode* as the **key**. From this
+ * perspective, the usage is not much different from
+ * **bpf_map_lookup_elem**\ (*map*, **&**\ *inode*) except this
+ * helper enforces the key must be an inode and the map must also
+ * be a **BPF_MAP_TYPE_INODE_STORAGE**.
+ *
+ * Underneath, the value is stored locally at *inode* instead of
+ * the *map*. The *map* is used as the bpf-local-storage
+ * "type". The bpf-local-storage "type" (i.e. the *map*) is
+ * searched against all bpf_local_storage residing at *inode*.
+ *
+ * An optional *flags* (**BPF_LOCAL_STORAGE_GET_F_CREATE**) can be
+ * used such that a new bpf_local_storage will be
+ * created if one does not exist. *value* can be used
+ * together with **BPF_LOCAL_STORAGE_GET_F_CREATE** to specify
+ * the initial value of a bpf_local_storage. If *value* is
+ * **NULL**, the new bpf_local_storage will be zero initialized.
+ * Return
+ * A bpf_local_storage pointer is returned on success.
+ *
+ * **NULL** if not found or there was an error in adding
+ * a new bpf_local_storage.
+ *
+ * int bpf_inode_storage_delete(struct bpf_map *map, void *inode)
+ * Description
+ * Delete a bpf_local_storage from an *inode*.
+ * Return
+ * 0 on success.
+ *
+ * **-ENOENT** if the bpf_local_storage cannot be found.
*/
#define __BPF_FUNC_MAPPER(FN) \
FN(unspec), \
@@ -3533,6 +3569,8 @@ union bpf_attr {
FN(skc_to_tcp_request_sock), \
FN(skc_to_udp6_sock), \
FN(get_task_stack), \
+ FN(inode_storage_get), \
+ FN(inode_storage_delete), \
/* */

/* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index a9a147e18600..25f40588dfbf 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -5,6 +5,7 @@ CFLAGS_core.o += $(call cc-disable-warning, override-init)
obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o bpf_iter.o map_iter.o task_iter.o prog_iter.o
obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o
obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o
+obj-${CONFIG_BPF_LSM} += bpf_inode_storage.o
obj-$(CONFIG_BPF_SYSCALL) += disasm.o
obj-$(CONFIG_BPF_JIT) += trampoline.o
obj-$(CONFIG_BPF_SYSCALL) += btf.o
diff --git a/kernel/bpf/bpf_inode_storage.c b/kernel/bpf/bpf_inode_storage.c
new file mode 100644
index 000000000000..16c133f36d7e
--- /dev/null
+++ b/kernel/bpf/bpf_inode_storage.c
@@ -0,0 +1,254 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2019 Facebook
+ * Copyright 2020 Google LLC.
+ */
+
+#include <linux/rculist.h>
+#include <linux/list.h>
+#include <linux/hash.h>
+#include <linux/types.h>
+#include <linux/spinlock.h>
+#include <linux/bpf.h>
+#include <linux/bpf_local_storage.h>
+#include <net/sock.h>
+#include <uapi/linux/sock_diag.h>
+#include <uapi/linux/btf.h>
+#include <linux/bpf_lsm.h>
+#include <linux/btf_ids.h>
+#include <linux/fdtable.h>
+
+DEFINE_BPF_STORAGE_CACHE(inode_cache);
+
+static struct bpf_local_storage __rcu **
+inode_storage_ptr(void *owner)
+{
+ struct inode *inode = owner;
+ struct bpf_storage_blob *bsb;
+ bsb = bpf_inode(inode);
+ if (!bsb)
+ return NULL;
+ return &bsb->storage;
+}
+
+static struct bpf_local_storage_data *inode_storage_lookup(struct inode *inode,
+ struct bpf_map *map,
+ bool cacheit_lockit)
+{
+ struct bpf_local_storage *inode_storage;
+ struct bpf_local_storage_map *smap;
+ struct bpf_storage_blob *bsb;
+
+ bsb = bpf_inode(inode);
+ if (!bsb)
+ return ERR_PTR(-ENOENT);
+
+ inode_storage = rcu_dereference(bsb->storage);
+ if (!inode_storage)
+ return NULL;
+
+ smap = (struct bpf_local_storage_map *)map;
+ return bpf_local_storage_lookup(inode_storage, smap, cacheit_lockit);
+}
+
+void bpf_inode_storage_free(struct inode *inode)
+{
+ struct bpf_local_storage_elem *selem;
+ struct bpf_local_storage *local_storage;
+ bool free_inode_storage = false;
+ struct bpf_storage_blob *bsb;
+ struct hlist_node *n;
+
+ bsb = bpf_inode(inode);
+ if (!bsb)
+ return;
+
+ rcu_read_lock();
+
+ local_storage = rcu_dereference(bsb->storage);
+ if (!local_storage) {
+ rcu_read_unlock();
+ return;
+ }
+
+ /* Netiher the bpf_prog nor the bpf-map's syscall
+ * could be modifying the local_storage->list now.
+ * Thus, no elem can be added-to or deleted-from the
+ * local_storage->list by the bpf_prog or by the bpf-map's syscall.
+ *
+ * It is racing with bpf_local_storage_map_free() alone
+ * when unlinking elem from the local_storage->list and
+ * the map's bucket->list.
+ */
+ raw_spin_lock_bh(&local_storage->lock);
+ hlist_for_each_entry_safe(selem, n, &local_storage->list, snode) {
+ /* Always unlink from map before unlinking from
+ * local_storage.
+ */
+ bpf_selem_unlink_map(selem);
+ free_inode_storage =
+ bpf_selem_unlink_storage(local_storage, selem, false);
+ }
+ raw_spin_unlock_bh(&local_storage->lock);
+ rcu_read_unlock();
+
+ /* free_inoode_storage should always be true as long as
+ * local_storage->list was non-empty.
+ */
+ if (free_inode_storage)
+ kfree_rcu(local_storage, rcu);
+}
+
+
+static void *bpf_fd_inode_storage_lookup_elem(struct bpf_map *map, void *key)
+{
+ struct bpf_local_storage_data *sdata;
+ struct file *f;
+ int fd;
+
+ fd = *(int *)key;
+ f = fcheck(fd);
+ if (!f)
+ return ERR_PTR(-EINVAL);
+
+ sdata = inode_storage_lookup(f->f_inode, map, true);
+ return sdata ? sdata->data : NULL;
+}
+
+static int bpf_fd_inode_storage_update_elem(struct bpf_map *map, void *key,
+ void *value, u64 map_flags)
+{
+ struct bpf_local_storage_data *sdata;
+ struct file *f;
+ int fd;
+
+ fd = *(int *)key;
+ f = fcheck(fd);
+ if (!f)
+ return -EINVAL;
+
+ sdata = bpf_local_storage_update(f->f_inode, map, value, map_flags);
+ return PTR_ERR_OR_ZERO(sdata);
+}
+
+static int inode_storage_delete(struct inode *inode, struct bpf_map *map)
+{
+ struct bpf_local_storage_data *sdata;
+
+ sdata = inode_storage_lookup(inode, map, false);
+ if (!sdata)
+ return -ENOENT;
+
+ bpf_selem_unlink(SELEM(sdata));
+
+ return 0;
+}
+
+static int bpf_fd_inode_storage_delete_elem(struct bpf_map *map, void *key)
+{
+ struct file *f;
+ int fd;
+
+ fd = *(int *)key;
+ f = fcheck(fd);
+ if (!f)
+ return -EINVAL;
+
+ return inode_storage_delete(f->f_inode, map);
+}
+
+BPF_CALL_4(bpf_inode_storage_get, struct bpf_map *, map, struct inode *, inode,
+ void *, value, u64, flags)
+{
+ struct bpf_local_storage_data *sdata;
+
+ if (flags & ~(BPF_LOCAL_STORAGE_GET_F_CREATE))
+ return (unsigned long)NULL;
+
+ sdata = inode_storage_lookup(inode, map, true);
+ if (sdata)
+ return (unsigned long)sdata->data;
+
+ if (flags & BPF_LOCAL_STORAGE_GET_F_CREATE) {
+ sdata = bpf_local_storage_update(inode, map, value,
+ BPF_NOEXIST);
+ return IS_ERR(sdata) ? (unsigned long)NULL :
+ (unsigned long)sdata->data;
+ }
+
+ return (unsigned long)NULL;
+}
+
+BPF_CALL_2(bpf_inode_storage_delete,
+ struct bpf_map *, map, struct inode *, inode)
+{
+ return inode_storage_delete(inode, map);
+}
+
+static int notsupp_get_next_key(struct bpf_map *map, void *key,
+ void *next_key)
+{
+ return -ENOTSUPP;
+}
+
+static struct bpf_map *inode_storage_map_alloc(union bpf_attr *attr)
+{
+ struct bpf_local_storage_map *smap;
+
+ smap = bpf_local_storage_map_alloc(attr);
+ if (IS_ERR(smap))
+ return ERR_CAST(smap);
+
+ smap->cache_idx = bpf_local_storage_cache_idx_get(&inode_cache);
+ return &smap->map;
+}
+
+static void inode_storage_map_free(struct bpf_map *map)
+{
+ struct bpf_local_storage_map *smap;
+
+ smap = (struct bpf_local_storage_map *)map;
+ bpf_local_storage_cache_idx_free(&inode_cache, smap->cache_idx);
+ bpf_local_storage_map_free(smap);
+}
+
+static int sk_storage_map_btf_id;
+const struct bpf_map_ops inode_storage_map_ops = {
+ .map_alloc_check = bpf_local_storage_map_alloc_check,
+ .map_alloc = inode_storage_map_alloc,
+ .map_free = inode_storage_map_free,
+ .map_get_next_key = notsupp_get_next_key,
+ .map_lookup_elem = bpf_fd_inode_storage_lookup_elem,
+ .map_update_elem = bpf_fd_inode_storage_update_elem,
+ .map_delete_elem = bpf_fd_inode_storage_delete_elem,
+ .map_check_btf = bpf_local_storage_map_check_btf,
+ .map_btf_name = "bpf_local_storage_map",
+ .map_btf_id = &sk_storage_map_btf_id,
+ .map_owner_storage_ptr = inode_storage_ptr,
+};
+
+BTF_ID_LIST(bpf_inode_storage_get_btf_ids)
+BTF_ID(struct, inode)
+
+const struct bpf_func_proto bpf_inode_storage_get_proto = {
+ .func = bpf_inode_storage_get,
+ .gpl_only = false,
+ .ret_type = RET_PTR_TO_MAP_VALUE_OR_NULL,
+ .arg1_type = ARG_CONST_MAP_PTR,
+ .arg2_type = ARG_PTR_TO_BTF_ID,
+ .arg3_type = ARG_PTR_TO_MAP_VALUE_OR_NULL,
+ .arg4_type = ARG_ANYTHING,
+ .btf_id = bpf_inode_storage_get_btf_ids,
+};
+
+BTF_ID_LIST(bpf_inode_storage_delete_btf_ids)
+BTF_ID(struct, inode)
+
+const struct bpf_func_proto bpf_inode_storage_delete_proto = {
+ .func = bpf_inode_storage_delete,
+ .gpl_only = false,
+ .ret_type = RET_INTEGER,
+ .arg1_type = ARG_CONST_MAP_PTR,
+ .arg2_type = ARG_PTR_TO_BTF_ID,
+ .btf_id = bpf_inode_storage_delete_btf_ids,
+};
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index cd3d599e9e90..78795cd50b45 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -768,7 +768,8 @@ static int map_check_btf(struct bpf_map *map, const struct btf *btf,
if (map->map_type != BPF_MAP_TYPE_HASH &&
map->map_type != BPF_MAP_TYPE_ARRAY &&
map->map_type != BPF_MAP_TYPE_CGROUP_STORAGE &&
- map->map_type != BPF_MAP_TYPE_SK_STORAGE)
+ map->map_type != BPF_MAP_TYPE_SK_STORAGE &&
+ map->map_type != BPF_MAP_TYPE_INODE_STORAGE)
return -ENOTSUPP;
if (map->spin_lock_off + sizeof(struct bpf_spin_lock) >
map->value_size) {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index b6ccfce3bf4c..c48378afbd95 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4242,6 +4242,11 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env,
func_id != BPF_FUNC_sk_storage_delete)
goto error;
break;
+ case BPF_MAP_TYPE_INODE_STORAGE:
+ if (func_id != BPF_FUNC_inode_storage_get &&
+ func_id != BPF_FUNC_inode_storage_delete)
+ goto error;
+ break;
default:
break;
}
@@ -4315,6 +4320,11 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env,
if (map->map_type != BPF_MAP_TYPE_SK_STORAGE)
goto error;
break;
+ case BPF_FUNC_inode_storage_get:
+ case BPF_FUNC_inode_storage_delete:
+ if (map->map_type != BPF_MAP_TYPE_INODE_STORAGE)
+ goto error;
+ break;
default:
break;
}
diff --git a/security/bpf/hooks.c b/security/bpf/hooks.c
index 32d32d485451..35f9b19259e5 100644
--- a/security/bpf/hooks.c
+++ b/security/bpf/hooks.c
@@ -3,6 +3,7 @@
/*
* Copyright (C) 2020 Google LLC.
*/
+#include <linux/bpf_local_storage.h>
#include <linux/lsm_hooks.h>
#include <linux/bpf_lsm.h>

@@ -11,6 +12,7 @@ static struct security_hook_list bpf_lsm_hooks[] __lsm_ro_after_init = {
LSM_HOOK_INIT(NAME, bpf_lsm_##NAME),
#include <linux/lsm_hook_defs.h>
#undef LSM_HOOK
+ LSM_HOOK_INIT(inode_free_security, bpf_inode_storage_free),
};

static int __init bpf_lsm_init(void)
@@ -20,7 +22,12 @@ static int __init bpf_lsm_init(void)
return 0;
}

+struct lsm_blob_sizes bpf_lsm_blob_sizes __lsm_ro_after_init = {
+ .lbs_inode = sizeof(struct bpf_storage_blob),
+};
+
DEFINE_LSM(bpf) = {
.name = "bpf",
.init = bpf_lsm_init,
+ .blobs = &bpf_lsm_blob_sizes
};
diff --git a/tools/bpf/bpftool/Documentation/bpftool-map.rst b/tools/bpf/bpftool/Documentation/bpftool-map.rst
index 41e2a74252d0..083db6c2fc67 100644
--- a/tools/bpf/bpftool/Documentation/bpftool-map.rst
+++ b/tools/bpf/bpftool/Documentation/bpftool-map.rst
@@ -49,7 +49,7 @@ MAP COMMANDS
| | **lru_percpu_hash** | **lpm_trie** | **array_of_maps** | **hash_of_maps**
| | **devmap** | **devmap_hash** | **sockmap** | **cpumap** | **xskmap** | **sockhash**
| | **cgroup_storage** | **reuseport_sockarray** | **percpu_cgroup_storage**
-| | **queue** | **stack** | **sk_storage** | **struct_ops** | **ringbuf** }
+| | **queue** | **stack** | **sk_storage** | **struct_ops** | **ringbuf** | **inode_storage** }

DESCRIPTION
===========
diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool
index 257fa310ea2b..45a93ed5ed89 100644
--- a/tools/bpf/bpftool/bash-completion/bpftool
+++ b/tools/bpf/bpftool/bash-completion/bpftool
@@ -704,7 +704,8 @@ _bpftool()
lru_percpu_hash lpm_trie array_of_maps \
hash_of_maps devmap devmap_hash sockmap cpumap \
xskmap sockhash cgroup_storage reuseport_sockarray \
- percpu_cgroup_storage queue stack' -- \
+ percpu_cgroup_storage queue stack sk_storage \
+ struct_ops inode_storage' -- \
"$cur" ) )
return 0
;;
diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c
index 3a27d31a1856..bc0071228f88 100644
--- a/tools/bpf/bpftool/map.c
+++ b/tools/bpf/bpftool/map.c
@@ -50,6 +50,7 @@ const char * const map_type_name[] = {
[BPF_MAP_TYPE_SK_STORAGE] = "sk_storage",
[BPF_MAP_TYPE_STRUCT_OPS] = "struct_ops",
[BPF_MAP_TYPE_RINGBUF] = "ringbuf",
+ [BPF_MAP_TYPE_INODE_STORAGE] = "inode_storage",
};

const size_t map_type_name_size = ARRAY_SIZE(map_type_name);
@@ -1442,7 +1443,7 @@ static int do_help(int argc, char **argv)
" lru_percpu_hash | lpm_trie | array_of_maps | hash_of_maps |\n"
" devmap | devmap_hash | sockmap | cpumap | xskmap | sockhash |\n"
" cgroup_storage | reuseport_sockarray | percpu_cgroup_storage |\n"
- " queue | stack | sk_storage | struct_ops | ringbuf }\n"
+ " queue | stack | sk_storage | struct_ops | ringbuf | inode_storage }\n"
" " HELP_SPEC_OPTIONS "\n"
"",
bin_name, argv[-2]);
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index cd5a47a74533..d3617f1ce3fc 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -148,6 +148,7 @@ enum bpf_map_type {
BPF_MAP_TYPE_DEVMAP_HASH,
BPF_MAP_TYPE_STRUCT_OPS,
BPF_MAP_TYPE_RINGBUF,
+ BPF_MAP_TYPE_INODE_STORAGE,
};

/* Note that tracing related programs such as
@@ -3389,6 +3390,41 @@ union bpf_attr {
* A non-negative value equal to or less than *size* on success,
* or a negative error in case of failure.
*
+ * void *bpf_inode_storage_get(struct bpf_map *map, void *inode, void *value, u64 flags)
+ * Description
+ * Get a bpf_local_storage from an *inode*.
+ *
+ * Logically, it could be thought of as getting the value from
+ * a *map* with *inode* as the **key**. From this
+ * perspective, the usage is not much different from
+ * **bpf_map_lookup_elem**\ (*map*, **&**\ *inode*) except this
+ * helper enforces the key must be an inode and the map must also
+ * be a **BPF_MAP_TYPE_INODE_STORAGE**.
+ *
+ * Underneath, the value is stored locally at *inode* instead of
+ * the *map*. The *map* is used as the bpf-local-storage
+ * "type". The bpf-local-storage "type" (i.e. the *map*) is
+ * searched against all bpf_local_storage residing at *inode*.
+ *
+ * An optional *flags* (**BPF_LOCAL_STORAGE_GET_F_CREATE**) can be
+ * used such that a new bpf_local_storage will be
+ * created if one does not exist. *value* can be used
+ * together with **BPF_LOCAL_STORAGE_GET_F_CREATE** to specify
+ * the initial value of a bpf_local_storage. If *value* is
+ * **NULL**, the new bpf_local_storage will be zero initialized.
+ * Return
+ * A bpf_local_storage pointer is returned on success.
+ *
+ * **NULL** if not found or there was an error in adding
+ * a new bpf_local_storage.
+ *
+ * int bpf_inode_storage_delete(struct bpf_map *map, void *inode)
+ * Description
+ * Delete a bpf_local_storage from an *inode*.
+ * Return
+ * 0 on success.
+ *
+ * **-ENOENT** if the bpf_local_storage cannot be found.
*/
#define __BPF_FUNC_MAPPER(FN) \
FN(unspec), \
@@ -3533,6 +3569,8 @@ union bpf_attr {
FN(skc_to_tcp_request_sock), \
FN(skc_to_udp6_sock), \
FN(get_task_stack), \
+ FN(inode_storage_get), \
+ FN(inode_storage_delete), \
/* */

/* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c
index 5a3d3f078408..daaad635d0ed 100644
--- a/tools/lib/bpf/libbpf_probes.c
+++ b/tools/lib/bpf/libbpf_probes.c
@@ -173,7 +173,7 @@ int libbpf__load_raw_btf(const char *raw_types, size_t types_len,
return btf_fd;
}

-static int load_sk_storage_btf(void)
+static int load_local_storage_btf(void)
{
const char strs[] = "\0bpf_spin_lock\0val\0cnt\0l";
/* struct bpf_spin_lock {
@@ -232,12 +232,13 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
key_size = 0;
break;
case BPF_MAP_TYPE_SK_STORAGE:
+ case BPF_MAP_TYPE_INODE_STORAGE:
btf_key_type_id = 1;
btf_value_type_id = 3;
value_size = 8;
max_entries = 0;
map_flags = BPF_F_NO_PREALLOC;
- btf_fd = load_sk_storage_btf();
+ btf_fd = load_local_storage_btf();
if (btf_fd < 0)
return false;
break;
--
2.28.0.rc0.142.g3c755180ce-goog

2020-07-30 14:11:23

by KP Singh

[permalink] [raw]
Subject: [PATCH bpf-next v7 1/7] A purely mechanical change to split the renaming from the actual generalization.

From: KP Singh <[email protected]>

Flags/consts:

SK_STORAGE_CREATE_FLAG_MASK BPF_LOCAL_STORAGE_CREATE_FLAG_MASK
BPF_SK_STORAGE_CACHE_SIZE BPF_LOCAL_STORAGE_CACHE_SIZE
MAX_VALUE_SIZE BPF_LOCAL_STORAGE_MAX_VALUE_SIZE

Structs:

bucket bpf_local_storage_map_bucket
bpf_sk_storage_map bpf_local_storage_map
bpf_sk_storage_data bpf_local_storage_data
bpf_sk_storage_elem bpf_local_storage_elem
bpf_sk_storage bpf_local_storage

The "sk" member in bpf_local_storage is also updated to "owner"
in preparation for changing the type to void * in a subsequent patch.

Functions:

selem_linked_to_sk selem_linked_to_storage
selem_alloc bpf_selem_alloc
__selem_unlink_sk bpf_selem_unlink_storage
__selem_link_sk bpf_selem_link_storage
selem_unlink_sk __bpf_selem_unlink_storage
sk_storage_update bpf_local_storage_update
__sk_storage_lookup bpf_local_storage_lookup
bpf_sk_storage_map_free bpf_local_storage_map_free
bpf_sk_storage_map_alloc bpf_local_storage_map_alloc
bpf_sk_storage_map_alloc_check bpf_local_storage_map_alloc_check
bpf_sk_storage_map_check_btf bpf_local_storage_map_check_btf

Signed-off-by: KP Singh <[email protected]>
---
include/net/sock.h | 4 +-
net/core/bpf_sk_storage.c | 455 +++++++++++++++++++-------------------
2 files changed, 233 insertions(+), 226 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 2cc3ba667908..685aee71b91a 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -246,7 +246,7 @@ struct sock_common {
/* public: */
};

-struct bpf_sk_storage;
+struct bpf_local_storage;

/**
* struct sock - network layer representation of sockets
@@ -517,7 +517,7 @@ struct sock {
void (*sk_destruct)(struct sock *sk);
struct sock_reuseport __rcu *sk_reuseport_cb;
#ifdef CONFIG_BPF_SYSCALL
- struct bpf_sk_storage __rcu *sk_bpf_storage;
+ struct bpf_local_storage __rcu *sk_bpf_storage;
#endif
struct rcu_head sk_rcu;
};
diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index d3377c90a291..a5cc218834ee 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -12,33 +12,32 @@
#include <uapi/linux/sock_diag.h>
#include <uapi/linux/btf.h>

-#define SK_STORAGE_CREATE_FLAG_MASK \
- (BPF_F_NO_PREALLOC | BPF_F_CLONE)
+#define BPF_LOCAL_STORAGE_CREATE_FLAG_MASK (BPF_F_NO_PREALLOC | BPF_F_CLONE)

-struct bucket {
+struct bpf_local_storage_map_bucket {
struct hlist_head list;
raw_spinlock_t lock;
};

-/* Thp map is not the primary owner of a bpf_sk_storage_elem.
- * Instead, the sk->sk_bpf_storage is.
+/* Thp map is not the primary owner of a bpf_local_storage_elem.
+ * Instead, the container object (eg. sk->sk_bpf_storage) is.
*
- * The map (bpf_sk_storage_map) is for two purposes
- * 1. Define the size of the "sk local storage". It is
+ * The map (bpf_local_storage_map) is for two purposes
+ * 1. Define the size of the "local storage". It is
* the map's value_size.
*
* 2. Maintain a list to keep track of all elems such
* that they can be cleaned up during the map destruction.
*
* When a bpf local storage is being looked up for a
- * particular sk, the "bpf_map" pointer is actually used
+ * particular object, the "bpf_map" pointer is actually used
* as the "key" to search in the list of elem in
- * sk->sk_bpf_storage.
+ * the respective bpf_local_storage owned by the object.
*
- * Hence, consider sk->sk_bpf_storage is the mini-map
- * with the "bpf_map" pointer as the searching key.
+ * e.g. sk->sk_bpf_storage is the mini-map with the "bpf_map" pointer
+ * as the searching key.
*/
-struct bpf_sk_storage_map {
+struct bpf_local_storage_map {
struct bpf_map map;
/* Lookup elem does not require accessing the map.
*
@@ -46,55 +45,57 @@ struct bpf_sk_storage_map {
* link/unlink the elem from the map. Having
* multiple buckets to improve contention.
*/
- struct bucket *buckets;
+ struct bpf_local_storage_map_bucket *buckets;
u32 bucket_log;
u16 elem_size;
u16 cache_idx;
};

-struct bpf_sk_storage_data {
+struct bpf_local_storage_data {
/* smap is used as the searching key when looking up
- * from sk->sk_bpf_storage.
+ * from the object's bpf_local_storage.
*
* Put it in the same cacheline as the data to minimize
* the number of cachelines access during the cache hit case.
*/
- struct bpf_sk_storage_map __rcu *smap;
+ struct bpf_local_storage_map __rcu *smap;
u8 data[] __aligned(8);
};

-/* Linked to bpf_sk_storage and bpf_sk_storage_map */
-struct bpf_sk_storage_elem {
- struct hlist_node map_node; /* Linked to bpf_sk_storage_map */
- struct hlist_node snode; /* Linked to bpf_sk_storage */
- struct bpf_sk_storage __rcu *sk_storage;
+/* Linked to bpf_local_storage and bpf_local_storage_map */
+struct bpf_local_storage_elem {
+ struct hlist_node map_node; /* Linked to bpf_local_storage_map */
+ struct hlist_node snode; /* Linked to bpf_local_storage */
+ struct bpf_local_storage __rcu *local_storage;
struct rcu_head rcu;
/* 8 bytes hole */
/* The data is stored in aother cacheline to minimize
* the number of cachelines access during a cache hit.
*/
- struct bpf_sk_storage_data sdata ____cacheline_aligned;
+ struct bpf_local_storage_data sdata ____cacheline_aligned;
};

-#define SELEM(_SDATA) container_of((_SDATA), struct bpf_sk_storage_elem, sdata)
+#define SELEM(_SDATA) \
+ container_of((_SDATA), struct bpf_local_storage_elem, sdata)
#define SDATA(_SELEM) (&(_SELEM)->sdata)
-#define BPF_SK_STORAGE_CACHE_SIZE 16
+#define BPF_LOCAL_STORAGE_CACHE_SIZE 16

static DEFINE_SPINLOCK(cache_idx_lock);
-static u64 cache_idx_usage_counts[BPF_SK_STORAGE_CACHE_SIZE];
+static u64 cache_idx_usage_counts[BPF_LOCAL_STORAGE_CACHE_SIZE];

-struct bpf_sk_storage {
- struct bpf_sk_storage_data __rcu *cache[BPF_SK_STORAGE_CACHE_SIZE];
- struct hlist_head list; /* List of bpf_sk_storage_elem */
- struct sock *sk; /* The sk that owns the the above "list" of
- * bpf_sk_storage_elem.
+struct bpf_local_storage {
+ struct bpf_local_storage_data __rcu *cache[BPF_LOCAL_STORAGE_CACHE_SIZE];
+ struct hlist_head list; /* List of bpf_local_storage_elem */
+ struct sock *owner; /* The object that owns the the above "list" of
+ * bpf_local_storage_elem.
*/
struct rcu_head rcu;
raw_spinlock_t lock; /* Protect adding/removing from the "list" */
};

-static struct bucket *select_bucket(struct bpf_sk_storage_map *smap,
- struct bpf_sk_storage_elem *selem)
+static struct bpf_local_storage_map_bucket *
+select_bucket(struct bpf_local_storage_map *smap,
+ struct bpf_local_storage_elem *selem)
{
return &smap->buckets[hash_ptr(selem, smap->bucket_log)];
}
@@ -111,21 +112,21 @@ static int omem_charge(struct sock *sk, unsigned int size)
return -ENOMEM;
}

-static bool selem_linked_to_sk(const struct bpf_sk_storage_elem *selem)
+static bool selem_linked_to_storage(const struct bpf_local_storage_elem *selem)
{
return !hlist_unhashed(&selem->snode);
}

-static bool selem_linked_to_map(const struct bpf_sk_storage_elem *selem)
+static bool selem_linked_to_map(const struct bpf_local_storage_elem *selem)
{
return !hlist_unhashed(&selem->map_node);
}

-static struct bpf_sk_storage_elem *selem_alloc(struct bpf_sk_storage_map *smap,
- struct sock *sk, void *value,
- bool charge_omem)
+static struct bpf_local_storage_elem *
+bpf_selem_alloc(struct bpf_local_storage_map *smap, struct sock *sk,
+ void *value, bool charge_omem)
{
- struct bpf_sk_storage_elem *selem;
+ struct bpf_local_storage_elem *selem;

if (charge_omem && omem_charge(sk, smap->elem_size))
return NULL;
@@ -147,85 +148,86 @@ static struct bpf_sk_storage_elem *selem_alloc(struct bpf_sk_storage_map *smap,
* The caller must ensure selem->smap is still valid to be
* dereferenced for its smap->elem_size and smap->cache_idx.
*/
-static bool __selem_unlink_sk(struct bpf_sk_storage *sk_storage,
- struct bpf_sk_storage_elem *selem,
- bool uncharge_omem)
+static bool bpf_selem_unlink_storage(struct bpf_local_storage *local_storage,
+ struct bpf_local_storage_elem *selem,
+ bool uncharge_omem)
{
- struct bpf_sk_storage_map *smap;
- bool free_sk_storage;
+ struct bpf_local_storage_map *smap;
+ bool free_local_storage;
struct sock *sk;

smap = rcu_dereference(SDATA(selem)->smap);
- sk = sk_storage->sk;
+ sk = local_storage->owner;

/* All uncharging on sk->sk_omem_alloc must be done first.
- * sk may be freed once the last selem is unlinked from sk_storage.
+ * sk may be freed once the last selem is unlinked from local_storage.
*/
if (uncharge_omem)
atomic_sub(smap->elem_size, &sk->sk_omem_alloc);

- free_sk_storage = hlist_is_singular_node(&selem->snode,
- &sk_storage->list);
- if (free_sk_storage) {
- atomic_sub(sizeof(struct bpf_sk_storage), &sk->sk_omem_alloc);
- sk_storage->sk = NULL;
+ free_local_storage = hlist_is_singular_node(&selem->snode,
+ &local_storage->list);
+ if (free_local_storage) {
+ atomic_sub(sizeof(struct bpf_local_storage), &sk->sk_omem_alloc);
+ local_storage->owner = NULL;
/* After this RCU_INIT, sk may be freed and cannot be used */
RCU_INIT_POINTER(sk->sk_bpf_storage, NULL);

- /* sk_storage is not freed now. sk_storage->lock is
- * still held and raw_spin_unlock_bh(&sk_storage->lock)
+ /* local_storage is not freed now. local_storage->lock is
+ * still held and raw_spin_unlock_bh(&local_storage->lock)
* will be done by the caller.
*
* Although the unlock will be done under
* rcu_read_lock(), it is more intutivie to
- * read if kfree_rcu(sk_storage, rcu) is done
- * after the raw_spin_unlock_bh(&sk_storage->lock).
+ * read if kfree_rcu(local_storage, rcu) is done
+ * after the raw_spin_unlock_bh(&local_storage->lock).
*
- * Hence, a "bool free_sk_storage" is returned
+ * Hence, a "bool free_local_storage" is returned
* to the caller which then calls the kfree_rcu()
* after unlock.
*/
}
hlist_del_init_rcu(&selem->snode);
- if (rcu_access_pointer(sk_storage->cache[smap->cache_idx]) ==
+ if (rcu_access_pointer(local_storage->cache[smap->cache_idx]) ==
SDATA(selem))
- RCU_INIT_POINTER(sk_storage->cache[smap->cache_idx], NULL);
+ RCU_INIT_POINTER(local_storage->cache[smap->cache_idx], NULL);

kfree_rcu(selem, rcu);

- return free_sk_storage;
+ return free_local_storage;
}

-static void selem_unlink_sk(struct bpf_sk_storage_elem *selem)
+static void __bpf_selem_unlink_storage(struct bpf_local_storage_elem *selem)
{
- struct bpf_sk_storage *sk_storage;
- bool free_sk_storage = false;
+ struct bpf_local_storage *local_storage;
+ bool free_local_storage = false;

- if (unlikely(!selem_linked_to_sk(selem)))
+ if (unlikely(!selem_linked_to_storage(selem)))
/* selem has already been unlinked from sk */
return;

- sk_storage = rcu_dereference(selem->sk_storage);
- raw_spin_lock_bh(&sk_storage->lock);
- if (likely(selem_linked_to_sk(selem)))
- free_sk_storage = __selem_unlink_sk(sk_storage, selem, true);
- raw_spin_unlock_bh(&sk_storage->lock);
+ local_storage = rcu_dereference(selem->local_storage);
+ raw_spin_lock_bh(&local_storage->lock);
+ if (likely(selem_linked_to_storage(selem)))
+ free_local_storage =
+ bpf_selem_unlink_storage(local_storage, selem, true);
+ raw_spin_unlock_bh(&local_storage->lock);

- if (free_sk_storage)
- kfree_rcu(sk_storage, rcu);
+ if (free_local_storage)
+ kfree_rcu(local_storage, rcu);
}

-static void __selem_link_sk(struct bpf_sk_storage *sk_storage,
- struct bpf_sk_storage_elem *selem)
+static void bpf_selem_link_storage(struct bpf_local_storage *local_storage,
+ struct bpf_local_storage_elem *selem)
{
- RCU_INIT_POINTER(selem->sk_storage, sk_storage);
- hlist_add_head(&selem->snode, &sk_storage->list);
+ RCU_INIT_POINTER(selem->local_storage, local_storage);
+ hlist_add_head(&selem->snode, &local_storage->list);
}

-static void selem_unlink_map(struct bpf_sk_storage_elem *selem)
+static void bpf_selem_unlink_map(struct bpf_local_storage_elem *selem)
{
- struct bpf_sk_storage_map *smap;
- struct bucket *b;
+ struct bpf_local_storage_map *smap;
+ struct bpf_local_storage_map_bucket *b;

if (unlikely(!selem_linked_to_map(selem)))
/* selem has already be unlinked from smap */
@@ -239,10 +241,10 @@ static void selem_unlink_map(struct bpf_sk_storage_elem *selem)
raw_spin_unlock_bh(&b->lock);
}

-static void selem_link_map(struct bpf_sk_storage_map *smap,
- struct bpf_sk_storage_elem *selem)
+static void bpf_selem_link_map(struct bpf_local_storage_map *smap,
+ struct bpf_local_storage_elem *selem)
{
- struct bucket *b = select_bucket(smap, selem);
+ struct bpf_local_storage_map_bucket *b = select_bucket(smap, selem);

raw_spin_lock_bh(&b->lock);
RCU_INIT_POINTER(SDATA(selem)->smap, smap);
@@ -250,31 +252,31 @@ static void selem_link_map(struct bpf_sk_storage_map *smap,
raw_spin_unlock_bh(&b->lock);
}

-static void selem_unlink(struct bpf_sk_storage_elem *selem)
+static void bpf_selem_unlink(struct bpf_local_storage_elem *selem)
{
- /* Always unlink from map before unlinking from sk_storage
+ /* Always unlink from map before unlinking from local_storage
* because selem will be freed after successfully unlinked from
- * the sk_storage.
+ * the local_storage.
*/
- selem_unlink_map(selem);
- selem_unlink_sk(selem);
+ bpf_selem_unlink_map(selem);
+ __bpf_selem_unlink_storage(selem);
}

-static struct bpf_sk_storage_data *
-__sk_storage_lookup(struct bpf_sk_storage *sk_storage,
- struct bpf_sk_storage_map *smap,
- bool cacheit_lockit)
+static struct bpf_local_storage_data *
+bpf_local_storage_lookup(struct bpf_local_storage *local_storage,
+ struct bpf_local_storage_map *smap,
+ bool cacheit_lockit)
{
- struct bpf_sk_storage_data *sdata;
- struct bpf_sk_storage_elem *selem;
+ struct bpf_local_storage_data *sdata;
+ struct bpf_local_storage_elem *selem;

/* Fast path (cache hit) */
- sdata = rcu_dereference(sk_storage->cache[smap->cache_idx]);
+ sdata = rcu_dereference(local_storage->cache[smap->cache_idx]);
if (sdata && rcu_access_pointer(sdata->smap) == smap)
return sdata;

/* Slow path (cache miss) */
- hlist_for_each_entry_rcu(selem, &sk_storage->list, snode)
+ hlist_for_each_entry_rcu(selem, &local_storage->list, snode)
if (rcu_access_pointer(SDATA(selem)->smap) == smap)
break;

@@ -286,33 +288,33 @@ __sk_storage_lookup(struct bpf_sk_storage *sk_storage,
/* spinlock is needed to avoid racing with the
* parallel delete. Otherwise, publishing an already
* deleted sdata to the cache will become a use-after-free
- * problem in the next __sk_storage_lookup().
+ * problem in the next bpf_local_storage_lookup().
*/
- raw_spin_lock_bh(&sk_storage->lock);
- if (selem_linked_to_sk(selem))
- rcu_assign_pointer(sk_storage->cache[smap->cache_idx],
+ raw_spin_lock_bh(&local_storage->lock);
+ if (selem_linked_to_storage(selem))
+ rcu_assign_pointer(local_storage->cache[smap->cache_idx],
sdata);
- raw_spin_unlock_bh(&sk_storage->lock);
+ raw_spin_unlock_bh(&local_storage->lock);
}

return sdata;
}

-static struct bpf_sk_storage_data *
+static struct bpf_local_storage_data *
sk_storage_lookup(struct sock *sk, struct bpf_map *map, bool cacheit_lockit)
{
- struct bpf_sk_storage *sk_storage;
- struct bpf_sk_storage_map *smap;
+ struct bpf_local_storage *sk_storage;
+ struct bpf_local_storage_map *smap;

sk_storage = rcu_dereference(sk->sk_bpf_storage);
if (!sk_storage)
return NULL;

- smap = (struct bpf_sk_storage_map *)map;
- return __sk_storage_lookup(sk_storage, smap, cacheit_lockit);
+ smap = (struct bpf_local_storage_map *)map;
+ return bpf_local_storage_lookup(sk_storage, smap, cacheit_lockit);
}

-static int check_flags(const struct bpf_sk_storage_data *old_sdata,
+static int check_flags(const struct bpf_local_storage_data *old_sdata,
u64 map_flags)
{
if (old_sdata && (map_flags & ~BPF_F_LOCK) == BPF_NOEXIST)
@@ -327,10 +329,10 @@ static int check_flags(const struct bpf_sk_storage_data *old_sdata,
}

static int sk_storage_alloc(struct sock *sk,
- struct bpf_sk_storage_map *smap,
- struct bpf_sk_storage_elem *first_selem)
+ struct bpf_local_storage_map *smap,
+ struct bpf_local_storage_elem *first_selem)
{
- struct bpf_sk_storage *prev_sk_storage, *sk_storage;
+ struct bpf_local_storage *prev_sk_storage, *sk_storage;
int err;

err = omem_charge(sk, sizeof(*sk_storage));
@@ -344,10 +346,10 @@ static int sk_storage_alloc(struct sock *sk,
}
INIT_HLIST_HEAD(&sk_storage->list);
raw_spin_lock_init(&sk_storage->lock);
- sk_storage->sk = sk;
+ sk_storage->owner = sk;

- __selem_link_sk(sk_storage, first_selem);
- selem_link_map(smap, first_selem);
+ bpf_selem_link_storage(sk_storage, first_selem);
+ bpf_selem_link_map(smap, first_selem);
/* Publish sk_storage to sk. sk->sk_lock cannot be acquired.
* Hence, atomic ops is used to set sk->sk_bpf_storage
* from NULL to the newly allocated sk_storage ptr.
@@ -357,10 +359,10 @@ static int sk_storage_alloc(struct sock *sk,
* the sk->sk_bpf_storage, the sk_storage->lock must
* be held before setting sk->sk_bpf_storage to NULL.
*/
- prev_sk_storage = cmpxchg((struct bpf_sk_storage **)&sk->sk_bpf_storage,
+ prev_sk_storage = cmpxchg((struct bpf_local_storage **)&sk->sk_bpf_storage,
NULL, sk_storage);
if (unlikely(prev_sk_storage)) {
- selem_unlink_map(first_selem);
+ bpf_selem_unlink_map(first_selem);
err = -EAGAIN;
goto uncharge;

@@ -387,15 +389,14 @@ static int sk_storage_alloc(struct sock *sk,
* Otherwise, it will become a leak (and other memory issues
* during map destruction).
*/
-static struct bpf_sk_storage_data *sk_storage_update(struct sock *sk,
- struct bpf_map *map,
- void *value,
- u64 map_flags)
+static struct bpf_local_storage_data *
+bpf_local_storage_update(struct sock *sk, struct bpf_map *map, void *value,
+ u64 map_flags)
{
- struct bpf_sk_storage_data *old_sdata = NULL;
- struct bpf_sk_storage_elem *selem;
- struct bpf_sk_storage *sk_storage;
- struct bpf_sk_storage_map *smap;
+ struct bpf_local_storage_data *old_sdata = NULL;
+ struct bpf_local_storage_elem *selem;
+ struct bpf_local_storage *local_storage;
+ struct bpf_local_storage_map *smap;
int err;

/* BPF_EXIST and BPF_NOEXIST cannot be both set */
@@ -404,15 +405,15 @@ static struct bpf_sk_storage_data *sk_storage_update(struct sock *sk,
unlikely((map_flags & BPF_F_LOCK) && !map_value_has_spin_lock(map)))
return ERR_PTR(-EINVAL);

- smap = (struct bpf_sk_storage_map *)map;
- sk_storage = rcu_dereference(sk->sk_bpf_storage);
- if (!sk_storage || hlist_empty(&sk_storage->list)) {
- /* Very first elem for this sk */
+ smap = (struct bpf_local_storage_map *)map;
+ local_storage = rcu_dereference(sk->sk_bpf_storage);
+ if (!local_storage || hlist_empty(&local_storage->list)) {
+ /* Very first elem for this object */
err = check_flags(NULL, map_flags);
if (err)
return ERR_PTR(err);

- selem = selem_alloc(smap, sk, value, true);
+ selem = bpf_selem_alloc(smap, sk, value, true);
if (!selem)
return ERR_PTR(-ENOMEM);

@@ -428,25 +429,26 @@ static struct bpf_sk_storage_data *sk_storage_update(struct sock *sk,

if ((map_flags & BPF_F_LOCK) && !(map_flags & BPF_NOEXIST)) {
/* Hoping to find an old_sdata to do inline update
- * such that it can avoid taking the sk_storage->lock
+ * such that it can avoid taking the local_storage->lock
* and changing the lists.
*/
- old_sdata = __sk_storage_lookup(sk_storage, smap, false);
+ old_sdata =
+ bpf_local_storage_lookup(local_storage, smap, false);
err = check_flags(old_sdata, map_flags);
if (err)
return ERR_PTR(err);
- if (old_sdata && selem_linked_to_sk(SELEM(old_sdata))) {
+ if (old_sdata && selem_linked_to_storage(SELEM(old_sdata))) {
copy_map_value_locked(map, old_sdata->data,
value, false);
return old_sdata;
}
}

- raw_spin_lock_bh(&sk_storage->lock);
+ raw_spin_lock_bh(&local_storage->lock);

- /* Recheck sk_storage->list under sk_storage->lock */
- if (unlikely(hlist_empty(&sk_storage->list))) {
- /* A parallel del is happening and sk_storage is going
+ /* Recheck local_storage->list under local_storage->lock */
+ if (unlikely(hlist_empty(&local_storage->list))) {
+ /* A parallel del is happening and local_storage is going
* away. It has just been checked before, so very
* unlikely. Return instead of retry to keep things
* simple.
@@ -455,7 +457,7 @@ static struct bpf_sk_storage_data *sk_storage_update(struct sock *sk,
goto unlock_err;
}

- old_sdata = __sk_storage_lookup(sk_storage, smap, false);
+ old_sdata = bpf_local_storage_lookup(local_storage, smap, false);
err = check_flags(old_sdata, map_flags);
if (err)
goto unlock_err;
@@ -466,50 +468,51 @@ static struct bpf_sk_storage_data *sk_storage_update(struct sock *sk,
goto unlock;
}

- /* sk_storage->lock is held. Hence, we are sure
+ /* local_storage->lock is held. Hence, we are sure
* we can unlink and uncharge the old_sdata successfully
* later. Hence, instead of charging the new selem now
* and then uncharge the old selem later (which may cause
* a potential but unnecessary charge failure), avoid taking
* a charge at all here (the "!old_sdata" check) and the
- * old_sdata will not be uncharged later during __selem_unlink_sk().
+ * old_sdata will not be uncharged later during
+ * bpf_selem_unlink_storage().
*/
- selem = selem_alloc(smap, sk, value, !old_sdata);
+ selem = bpf_selem_alloc(smap, sk, value, !old_sdata);
if (!selem) {
err = -ENOMEM;
goto unlock_err;
}

/* First, link the new selem to the map */
- selem_link_map(smap, selem);
+ bpf_selem_link_map(smap, selem);

- /* Second, link (and publish) the new selem to sk_storage */
- __selem_link_sk(sk_storage, selem);
+ /* Second, link (and publish) the new selem to local_storage */
+ bpf_selem_link_storage(local_storage, selem);

/* Third, remove old selem, SELEM(old_sdata) */
if (old_sdata) {
- selem_unlink_map(SELEM(old_sdata));
- __selem_unlink_sk(sk_storage, SELEM(old_sdata), false);
+ bpf_selem_unlink_map(SELEM(old_sdata));
+ bpf_selem_unlink_storage(local_storage, SELEM(old_sdata), false);
}

unlock:
- raw_spin_unlock_bh(&sk_storage->lock);
+ raw_spin_unlock_bh(&local_storage->lock);
return SDATA(selem);

unlock_err:
- raw_spin_unlock_bh(&sk_storage->lock);
+ raw_spin_unlock_bh(&local_storage->lock);
return ERR_PTR(err);
}

static int sk_storage_delete(struct sock *sk, struct bpf_map *map)
{
- struct bpf_sk_storage_data *sdata;
+ struct bpf_local_storage_data *sdata;

sdata = sk_storage_lookup(sk, map, false);
if (!sdata)
return -ENOENT;

- selem_unlink(SELEM(sdata));
+ bpf_selem_unlink(SELEM(sdata));

return 0;
}
@@ -521,7 +524,7 @@ static u16 cache_idx_get(void)

spin_lock(&cache_idx_lock);

- for (i = 0; i < BPF_SK_STORAGE_CACHE_SIZE; i++) {
+ for (i = 0; i < BPF_LOCAL_STORAGE_CACHE_SIZE; i++) {
if (cache_idx_usage_counts[i] < min_usage) {
min_usage = cache_idx_usage_counts[i];
res = i;
@@ -548,8 +551,8 @@ static void cache_idx_free(u16 idx)
/* Called by __sk_destruct() & bpf_sk_storage_clone() */
void bpf_sk_storage_free(struct sock *sk)
{
- struct bpf_sk_storage_elem *selem;
- struct bpf_sk_storage *sk_storage;
+ struct bpf_local_storage_elem *selem;
+ struct bpf_local_storage *sk_storage;
bool free_sk_storage = false;
struct hlist_node *n;

@@ -574,8 +577,9 @@ void bpf_sk_storage_free(struct sock *sk)
/* Always unlink from map before unlinking from
* sk_storage.
*/
- selem_unlink_map(selem);
- free_sk_storage = __selem_unlink_sk(sk_storage, selem, true);
+ bpf_selem_unlink_map(selem);
+ free_sk_storage =
+ bpf_selem_unlink_storage(sk_storage, selem, true);
}
raw_spin_unlock_bh(&sk_storage->lock);
rcu_read_unlock();
@@ -584,14 +588,14 @@ void bpf_sk_storage_free(struct sock *sk)
kfree_rcu(sk_storage, rcu);
}

-static void bpf_sk_storage_map_free(struct bpf_map *map)
+static void bpf_local_storage_map_free(struct bpf_map *map)
{
- struct bpf_sk_storage_elem *selem;
- struct bpf_sk_storage_map *smap;
- struct bucket *b;
+ struct bpf_local_storage_elem *selem;
+ struct bpf_local_storage_map *smap;
+ struct bpf_local_storage_map_bucket *b;
unsigned int i;

- smap = (struct bpf_sk_storage_map *)map;
+ smap = (struct bpf_local_storage_map *)map;

cache_idx_free(smap->cache_idx);

@@ -615,10 +619,10 @@ static void bpf_sk_storage_map_free(struct bpf_map *map)

rcu_read_lock();
/* No one is adding to b->list now */
- while ((selem = hlist_entry_safe(rcu_dereference_raw(hlist_first_rcu(&b->list)),
- struct bpf_sk_storage_elem,
- map_node))) {
- selem_unlink(selem);
+ while ((selem = hlist_entry_safe(
+ rcu_dereference_raw(hlist_first_rcu(&b->list)),
+ struct bpf_local_storage_elem, map_node))) {
+ bpf_selem_unlink(selem);
cond_resched_rcu();
}
rcu_read_unlock();
@@ -631,7 +635,7 @@ static void bpf_sk_storage_map_free(struct bpf_map *map)
*
* However, the bpf_sk_storage_free() still needs to access
* the smap->elem_size to do the uncharging in
- * __selem_unlink_sk().
+ * bpf_selem_unlink_storage().
*
* Hence, wait another rcu grace period for the
* bpf_sk_storage_free() to finish.
@@ -645,14 +649,15 @@ static void bpf_sk_storage_map_free(struct bpf_map *map)
/* U16_MAX is much more than enough for sk local storage
* considering a tcp_sock is ~2k.
*/
-#define MAX_VALUE_SIZE \
+#define BPF_LOCAL_STORAGE_MAX_VALUE_SIZE \
min_t(u32, \
- (KMALLOC_MAX_SIZE - MAX_BPF_STACK - sizeof(struct bpf_sk_storage_elem)), \
- (U16_MAX - sizeof(struct bpf_sk_storage_elem)))
+ (KMALLOC_MAX_SIZE - MAX_BPF_STACK - \
+ sizeof(struct bpf_local_storage_elem)), \
+ (U16_MAX - sizeof(struct bpf_local_storage_elem)))

-static int bpf_sk_storage_map_alloc_check(union bpf_attr *attr)
+static int bpf_local_storage_map_alloc_check(union bpf_attr *attr)
{
- if (attr->map_flags & ~SK_STORAGE_CREATE_FLAG_MASK ||
+ if (attr->map_flags & ~BPF_LOCAL_STORAGE_CREATE_FLAG_MASK ||
!(attr->map_flags & BPF_F_NO_PREALLOC) ||
attr->max_entries ||
attr->key_size != sizeof(int) || !attr->value_size ||
@@ -663,15 +668,15 @@ static int bpf_sk_storage_map_alloc_check(union bpf_attr *attr)
if (!bpf_capable())
return -EPERM;

- if (attr->value_size > MAX_VALUE_SIZE)
+ if (attr->value_size > BPF_LOCAL_STORAGE_MAX_VALUE_SIZE)
return -E2BIG;

return 0;
}

-static struct bpf_map *bpf_sk_storage_map_alloc(union bpf_attr *attr)
+static struct bpf_map *bpf_local_storage_map_alloc(union bpf_attr *attr)
{
- struct bpf_sk_storage_map *smap;
+ struct bpf_local_storage_map *smap;
unsigned int i;
u32 nbuckets;
u64 cost;
@@ -707,7 +712,7 @@ static struct bpf_map *bpf_sk_storage_map_alloc(union bpf_attr *attr)
raw_spin_lock_init(&smap->buckets[i].lock);
}

- smap->elem_size = sizeof(struct bpf_sk_storage_elem) + attr->value_size;
+ smap->elem_size = sizeof(struct bpf_local_storage_elem) + attr->value_size;
smap->cache_idx = cache_idx_get();

return &smap->map;
@@ -719,10 +724,10 @@ static int notsupp_get_next_key(struct bpf_map *map, void *key,
return -ENOTSUPP;
}

-static int bpf_sk_storage_map_check_btf(const struct bpf_map *map,
- const struct btf *btf,
- const struct btf_type *key_type,
- const struct btf_type *value_type)
+static int bpf_local_storage_map_check_btf(const struct bpf_map *map,
+ const struct btf *btf,
+ const struct btf_type *key_type,
+ const struct btf_type *value_type)
{
u32 int_data;

@@ -738,7 +743,7 @@ static int bpf_sk_storage_map_check_btf(const struct bpf_map *map,

static void *bpf_fd_sk_storage_lookup_elem(struct bpf_map *map, void *key)
{
- struct bpf_sk_storage_data *sdata;
+ struct bpf_local_storage_data *sdata;
struct socket *sock;
int fd, err;

@@ -756,14 +761,15 @@ static void *bpf_fd_sk_storage_lookup_elem(struct bpf_map *map, void *key)
static int bpf_fd_sk_storage_update_elem(struct bpf_map *map, void *key,
void *value, u64 map_flags)
{
- struct bpf_sk_storage_data *sdata;
+ struct bpf_local_storage_data *sdata;
struct socket *sock;
int fd, err;

fd = *(int *)key;
sock = sockfd_lookup(fd, &err);
if (sock) {
- sdata = sk_storage_update(sock->sk, map, value, map_flags);
+ sdata = bpf_local_storage_update(sock->sk, map, value,
+ map_flags);
sockfd_put(sock);
return PTR_ERR_OR_ZERO(sdata);
}
@@ -787,14 +793,14 @@ static int bpf_fd_sk_storage_delete_elem(struct bpf_map *map, void *key)
return err;
}

-static struct bpf_sk_storage_elem *
+static struct bpf_local_storage_elem *
bpf_sk_storage_clone_elem(struct sock *newsk,
- struct bpf_sk_storage_map *smap,
- struct bpf_sk_storage_elem *selem)
+ struct bpf_local_storage_map *smap,
+ struct bpf_local_storage_elem *selem)
{
- struct bpf_sk_storage_elem *copy_selem;
+ struct bpf_local_storage_elem *copy_selem;

- copy_selem = selem_alloc(smap, newsk, NULL, true);
+ copy_selem = bpf_selem_alloc(smap, newsk, NULL, true);
if (!copy_selem)
return NULL;

@@ -810,9 +816,9 @@ bpf_sk_storage_clone_elem(struct sock *newsk,

int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
{
- struct bpf_sk_storage *new_sk_storage = NULL;
- struct bpf_sk_storage *sk_storage;
- struct bpf_sk_storage_elem *selem;
+ struct bpf_local_storage *new_sk_storage = NULL;
+ struct bpf_local_storage *sk_storage;
+ struct bpf_local_storage_elem *selem;
int ret = 0;

RCU_INIT_POINTER(newsk->sk_bpf_storage, NULL);
@@ -824,8 +830,8 @@ int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
goto out;

hlist_for_each_entry_rcu(selem, &sk_storage->list, snode) {
- struct bpf_sk_storage_elem *copy_selem;
- struct bpf_sk_storage_map *smap;
+ struct bpf_local_storage_elem *copy_selem;
+ struct bpf_local_storage_map *smap;
struct bpf_map *map;

smap = rcu_dereference(SDATA(selem)->smap);
@@ -849,8 +855,8 @@ int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
}

if (new_sk_storage) {
- selem_link_map(smap, copy_selem);
- __selem_link_sk(new_sk_storage, copy_selem);
+ bpf_selem_link_map(smap, copy_selem);
+ bpf_selem_link_storage(new_sk_storage, copy_selem);
} else {
ret = sk_storage_alloc(newsk, smap, copy_selem);
if (ret) {
@@ -861,7 +867,8 @@ int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
goto out;
}

- new_sk_storage = rcu_dereference(copy_selem->sk_storage);
+ new_sk_storage =
+ rcu_dereference(copy_selem->local_storage);
}
bpf_map_put(map);
}
@@ -879,7 +886,7 @@ int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
BPF_CALL_4(bpf_sk_storage_get, struct bpf_map *, map, struct sock *, sk,
void *, value, u64, flags)
{
- struct bpf_sk_storage_data *sdata;
+ struct bpf_local_storage_data *sdata;

if (flags > BPF_SK_STORAGE_GET_F_CREATE)
return (unsigned long)NULL;
@@ -895,7 +902,7 @@ BPF_CALL_4(bpf_sk_storage_get, struct bpf_map *, map, struct sock *, sk,
* destruction).
*/
refcount_inc_not_zero(&sk->sk_refcnt)) {
- sdata = sk_storage_update(sk, map, value, BPF_NOEXIST);
+ sdata = bpf_local_storage_update(sk, map, value, BPF_NOEXIST);
/* sk must be a fullsock (guaranteed by verifier),
* so sock_gen_put() is unnecessary.
*/
@@ -922,15 +929,15 @@ BPF_CALL_2(bpf_sk_storage_delete, struct bpf_map *, map, struct sock *, sk)

static int sk_storage_map_btf_id;
const struct bpf_map_ops sk_storage_map_ops = {
- .map_alloc_check = bpf_sk_storage_map_alloc_check,
- .map_alloc = bpf_sk_storage_map_alloc,
- .map_free = bpf_sk_storage_map_free,
+ .map_alloc_check = bpf_local_storage_map_alloc_check,
+ .map_alloc = bpf_local_storage_map_alloc,
+ .map_free = bpf_local_storage_map_free,
.map_get_next_key = notsupp_get_next_key,
.map_lookup_elem = bpf_fd_sk_storage_lookup_elem,
.map_update_elem = bpf_fd_sk_storage_update_elem,
.map_delete_elem = bpf_fd_sk_storage_delete_elem,
- .map_check_btf = bpf_sk_storage_map_check_btf,
- .map_btf_name = "bpf_sk_storage_map",
+ .map_check_btf = bpf_local_storage_map_check_btf,
+ .map_btf_name = "bpf_local_storage_map",
.map_btf_id = &sk_storage_map_btf_id,
};

@@ -1022,7 +1029,7 @@ bpf_sk_storage_diag_alloc(const struct nlattr *nla_stgs)
u32 nr_maps = 0;
int rem, err;

- /* bpf_sk_storage_map is currently limited to CAP_SYS_ADMIN as
+ /* bpf_local_storage_map is currently limited to CAP_SYS_ADMIN as
* the map_alloc_check() side also does.
*/
if (!bpf_capable())
@@ -1072,13 +1079,13 @@ bpf_sk_storage_diag_alloc(const struct nlattr *nla_stgs)
}
EXPORT_SYMBOL_GPL(bpf_sk_storage_diag_alloc);

-static int diag_get(struct bpf_sk_storage_data *sdata, struct sk_buff *skb)
+static int diag_get(struct bpf_local_storage_data *sdata, struct sk_buff *skb)
{
struct nlattr *nla_stg, *nla_value;
- struct bpf_sk_storage_map *smap;
+ struct bpf_local_storage_map *smap;

/* It cannot exceed max nlattr's payload */
- BUILD_BUG_ON(U16_MAX - NLA_HDRLEN < MAX_VALUE_SIZE);
+ BUILD_BUG_ON(U16_MAX - NLA_HDRLEN < BPF_LOCAL_STORAGE_MAX_VALUE_SIZE);

nla_stg = nla_nest_start(skb, SK_DIAG_BPF_STORAGE);
if (!nla_stg)
@@ -1114,9 +1121,9 @@ static int bpf_sk_storage_diag_put_all(struct sock *sk, struct sk_buff *skb,
{
/* stg_array_type (e.g. INET_DIAG_BPF_SK_STORAGES) */
unsigned int diag_size = nla_total_size(0);
- struct bpf_sk_storage *sk_storage;
- struct bpf_sk_storage_elem *selem;
- struct bpf_sk_storage_map *smap;
+ struct bpf_local_storage *sk_storage;
+ struct bpf_local_storage_elem *selem;
+ struct bpf_local_storage_map *smap;
struct nlattr *nla_stgs;
unsigned int saved_len;
int err = 0;
@@ -1169,8 +1176,8 @@ int bpf_sk_storage_diag_put(struct bpf_sk_storage_diag *diag,
{
/* stg_array_type (e.g. INET_DIAG_BPF_SK_STORAGES) */
unsigned int diag_size = nla_total_size(0);
- struct bpf_sk_storage *sk_storage;
- struct bpf_sk_storage_data *sdata;
+ struct bpf_local_storage *sk_storage;
+ struct bpf_local_storage_data *sdata;
struct nlattr *nla_stgs;
unsigned int saved_len;
int err = 0;
@@ -1197,8 +1204,8 @@ int bpf_sk_storage_diag_put(struct bpf_sk_storage_diag *diag,

saved_len = skb->len;
for (i = 0; i < diag->nr_maps; i++) {
- sdata = __sk_storage_lookup(sk_storage,
- (struct bpf_sk_storage_map *)diag->maps[i],
+ sdata = bpf_local_storage_lookup(sk_storage,
+ (struct bpf_local_storage_map *)diag->maps[i],
false);

if (!sdata)
@@ -1235,19 +1242,19 @@ struct bpf_iter_seq_sk_storage_map_info {
unsigned skip_elems;
};

-static struct bpf_sk_storage_elem *
+static struct bpf_local_storage_elem *
bpf_sk_storage_map_seq_find_next(struct bpf_iter_seq_sk_storage_map_info *info,
- struct bpf_sk_storage_elem *prev_selem)
+ struct bpf_local_storage_elem *prev_selem)
{
- struct bpf_sk_storage *sk_storage;
- struct bpf_sk_storage_elem *selem;
+ struct bpf_local_storage *sk_storage;
+ struct bpf_local_storage_elem *selem;
u32 skip_elems = info->skip_elems;
- struct bpf_sk_storage_map *smap;
+ struct bpf_local_storage_map *smap;
u32 bucket_id = info->bucket_id;
u32 i, count, n_buckets;
- struct bucket *b;
+ struct bpf_local_storage_map_bucket *b;

- smap = (struct bpf_sk_storage_map *)info->map;
+ smap = (struct bpf_local_storage_map *)info->map;
n_buckets = 1U << smap->bucket_log;
if (bucket_id >= n_buckets)
return NULL;
@@ -1257,7 +1264,7 @@ bpf_sk_storage_map_seq_find_next(struct bpf_iter_seq_sk_storage_map_info *info,
count = 0;
while (selem) {
selem = hlist_entry_safe(selem->map_node.next,
- struct bpf_sk_storage_elem, map_node);
+ struct bpf_local_storage_elem, map_node);
if (!selem) {
/* not found, unlock and go to the next bucket */
b = &smap->buckets[bucket_id++];
@@ -1265,7 +1272,7 @@ bpf_sk_storage_map_seq_find_next(struct bpf_iter_seq_sk_storage_map_info *info,
skip_elems = 0;
break;
}
- sk_storage = rcu_dereference_raw(selem->sk_storage);
+ sk_storage = rcu_dereference_raw(selem->local_storage);
if (sk_storage) {
info->skip_elems = skip_elems + count;
return selem;
@@ -1278,7 +1285,7 @@ bpf_sk_storage_map_seq_find_next(struct bpf_iter_seq_sk_storage_map_info *info,
raw_spin_lock_bh(&b->lock);
count = 0;
hlist_for_each_entry(selem, &b->list, map_node) {
- sk_storage = rcu_dereference_raw(selem->sk_storage);
+ sk_storage = rcu_dereference_raw(selem->local_storage);
if (sk_storage && count >= skip_elems) {
info->bucket_id = i;
info->skip_elems = count;
@@ -1297,7 +1304,7 @@ bpf_sk_storage_map_seq_find_next(struct bpf_iter_seq_sk_storage_map_info *info,

static void *bpf_sk_storage_map_seq_start(struct seq_file *seq, loff_t *pos)
{
- struct bpf_sk_storage_elem *selem;
+ struct bpf_local_storage_elem *selem;

selem = bpf_sk_storage_map_seq_find_next(seq->private, NULL);
if (!selem)
@@ -1330,11 +1337,11 @@ DEFINE_BPF_ITER_FUNC(bpf_sk_storage_map, struct bpf_iter_meta *meta,
void *value)

static int __bpf_sk_storage_map_seq_show(struct seq_file *seq,
- struct bpf_sk_storage_elem *selem)
+ struct bpf_local_storage_elem *selem)
{
struct bpf_iter_seq_sk_storage_map_info *info = seq->private;
struct bpf_iter__bpf_sk_storage_map ctx = {};
- struct bpf_sk_storage *sk_storage;
+ struct bpf_local_storage *sk_storage;
struct bpf_iter_meta meta;
struct bpf_prog *prog;
int ret = 0;
@@ -1345,8 +1352,8 @@ static int __bpf_sk_storage_map_seq_show(struct seq_file *seq,
ctx.meta = &meta;
ctx.map = info->map;
if (selem) {
- sk_storage = rcu_dereference_raw(selem->sk_storage);
- ctx.sk = sk_storage->sk;
+ sk_storage = rcu_dereference_raw(selem->local_storage);
+ ctx.sk = sk_storage->owner;
ctx.value = SDATA(selem)->data;
}
ret = bpf_iter_run_prog(prog, &ctx);
@@ -1363,13 +1370,13 @@ static int bpf_sk_storage_map_seq_show(struct seq_file *seq, void *v)
static void bpf_sk_storage_map_seq_stop(struct seq_file *seq, void *v)
{
struct bpf_iter_seq_sk_storage_map_info *info = seq->private;
- struct bpf_sk_storage_map *smap;
- struct bucket *b;
+ struct bpf_local_storage_map *smap;
+ struct bpf_local_storage_map_bucket *b;

if (!v) {
(void)__bpf_sk_storage_map_seq_show(seq, v);
} else {
- smap = (struct bpf_sk_storage_map *)info->map;
+ smap = (struct bpf_local_storage_map *)info->map;
b = &smap->buckets[info->bucket_id];
raw_spin_unlock_bh(&b->lock);
}
--
2.28.0.rc0.142.g3c755180ce-goog

2020-07-30 14:12:02

by KP Singh

[permalink] [raw]
Subject: [PATCH bpf-next v7 2/7] bpf: Generalize caching for sk_storage.

From: KP Singh <[email protected]>

Provide the a ability to define local storage caches on a per-object
type basis. The caches and caching indices for different objects should
not be inter-mixed as suggested in:

https://lore.kernel.org/bpf/[email protected]/

"Caching a sk-storage at idx=0 of a sk should not stop an
inode-storage to be cached at the same idx of a inode."

Signed-off-by: KP Singh <[email protected]>
---
include/net/bpf_sk_storage.h | 19 +++++++++++++++++++
net/core/bpf_sk_storage.c | 31 +++++++++++++++----------------
2 files changed, 34 insertions(+), 16 deletions(-)

diff --git a/include/net/bpf_sk_storage.h b/include/net/bpf_sk_storage.h
index 5036c94c0503..950c5aaba15e 100644
--- a/include/net/bpf_sk_storage.h
+++ b/include/net/bpf_sk_storage.h
@@ -3,6 +3,9 @@
#ifndef _BPF_SK_STORAGE_H
#define _BPF_SK_STORAGE_H

+#include <linux/types.h>
+#include <linux/spinlock.h>
+
struct sock;

void bpf_sk_storage_free(struct sock *sk);
@@ -15,6 +18,22 @@ struct sk_buff;
struct nlattr;
struct sock;

+#define BPF_LOCAL_STORAGE_CACHE_SIZE 16
+
+struct bpf_local_storage_cache {
+ spinlock_t idx_lock;
+ u64 idx_usage_counts[BPF_LOCAL_STORAGE_CACHE_SIZE];
+};
+
+#define DEFINE_BPF_STORAGE_CACHE(name) \
+static struct bpf_local_storage_cache name = { \
+ .idx_lock = __SPIN_LOCK_UNLOCKED(name.idx_lock), \
+}
+
+u16 bpf_local_storage_cache_idx_get(struct bpf_local_storage_cache *cache);
+void bpf_local_storage_cache_idx_free(struct bpf_local_storage_cache *cache,
+ u16 idx);
+
#ifdef CONFIG_BPF_SYSCALL
int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk);
struct bpf_sk_storage_diag *
diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index a5cc218834ee..99dde7b74767 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -14,6 +14,8 @@

#define BPF_LOCAL_STORAGE_CREATE_FLAG_MASK (BPF_F_NO_PREALLOC | BPF_F_CLONE)

+DEFINE_BPF_STORAGE_CACHE(sk_cache);
+
struct bpf_local_storage_map_bucket {
struct hlist_head list;
raw_spinlock_t lock;
@@ -78,10 +80,6 @@ struct bpf_local_storage_elem {
#define SELEM(_SDATA) \
container_of((_SDATA), struct bpf_local_storage_elem, sdata)
#define SDATA(_SELEM) (&(_SELEM)->sdata)
-#define BPF_LOCAL_STORAGE_CACHE_SIZE 16
-
-static DEFINE_SPINLOCK(cache_idx_lock);
-static u64 cache_idx_usage_counts[BPF_LOCAL_STORAGE_CACHE_SIZE];

struct bpf_local_storage {
struct bpf_local_storage_data __rcu *cache[BPF_LOCAL_STORAGE_CACHE_SIZE];
@@ -517,16 +515,16 @@ static int sk_storage_delete(struct sock *sk, struct bpf_map *map)
return 0;
}

-static u16 cache_idx_get(void)
+u16 bpf_local_storage_cache_idx_get(struct bpf_local_storage_cache *cache)
{
u64 min_usage = U64_MAX;
u16 i, res = 0;

- spin_lock(&cache_idx_lock);
+ spin_lock(&cache->idx_lock);

for (i = 0; i < BPF_LOCAL_STORAGE_CACHE_SIZE; i++) {
- if (cache_idx_usage_counts[i] < min_usage) {
- min_usage = cache_idx_usage_counts[i];
+ if (cache->idx_usage_counts[i] < min_usage) {
+ min_usage = cache->idx_usage_counts[i];
res = i;

/* Found a free cache_idx */
@@ -534,18 +532,19 @@ static u16 cache_idx_get(void)
break;
}
}
- cache_idx_usage_counts[res]++;
+ cache->idx_usage_counts[res]++;

- spin_unlock(&cache_idx_lock);
+ spin_unlock(&cache->idx_lock);

return res;
}

-static void cache_idx_free(u16 idx)
+void bpf_local_storage_cache_idx_free(struct bpf_local_storage_cache *cache,
+ u16 idx)
{
- spin_lock(&cache_idx_lock);
- cache_idx_usage_counts[idx]--;
- spin_unlock(&cache_idx_lock);
+ spin_lock(&cache->idx_lock);
+ cache->idx_usage_counts[idx]--;
+ spin_unlock(&cache->idx_lock);
}

/* Called by __sk_destruct() & bpf_sk_storage_clone() */
@@ -597,7 +596,7 @@ static void bpf_local_storage_map_free(struct bpf_map *map)

smap = (struct bpf_local_storage_map *)map;

- cache_idx_free(smap->cache_idx);
+ bpf_local_storage_cache_idx_free(&sk_cache, smap->cache_idx);

/* Note that this map might be concurrently cloned from
* bpf_sk_storage_clone. Wait for any existing bpf_sk_storage_clone
@@ -713,7 +712,7 @@ static struct bpf_map *bpf_local_storage_map_alloc(union bpf_attr *attr)
}

smap->elem_size = sizeof(struct bpf_local_storage_elem) + attr->value_size;
- smap->cache_idx = cache_idx_get();
+ smap->cache_idx = bpf_local_storage_cache_idx_get(&sk_cache);

return &smap->map;
}
--
2.28.0.rc0.142.g3c755180ce-goog

2020-07-31 01:12:42

by Martin KaFai Lau

[permalink] [raw]
Subject: Re: [PATCH bpf-next v7 5/7] bpf: Implement bpf_local_storage for inodes

On Thu, Jul 30, 2020 at 04:07:14PM +0200, KP Singh wrote:
> From: KP Singh <[email protected]>
>
> Similar to bpf_local_storage for sockets, add local storage for inodes.
> The life-cycle of storage is managed with the life-cycle of the inode.
> i.e. the storage is destroyed along with the owning inode.
>
> The BPF LSM allocates an __rcu pointer to the bpf_local_storage in the
> security blob which are now stackable and can co-exist with other LSMs.
>

[ ... ]

> +static int bpf_fd_inode_storage_update_elem(struct bpf_map *map, void *key,
> + void *value, u64 map_flags)
> +{
> + struct bpf_local_storage_data *sdata;
> + struct file *f;
> + int fd;
> +
> + fd = *(int *)key;
> + f = fcheck(fd);
> + if (!f)
> + return -EINVAL;
> +
> + sdata = bpf_local_storage_update(f->f_inode, map, value, map_flags);
n00b question. inode will not be going away here (like another
task calls close(fd))? and there is no chance that bpf_local_storage_update()
will be adding new storage after bpf_inode_storage_free() was called?

A few comments will be useful here.

> + return PTR_ERR_OR_ZERO(sdata);
> +}
> +
> +static int inode_storage_delete(struct inode *inode, struct bpf_map *map)
> +{
> + struct bpf_local_storage_data *sdata;
> +
> + sdata = inode_storage_lookup(inode, map, false);
> + if (!sdata)
> + return -ENOENT;
> +
> + bpf_selem_unlink(SELEM(sdata));
> +
> + return 0;
> +}
> +
> +static int bpf_fd_inode_storage_delete_elem(struct bpf_map *map, void *key)
> +{
> + struct file *f;
> + int fd;
> +
> + fd = *(int *)key;
> + f = fcheck(fd);
> + if (!f)
> + return -EINVAL;
> +
> + return inode_storage_delete(f->f_inode, map);
> +}
> +
> +BPF_CALL_4(bpf_inode_storage_get, struct bpf_map *, map, struct inode *, inode,
> + void *, value, u64, flags)
> +{
> + struct bpf_local_storage_data *sdata;
> +
> + if (flags & ~(BPF_LOCAL_STORAGE_GET_F_CREATE))
> + return (unsigned long)NULL;
> +
> + sdata = inode_storage_lookup(inode, map, true);
> + if (sdata)
> + return (unsigned long)sdata->data;
> +
> + if (flags & BPF_LOCAL_STORAGE_GET_F_CREATE) {
> + sdata = bpf_local_storage_update(inode, map, value,
> + BPF_NOEXIST);
The same question here

> + return IS_ERR(sdata) ? (unsigned long)NULL :
> + (unsigned long)sdata->data;
> + }
> +
> + return (unsigned long)NULL;
> +}
> +
> +BPF_CALL_2(bpf_inode_storage_delete,
> + struct bpf_map *, map, struct inode *, inode)
> +{
> + return inode_storage_delete(inode, map);
> +}
> +
> +static int notsupp_get_next_key(struct bpf_map *map, void *key,
> + void *next_key)
> +{
> + return -ENOTSUPP;
> +}
> +
> +static struct bpf_map *inode_storage_map_alloc(union bpf_attr *attr)
> +{
> + struct bpf_local_storage_map *smap;
> +
> + smap = bpf_local_storage_map_alloc(attr);
> + if (IS_ERR(smap))
> + return ERR_CAST(smap);
> +
> + smap->cache_idx = bpf_local_storage_cache_idx_get(&inode_cache);
> + return &smap->map;
> +}
> +
> +static void inode_storage_map_free(struct bpf_map *map)
> +{
> + struct bpf_local_storage_map *smap;
> +
> + smap = (struct bpf_local_storage_map *)map;
> + bpf_local_storage_cache_idx_free(&inode_cache, smap->cache_idx);
> + bpf_local_storage_map_free(smap);
> +}
> +
> +static int sk_storage_map_btf_id;
> +const struct bpf_map_ops inode_storage_map_ops = {
> + .map_alloc_check = bpf_local_storage_map_alloc_check,
> + .map_alloc = inode_storage_map_alloc,
> + .map_free = inode_storage_map_free,
> + .map_get_next_key = notsupp_get_next_key,
> + .map_lookup_elem = bpf_fd_inode_storage_lookup_elem,
> + .map_update_elem = bpf_fd_inode_storage_update_elem,
> + .map_delete_elem = bpf_fd_inode_storage_delete_elem,
> + .map_check_btf = bpf_local_storage_map_check_btf,
> + .map_btf_name = "bpf_local_storage_map",
> + .map_btf_id = &sk_storage_map_btf_id,
> + .map_owner_storage_ptr = inode_storage_ptr,
> +};
> +
> +BTF_ID_LIST(bpf_inode_storage_get_btf_ids)
> +BTF_ID(struct, inode)
The ARG_PTR_TO_BTF_ID is the second arg instead of the first
arg in bpf_inode_storage_get_proto.
Does it just happen to work here without needing BTF_ID_UNUSED?

> +
> +const struct bpf_func_proto bpf_inode_storage_get_proto = {
> + .func = bpf_inode_storage_get,
> + .gpl_only = false,
> + .ret_type = RET_PTR_TO_MAP_VALUE_OR_NULL,
> + .arg1_type = ARG_CONST_MAP_PTR,
> + .arg2_type = ARG_PTR_TO_BTF_ID,
> + .arg3_type = ARG_PTR_TO_MAP_VALUE_OR_NULL,
> + .arg4_type = ARG_ANYTHING,
> + .btf_id = bpf_inode_storage_get_btf_ids,
> +};
> +
> +BTF_ID_LIST(bpf_inode_storage_delete_btf_ids)
> +BTF_ID(struct, inode)
> +
> +const struct bpf_func_proto bpf_inode_storage_delete_proto = {
> + .func = bpf_inode_storage_delete,
> + .gpl_only = false,
> + .ret_type = RET_INTEGER,
> + .arg1_type = ARG_CONST_MAP_PTR,
> + .arg2_type = ARG_PTR_TO_BTF_ID,
> + .btf_id = bpf_inode_storage_delete_btf_ids,
> +};

2020-07-31 12:09:28

by KP Singh

[permalink] [raw]
Subject: Re: [PATCH bpf-next v7 5/7] bpf: Implement bpf_local_storage for inodes



On 31.07.20 03:08, Martin KaFai Lau wrote:
> On Thu, Jul 30, 2020 at 04:07:14PM +0200, KP Singh wrote:
>> From: KP Singh <[email protected]>
>>
>> Similar to bpf_local_storage for sockets, add local storage for inodes.
>> The life-cycle of storage is managed with the life-cycle of the inode.
>> i.e. the storage is destroyed along with the owning inode.
>>
>> The BPF LSM allocates an __rcu pointer to the bpf_local_storage in the
>> security blob which are now stackable and can co-exist with other LSMs.
>>
>
> [ ... ]
>
>> +static int bpf_fd_inode_storage_update_elem(struct bpf_map *map, void *key,
>> + void *value, u64 map_flags)
>> +{
>> + struct bpf_local_storage_data *sdata;
>> + struct file *f;
>> + int fd;
>> +
>> + fd = *(int *)key;
>> + f = fcheck(fd);
>> + if (!f)
>> + return -EINVAL;
>> +
>> + sdata = bpf_local_storage_update(f->f_inode, map, value, map_flags);
> n00b question. inode will not be going away here (like another
> task calls close(fd))? and there is no chance that bpf_local_storage_update()
> will be adding new storage after bpf_inode_storage_free() was called?

Good catch, I think we need to guard this update by grabbing a reference
to the file here.

>
> A few comments will be useful here.
>
>> + return PTR_ERR_OR_ZERO(sdata);
>> +}
>> +
>> +static int inode_storage_delete(struct inode *inode, struct bpf_map *map)
>> +{
>> + struct bpf_local_storage_data *sdata;
>> +
>> + sdata = inode_storage_lookup(inode, map, false);
>> + if (!sdata)
>> + return -ENOENT;
>> +
>> + bpf_selem_unlink(SELEM(sdata));
>> +
>> + return 0;
>> +}
>> +
>> +static int bpf_fd_inode_storage_delete_elem(struct bpf_map *map, void *key)
>> +{
>> + struct file *f;
>> + int fd;
>> +
>> + fd = *(int *)key;
>> + f = fcheck(fd);
>> + if (!f)
>> + return -EINVAL;
>> +
>> + return inode_storage_delete(f->f_inode, map);
>> +}
>> +
>> +BPF_CALL_4(bpf_inode_storage_get, struct bpf_map *, map, struct inode *, inode,
>> + void *, value, u64, flags)
>> +{
>> + struct bpf_local_storage_data *sdata;
>> +
>> + if (flags & ~(BPF_LOCAL_STORAGE_GET_F_CREATE))
>> + return (unsigned long)NULL;
>> +
>> + sdata = inode_storage_lookup(inode, map, true);
>> + if (sdata)
>> + return (unsigned long)sdata->data;
>> +
>> + if (flags & BPF_LOCAL_STORAGE_GET_F_CREATE) {
>> + sdata = bpf_local_storage_update(inode, map, value,
>> + BPF_NOEXIST);
> The same question here

This is slightly different. The helper gets called from the BPF program.
We are only allowing this from LSM hooks which have better guarantees
w.r.t the lifetime of the object unlike tracing programs.

I will add a comment that explains this. Once we have sleepable BPF we can
also grab a reference to the inode here.

>
>> + return IS_ERR(sdata) ? (unsigned long)NULL :
>> + (unsigned long)sdata->data;
>> + }
>> +
>> + return (unsigned long)NULL;
>> +}
>> +
>> +BPF_CALL_2(bpf_inode_storage_delete,
>> + struct bpf_map *, map, struct inode *, inode)
>> +{
>> + return inode_storage_delete(inode, map);
>> +}
>> +
>> +static int notsupp_get_next_key(struct bpf_map *map, void *key,
>> + void *next_key)
>> +{
>> + return -ENOTSUPP;
>> +}
>> +
>> +static struct bpf_map *inode_storage_map_alloc(union bpf_attr *attr)
>> +{
>> + struct bpf_local_storage_map *smap;
>> +
>> + smap = bpf_local_storage_map_alloc(attr);
>> + if (IS_ERR(smap))
>> + return ERR_CAST(smap);
>> +
>> + smap->cache_idx = bpf_local_storage_cache_idx_get(&inode_cache);
>> + return &smap->map;
>> +}
>> +
>> +static void inode_storage_map_free(struct bpf_map *map)
>> +{
>> + struct bpf_local_storage_map *smap;
>> +
>> + smap = (struct bpf_local_storage_map *)map;
>> + bpf_local_storage_cache_idx_free(&inode_cache, smap->cache_idx);
>> + bpf_local_storage_map_free(smap);
>> +}
>> +
>> +static int sk_storage_map_btf_id;

This name needs to be fixed as well.

>> +const struct bpf_map_ops inode_storage_map_ops = {
>> + .map_alloc_check = bpf_local_storage_map_alloc_check,
>> + .map_alloc = inode_storage_map_alloc,
>> + .map_free = inode_storage_map_free,
>> + .map_get_next_key = notsupp_get_next_key,
>> + .map_lookup_elem = bpf_fd_inode_storage_lookup_elem,
>> + .map_update_elem = bpf_fd_inode_storage_update_elem,
>> + .map_delete_elem = bpf_fd_inode_storage_delete_elem,
>> + .map_check_btf = bpf_local_storage_map_check_btf,
>> + .map_btf_name = "bpf_local_storage_map",
>> + .map_btf_id = &sk_storage_map_btf_id,
>> + .map_owner_storage_ptr = inode_storage_ptr,
>> +};
>> +
>> +BTF_ID_LIST(bpf_inode_storage_get_btf_ids)
>> +BTF_ID(struct, inode)
> The ARG_PTR_TO_BTF_ID is the second arg instead of the first
> arg in bpf_inode_storage_get_proto.
> Does it just happen to work here without needing BTF_ID_UNUSED?


Yeah, this surprised me as to why it worked, so I did some debugging:


diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
index 9cd1428c7199..95e84bcf1a74 100644
--- a/kernel/bpf/bpf_lsm.c
+++ b/kernel/bpf/bpf_lsm.c
@@ -52,6 +52,8 @@ bpf_lsm_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
{
switch (func_id) {
case BPF_FUNC_inode_storage_get:
+ pr_err("btf_ids[0]=%d\n", bpf_inode_storage_get_proto.btf_id[0]);
+ pr_err("btf_ids[1]=%d\n", bpf_inode_storage_get_proto.btf_id[1]);
return &bpf_inode_storage_get_proto;
case BPF_FUNC_inode_storage_delete:
return &bpf_inode_storage_delete_proto;

./test_progs -t test_local_storage

[ 21.694473] btf_ids[0]=915
[ 21.694974] btf_ids[1]=915

btf dump file /sys/kernel/btf/vmlinux | grep "STRUCT 'inode'"
"[915] STRUCT 'inode' size=984 vlen=48

So it seems like btf_id[0] and btf_id[1] are set to the BTF ID
for inode. Now I think this might just be a coincidence as
the next helper (bpf_inode_storage_delete)
also has a BTF argument of type inode.

and sure enough if I call:

bpf_inode_storage_delete from my selftests program,
it does not load:

arg#0 type is not a struct
Unrecognized arg#0 type PTR
; int BPF_PROG(unlink_hook, struct inode *dir, struct dentry *victim)
0: (79) r6 = *(u64 *)(r1 +8)
func 'bpf_lsm_inode_unlink' arg1 has btf_id 912 type STRUCT 'dentry'
; __u32 pid = bpf_get_current_pid_tgid() >> 32;

[...]

So, The BTF_ID_UNUSED is actually needed here. I also added a call to
bpf_inode_storage_delete in my test to catch any issues with it.

After adding BTF_ID_UNUSED, the result is what we expect:

./test_progs -t test_local_storage
[ 20.577223] btf_ids[0]=0
[ 20.577702] btf_ids[1]=915

Thanks for noticing this!

- KP

>
>> +
>> +const struct bpf_func_proto bpf_inode_storage_get_proto = {
>> + .func = bpf_inode_storage_get,
>> + .gpl_only = false,
>> + .ret_type = RET_PTR_TO_MAP_VALUE_OR_NULL,
>> + .arg1_type = ARG_CONST_MAP_PTR,
>> + .arg2_type = ARG_PTR_TO_BTF_ID,
>> + .arg3_type = ARG_PTR_TO_MAP_VALUE_OR_NULL,
>> + .arg4_type = ARG_ANYTHING,
>> + .btf_id = bpf_inode_storage_get_btf_ids,
>> +};
>> +
>> +BTF_ID_LIST(bpf_inode_storage_delete_btf_ids)
>> +BTF_ID(struct, inode)
>> +
>> +const struct bpf_func_proto bpf_inode_storage_delete_proto = {
>> + .func = bpf_inode_storage_delete,
>> + .gpl_only = false,
>> + .ret_type = RET_INTEGER,
>> + .arg1_type = ARG_CONST_MAP_PTR,
>> + .arg2_type = ARG_PTR_TO_BTF_ID,
>> + .btf_id = bpf_inode_storage_delete_btf_ids,
>> +};

2020-07-31 19:04:38

by Martin KaFai Lau

[permalink] [raw]
Subject: Re: [PATCH bpf-next v7 5/7] bpf: Implement bpf_local_storage for inodes

On Fri, Jul 31, 2020 at 02:08:55PM +0200, KP Singh wrote:
[ ... ]
> >> +const struct bpf_map_ops inode_storage_map_ops = {
> >> + .map_alloc_check = bpf_local_storage_map_alloc_check,
> >> + .map_alloc = inode_storage_map_alloc,
> >> + .map_free = inode_storage_map_free,
> >> + .map_get_next_key = notsupp_get_next_key,
> >> + .map_lookup_elem = bpf_fd_inode_storage_lookup_elem,
> >> + .map_update_elem = bpf_fd_inode_storage_update_elem,
> >> + .map_delete_elem = bpf_fd_inode_storage_delete_elem,
> >> + .map_check_btf = bpf_local_storage_map_check_btf,
> >> + .map_btf_name = "bpf_local_storage_map",
> >> + .map_btf_id = &sk_storage_map_btf_id,
> >> + .map_owner_storage_ptr = inode_storage_ptr,
> >> +};
> >> +
> >> +BTF_ID_LIST(bpf_inode_storage_get_btf_ids)
> >> +BTF_ID(struct, inode)
> > The ARG_PTR_TO_BTF_ID is the second arg instead of the first
> > arg in bpf_inode_storage_get_proto.
> > Does it just happen to work here without needing BTF_ID_UNUSED?
>
>
> Yeah, this surprised me as to why it worked, so I did some debugging:
>
>
> diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
> index 9cd1428c7199..95e84bcf1a74 100644
> --- a/kernel/bpf/bpf_lsm.c
> +++ b/kernel/bpf/bpf_lsm.c
> @@ -52,6 +52,8 @@ bpf_lsm_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> {
> switch (func_id) {
> case BPF_FUNC_inode_storage_get:
> + pr_err("btf_ids[0]=%d\n", bpf_inode_storage_get_proto.btf_id[0]);
> + pr_err("btf_ids[1]=%d\n", bpf_inode_storage_get_proto.btf_id[1]);
> return &bpf_inode_storage_get_proto;
> case BPF_FUNC_inode_storage_delete:
> return &bpf_inode_storage_delete_proto;
>
> ./test_progs -t test_local_storage
>
> [ 21.694473] btf_ids[0]=915
> [ 21.694974] btf_ids[1]=915
>
> btf dump file /sys/kernel/btf/vmlinux | grep "STRUCT 'inode'"
> "[915] STRUCT 'inode' size=984 vlen=48
>
> So it seems like btf_id[0] and btf_id[1] are set to the BTF ID
> for inode. Now I think this might just be a coincidence as
> the next helper (bpf_inode_storage_delete)
> also has a BTF argument of type inode.
It seems the next BTF_ID_LIST(bpf_inode_storage_delete_btf_ids)
is not needed because they are the same. I think one
BTF_ID_LIST(bpf_inode_btf_ids) can be used for both helpers.

>
> and sure enough if I call:
>
> bpf_inode_storage_delete from my selftests program,
> it does not load:
>
> arg#0 type is not a struct
> Unrecognized arg#0 type PTR
> ; int BPF_PROG(unlink_hook, struct inode *dir, struct dentry *victim)
> 0: (79) r6 = *(u64 *)(r1 +8)
> func 'bpf_lsm_inode_unlink' arg1 has btf_id 912 type STRUCT 'dentry'
> ; __u32 pid = bpf_get_current_pid_tgid() >> 32;
>
> [...]
>
> So, The BTF_ID_UNUSED is actually needed here. I also added a call to
> bpf_inode_storage_delete in my test to catch any issues with it.
>
> After adding BTF_ID_UNUSED, the result is what we expect:
>
> ./test_progs -t test_local_storage
> [ 20.577223] btf_ids[0]=0
> [ 20.577702] btf_ids[1]=915
>
> Thanks for noticing this!
>
> - KP
>
> >
> >> +
> >> +const struct bpf_func_proto bpf_inode_storage_get_proto = {
> >> + .func = bpf_inode_storage_get,
> >> + .gpl_only = false,
> >> + .ret_type = RET_PTR_TO_MAP_VALUE_OR_NULL,
> >> + .arg1_type = ARG_CONST_MAP_PTR,
> >> + .arg2_type = ARG_PTR_TO_BTF_ID,
> >> + .arg3_type = ARG_PTR_TO_MAP_VALUE_OR_NULL,
> >> + .arg4_type = ARG_ANYTHING,
> >> + .btf_id = bpf_inode_storage_get_btf_ids,
> >> +};
> >> +
> >> +BTF_ID_LIST(bpf_inode_storage_delete_btf_ids)
> >> +BTF_ID(struct, inode)
> >> +
> >> +const struct bpf_func_proto bpf_inode_storage_delete_proto = {
> >> + .func = bpf_inode_storage_delete,
> >> + .gpl_only = false,
> >> + .ret_type = RET_INTEGER,
> >> + .arg1_type = ARG_CONST_MAP_PTR,
> >> + .arg2_type = ARG_PTR_TO_BTF_ID,
> >> + .btf_id = bpf_inode_storage_delete_btf_ids,
> >> +};

2020-08-03 15:42:51

by KP Singh

[permalink] [raw]
Subject: Re: [PATCH bpf-next v7 5/7] bpf: Implement bpf_local_storage for inodes



On 31.07.20 21:02, Martin KaFai Lau wrote:
> On Fri, Jul 31, 2020 at 02:08:55PM +0200, KP Singh wrote:
> [ ... ]
>>>> +const struct bpf_map_ops inode_storage_map_ops = {

[...]

>>
>> btf dump file /sys/kernel/btf/vmlinux | grep "STRUCT 'inode'"
>> "[915] STRUCT 'inode' size=984 vlen=48
>>
>> So it seems like btf_id[0] and btf_id[1] are set to the BTF ID
>> for inode. Now I think this might just be a coincidence as
>> the next helper (bpf_inode_storage_delete)
>> also has a BTF argument of type inode.
> It seems the next BTF_ID_LIST(bpf_inode_storage_delete_btf_ids)
> is not needed because they are the same. I think one
> BTF_ID_LIST(bpf_inode_btf_ids) can be used for both helpers.
>

Cool, yeah. I have fixed it and also for sock helpers. Will
send a new series out.

- KP

>>
>> and sure enough if I call:
>>
>> bpf_inode_storage_delete from my selftests program,
>> it does not load:

[...]

>> ./test_progs -t test_local_storage
>> [ 20.577223] btf_ids[0]=0
>> [ 20.577702] btf_ids[1]=915
>>
>> Thanks for noticing this!
>>
>> - KP
>>
>>>
>>>> +
>>>> +const struct bpf_func_proto bpf_inode_storage_get_proto = {
>>>> + .func = bpf_inode_storage_get,
>>>> + .gpl_only = false,
>>>> + .ret_type = RET_PTR_TO_MAP_VALUE_OR_NULL,
>>>> + .arg1_type = ARG_CONST_MAP_PTR,
>>>> + .arg2_type = ARG_PTR_TO_BTF_ID,
>>>> + .arg3_type = ARG_PTR_TO_MAP_VALUE_OR_NULL,
>>>> + .arg4_type = ARG_ANYTHING,
>>>> + .btf_id = bpf_inode_storage_get_btf_ids,
>>>> +};
>>>> +
>>>> +BTF_ID_LIST(bpf_inode_storage_delete_btf_ids)
>>>> +BTF_ID(struct, inode)
>>>> +
>>>> +const struct bpf_func_proto bpf_inode_storage_delete_proto = {
>>>> + .func = bpf_inode_storage_delete,
>>>> + .gpl_only = false,
>>>> + .ret_type = RET_INTEGER,
>>>> + .arg1_type = ARG_CONST_MAP_PTR,
>>>> + .arg2_type = ARG_PTR_TO_BTF_ID,
>>>> + .btf_id = bpf_inode_storage_delete_btf_ids,
>>>> +};