2022-08-19 23:29:07

by Daniel Xu

[permalink] [raw]
Subject: [PATCH bpf-next v3 0/5] Support direct writes to nf_conn:mark

Support direct writes to nf_conn:mark from TC and XDP prog types. This
is useful when applications want to store per-connection metadata. This
is also particularly useful for applications that run both bpf and
iptables/nftables because the latter can trivially access this metadata.

One example use case would be if a bpf prog is responsible for advanced
packet classification and iptables/nftables is later used for routing
due to pre-existing/legacy code.

Past discussion:
- v2: https://lore.kernel.org/bpf/CAP01T74Sgn354dXGiFWFryu4vg+o8b9s9La1d9zEbC4LGvH4qg@mail.gmail.com/T/
- v1: https://lore.kernel.org/bpf/[email protected]/

Changes since v2:
- Remove use of NOT_INIT for btf_struct_access write path
- Disallow nf_conn writing when nf_conntrack module not loaded
- Support writing to nf_conn___init:mark

Changes since v1:
- Add unimplemented stub for when !CONFIG_BPF_SYSCALL

Daniel Xu (5):
bpf: Remove duplicate PTR_TO_BTF_ID RO check
bpf: Add stub for btf_struct_access()
bpf: Use 0 instead of NOT_INIT for btf_struct_access() writes
bpf: Add support for writing to nf_conn:mark
selftests/bpf: Add tests for writing to nf_conn:mark

include/linux/bpf.h | 9 +++
include/net/netfilter/nf_conntrack_bpf.h | 13 ++++
kernel/bpf/verifier.c | 3 -
net/core/filter.c | 50 +++++++++++++++
net/ipv4/bpf_tcp_ca.c | 2 +-
net/netfilter/nf_conntrack_bpf.c | 64 ++++++++++++++++++-
net/netfilter/nf_conntrack_core.c | 1 +
.../testing/selftests/bpf/prog_tests/bpf_nf.c | 2 +
.../testing/selftests/bpf/progs/test_bpf_nf.c | 9 ++-
.../selftests/bpf/progs/test_bpf_nf_fail.c | 14 ++++
10 files changed, 160 insertions(+), 7 deletions(-)

--
2.37.1


2022-08-20 00:06:31

by Daniel Xu

[permalink] [raw]
Subject: [PATCH bpf-next v3 3/5] bpf: Use 0 instead of NOT_INIT for btf_struct_access() writes

Returning a bpf_reg_type only makes sense in the context of a BPF_READ.
For writes, prefer to explicitly return 0 for clarity.

Note that is non-functional change as it just so happened that NOT_INIT
== 0.

Signed-off-by: Daniel Xu <[email protected]>
---
net/ipv4/bpf_tcp_ca.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/bpf_tcp_ca.c b/net/ipv4/bpf_tcp_ca.c
index 85a9e500c42d..6da16ae6a962 100644
--- a/net/ipv4/bpf_tcp_ca.c
+++ b/net/ipv4/bpf_tcp_ca.c
@@ -124,7 +124,7 @@ static int bpf_tcp_ca_btf_struct_access(struct bpf_verifier_log *log,
return -EACCES;
}

- return NOT_INIT;
+ return 0;
}

BPF_CALL_2(bpf_tcp_send_ack, struct tcp_sock *, tp, u32, rcv_nxt)
--
2.37.1

2022-08-20 00:06:45

by Daniel Xu

[permalink] [raw]
Subject: [PATCH bpf-next v3 1/5] bpf: Remove duplicate PTR_TO_BTF_ID RO check

Since commit 27ae7997a661 ("bpf: Introduce BPF_PROG_TYPE_STRUCT_OPS")
there has existed bpf_verifier_ops:btf_struct_access. When
btf_struct_access is _unset_ for a prog type, the verifier runs the
default implementation, which is to enforce read only:

if (env->ops->btf_struct_access) {
[...]
} else {
if (atype != BPF_READ) {
verbose(env, "only read is supported\n");
return -EACCES;
}

[...]
}

When btf_struct_access is _set_, the expectation is that
btf_struct_access has full control over accesses, including if writes
are allowed.

Rather than carve out an exception for each prog type that may write to
BTF ptrs, delete the redundant check and give full control to
btf_struct_access.

Signed-off-by: Daniel Xu <[email protected]>
Acked-by: Kumar Kartikeya Dwivedi <[email protected]>
---
kernel/bpf/verifier.c | 3 ---
1 file changed, 3 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 2c1f8069f7b7..ca2311bf0cfd 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -13474,9 +13474,6 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
insn->code = BPF_LDX | BPF_PROBE_MEM |
BPF_SIZE((insn)->code);
env->prog->aux->num_exentries++;
- } else if (resolve_prog_type(env->prog) != BPF_PROG_TYPE_STRUCT_OPS) {
- verbose(env, "Writes through BTF pointers are not allowed\n");
- return -EINVAL;
}
continue;
default:
--
2.37.1

2022-08-20 00:06:58

by Daniel Xu

[permalink] [raw]
Subject: [PATCH bpf-next v3 5/5] selftests/bpf: Add tests for writing to nf_conn:mark

Add a simple extension to the existing selftest to write to
nf_conn:mark. Also add a failure test for writing to unsupported field.

Signed-off-by: Daniel Xu <[email protected]>
---
tools/testing/selftests/bpf/prog_tests/bpf_nf.c | 2 ++
tools/testing/selftests/bpf/progs/test_bpf_nf.c | 9 +++++++--
.../testing/selftests/bpf/progs/test_bpf_nf_fail.c | 14 ++++++++++++++
3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_nf.c b/tools/testing/selftests/bpf/prog_tests/bpf_nf.c
index 544bf90ac2a7..ab9117ae7545 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_nf.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_nf.c
@@ -17,6 +17,7 @@ struct {
{ "set_status_after_insert", "kernel function bpf_ct_set_status args#0 expected pointer to STRUCT nf_conn___init but" },
{ "change_timeout_after_alloc", "kernel function bpf_ct_change_timeout args#0 expected pointer to STRUCT nf_conn but" },
{ "change_status_after_alloc", "kernel function bpf_ct_change_status args#0 expected pointer to STRUCT nf_conn but" },
+ { "write_not_allowlisted_field", "no write support to nf_conn at off" },
};

enum {
@@ -113,6 +114,7 @@ static void test_bpf_nf_ct(int mode)
ASSERT_LE(skel->bss->test_delta_timeout, 10, "Test for max ct timeout update");
/* expected status is IPS_SEEN_REPLY */
ASSERT_EQ(skel->bss->test_status, 2, "Test for ct status update ");
+ ASSERT_EQ(skel->bss->test_insert_lookup_mark, 77, "Test for insert and lookup mark value");
ASSERT_EQ(skel->data->test_exist_lookup, 0, "Test existing connection lookup");
ASSERT_EQ(skel->bss->test_exist_lookup_mark, 43, "Test existing connection lookup ctmark");
end:
diff --git a/tools/testing/selftests/bpf/progs/test_bpf_nf.c b/tools/testing/selftests/bpf/progs/test_bpf_nf.c
index 2722441850cc..b5e7079701e8 100644
--- a/tools/testing/selftests/bpf/progs/test_bpf_nf.c
+++ b/tools/testing/selftests/bpf/progs/test_bpf_nf.c
@@ -23,6 +23,7 @@ int test_insert_entry = -EAFNOSUPPORT;
int test_succ_lookup = -ENOENT;
u32 test_delta_timeout = 0;
u32 test_status = 0;
+u32 test_insert_lookup_mark = 0;
__be32 saddr = 0;
__be16 sport = 0;
__be32 daddr = 0;
@@ -144,6 +145,7 @@ nf_ct_test(struct nf_conn *(*lookup_fn)(void *, struct bpf_sock_tuple *, u32,

bpf_ct_set_timeout(ct, 10000);
bpf_ct_set_status(ct, IPS_CONFIRMED);
+ ct->mark = 77;

ct_ins = bpf_ct_insert_entry(ct);
if (ct_ins) {
@@ -157,6 +159,7 @@ nf_ct_test(struct nf_conn *(*lookup_fn)(void *, struct bpf_sock_tuple *, u32,
test_delta_timeout = ct_lk->timeout - bpf_jiffies64();
test_delta_timeout /= CONFIG_HZ;
test_status = IPS_SEEN_REPLY;
+ test_insert_lookup_mark = ct_lk->mark;
bpf_ct_change_status(ct_lk, IPS_SEEN_REPLY);
bpf_ct_release(ct_lk);
test_succ_lookup = 0;
@@ -175,8 +178,10 @@ nf_ct_test(struct nf_conn *(*lookup_fn)(void *, struct bpf_sock_tuple *, u32,
sizeof(opts_def));
if (ct) {
test_exist_lookup = 0;
- if (ct->mark == 42)
- test_exist_lookup_mark = 43;
+ if (ct->mark == 42) {
+ ct->mark++;
+ test_exist_lookup_mark = ct->mark;
+ }
bpf_ct_release(ct);
} else {
test_exist_lookup = opts_def.error;
diff --git a/tools/testing/selftests/bpf/progs/test_bpf_nf_fail.c b/tools/testing/selftests/bpf/progs/test_bpf_nf_fail.c
index bf79af15c808..0e4759ab38ff 100644
--- a/tools/testing/selftests/bpf/progs/test_bpf_nf_fail.c
+++ b/tools/testing/selftests/bpf/progs/test_bpf_nf_fail.c
@@ -69,6 +69,20 @@ int lookup_insert(struct __sk_buff *ctx)
return 0;
}

+SEC("?tc")
+int write_not_allowlisted_field(struct __sk_buff *ctx)
+{
+ struct bpf_ct_opts___local opts = {};
+ struct bpf_sock_tuple tup = {};
+ struct nf_conn *ct;
+
+ ct = bpf_skb_ct_lookup(ctx, &tup, sizeof(tup.ipv4), &opts, sizeof(opts));
+ if (!ct)
+ return 0;
+ ct->status = 0xF00;
+ return 0;
+}
+
SEC("?tc")
int set_timeout_after_insert(struct __sk_buff *ctx)
{
--
2.37.1

2022-08-20 00:08:17

by Daniel Xu

[permalink] [raw]
Subject: [PATCH bpf-next v3 4/5] bpf: Add support for writing to nf_conn:mark

Support direct writes to nf_conn:mark from TC and XDP prog types. This
is useful when applications want to store per-connection metadata. This
is also particularly useful for applications that run both bpf and
iptables/nftables because the latter can trivially access this metadata.

One example use case would be if a bpf prog is responsible for advanced
packet classification and iptables/nftables is later used for routing
due to pre-existing/legacy code.

Signed-off-by: Daniel Xu <[email protected]>
---
include/net/netfilter/nf_conntrack_bpf.h | 13 +++++
net/core/filter.c | 50 ++++++++++++++++++
net/netfilter/nf_conntrack_bpf.c | 64 +++++++++++++++++++++++-
net/netfilter/nf_conntrack_core.c | 1 +
4 files changed, 127 insertions(+), 1 deletion(-)

diff --git a/include/net/netfilter/nf_conntrack_bpf.h b/include/net/netfilter/nf_conntrack_bpf.h
index a473b56842c5..4ef89ee5b5a9 100644
--- a/include/net/netfilter/nf_conntrack_bpf.h
+++ b/include/net/netfilter/nf_conntrack_bpf.h
@@ -3,13 +3,22 @@
#ifndef _NF_CONNTRACK_BPF_H
#define _NF_CONNTRACK_BPF_H

+#include <linux/bpf.h>
#include <linux/btf.h>
#include <linux/kconfig.h>

+extern int (*nf_conntrack_btf_struct_access)(struct bpf_verifier_log *log,
+ const struct btf *btf,
+ const struct btf_type *t, int off,
+ int size, enum bpf_access_type atype,
+ u32 *next_btf_id,
+ enum bpf_type_flag *flag);
+
#if (IS_BUILTIN(CONFIG_NF_CONNTRACK) && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) || \
(IS_MODULE(CONFIG_NF_CONNTRACK) && IS_ENABLED(CONFIG_DEBUG_INFO_BTF_MODULES))

extern int register_nf_conntrack_bpf(void);
+extern void cleanup_nf_conntrack_bpf(void);

#else

@@ -18,6 +27,10 @@ static inline int register_nf_conntrack_bpf(void)
return 0;
}

+static inline void cleanup_nf_conntrack_bpf(void)
+{
+}
+
#endif

#endif /* _NF_CONNTRACK_BPF_H */
diff --git a/net/core/filter.c b/net/core/filter.c
index 1acfaffeaf32..e5f48e6030b7 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -18,6 +18,7 @@
*/

#include <linux/atomic.h>
+#include <linux/bpf_verifier.h>
#include <linux/module.h>
#include <linux/types.h>
#include <linux/mm.h>
@@ -55,6 +56,7 @@
#include <net/sock_reuseport.h>
#include <net/busy_poll.h>
#include <net/tcp.h>
+#include <net/netfilter/nf_conntrack_bpf.h>
#include <net/xfrm.h>
#include <net/udp.h>
#include <linux/bpf_trace.h>
@@ -8628,6 +8630,32 @@ static bool tc_cls_act_is_valid_access(int off, int size,
return bpf_skb_is_valid_access(off, size, type, prog, info);
}

+typedef int (*btf_struct_access_t)(struct bpf_verifier_log *log,
+ const struct btf *btf,
+ const struct btf_type *t, int off, int size,
+ enum bpf_access_type atype,
+ u32 *next_btf_id, enum bpf_type_flag *flag);
+
+static int tc_cls_act_btf_struct_access(struct bpf_verifier_log *log,
+ const struct btf *btf,
+ const struct btf_type *t, int off,
+ int size, enum bpf_access_type atype,
+ u32 *next_btf_id,
+ enum bpf_type_flag *flag)
+{
+ btf_struct_access_t sa;
+
+ if (atype == BPF_READ)
+ return btf_struct_access(log, btf, t, off, size, atype, next_btf_id,
+ flag);
+
+ sa = READ_ONCE(nf_conntrack_btf_struct_access);
+ if (sa)
+ return sa(log, btf, t, off, size, atype, next_btf_id, flag);
+
+ return -EACCES;
+}
+
static bool __is_valid_xdp_access(int off, int size)
{
if (off < 0 || off >= sizeof(struct xdp_md))
@@ -8687,6 +8715,26 @@ void bpf_warn_invalid_xdp_action(struct net_device *dev, struct bpf_prog *prog,
}
EXPORT_SYMBOL_GPL(bpf_warn_invalid_xdp_action);

+static int xdp_btf_struct_access(struct bpf_verifier_log *log,
+ const struct btf *btf,
+ const struct btf_type *t, int off,
+ int size, enum bpf_access_type atype,
+ u32 *next_btf_id,
+ enum bpf_type_flag *flag)
+{
+ btf_struct_access_t sa;
+
+ if (atype == BPF_READ)
+ return btf_struct_access(log, btf, t, off, size, atype, next_btf_id,
+ flag);
+
+ sa = READ_ONCE(nf_conntrack_btf_struct_access);
+ if (sa)
+ return sa(log, btf, t, off, size, atype, next_btf_id, flag);
+
+ return -EACCES;
+}
+
static bool sock_addr_is_valid_access(int off, int size,
enum bpf_access_type type,
const struct bpf_prog *prog,
@@ -10581,6 +10629,7 @@ const struct bpf_verifier_ops tc_cls_act_verifier_ops = {
.convert_ctx_access = tc_cls_act_convert_ctx_access,
.gen_prologue = tc_cls_act_prologue,
.gen_ld_abs = bpf_gen_ld_abs,
+ .btf_struct_access = tc_cls_act_btf_struct_access,
};

const struct bpf_prog_ops tc_cls_act_prog_ops = {
@@ -10592,6 +10641,7 @@ const struct bpf_verifier_ops xdp_verifier_ops = {
.is_valid_access = xdp_is_valid_access,
.convert_ctx_access = xdp_convert_ctx_access,
.gen_prologue = bpf_noop_prologue,
+ .btf_struct_access = xdp_btf_struct_access,
};

const struct bpf_prog_ops xdp_prog_ops = {
diff --git a/net/netfilter/nf_conntrack_bpf.c b/net/netfilter/nf_conntrack_bpf.c
index 1cd87b28c9b0..a346b561981a 100644
--- a/net/netfilter/nf_conntrack_bpf.c
+++ b/net/netfilter/nf_conntrack_bpf.c
@@ -6,6 +6,7 @@
* are exposed through to BPF programs is explicitly unstable.
*/

+#include <linux/bpf_verifier.h>
#include <linux/bpf.h>
#include <linux/btf.h>
#include <linux/types.h>
@@ -15,6 +16,13 @@
#include <net/netfilter/nf_conntrack_bpf.h>
#include <net/netfilter/nf_conntrack_core.h>

+int (*nf_conntrack_btf_struct_access)(struct bpf_verifier_log *log,
+ const struct btf *btf,
+ const struct btf_type *t, int off,
+ int size, enum bpf_access_type atype,
+ u32 *next_btf_id,
+ enum bpf_type_flag *flag);
+
/* bpf_ct_opts - Options for CT lookup helpers
*
* Members:
@@ -184,6 +192,51 @@ static struct nf_conn *__bpf_nf_ct_lookup(struct net *net,
return ct;
}

+BTF_ID_LIST(btf_nf_conn_ids)
+BTF_ID(struct, nf_conn)
+BTF_ID(struct, nf_conn___init)
+
+/* Check writes into `struct nf_conn` */
+static int _nf_conntrack_btf_struct_access(struct bpf_verifier_log *log,
+ const struct btf *btf,
+ const struct btf_type *t, int off,
+ int size, enum bpf_access_type atype,
+ u32 *next_btf_id,
+ enum bpf_type_flag *flag)
+{
+ const struct btf_type *ncit = btf_type_by_id(btf, btf_nf_conn_ids[1]);
+ const struct btf_type *nct = btf_type_by_id(btf, btf_nf_conn_ids[0]);
+ size_t end;
+
+ if (t != nct && t != ncit) {
+ bpf_log(log, "only read is supported\n");
+ return -EACCES;
+ }
+
+ /* `struct nf_conn` and `struct nf_conn___init` have the same layout
+ * so we are safe to simply merge offset checks here
+ */
+ switch (off) {
+#if defined(CONFIG_NF_CONNTRACK_MARK)
+ case offsetof(struct nf_conn, mark):
+ end = offsetofend(struct nf_conn, mark);
+ break;
+#endif
+ default:
+ bpf_log(log, "no write support to nf_conn at off %d\n", off);
+ return -EACCES;
+ }
+
+ if (off + size > end) {
+ bpf_log(log,
+ "write access at off %d with size %d beyond the member of nf_conn ended at %zu\n",
+ off, size, end);
+ return -EACCES;
+ }
+
+ return 0;
+}
+
__diag_push();
__diag_ignore_all("-Wmissing-prototypes",
"Global functions as their definitions will be in nf_conntrack BTF");
@@ -449,5 +502,14 @@ int register_nf_conntrack_bpf(void)
int ret;

ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &nf_conntrack_kfunc_set);
- return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &nf_conntrack_kfunc_set);
+ ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &nf_conntrack_kfunc_set);
+ if (!ret)
+ WRITE_ONCE(nf_conntrack_btf_struct_access, _nf_conntrack_btf_struct_access);
+
+ return ret;
+}
+
+void cleanup_nf_conntrack_bpf(void)
+{
+ WRITE_ONCE(nf_conntrack_btf_struct_access, NULL);
}
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index da65c6e8eeeb..0195f60fc43b 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -2512,6 +2512,7 @@ static int kill_all(struct nf_conn *i, void *data)

void nf_conntrack_cleanup_start(void)
{
+ cleanup_nf_conntrack_bpf();
conntrack_gc_work.exiting = true;
}

--
2.37.1

2022-08-20 00:18:16

by Kumar Kartikeya Dwivedi

[permalink] [raw]
Subject: Re: [PATCH bpf-next v3 4/5] bpf: Add support for writing to nf_conn:mark

On Sat, 20 Aug 2022 at 01:23, Daniel Xu <[email protected]> wrote:
>
> Support direct writes to nf_conn:mark from TC and XDP prog types. This
> is useful when applications want to store per-connection metadata. This
> is also particularly useful for applications that run both bpf and
> iptables/nftables because the latter can trivially access this metadata.
>
> One example use case would be if a bpf prog is responsible for advanced
> packet classification and iptables/nftables is later used for routing
> due to pre-existing/legacy code.
>
> Signed-off-by: Daniel Xu <[email protected]>
> ---
> include/net/netfilter/nf_conntrack_bpf.h | 13 +++++
> net/core/filter.c | 50 ++++++++++++++++++
> net/netfilter/nf_conntrack_bpf.c | 64 +++++++++++++++++++++++-
> net/netfilter/nf_conntrack_core.c | 1 +
> 4 files changed, 127 insertions(+), 1 deletion(-)
>
> diff --git a/include/net/netfilter/nf_conntrack_bpf.h b/include/net/netfilter/nf_conntrack_bpf.h
> index a473b56842c5..4ef89ee5b5a9 100644
> --- a/include/net/netfilter/nf_conntrack_bpf.h
> +++ b/include/net/netfilter/nf_conntrack_bpf.h
> @@ -3,13 +3,22 @@
> #ifndef _NF_CONNTRACK_BPF_H
> #define _NF_CONNTRACK_BPF_H
>
> +#include <linux/bpf.h>
> #include <linux/btf.h>
> #include <linux/kconfig.h>
>
> +extern int (*nf_conntrack_btf_struct_access)(struct bpf_verifier_log *log,
> + const struct btf *btf,
> + const struct btf_type *t, int off,
> + int size, enum bpf_access_type atype,
> + u32 *next_btf_id,
> + enum bpf_type_flag *flag);
> +
> #if (IS_BUILTIN(CONFIG_NF_CONNTRACK) && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) || \
> (IS_MODULE(CONFIG_NF_CONNTRACK) && IS_ENABLED(CONFIG_DEBUG_INFO_BTF_MODULES))
>
> extern int register_nf_conntrack_bpf(void);
> +extern void cleanup_nf_conntrack_bpf(void);
>
> #else
>
> @@ -18,6 +27,10 @@ static inline int register_nf_conntrack_bpf(void)
> return 0;
> }
>
> +static inline void cleanup_nf_conntrack_bpf(void)
> +{
> +}
> +
> #endif
>
> #endif /* _NF_CONNTRACK_BPF_H */
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 1acfaffeaf32..e5f48e6030b7 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -18,6 +18,7 @@
> */
>
> #include <linux/atomic.h>
> +#include <linux/bpf_verifier.h>
> #include <linux/module.h>
> #include <linux/types.h>
> #include <linux/mm.h>
> @@ -55,6 +56,7 @@
> #include <net/sock_reuseport.h>
> #include <net/busy_poll.h>
> #include <net/tcp.h>
> +#include <net/netfilter/nf_conntrack_bpf.h>
> #include <net/xfrm.h>
> #include <net/udp.h>
> #include <linux/bpf_trace.h>
> @@ -8628,6 +8630,32 @@ static bool tc_cls_act_is_valid_access(int off, int size,
> return bpf_skb_is_valid_access(off, size, type, prog, info);
> }
>
> +typedef int (*btf_struct_access_t)(struct bpf_verifier_log *log,
> + const struct btf *btf,
> + const struct btf_type *t, int off, int size,
> + enum bpf_access_type atype,
> + u32 *next_btf_id, enum bpf_type_flag *flag);
> +
> +static int tc_cls_act_btf_struct_access(struct bpf_verifier_log *log,
> + const struct btf *btf,
> + const struct btf_type *t, int off,
> + int size, enum bpf_access_type atype,
> + u32 *next_btf_id,
> + enum bpf_type_flag *flag)
> +{
> + btf_struct_access_t sa;
> +
> + if (atype == BPF_READ)
> + return btf_struct_access(log, btf, t, off, size, atype, next_btf_id,
> + flag);
> +
> + sa = READ_ONCE(nf_conntrack_btf_struct_access);

This looks unsafe. How do you prevent this race?

CPU 0 CPU 1
sa = READ_ONCE(nf_ct_bsa);

delete_module("nf_conntrack", ..);

WRITE_ONCE(nf_ct_bsa, NULL);
// finishes
successfully
if (sa)
return sa(...); // oops

i.e. what keeps the module alive while we execute its callback?

Using a mutex is one way (as I suggested previously), either you
acquire it before unload, or after. If after, you see cb as NULL,
otherwise if unload is triggered concurrently it waits to acquire the
mutex held by us. Unsetting the cb would be the first thing the module
would do.

You can also hold a module reference, but then you must verify it is
nf_conntrack's BTF before using btf_try_get_module.
But _something_ needs to be done to prevent the module from going away
while we execute its code.

> + if (sa)
> + return sa(log, btf, t, off, size, atype, next_btf_id, flag);
> +
> + return -EACCES;
> +}
> +
> [...]

2022-08-20 00:33:17

by Daniel Xu

[permalink] [raw]
Subject: Re: [PATCH bpf-next v3 4/5] bpf: Add support for writing to nf_conn:mark

Hi Kumar,

On Sat, Aug 20, 2022 at 01:46:04AM +0200, Kumar Kartikeya Dwivedi wrote:
> On Sat, 20 Aug 2022 at 01:23, Daniel Xu <[email protected]> wrote:
[...]
> > +static int tc_cls_act_btf_struct_access(struct bpf_verifier_log *log,
> > + const struct btf *btf,
> > + const struct btf_type *t, int off,
> > + int size, enum bpf_access_type atype,
> > + u32 *next_btf_id,
> > + enum bpf_type_flag *flag)
> > +{
> > + btf_struct_access_t sa;
> > +
> > + if (atype == BPF_READ)
> > + return btf_struct_access(log, btf, t, off, size, atype, next_btf_id,
> > + flag);
> > +
> > + sa = READ_ONCE(nf_conntrack_btf_struct_access);
>
> This looks unsafe. How do you prevent this race?
>
> CPU 0 CPU 1
> sa = READ_ONCE(nf_ct_bsa);
>
> delete_module("nf_conntrack", ..);
>
> WRITE_ONCE(nf_ct_bsa, NULL);
> // finishes
> successfully
> if (sa)
> return sa(...); // oops
>
> i.e. what keeps the module alive while we execute its callback?
>
> Using a mutex is one way (as I suggested previously), either you
> acquire it before unload, or after. If after, you see cb as NULL,
> otherwise if unload is triggered concurrently it waits to acquire the
> mutex held by us. Unsetting the cb would be the first thing the module
> would do.
>
> You can also hold a module reference, but then you must verify it is
> nf_conntrack's BTF before using btf_try_get_module.
> But _something_ needs to be done to prevent the module from going away
> while we execute its code.

I think I somehow convinced myself that nf_conntrack_core.o is always
compiled in. Due to some of the garbage collection semantics I saw in
the code.

Lemme take a closer look (for learning I guess). Mutex is probably
safest bet.

[...]

Thanks,
Daniel

2022-08-20 03:53:25

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH bpf-next v3 4/5] bpf: Add support for writing to nf_conn:mark

Hi Daniel,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on bpf-next/master]

url: https://github.com/intel-lab-lkp/linux/commits/Daniel-Xu/bpf-Remove-duplicate-PTR_TO_BTF_ID-RO-check/20220820-082411
base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: um-i386_defconfig
compiler: gcc-11 (Debian 11.3.0-5) 11.3.0
reproduce (this is a W=1 build):
# https://github.com/intel-lab-lkp/linux/commit/20062077235a94dd0b856204b6dbddefe8342f01
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Daniel-Xu/bpf-Remove-duplicate-PTR_TO_BTF_ID-RO-check/20220820-082411
git checkout 20062077235a94dd0b856204b6dbddefe8342f01
# save the config file
mkdir build_dir && cp config build_dir/.config
make W=1 O=build_dir ARCH=um SUBARCH=i386 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

/usr/bin/ld: warning: arch/x86/um/checksum_32.o: missing .note.GNU-stack section implies executable stack
/usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
/usr/bin/ld: warning: .tmp_vmlinux.kallsyms1 has a LOAD segment with RWX permissions
/usr/bin/ld: net/core/filter.o: in function `tc_cls_act_btf_struct_access':
>> filter.c:(.text+0x920): undefined reference to `nf_conntrack_btf_struct_access'
collect2: error: ld returned 1 exit status

--
0-DAY CI Kernel Test Service
https://01.org/lkp


Attachments:
(No filename) (1.62 kB)
config (42.08 kB)
Download all attachments

2022-08-20 04:56:49

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH bpf-next v3 4/5] bpf: Add support for writing to nf_conn:mark

Hi Daniel,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on bpf-next/master]

url: https://github.com/intel-lab-lkp/linux/commits/Daniel-Xu/bpf-Remove-duplicate-PTR_TO_BTF_ID-RO-check/20220820-082411
base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
config: nios2-randconfig-r005-20220820 (https://download.01.org/0day-ci/archive/20220820/[email protected]/config)
compiler: nios2-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/20062077235a94dd0b856204b6dbddefe8342f01
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Daniel-Xu/bpf-Remove-duplicate-PTR_TO_BTF_ID-RO-check/20220820-082411
git checkout 20062077235a94dd0b856204b6dbddefe8342f01
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=nios2 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

nios2-linux-ld: net/core/filter.o: in function `tc_cls_act_btf_struct_access':
filter.c:(.text+0xd54): undefined reference to `nf_conntrack_btf_struct_access'
>> nios2-linux-ld: filter.c:(.text+0xd58): undefined reference to `nf_conntrack_btf_struct_access'

--
0-DAY CI Kernel Test Service
https://01.org/lkp