2022-12-24 00:23:45

by Quentin Deslandes

[permalink] [raw]
Subject: [PATCH bpf-next v3 00/16] bpfilter

The patchset is based on the patches from David S. Miller [1],
Daniel Borkmann [2], and Dmitrii Banshchikov [3].

The main goal of the patchset is to prepare bpfilter for
iptables' configuration blob parsing and code generation.

The patchset introduces data structures and code for matches,
targets, rules and tables. Beside that the code generation
is introduced.

The first version of the code generation supports only "inline"
mode - all chains and their rules emit instructions in linear
approach.

Things that are not implemented yet:
1) The process of switching from the previous BPF programs to the
new set isn't atomic.
2) No support of device ifindex - it's hardcoded
3) No helper subprog for counters update

Another problem is using iptables' blobs for tests and filter
table initialization. While it saves lines something more
maintainable should be done here.

The plan for the next iteration:
1) Add a helper program for counters update
2) Handle ifindex

Patches 1/2 adds definitions of the used types.
Patch 3 adds logging to bpfilter.
Patch 4 adds an associative map.
Patch 5 add runtime context structure.
Patches 6/7 add code generation infrastructure and TC code generator.
Patches 8/9/10/11/12 add code for matches, targets, rules and table.
Patch 13 adds code generation for table.
Patch 14 handles hooked setsockopt(2) calls.
Patch 15 adds filter table
Patch 16 uses prepared code in main().

Due to poor hardware availability on my side, I've not been able to
benchmark those changes. I plan to get some numbers for the next iteration.

FORWARD filter chain is now supported, however, it's attached to
TC INGRESS along with INPUT filter chain. This is due to XDP not supporting
multiple programs to be attached. I could generate a single program
out of both INPUT and FORWARD chains, but that would prevent another
BPF program to be attached to the interface anyway. If a solution
exists to attach both those programs to XDP while allowing for other
programs to be attached, it requires more investigation. In the meantime,
INPUT and FORWARD filtering is supported using TC.

Most of the code in this series was written by Dmitrii Banshchikov,
my changes are limited to v3. I've tried to reflect this fact in the
commits by adding 'Co-developed-by:' and 'Signed-off-by:' for Dmitrii,
please tell me this was done the wrong way.

v2 -> v3
Chains:
* Add support for FORWARD filter chain.
* Add generation of BPF bytecode to assess whether a packet should be
forwarded or not, using bpf_fib_lookup().
* Allow for multiple programs to be attached to TC.
* Allow for multiple TC hooks to be used.
Code generation:
* Remove duplicated BPF bytecode generation.
* Fix a bug regarding jump offset during generation.
* Remove support for XDP from the series, as it's not currently
used.
Table:
* Add new filter_table_update_counters() virtual call. It updates
the table's counter stored in the ipt_entry structure. This way,
when iptables tries to fetch the values of the counters, bpfilter only
has to copy the ipt_entry cached in the table structure.
Logging:
* Refactor logging primitives.
Sockopts:
* Add support for userspace counters querying.
Rule:
* Store the rule's index inside struct rule, to each counters'
map usage.

v1 -> v2
Maps:
* Use map_upsert instead of separate map_insert and map_update
Matches:
* Add a new virtual call - gen_inline. The call is used for
* inline generating of a rule's match.
Targets:
* Add a new virtual call - gen_inline. The call is used for inline
generating of a rule's target.
Rules:
* Add code generation for rules
Table:
* Add struct table_ops
* Add map for table_ops
* Add filter table
* Reorganize the way filter table is initialized
Sockopts:
* Install/uninstall BPF programs while handling
IPT_SO_SET_REPLACE
Code generation:
* Add first version of the code generation
Dependencies:
* Add libbpf

v0 -> v1
IO:
* Use ssize_t in pvm_read, pvm_write for total_bytes
* Move IO functions into sockopt.c and main.c
Logging:
* Use LOGLEVEL_EMERG, LOGLEVEL_NOTICE, LOGLEVE_DEBUG
while logging to /dev/kmsg
* Prepend log message with <n> where n is log level
* Conditionally enable BFLOG_DEBUG messages
* Merge bflog.{h,c} into context.h
Matches:
* Reorder fields in struct match_ops for tight packing
* Get rid of struct match_ops_map
* Rename udp_match_ops to xt_udp
* Use XT_ALIGN macro
* Store payload size in match size
* Move udp match routines into a separate file
Targets:
* Reorder fields in struct target_ops for tight packing
* Get rid of struct target_ops_map
* Add comments for convert_verdict function
Rules:
* Add validation
Tables:
* Combine table_map and table_list into table_index
* Add validation
Sockopts:
* Handle IPT_SO_GET_REVISION_TARGET

1. https://lore.kernel.org/patchwork/patch/902785/
2. https://lore.kernel.org/patchwork/patch/902783/
3. https://kernel.ubuntu.com/~cking/stress-ng/stress-ng.pdf

Quentin Deslandes (16):
bpfilter: add types for usermode helper
tools: add bpfilter usermode helper header
bpfilter: add logging facility
bpfilter: add map container
bpfilter: add runtime context
bpfilter: add BPF bytecode generation infrastructure
bpfilter: add support for TC bytecode generation
bpfilter: add match structure
bpfilter: add support for src/dst addr and ports
bpfilter: add target structure
bpfilter: add rule structure
bpfilter: add table structure
bpfilter: add table code generation
bpfilter: add setsockopt() support
bpfilter: add filter table
bpfilter: handle setsockopt() calls

include/uapi/linux/bpfilter.h | 154 +++
net/bpfilter/Makefile | 16 +-
net/bpfilter/codegen.c | 1040 +++++++++++++++++
net/bpfilter/codegen.h | 183 +++
net/bpfilter/context.c | 168 +++
net/bpfilter/context.h | 24 +
net/bpfilter/filter-table.c | 344 ++++++
net/bpfilter/filter-table.h | 18 +
net/bpfilter/logger.c | 52 +
net/bpfilter/logger.h | 80 ++
net/bpfilter/main.c | 132 ++-
net/bpfilter/map-common.c | 51 +
net/bpfilter/map-common.h | 19 +
net/bpfilter/match.c | 55 +
net/bpfilter/match.h | 37 +
net/bpfilter/rule.c | 286 +++++
net/bpfilter/rule.h | 37 +
net/bpfilter/sockopt.c | 533 +++++++++
net/bpfilter/sockopt.h | 15 +
net/bpfilter/table.c | 391 +++++++
net/bpfilter/table.h | 59 +
net/bpfilter/target.c | 203 ++++
net/bpfilter/target.h | 57 +
net/bpfilter/xt_udp.c | 111 ++
tools/include/uapi/linux/bpfilter.h | 175 +++
.../testing/selftests/bpf/bpfilter/.gitignore | 8 +
tools/testing/selftests/bpf/bpfilter/Makefile | 57 +
.../selftests/bpf/bpfilter/bpfilter_util.h | 80 ++
.../selftests/bpf/bpfilter/test_codegen.c | 338 ++++++
.../testing/selftests/bpf/bpfilter/test_map.c | 63 +
.../selftests/bpf/bpfilter/test_match.c | 69 ++
.../selftests/bpf/bpfilter/test_rule.c | 56 +
.../selftests/bpf/bpfilter/test_target.c | 83 ++
.../selftests/bpf/bpfilter/test_xt_udp.c | 48 +
34 files changed, 4999 insertions(+), 43 deletions(-)
create mode 100644 net/bpfilter/codegen.c
create mode 100644 net/bpfilter/codegen.h
create mode 100644 net/bpfilter/context.c
create mode 100644 net/bpfilter/context.h
create mode 100644 net/bpfilter/filter-table.c
create mode 100644 net/bpfilter/filter-table.h
create mode 100644 net/bpfilter/logger.c
create mode 100644 net/bpfilter/logger.h
create mode 100644 net/bpfilter/map-common.c
create mode 100644 net/bpfilter/map-common.h
create mode 100644 net/bpfilter/match.c
create mode 100644 net/bpfilter/match.h
create mode 100644 net/bpfilter/rule.c
create mode 100644 net/bpfilter/rule.h
create mode 100644 net/bpfilter/sockopt.c
create mode 100644 net/bpfilter/sockopt.h
create mode 100644 net/bpfilter/table.c
create mode 100644 net/bpfilter/table.h
create mode 100644 net/bpfilter/target.c
create mode 100644 net/bpfilter/target.h
create mode 100644 net/bpfilter/xt_udp.c
create mode 100644 tools/include/uapi/linux/bpfilter.h
create mode 100644 tools/testing/selftests/bpf/bpfilter/.gitignore
create mode 100644 tools/testing/selftests/bpf/bpfilter/Makefile
create mode 100644 tools/testing/selftests/bpf/bpfilter/bpfilter_util.h
create mode 100644 tools/testing/selftests/bpf/bpfilter/test_codegen.c
create mode 100644 tools/testing/selftests/bpf/bpfilter/test_map.c
create mode 100644 tools/testing/selftests/bpf/bpfilter/test_match.c
create mode 100644 tools/testing/selftests/bpf/bpfilter/test_rule.c
create mode 100644 tools/testing/selftests/bpf/bpfilter/test_target.c
create mode 100644 tools/testing/selftests/bpf/bpfilter/test_xt_udp.c

--
2.38.1


2022-12-24 00:23:46

by Quentin Deslandes

[permalink] [raw]
Subject: [PATCH bpf-next v3 01/16] bpfilter: add types for usermode helper

Add required definitions that mirror existing iptables' ABI. Those
definitions are needed by usermode helper.

Co-developed-by: Dmitrii Banshchikov <[email protected]>
Signed-off-by: Dmitrii Banshchikov <[email protected]>
Signed-off-by: Quentin Deslandes <[email protected]>
---
include/uapi/linux/bpfilter.h | 154 ++++++++++++++++++++++++++++++++++
1 file changed, 154 insertions(+)

diff --git a/include/uapi/linux/bpfilter.h b/include/uapi/linux/bpfilter.h
index cbc1f5813f50..295fd9caa3c8 100644
--- a/include/uapi/linux/bpfilter.h
+++ b/include/uapi/linux/bpfilter.h
@@ -3,6 +3,10 @@
#define _UAPI_LINUX_BPFILTER_H

#include <linux/if.h>
+#include <linux/const.h>
+
+#define BPFILTER_STANDARD_TARGET ""
+#define BPFILTER_ERROR_TARGET "ERROR"

enum {
BPFILTER_IPT_SO_SET_REPLACE = 64,
@@ -18,4 +22,154 @@ enum {
BPFILTER_IPT_GET_MAX,
};

+enum {
+ BPFILTER_XT_TABLE_MAXNAMELEN = 32,
+ BPFILTER_FUNCTION_MAXNAMELEN = 30,
+ BPFILTER_EXTENSION_MAXNAMELEN = 29,
+};
+
+enum {
+ BPFILTER_NF_DROP = 0,
+ BPFILTER_NF_ACCEPT = 1,
+ BPFILTER_NF_STOLEN = 2,
+ BPFILTER_NF_QUEUE = 3,
+ BPFILTER_NF_REPEAT = 4,
+ BPFILTER_NF_STOP = 5,
+ BPFILTER_NF_MAX_VERDICT = BPFILTER_NF_STOP,
+ BPFILTER_RETURN = (-BPFILTER_NF_REPEAT - 1),
+};
+
+enum {
+ BPFILTER_INET_HOOK_PRE_ROUTING = 0,
+ BPFILTER_INET_HOOK_LOCAL_IN = 1,
+ BPFILTER_INET_HOOK_FORWARD = 2,
+ BPFILTER_INET_HOOK_LOCAL_OUT = 3,
+ BPFILTER_INET_HOOK_POST_ROUTING = 4,
+ BPFILTER_INET_HOOK_MAX,
+};
+
+enum {
+ BPFILTER_IPT_F_MASK = 0x03,
+ BPFILTER_IPT_INV_MASK = 0x7f
+};
+
+struct bpfilter_ipt_match {
+ union {
+ struct {
+ __u16 match_size;
+ char name[BPFILTER_EXTENSION_MAXNAMELEN];
+ __u8 revision;
+ } user;
+ struct {
+ __u16 match_size;
+ void *match;
+ } kernel;
+ __u16 match_size;
+ } u;
+ unsigned char data[];
+};
+
+struct bpfilter_ipt_target {
+ union {
+ struct {
+ __u16 target_size;
+ char name[BPFILTER_EXTENSION_MAXNAMELEN];
+ __u8 revision;
+ } user;
+ struct {
+ __u16 target_size;
+ void *target;
+ } kernel;
+ __u16 target_size;
+ } u;
+ unsigned char data[];
+};
+
+struct bpfilter_ipt_standard_target {
+ struct bpfilter_ipt_target target;
+ int verdict;
+};
+
+struct bpfilter_ipt_error_target {
+ struct bpfilter_ipt_target target;
+ char error_name[BPFILTER_FUNCTION_MAXNAMELEN];
+};
+
+struct bpfilter_ipt_get_info {
+ char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+ __u32 valid_hooks;
+ __u32 hook_entry[BPFILTER_INET_HOOK_MAX];
+ __u32 underflow[BPFILTER_INET_HOOK_MAX];
+ __u32 num_entries;
+ __u32 size;
+};
+
+struct bpfilter_ipt_counters {
+ __u64 packet_cnt;
+ __u64 byte_cnt;
+};
+
+struct bpfilter_ipt_counters_info {
+ char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+ __u32 num_counters;
+ struct bpfilter_ipt_counters counters[];
+};
+
+struct bpfilter_ipt_get_revision {
+ char name[BPFILTER_EXTENSION_MAXNAMELEN];
+ __u8 revision;
+};
+
+struct bpfilter_ipt_ip {
+ __u32 src;
+ __u32 dst;
+ __u32 src_mask;
+ __u32 dst_mask;
+ char in_iface[IFNAMSIZ];
+ char out_iface[IFNAMSIZ];
+ __u8 in_iface_mask[IFNAMSIZ];
+ __u8 out_iface_mask[IFNAMSIZ];
+ __u16 protocol;
+ __u8 flags;
+ __u8 invflags;
+};
+
+struct bpfilter_ipt_entry {
+ struct bpfilter_ipt_ip ip;
+ __u32 bfcache;
+ __u16 target_offset;
+ __u16 next_offset;
+ __u32 comefrom;
+ struct bpfilter_ipt_counters counters;
+ __u8 elems[];
+};
+
+struct bpfilter_ipt_standard_entry {
+ struct bpfilter_ipt_entry entry;
+ struct bpfilter_ipt_standard_target target;
+};
+
+struct bpfilter_ipt_error_entry {
+ struct bpfilter_ipt_entry entry;
+ struct bpfilter_ipt_error_target target;
+};
+
+struct bpfilter_ipt_get_entries {
+ char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+ __u32 size;
+ struct bpfilter_ipt_entry entries[];
+};
+
+struct bpfilter_ipt_replace {
+ char name[BPFILTER_XT_TABLE_MAXNAMELEN];
+ __u32 valid_hooks;
+ __u32 num_entries;
+ __u32 size;
+ __u32 hook_entry[BPFILTER_INET_HOOK_MAX];
+ __u32 underflow[BPFILTER_INET_HOOK_MAX];
+ __u32 num_counters;
+ struct bpfilter_ipt_counters *cntrs;
+ struct bpfilter_ipt_entry entries[];
+};
+
#endif /* _UAPI_LINUX_BPFILTER_H */
--
2.38.1

2022-12-24 03:29:13

by Quentin Deslandes

[permalink] [raw]
Subject: [PATCH bpf-next v3 03/16] bpfilter: add logging facility

bpfilter will log to /dev/kmsg by default. Four different log levels are
available. LOG_EMERG() will exit the usermode helper after logging.

Signed-off-by: Quentin Deslandes <[email protected]>
---
net/bpfilter/Makefile | 2 +-
net/bpfilter/logger.c | 52 ++++++++++++++++++++++++++++
net/bpfilter/logger.h | 80 +++++++++++++++++++++++++++++++++++++++++++
3 files changed, 133 insertions(+), 1 deletion(-)
create mode 100644 net/bpfilter/logger.c
create mode 100644 net/bpfilter/logger.h

diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index cdac82b8c53a..8d9c726ba1a5 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -4,7 +4,7 @@
#

userprogs := bpfilter_umh
-bpfilter_umh-objs := main.o
+bpfilter_umh-objs := main.o logger.o
userccflags += -I $(srctree)/tools/include/ -I $(srctree)/tools/include/uapi

ifeq ($(CONFIG_BPFILTER_UMH), y)
diff --git a/net/bpfilter/logger.c b/net/bpfilter/logger.c
new file mode 100644
index 000000000000..c256bfef7e6c
--- /dev/null
+++ b/net/bpfilter/logger.c
@@ -0,0 +1,52 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2022 Meta Platforms, Inc. and affiliates.
+ */
+
+#include "logger.h"
+
+#include <errno.h>
+
+static const char *log_file_path = "/dev/kmsg";
+static FILE *log_file;
+
+int logger_init(void)
+{
+ if (log_file)
+ return 0;
+
+ log_file = fopen(log_file_path, "w");
+ if (!log_file)
+ return -errno;
+
+ if (setvbuf(log_file, 0, _IOLBF, 0))
+ return -errno;
+
+ return 0;
+}
+
+void logger_set_file(FILE *file)
+{
+ log_file = file;
+}
+
+FILE *logger_get_file(void)
+{
+ return log_file;
+}
+
+int logger_clean(void)
+{
+ int r;
+
+ if (!log_file)
+ return 0;
+
+ r = fclose(log_file);
+ if (r == EOF)
+ return -errno;
+
+ log_file = NULL;
+
+ return 0;
+}
diff --git a/net/bpfilter/logger.h b/net/bpfilter/logger.h
new file mode 100644
index 000000000000..c44739ec0069
--- /dev/null
+++ b/net/bpfilter/logger.h
@@ -0,0 +1,80 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2022 Meta Platforms, Inc. and affiliates.
+ */
+
+#ifndef NET_BPFILTER_LOGGER_H
+#define NET_BPFILTER_LOGGER_H
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <syslog.h>
+
+#define _BFLOG_IMPL(level, fmt, ...) \
+ do { \
+ typeof(level) __level = level; \
+ if (logger_get_file()) { \
+ fprintf(logger_get_file(), "<%d>bpfilter: " fmt "\n", \
+ (__level), ##__VA_ARGS__); \
+ } \
+ if ((__level) == LOG_EMERG) \
+ exit(EXIT_FAILURE); \
+ } while (0)
+
+#define BFLOG_EMERG(fmt, ...) \
+ _BFLOG_IMPL(LOG_KERN | LOG_EMERG, fmt, ##__VA_ARGS__)
+#define BFLOG_ERR(fmt, ...) \
+ _BFLOG_IMPL(LOG_KERN | LOG_ERR, fmt, ##__VA_ARGS__)
+#define BFLOG_NOTICE(fmt, ...) \
+ _BFLOG_IMPL(LOG_KERN | LOG_NOTICE, fmt, ##__VA_ARGS__)
+
+#ifdef DEBUG
+#define BFLOG_DBG(fmt, ...) BFLOG_IMPL(LOG_KERN | LOG_DEBUG, fmt, ##__VA_ARGS__)
+#else
+#define BFLOG_DBG(fmt, ...)
+#endif
+
+#define STRERR(v) strerror(abs(v))
+
+/**
+ * logger_init() - Initialise logging facility.
+ *
+ * This function is used to open a file to write logs to (see @log_file_path).
+ * It must be called before using any logging macro, otherwise log messages
+ * will be discarded.
+ *
+ * Return: 0 on success, negative errno value on error.
+ */
+int logger_init(void);
+
+/**
+ * logger_set_file() - Set the FILE pointer to use to log messages.
+ * @file: new FILE * to the log file.
+ *
+ * This function won't check whether the FILE pointer is valid, nor whether
+ * a file is already opened, this is the responsibility of the caller. Once
+ * logger_set_file() returns, all new log messages will be printed to the
+ * FILE * provided.
+ */
+void logger_set_file(FILE *file);
+
+/**
+ * logger_get_file() - Returns a FILE * pointer to the log file.
+ *
+ * Return: pointer to the file to log to (as a FILE *), or NULL if the file
+ * is not valid.
+ */
+FILE *logger_get_file(void);
+
+/**
+ * logger_clean() - Close the log file.
+ *
+ * On success, the log file pointer will be NULL. If the function fails,
+ * the log file pointer remain unchanged and the file should be considered open.
+ *
+ * Return: 0 on success, negative errno value on error.
+ */
+int logger_clean(void);
+
+#endif // NET_BPFILTER_LOGGER_H
--
2.38.1