Received: by 2002:a05:6358:f14:b0:e5:3b68:ec04 with SMTP id b20csp1288100rwj; Fri, 23 Dec 2022 16:23:45 -0800 (PST) X-Google-Smtp-Source: AMrXdXtMVHYEQJQtZQcCWNcKP0hH+Vlva5NOqG9voyiHjorFYAP8JteN8C0dJRQh3wfOmpAZLfl/ X-Received: by 2002:a17:903:2013:b0:18e:4121:4b46 with SMTP id s19-20020a170903201300b0018e41214b46mr10902065pla.25.1671841425083; Fri, 23 Dec 2022 16:23:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671841425; cv=none; d=google.com; s=arc-20160816; b=I50RpSrPD8r4FoszuRivWYw6xvhTXnNXJnrSafKsH51848VbrYT6ejnrrtQCcWKe58 6yPolPY7oJOCjbwUvMYosou6A9ezwDJiyRw4PU4WOmAre17bNsx7W0y46RTO7rjO1II4 KBAYxDUIvbtTFhaq0rN7WPKBBFClwrcHfDaTfwar/8GAUt4XI6BWmfZMWIpl+IK0uuv/ PsftVjF02nR5oYdThIuVFPtzfC4ZrftzuK/lVE4/8GhtVPPx0/x5oPOmLGO9ZnfCBdUR R/Ib1RnlQ+C5QDlYRwWJY34N/pXEX+aPJZG1sLKcOnKbp95xv6jmPHg/Hlahjdpmet15 gUag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=HUvWes3xLw/Q/K5Nrgp+B14FSM6GfkwFwJoAzkpHznc=; b=juoAAjr66BpsmCvWD7rKR+OHP/FTWz36Sx4pTWrwsnFMkGSvojxiX6TEtvkgIMtyW+ +jsp1KhbZ5SXPUFqh0xx06q+KYjzjRxQjkFnm+/K65FDige4SQLuW2DAIJRJblGxo+f4 AdfAvJRcjrlHV4wQf8n7ccBgnObuUhfeFFbSTby7o9lMcu60ckSpN5/tvClGjAoXdEVH neOYQIx/zBNvV05o3u/6Me00S9bqSmDF6kKYV1ov5Tjbnj/O75vkXGeiSnrf4TA/HI2i gnoX4fyU5NM0VD67OLFiNs8AzXhxnLEYA8p34yE1Wtg7SjJpNNCryYyZ8P/nAXkUJnv3 WQgQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p17-20020a170902ead100b00186a16f8d94si4655088pld.77.2022.12.23.16.23.37; Fri, 23 Dec 2022 16:23:45 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233273AbiLXAWs (ORCPT + 64 others); Fri, 23 Dec 2022 19:22:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233533AbiLXAWd (ORCPT ); Fri, 23 Dec 2022 19:22:33 -0500 Received: from 7.mo546.mail-out.ovh.net (7.mo546.mail-out.ovh.net [46.105.45.98]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C40DD1AA18 for ; Fri, 23 Dec 2022 16:22:29 -0800 (PST) Received: from ex4.mail.ovh.net (unknown [10.111.208.63]) by mo546.mail-out.ovh.net (Postfix) with ESMTPS id 920EF24CA9; Fri, 23 Dec 2022 23:42:40 +0000 (UTC) Received: from dev-fedora-x86-64.naccy.de (37.65.8.229) by DAG10EX1.indiv4.local (172.16.2.91) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.16; Sat, 24 Dec 2022 00:42:38 +0100 From: Quentin Deslandes To: CC: , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Mykola Lysenko , Shuah Khan , Dmitrii Banshchikov , , , , Subject: [PATCH bpf-next v3 00/16] bpfilter Date: Sat, 24 Dec 2022 00:40:08 +0100 Message-ID: <20221223234127.474463-1-qde@naccy.de> X-Mailer: git-send-email 2.38.1 MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [37.65.8.229] X-ClientProxiedBy: CAS6.indiv4.local (172.16.1.6) To DAG10EX1.indiv4.local (172.16.2.91) X-Ovh-Tracer-Id: 4391572586693783159 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -85 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedvhedrheefgddufecutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfqggfjpdevjffgvefmvefgnecuuegrihhlohhuthemucehtddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenogetfedtuddqtdduucdludehmdenucfjughrpefhvfevufffkffoggfgtghisehtkeertdertddtnecuhfhrohhmpefsuhgvnhhtihhnucffvghslhgrnhguvghsuceoqhguvgesnhgrtggthidruggvqeenucggtffrrghtthgvrhhnpeejgeehueefjeeihfeugefftdehtdeikeduvdettefgieekffekuefgveekgedvheenucffohhmrghinhepkhgvrhhnvghlrdhorhhgpdhusghunhhtuhdrtghomhenucfkphepuddvjedrtddrtddruddpfeejrdeihedrkedrvddvleenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepihhnvghtpeduvdejrddtrddtrddupdhmrghilhhfrhhomhepoehquggvsehnrggttgihrdguvgeqpdhnsggprhgtphhtthhopedupdhrtghpthhtohephhgrohhluhhosehgohhoghhlvgdrtghomhdpsghpfhesvhhgvghrrdhkvghrnhgvlhdrohhrghdplhhinhhugidqkhgvrhhnvghlsehvghgvrhdrkhgvrhhnvghlrdhorhhgpdhmvgesuhgsihhquhgvrdhsphgsrdhruhdpshhhuhgrhheskhgvrhhnvghlrdhorhhgpdhmhihkohhlrghlsehfsgdrtghomhdpphgrsggvnhhisehrvgguhhgrthdrtghomhdpkhhusggrsehkvg hrnhgvlhdrohhrghdpvgguuhhmrgiivghtsehgohhoghhlvgdrtghomhdpuggrvhgvmhesuggrvhgvmhhlohhfthdrnhgvthdpjhholhhsrgeskhgvrhhnvghlrdhorhhgpdhlihhnuhigqdhkshgvlhhfthgvshhtsehvghgvrhdrkhgvrhhnvghlrdhorhhgpdhsughfsehgohhoghhlvgdrtghomhdpkhhpshhinhhghheskhgvrhhnvghlrdhorhhgpdhjohhhnhdrfhgrshhtrggsvghnugesghhmrghilhdrtghomhdphihhshesfhgsrdgtohhmpdhsohhngheskhgvrhhnvghlrdhorhhgpdhmrghrthhinhdrlhgruheslhhinhhugidruggvvhdprghnughrihhisehkvghrnhgvlhdrohhrghdpuggrnhhivghlsehiohhgvggrrhgsohigrdhnvghtpdgrshhtsehkvghrnhgvlhdrohhrghdpkhgvrhhnvghlqdhtvggrmhesmhgvthgrrdgtohhmpdhnvghtuggvvhesvhhgvghrrdhkvghrnhgvlhdrohhrghdpoffvtefjohhsthepmhhoheegiedpmhhouggvpehsmhhtphhouhht X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The patchset is based on the patches from David S. Miller [1], Daniel Borkmann [2], and Dmitrii Banshchikov [3]. The main goal of the patchset is to prepare bpfilter for iptables' configuration blob parsing and code generation. The patchset introduces data structures and code for matches, targets, rules and tables. Beside that the code generation is introduced. The first version of the code generation supports only "inline" mode - all chains and their rules emit instructions in linear approach. Things that are not implemented yet: 1) The process of switching from the previous BPF programs to the new set isn't atomic. 2) No support of device ifindex - it's hardcoded 3) No helper subprog for counters update Another problem is using iptables' blobs for tests and filter table initialization. While it saves lines something more maintainable should be done here. The plan for the next iteration: 1) Add a helper program for counters update 2) Handle ifindex Patches 1/2 adds definitions of the used types. Patch 3 adds logging to bpfilter. Patch 4 adds an associative map. Patch 5 add runtime context structure. Patches 6/7 add code generation infrastructure and TC code generator. Patches 8/9/10/11/12 add code for matches, targets, rules and table. Patch 13 adds code generation for table. Patch 14 handles hooked setsockopt(2) calls. Patch 15 adds filter table Patch 16 uses prepared code in main(). Due to poor hardware availability on my side, I've not been able to benchmark those changes. I plan to get some numbers for the next iteration. FORWARD filter chain is now supported, however, it's attached to TC INGRESS along with INPUT filter chain. This is due to XDP not supporting multiple programs to be attached. I could generate a single program out of both INPUT and FORWARD chains, but that would prevent another BPF program to be attached to the interface anyway. If a solution exists to attach both those programs to XDP while allowing for other programs to be attached, it requires more investigation. In the meantime, INPUT and FORWARD filtering is supported using TC. Most of the code in this series was written by Dmitrii Banshchikov, my changes are limited to v3. I've tried to reflect this fact in the commits by adding 'Co-developed-by:' and 'Signed-off-by:' for Dmitrii, please tell me this was done the wrong way. v2 -> v3 Chains: * Add support for FORWARD filter chain. * Add generation of BPF bytecode to assess whether a packet should be forwarded or not, using bpf_fib_lookup(). * Allow for multiple programs to be attached to TC. * Allow for multiple TC hooks to be used. Code generation: * Remove duplicated BPF bytecode generation. * Fix a bug regarding jump offset during generation. * Remove support for XDP from the series, as it's not currently used. Table: * Add new filter_table_update_counters() virtual call. It updates the table's counter stored in the ipt_entry structure. This way, when iptables tries to fetch the values of the counters, bpfilter only has to copy the ipt_entry cached in the table structure. Logging: * Refactor logging primitives. Sockopts: * Add support for userspace counters querying. Rule: * Store the rule's index inside struct rule, to each counters' map usage. v1 -> v2 Maps: * Use map_upsert instead of separate map_insert and map_update Matches: * Add a new virtual call - gen_inline. The call is used for * inline generating of a rule's match. Targets: * Add a new virtual call - gen_inline. The call is used for inline generating of a rule's target. Rules: * Add code generation for rules Table: * Add struct table_ops * Add map for table_ops * Add filter table * Reorganize the way filter table is initialized Sockopts: * Install/uninstall BPF programs while handling IPT_SO_SET_REPLACE Code generation: * Add first version of the code generation Dependencies: * Add libbpf v0 -> v1 IO: * Use ssize_t in pvm_read, pvm_write for total_bytes * Move IO functions into sockopt.c and main.c Logging: * Use LOGLEVEL_EMERG, LOGLEVEL_NOTICE, LOGLEVE_DEBUG while logging to /dev/kmsg * Prepend log message with where n is log level * Conditionally enable BFLOG_DEBUG messages * Merge bflog.{h,c} into context.h Matches: * Reorder fields in struct match_ops for tight packing * Get rid of struct match_ops_map * Rename udp_match_ops to xt_udp * Use XT_ALIGN macro * Store payload size in match size * Move udp match routines into a separate file Targets: * Reorder fields in struct target_ops for tight packing * Get rid of struct target_ops_map * Add comments for convert_verdict function Rules: * Add validation Tables: * Combine table_map and table_list into table_index * Add validation Sockopts: * Handle IPT_SO_GET_REVISION_TARGET 1. https://lore.kernel.org/patchwork/patch/902785/ 2. https://lore.kernel.org/patchwork/patch/902783/ 3. https://kernel.ubuntu.com/~cking/stress-ng/stress-ng.pdf Quentin Deslandes (16): bpfilter: add types for usermode helper tools: add bpfilter usermode helper header bpfilter: add logging facility bpfilter: add map container bpfilter: add runtime context bpfilter: add BPF bytecode generation infrastructure bpfilter: add support for TC bytecode generation bpfilter: add match structure bpfilter: add support for src/dst addr and ports bpfilter: add target structure bpfilter: add rule structure bpfilter: add table structure bpfilter: add table code generation bpfilter: add setsockopt() support bpfilter: add filter table bpfilter: handle setsockopt() calls include/uapi/linux/bpfilter.h | 154 +++ net/bpfilter/Makefile | 16 +- net/bpfilter/codegen.c | 1040 +++++++++++++++++ net/bpfilter/codegen.h | 183 +++ net/bpfilter/context.c | 168 +++ net/bpfilter/context.h | 24 + net/bpfilter/filter-table.c | 344 ++++++ net/bpfilter/filter-table.h | 18 + net/bpfilter/logger.c | 52 + net/bpfilter/logger.h | 80 ++ net/bpfilter/main.c | 132 ++- net/bpfilter/map-common.c | 51 + net/bpfilter/map-common.h | 19 + net/bpfilter/match.c | 55 + net/bpfilter/match.h | 37 + net/bpfilter/rule.c | 286 +++++ net/bpfilter/rule.h | 37 + net/bpfilter/sockopt.c | 533 +++++++++ net/bpfilter/sockopt.h | 15 + net/bpfilter/table.c | 391 +++++++ net/bpfilter/table.h | 59 + net/bpfilter/target.c | 203 ++++ net/bpfilter/target.h | 57 + net/bpfilter/xt_udp.c | 111 ++ tools/include/uapi/linux/bpfilter.h | 175 +++ .../testing/selftests/bpf/bpfilter/.gitignore | 8 + tools/testing/selftests/bpf/bpfilter/Makefile | 57 + .../selftests/bpf/bpfilter/bpfilter_util.h | 80 ++ .../selftests/bpf/bpfilter/test_codegen.c | 338 ++++++ .../testing/selftests/bpf/bpfilter/test_map.c | 63 + .../selftests/bpf/bpfilter/test_match.c | 69 ++ .../selftests/bpf/bpfilter/test_rule.c | 56 + .../selftests/bpf/bpfilter/test_target.c | 83 ++ .../selftests/bpf/bpfilter/test_xt_udp.c | 48 + 34 files changed, 4999 insertions(+), 43 deletions(-) create mode 100644 net/bpfilter/codegen.c create mode 100644 net/bpfilter/codegen.h create mode 100644 net/bpfilter/context.c create mode 100644 net/bpfilter/context.h create mode 100644 net/bpfilter/filter-table.c create mode 100644 net/bpfilter/filter-table.h create mode 100644 net/bpfilter/logger.c create mode 100644 net/bpfilter/logger.h create mode 100644 net/bpfilter/map-common.c create mode 100644 net/bpfilter/map-common.h create mode 100644 net/bpfilter/match.c create mode 100644 net/bpfilter/match.h create mode 100644 net/bpfilter/rule.c create mode 100644 net/bpfilter/rule.h create mode 100644 net/bpfilter/sockopt.c create mode 100644 net/bpfilter/sockopt.h create mode 100644 net/bpfilter/table.c create mode 100644 net/bpfilter/table.h create mode 100644 net/bpfilter/target.c create mode 100644 net/bpfilter/target.h create mode 100644 net/bpfilter/xt_udp.c create mode 100644 tools/include/uapi/linux/bpfilter.h create mode 100644 tools/testing/selftests/bpf/bpfilter/.gitignore create mode 100644 tools/testing/selftests/bpf/bpfilter/Makefile create mode 100644 tools/testing/selftests/bpf/bpfilter/bpfilter_util.h create mode 100644 tools/testing/selftests/bpf/bpfilter/test_codegen.c create mode 100644 tools/testing/selftests/bpf/bpfilter/test_map.c create mode 100644 tools/testing/selftests/bpf/bpfilter/test_match.c create mode 100644 tools/testing/selftests/bpf/bpfilter/test_rule.c create mode 100644 tools/testing/selftests/bpf/bpfilter/test_target.c create mode 100644 tools/testing/selftests/bpf/bpfilter/test_xt_udp.c -- 2.38.1