2024-05-09 15:49:53

by Matthieu Baerts (NGI0)

[permalink] [raw]
Subject: [PATCH bpf-next v2 0/2] selftests/bpf: new MPTCP subflow subtest

In this series from Geliang, modifying MPTCP BPF selftests, we have:

- A new MPTCP subflow BPF program setting socket options per subflow: it
looks better to have this old test program in the BPF selftests to
track regressions and to serve as example.

Note: Nicolas is no longer working for Tessares, but he did this work
while working for them, and his email address is no longer available.

- A new MPTCP BPF subtest validating this new BPF program.

Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
---
Changes in v2:
- Previous patches 1/4 and 2/4 have been dropped from this series:
- 1/4: "selftests/bpf: Handle SIGINT when creating netns":
- A new version, more generic and no longer specific to MPTCP BPF
selftest will be sent later, as part of a new series. (Alexei)
- 2/4: "selftests/bpf: Add RUN_MPTCP_TEST macro":
- Removed, not to hide helper functions in macros. (Alexei)
- The commit message of patch 1/2 has been clarified to avoid some
possible confusions spot by Alexei.
- Link to v1: https://lore.kernel.org/r/20240507-upstream-bpf-next-20240506-mptcp-subflow-test-v1-0-e2bcbdf49857@kernel.org

---
Geliang Tang (1):
selftests/bpf: Add mptcp subflow subtest

Nicolas Rybowski (1):
selftests/bpf: Add mptcp subflow example

tools/testing/selftests/bpf/prog_tests/mptcp.c | 109 ++++++++++++++++++++++
tools/testing/selftests/bpf/progs/mptcp_subflow.c | 70 ++++++++++++++
2 files changed, 179 insertions(+)
---
base-commit: 009367099eb61a4fc2af44d4eb06b6b4de7de6db
change-id: 20240506-upstream-bpf-next-20240506-mptcp-subflow-test-faef6654bfa3

Best regards,
--
Matthieu Baerts (NGI0) <[email protected]>



2024-05-09 15:50:14

by Matthieu Baerts (NGI0)

[permalink] [raw]
Subject: [PATCH bpf-next v2 1/2] selftests/bpf: Add mptcp subflow example

From: Nicolas Rybowski <[email protected]>

Move Nicolas' patch into bpf selftests directory. This example adds a
different mark (SO_MARK) on each subflow, and changes the TCP CC only on
the first subflow.

From the userspace, an application can do a setsockopt() on an MPTCP
socket, and typically the same value will be propagated to all subflows
(paths). If someone wants to have different values per subflow, the
recommended way is to use BPF. So it is good to add such example here,
and make sure there is no regressions.

This example shows how it is possible to:

Identify the parent msk of an MPTCP subflow.
Put different sockopt for each subflow of a same MPTCP connection.

Here especially, two different behaviours are implemented:

A socket mark (SOL_SOCKET SO_MARK) is put on each subflow of a same
MPTCP connection. The order of creation of the current subflow defines
its mark. The TCP CC algorithm of the very first subflow of an MPTCP
connection is set to "reno".

This is just to show it is possible to identify an MPTCP connection, and
set socket options, from different SOL levels, per subflow. "reno" has
been picked because it is built-in and usually not set as default one.
It is easy to verify with 'ss' that these modifications have been
applied correctly. That's what the next patch is going to do.

Nicolas' code comes from:

commit 4d120186e4d6 ("bpf:examples: update mptcp_set_mark_kern.c")

from the MPTCP repo https://github.com/multipath-tcp/mptcp_net-next (the
"scripts" branch), and it has been adapted by Geliang.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/76
Co-developed-by: Geliang Tang <[email protected]>
Signed-off-by: Geliang Tang <[email protected]>
Signed-off-by: Nicolas Rybowski <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
---
Notes:
- v1 -> v2:
- The commit message has been updated: why setting multiple socket
options, why reno, the verification is done in the next patch
(different author). (Alexei)
---
tools/testing/selftests/bpf/progs/mptcp_subflow.c | 70 +++++++++++++++++++++++
1 file changed, 70 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/mptcp_subflow.c b/tools/testing/selftests/bpf/progs/mptcp_subflow.c
new file mode 100644
index 000000000000..de9dbba37133
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/mptcp_subflow.c
@@ -0,0 +1,70 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020, Tessares SA. */
+/* Copyright (c) 2024, Kylin Software */
+
+#include <sys/socket.h> // SOL_SOCKET, SO_MARK, ...
+#include <linux/tcp.h> // TCP_CONGESTION
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_tcp_helpers.h"
+
+char _license[] SEC("license") = "GPL";
+
+#ifndef SOL_TCP
+#define SOL_TCP 6
+#endif
+
+#ifndef TCP_CA_NAME_MAX
+#define TCP_CA_NAME_MAX 16
+#endif
+
+char cc[TCP_CA_NAME_MAX] = "reno";
+
+/* Associate a subflow counter to each token */
+struct {
+ __uint(type, BPF_MAP_TYPE_HASH);
+ __uint(key_size, sizeof(__u32));
+ __uint(value_size, sizeof(__u32));
+ __uint(max_entries, 100);
+} mptcp_sf SEC(".maps");
+
+SEC("sockops")
+int mptcp_subflow(struct bpf_sock_ops *skops)
+{
+ __u32 init = 1, key, mark, *cnt;
+ struct mptcp_sock *msk;
+ struct bpf_sock *sk;
+ int err;
+
+ if (skops->op != BPF_SOCK_OPS_TCP_CONNECT_CB)
+ return 1;
+
+ sk = skops->sk;
+ if (!sk)
+ return 1;
+
+ msk = bpf_skc_to_mptcp_sock(sk);
+ if (!msk)
+ return 1;
+
+ key = msk->token;
+ cnt = bpf_map_lookup_elem(&mptcp_sf, &key);
+ if (cnt) {
+ /* A new subflow is added to an existing MPTCP connection */
+ __sync_fetch_and_add(cnt, 1);
+ mark = *cnt;
+ } else {
+ /* A new MPTCP connection is just initiated and this is its primary subflow */
+ bpf_map_update_elem(&mptcp_sf, &key, &init, BPF_ANY);
+ mark = init;
+ }
+
+ /* Set the mark of the subflow's socket based on appearance order */
+ err = bpf_setsockopt(skops, SOL_SOCKET, SO_MARK, &mark, sizeof(mark));
+ if (err < 0)
+ return 1;
+ if (mark == 1)
+ err = bpf_setsockopt(skops, SOL_TCP, TCP_CONGESTION, cc, TCP_CA_NAME_MAX);
+
+ return 1;
+}

--
2.43.0


2024-05-09 15:50:34

by Matthieu Baerts (NGI0)

[permalink] [raw]
Subject: [PATCH bpf-next v2 2/2] selftests/bpf: Add mptcp subflow subtest

From: Geliang Tang <[email protected]>

This patch adds a subtest named test_subflow to load and verify the newly
added mptcp subflow example in test_mptcp. Add a helper endpoint_init()
to add a new subflow endpoint. Add another helper ss_search() to verify the
fwmark and congestion values set by mptcp_subflow prog using setsockopts.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/76
Signed-off-by: Geliang Tang <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
---
tools/testing/selftests/bpf/prog_tests/mptcp.c | 109 +++++++++++++++++++++++++
1 file changed, 109 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c
index 274d2e033e39..6039b0ff3801 100644
--- a/tools/testing/selftests/bpf/prog_tests/mptcp.c
+++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c
@@ -9,8 +9,12 @@
#include "network_helpers.h"
#include "mptcp_sock.skel.h"
#include "mptcpify.skel.h"
+#include "mptcp_subflow.skel.h"

#define NS_TEST "mptcp_ns"
+#define ADDR_1 "10.0.1.1"
+#define ADDR_2 "10.0.1.2"
+#define PORT_1 10001

#ifndef IPPROTO_MPTCP
#define IPPROTO_MPTCP 262
@@ -340,10 +344,115 @@ static void test_mptcpify(void)
close(cgroup_fd);
}

+static int endpoint_init(char *flags)
+{
+ SYS(fail, "ip -net %s link add veth1 type veth peer name veth2", NS_TEST);
+ SYS(fail, "ip -net %s addr add %s/24 dev veth1", NS_TEST, ADDR_1);
+ SYS(fail, "ip -net %s link set dev veth1 up", NS_TEST);
+ SYS(fail, "ip -net %s addr add %s/24 dev veth2", NS_TEST, ADDR_2);
+ SYS(fail, "ip -net %s link set dev veth2 up", NS_TEST);
+ SYS(fail, "ip -net %s mptcp endpoint add %s %s", NS_TEST, ADDR_2, flags);
+
+ return 0;
+fail:
+ return -1;
+}
+
+static int _ss_search(char *src, char *dst, char *port, char *keyword)
+{
+ char cmd[128];
+ int n;
+
+ n = snprintf(cmd, sizeof(cmd),
+ "ip netns exec %s ss -Menita src %s dst %s %s %d | grep -q '%s'",
+ NS_TEST, src, dst, port, PORT_1, keyword);
+ if (n < 0 || n >= sizeof(cmd))
+ return -1;
+
+ return system(cmd);
+}
+
+static int ss_search(char *src, char *keyword)
+{
+ return _ss_search(src, ADDR_1, "dport", keyword);
+}
+
+static void run_subflow(char *new)
+{
+ int server_fd, client_fd, err;
+ char cc[TCP_CA_NAME_MAX];
+ socklen_t len = sizeof(cc);
+
+ server_fd = start_mptcp_server(AF_INET, ADDR_1, PORT_1, 0);
+ if (!ASSERT_GE(server_fd, 0, "start_mptcp_server"))
+ return;
+
+ client_fd = connect_to_fd(server_fd, 0);
+ if (!ASSERT_GE(client_fd, 0, "connect to fd"))
+ goto fail;
+
+ err = getsockopt(server_fd, SOL_TCP, TCP_CONGESTION, cc, &len);
+ if (!ASSERT_OK(err, "getsockopt(srv_fd, TCP_CONGESTION)"))
+ goto fail;
+
+ send_byte(client_fd);
+
+ ASSERT_OK(ss_search(ADDR_1, "fwmark:0x1"), "ss_search fwmark:0x1");
+ ASSERT_OK(ss_search(ADDR_2, "fwmark:0x2"), "ss_search fwmark:0x2");
+ ASSERT_OK(ss_search(ADDR_1, new), "ss_search new cc");
+ ASSERT_OK(ss_search(ADDR_2, cc), "ss_search default cc");
+
+ close(client_fd);
+fail:
+ close(server_fd);
+}
+
+static void test_subflow(void)
+{
+ int cgroup_fd, prog_fd, err;
+ struct mptcp_subflow *skel;
+ struct nstoken *nstoken;
+
+ cgroup_fd = test__join_cgroup("/mptcp_subflow");
+ if (!ASSERT_GE(cgroup_fd, 0, "join_cgroup: mptcp_subflow"))
+ return;
+
+ skel = mptcp_subflow__open_and_load();
+ if (!ASSERT_OK_PTR(skel, "skel_open_load: mptcp_subflow"))
+ goto close_cgroup;
+
+ err = mptcp_subflow__attach(skel);
+ if (!ASSERT_OK(err, "skel_attach: mptcp_subflow"))
+ goto skel_destroy;
+
+ prog_fd = bpf_program__fd(skel->progs.mptcp_subflow);
+ err = bpf_prog_attach(prog_fd, cgroup_fd, BPF_CGROUP_SOCK_OPS, 0);
+ if (!ASSERT_OK(err, "prog_attach"))
+ goto skel_destroy;
+
+ nstoken = create_netns();
+ if (!ASSERT_OK_PTR(nstoken, "create_netns: mptcp_subflow"))
+ goto skel_destroy;
+
+ if (!ASSERT_OK(endpoint_init("subflow"), "endpoint_init"))
+ goto close_netns;
+
+ run_subflow(skel->data->cc);
+
+close_netns:
+ cleanup_netns(nstoken);
+skel_destroy:
+ mptcp_subflow__destroy(skel);
+close_cgroup:
+ close(cgroup_fd);
+}
+
void test_mptcp(void)
{
if (test__start_subtest("base"))
test_base();
if (test__start_subtest("mptcpify"))
test_mptcpify();
+ if (test__start_subtest("subflow"))
+ test_subflow();
}

--
2.43.0