2024-03-01 18:09:25

by Matthieu Baerts (NGI0)

[permalink] [raw]
Subject: [PATCH net-next 0/4] mptcp: add TCP_NOTSENT_LOWAT sockopt support

Patch 3 does the magic of adding TCP_NOTSENT_LOWAT support, all the
other ones are minor cleanup seen along when working on the new feature.

Note that this feature relies on the existing accounting for snd_nxt.
Such accounting is not 110% accurate as it tracks the most recent
sequence number queued to any subflow, and not the actual sequence
number sent on the wire. Paolo experimented a lot, trying to implement
the latter, and in the end it proved to be both "too complex" and "not
necessary".

The complexity raises from the need for additional lock and a lot of
refactoring to introduce such protections without adding significant
overhead. Additionally, snd_nxt is currently used and exposed with the
current semantic by the internal packet scheduling. Introducing a
different tracking will still require us to keep the old one.

More interestingly, a more accurate tracking could be not strictly
necessary: as the MPTCP socket enqueues data to the subflows only up to
the available send window, any enqueue data is sent on the wire
instantly, without any blocking operation short or a drop in the tx path
at the nft or TC layer.

Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
---
Paolo Abeni (4):
mptcp: cleanup writer wake-up
mptcp: avoid some duplicate code in socket option handling
mptcp: implement TCP_NOTSENT_LOWAT support
mptcp: cleanup SOL_TCP handling

net/mptcp/protocol.c | 54 ++++++++++++++++++++++++++-------------
net/mptcp/protocol.h | 42 +++++++++++++++++++++++--------
net/mptcp/sockopt.c | 71 +++++++++++++++++++++++-----------------------------
3 files changed, 101 insertions(+), 66 deletions(-)
---
base-commit: e960825709330cb199d209740326cec37e8c419d
change-id: 20240301-upstream-net-next-20240301-mptcp-tcp_notsent_lowat-770cab93d253

Best regards,
--
Matthieu Baerts (NGI0) <[email protected]>



2024-03-01 18:10:02

by Matthieu Baerts (NGI0)

[permalink] [raw]
Subject: [PATCH net-next 2/4] mptcp: avoid some duplicate code in socket option handling

From: Paolo Abeni <[email protected]>

The mptcp_get_int_option() helper is needless open-coded in a
couple of places, replace the duplicate code with the helper
call.

Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
---
net/mptcp/sockopt.c | 20 ++++++++------------
1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c
index da37e4541a5d..ac37f6c5e2ed 100644
--- a/net/mptcp/sockopt.c
+++ b/net/mptcp/sockopt.c
@@ -629,13 +629,11 @@ static int mptcp_setsockopt_sol_tcp_cork(struct mptcp_sock *msk, sockptr_t optva
{
struct mptcp_subflow_context *subflow;
struct sock *sk = (struct sock *)msk;
- int val;
+ int val, ret;

- if (optlen < sizeof(int))
- return -EINVAL;
-
- if (copy_from_sockptr(&val, optval, sizeof(val)))
- return -EFAULT;
+ ret = mptcp_get_int_option(msk, optval, optlen, &val);
+ if (ret)
+ return ret;

lock_sock(sk);
sockopt_seq_inc(msk);
@@ -659,13 +657,11 @@ static int mptcp_setsockopt_sol_tcp_nodelay(struct mptcp_sock *msk, sockptr_t op
{
struct mptcp_subflow_context *subflow;
struct sock *sk = (struct sock *)msk;
- int val;
+ int val, ret;

- if (optlen < sizeof(int))
- return -EINVAL;
-
- if (copy_from_sockptr(&val, optval, sizeof(val)))
- return -EFAULT;
+ ret = mptcp_get_int_option(msk, optval, optlen, &val);
+ if (ret)
+ return ret;

lock_sock(sk);
sockopt_seq_inc(msk);

--
2.43.0


2024-03-01 18:10:36

by Matthieu Baerts (NGI0)

[permalink] [raw]
Subject: [PATCH net-next 4/4] mptcp: cleanup SOL_TCP handling

From: Paolo Abeni <[email protected]>

Most TCP-level socket options get an integer from user space, and
set the corresponding field under the msk-level socket lock.

Reduce the code duplication moving such operations in the common code.

Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
---
net/mptcp/sockopt.c | 75 +++++++++++++++++++++--------------------------------
1 file changed, 30 insertions(+), 45 deletions(-)

diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c
index 1b38dac70719..dcd1c76d2a3b 100644
--- a/net/mptcp/sockopt.c
+++ b/net/mptcp/sockopt.c
@@ -624,18 +624,11 @@ static int mptcp_setsockopt_sol_tcp_congestion(struct mptcp_sock *msk, sockptr_t
return ret;
}

-static int mptcp_setsockopt_sol_tcp_cork(struct mptcp_sock *msk, sockptr_t optval,
- unsigned int optlen)
+static int __mptcp_setsockopt_sol_tcp_cork(struct mptcp_sock *msk, int val)
{
struct mptcp_subflow_context *subflow;
struct sock *sk = (struct sock *)msk;
- int val, ret;

- ret = mptcp_get_int_option(msk, optval, optlen, &val);
- if (ret)
- return ret;
-
- lock_sock(sk);
sockopt_seq_inc(msk);
msk->cork = !!val;
mptcp_for_each_subflow(msk, subflow) {
@@ -647,23 +640,15 @@ static int mptcp_setsockopt_sol_tcp_cork(struct mptcp_sock *msk, sockptr_t optva
}
if (!val)
mptcp_check_and_set_pending(sk);
- release_sock(sk);

return 0;
}

-static int mptcp_setsockopt_sol_tcp_nodelay(struct mptcp_sock *msk, sockptr_t optval,
- unsigned int optlen)
+static int __mptcp_setsockopt_sol_tcp_nodelay(struct mptcp_sock *msk, int val)
{
struct mptcp_subflow_context *subflow;
struct sock *sk = (struct sock *)msk;
- int val, ret;

- ret = mptcp_get_int_option(msk, optval, optlen, &val);
- if (ret)
- return ret;
-
- lock_sock(sk);
sockopt_seq_inc(msk);
msk->nodelay = !!val;
mptcp_for_each_subflow(msk, subflow) {
@@ -675,8 +660,6 @@ static int mptcp_setsockopt_sol_tcp_nodelay(struct mptcp_sock *msk, sockptr_t op
}
if (val)
mptcp_check_and_set_pending(sk);
- release_sock(sk);
-
return 0;
}

@@ -799,35 +782,10 @@ static int mptcp_setsockopt_sol_tcp(struct mptcp_sock *msk, int optname,
int ret, val;

switch (optname) {
- case TCP_INQ:
- ret = mptcp_get_int_option(msk, optval, optlen, &val);
- if (ret)
- return ret;
- if (val < 0 || val > 1)
- return -EINVAL;
-
- lock_sock(sk);
- msk->recvmsg_inq = !!val;
- release_sock(sk);
- return 0;
case TCP_ULP:
return -EOPNOTSUPP;
- case TCP_NOTSENT_LOWAT:
- ret = mptcp_get_int_option(msk, optval, optlen, &val);
- if (ret)
- return ret;
-
- lock_sock(sk);
- WRITE_ONCE(msk->notsent_lowat, val);
- mptcp_write_space(sk);
- release_sock(sk);
- return 0;
case TCP_CONGESTION:
return mptcp_setsockopt_sol_tcp_congestion(msk, optval, optlen);
- case TCP_CORK:
- return mptcp_setsockopt_sol_tcp_cork(msk, optval, optlen);
- case TCP_NODELAY:
- return mptcp_setsockopt_sol_tcp_nodelay(msk, optval, optlen);
case TCP_DEFER_ACCEPT:
/* See tcp.c: TCP_DEFER_ACCEPT does not fail */
mptcp_setsockopt_first_sf_only(msk, SOL_TCP, optname, optval, optlen);
@@ -840,7 +798,34 @@ static int mptcp_setsockopt_sol_tcp(struct mptcp_sock *msk, int optname,
optval, optlen);
}

- return -EOPNOTSUPP;
+ ret = mptcp_get_int_option(msk, optval, optlen, &val);
+ if (ret)
+ return ret;
+
+ lock_sock(sk);
+ switch (optname) {
+ case TCP_INQ:
+ if (val < 0 || val > 1)
+ ret = -EINVAL;
+ else
+ msk->recvmsg_inq = !!val;
+ break;
+ case TCP_NOTSENT_LOWAT:
+ WRITE_ONCE(msk->notsent_lowat, val);
+ mptcp_write_space(sk);
+ break;
+ case TCP_CORK:
+ ret = __mptcp_setsockopt_sol_tcp_cork(msk, val);
+ break;
+ case TCP_NODELAY:
+ ret = __mptcp_setsockopt_sol_tcp_nodelay(msk, val);
+ break;
+ default:
+ ret = -ENOPROTOOPT;
+ }
+
+ release_sock(sk);
+ return ret;
}

int mptcp_setsockopt(struct sock *sk, int level, int optname,

--
2.43.0


2024-03-04 11:00:39

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH net-next 0/4] mptcp: add TCP_NOTSENT_LOWAT sockopt support

Hello:

This series was applied to netdev/net-next.git (main)
by David S. Miller <[email protected]>:

On Fri, 01 Mar 2024 18:43:43 +0100 you wrote:
> Patch 3 does the magic of adding TCP_NOTSENT_LOWAT support, all the
> other ones are minor cleanup seen along when working on the new feature.
>
> Note that this feature relies on the existing accounting for snd_nxt.
> Such accounting is not 110% accurate as it tracks the most recent
> sequence number queued to any subflow, and not the actual sequence
> number sent on the wire. Paolo experimented a lot, trying to implement
> the latter, and in the end it proved to be both "too complex" and "not
> necessary".
>
> [...]

Here is the summary with links:
- [net-next,1/4] mptcp: cleanup writer wake-up
https://git.kernel.org/netdev/net-next/c/037db6ea57da
- [net-next,2/4] mptcp: avoid some duplicate code in socket option handling
https://git.kernel.org/netdev/net-next/c/a74762675f70
- [net-next,3/4] mptcp: implement TCP_NOTSENT_LOWAT support
https://git.kernel.org/netdev/net-next/c/29b5e5ef8739
- [net-next,4/4] mptcp: cleanup SOL_TCP handling
https://git.kernel.org/netdev/net-next/c/7f71a337b515

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html