2023-08-03 16:13:06

by Gerd Bayer

[permalink] [raw]
Subject: [PATCH net v2 0/2] net/smc: Fix effective buffer size

Hi all,

commit 0227f058aa29 ("net/smc: Unbind r/w buffer size from clcsock
and make them tunable") started to derive the effective buffer size for
SMC connections inconsistently in case a TCP fallback was used and
memory consumption of SMC with the default settings was doubled when
a connection negotiated SMC. That was not what we want.

This series consolidates the resulting effective buffer size that is
used with SMC sockets, which is based on Jan Karcher's effort (see
[1]). For all TCP exchanges (in particular in case of a fall back when
no SMC connection was possible) the values from net.ipv4.tcp_[rw]mem
are used. If SMC succeeds in establishing a SMC connection, the newly
introduced values from net.smc.[rw]mem are used.

net.smc.[rw]mem is initialized to 64kB, respectively. Internal test
have show this to be a good compromise between throughput/latency
and memory consumption. Also net.smc.[rw]mem is now decoupled completely
from any tuning through net.ipv4.tcp_[rw]mem.

If a user chose to tune a socket's receive or send buffer size with
setsockopt, this tuning is now consistently applied to either fall-back
TCP or proper SMC connections over the socket.

Thanks,
Gerd

v1 - v2:
- In second patch, use sock_net() helper as suggested by Tony and demanded
by kernel test robot.

[1] https://lore.kernel.org/netdev/[email protected]


Gerd Bayer (2):
net/smc: Fix setsockopt and sysctl to specify same buffer size again
net/smc: Use correct buffer sizes when switching between TCP and SMC

net/smc/af_smc.c | 77 ++++++++++++++++++++++++++++++--------------
net/smc/smc.h | 2 +-
net/smc/smc_clc.c | 4 +--
net/smc/smc_core.c | 25 +++++++-------
net/smc/smc_sysctl.c | 10 ++++--
5 files changed, 76 insertions(+), 42 deletions(-)


base-commit: 5d0c230f1de8c7515b6567d9afba1f196fb4e2f4
--
2.41.0



2023-08-03 16:16:34

by Gerd Bayer

[permalink] [raw]
Subject: [PATCH net v2 2/2] net/smc: Use correct buffer sizes when switching between TCP and SMC

Tuning of the effective buffer size through setsockopts was working for
SMC traffic only but not for TCP fall-back connections even before
commit 0227f058aa29 ("net/smc: Unbind r/w buffer size from clcsock and
make them tunable"). That change made it apparent that TCP fall-back
connections would use net.smc.[rw]mem as buffer size instead of
net.ipv4_tcp_[rw]mem.

Amend the code that copies attributes between the (TCP) clcsock and the
SMC socket and adjust buffer sizes appropriately:
- Copy over sk_userlocks so that both sockets agree on whether tuning
via setsockopt is active.
- When falling back to TCP use sk_sndbuf or sk_rcvbuf as specified with
setsockopt. Otherwise, use the sysctl value for TCP/IPv4.
- Likewise, use either values from setsockopt or from sysctl for SMC
(duplicated) on successful SMC connect.

In smc_tcp_listen_work() drop the explicit copy of buffer sizes as that
is taken care of by the attribute copy.

Fixes: 0227f058aa29 ("net/smc: Unbind r/w buffer size from clcsock and make them tunable")
Signed-off-by: Gerd Bayer <[email protected]>
Reviewed-by: Wenjia Zhang <[email protected]>
Reviewed-by: Tony Lu <[email protected]>
---
net/smc/af_smc.c | 73 +++++++++++++++++++++++++++++++++---------------
1 file changed, 51 insertions(+), 22 deletions(-)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 1fcf1e42474a..385e86bd6bdf 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -436,13 +436,60 @@ static int smc_bind(struct socket *sock, struct sockaddr *uaddr,
return rc;
}

+/* copy only relevant settings and flags of SOL_SOCKET level from smc to
+ * clc socket (since smc is not called for these options from net/core)
+ */
+
+#define SK_FLAGS_SMC_TO_CLC ((1UL << SOCK_URGINLINE) | \
+ (1UL << SOCK_KEEPOPEN) | \
+ (1UL << SOCK_LINGER) | \
+ (1UL << SOCK_BROADCAST) | \
+ (1UL << SOCK_TIMESTAMP) | \
+ (1UL << SOCK_DBG) | \
+ (1UL << SOCK_RCVTSTAMP) | \
+ (1UL << SOCK_RCVTSTAMPNS) | \
+ (1UL << SOCK_LOCALROUTE) | \
+ (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE) | \
+ (1UL << SOCK_RXQ_OVFL) | \
+ (1UL << SOCK_WIFI_STATUS) | \
+ (1UL << SOCK_NOFCS) | \
+ (1UL << SOCK_FILTER_LOCKED) | \
+ (1UL << SOCK_TSTAMP_NEW))
+
+/* if set, use value set by setsockopt() - else use IPv4 or SMC sysctl value */
+static void smc_adjust_sock_bufsizes(struct sock *nsk, struct sock *osk,
+ unsigned long mask)
+{
+ struct net *nnet = sock_net(nsk);
+
+ nsk->sk_userlocks = osk->sk_userlocks;
+ if (osk->sk_userlocks & SOCK_SNDBUF_LOCK) {
+ nsk->sk_sndbuf = osk->sk_sndbuf;
+ } else {
+ if (mask == SK_FLAGS_SMC_TO_CLC)
+ WRITE_ONCE(nsk->sk_sndbuf,
+ READ_ONCE(nnet->ipv4.sysctl_tcp_wmem[1]));
+ else
+ WRITE_ONCE(nsk->sk_sndbuf,
+ 2 * READ_ONCE(nnet->smc.sysctl_wmem));
+ }
+ if (osk->sk_userlocks & SOCK_RCVBUF_LOCK) {
+ nsk->sk_rcvbuf = osk->sk_rcvbuf;
+ } else {
+ if (mask == SK_FLAGS_SMC_TO_CLC)
+ WRITE_ONCE(nsk->sk_rcvbuf,
+ READ_ONCE(nnet->ipv4.sysctl_tcp_rmem[1]));
+ else
+ WRITE_ONCE(nsk->sk_rcvbuf,
+ 2 * READ_ONCE(nnet->smc.sysctl_rmem));
+ }
+}
+
static void smc_copy_sock_settings(struct sock *nsk, struct sock *osk,
unsigned long mask)
{
/* options we don't get control via setsockopt for */
nsk->sk_type = osk->sk_type;
- nsk->sk_sndbuf = osk->sk_sndbuf;
- nsk->sk_rcvbuf = osk->sk_rcvbuf;
nsk->sk_sndtimeo = osk->sk_sndtimeo;
nsk->sk_rcvtimeo = osk->sk_rcvtimeo;
nsk->sk_mark = osk->sk_mark;
@@ -453,26 +500,10 @@ static void smc_copy_sock_settings(struct sock *nsk, struct sock *osk,

nsk->sk_flags &= ~mask;
nsk->sk_flags |= osk->sk_flags & mask;
+
+ smc_adjust_sock_bufsizes(nsk, osk, mask);
}

-#define SK_FLAGS_SMC_TO_CLC ((1UL << SOCK_URGINLINE) | \
- (1UL << SOCK_KEEPOPEN) | \
- (1UL << SOCK_LINGER) | \
- (1UL << SOCK_BROADCAST) | \
- (1UL << SOCK_TIMESTAMP) | \
- (1UL << SOCK_DBG) | \
- (1UL << SOCK_RCVTSTAMP) | \
- (1UL << SOCK_RCVTSTAMPNS) | \
- (1UL << SOCK_LOCALROUTE) | \
- (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE) | \
- (1UL << SOCK_RXQ_OVFL) | \
- (1UL << SOCK_WIFI_STATUS) | \
- (1UL << SOCK_NOFCS) | \
- (1UL << SOCK_FILTER_LOCKED) | \
- (1UL << SOCK_TSTAMP_NEW))
-/* copy only relevant settings and flags of SOL_SOCKET level from smc to
- * clc socket (since smc is not called for these options from net/core)
- */
static void smc_copy_sock_settings_to_clc(struct smc_sock *smc)
{
smc_copy_sock_settings(smc->clcsock->sk, &smc->sk, SK_FLAGS_SMC_TO_CLC);
@@ -2479,8 +2510,6 @@ static void smc_tcp_listen_work(struct work_struct *work)
sock_hold(lsk); /* sock_put in smc_listen_work */
INIT_WORK(&new_smc->smc_listen_work, smc_listen_work);
smc_copy_sock_settings_to_smc(new_smc);
- new_smc->sk.sk_sndbuf = lsmc->sk.sk_sndbuf;
- new_smc->sk.sk_rcvbuf = lsmc->sk.sk_rcvbuf;
sock_hold(&new_smc->sk); /* sock_put in passive closing */
if (!queue_work(smc_hs_wq, &new_smc->smc_listen_work))
sock_put(&new_smc->sk);
--
2.41.0


2023-08-04 14:42:35

by Simon Horman

[permalink] [raw]
Subject: Re: [PATCH net v2 0/2] net/smc: Fix effective buffer size

On Thu, Aug 03, 2023 at 04:46:03PM +0200, Gerd Bayer wrote:
> Hi all,
>
> commit 0227f058aa29 ("net/smc: Unbind r/w buffer size from clcsock
> and make them tunable") started to derive the effective buffer size for
> SMC connections inconsistently in case a TCP fallback was used and
> memory consumption of SMC with the default settings was doubled when
> a connection negotiated SMC. That was not what we want.
>
> This series consolidates the resulting effective buffer size that is
> used with SMC sockets, which is based on Jan Karcher's effort (see
> [1]). For all TCP exchanges (in particular in case of a fall back when
> no SMC connection was possible) the values from net.ipv4.tcp_[rw]mem
> are used. If SMC succeeds in establishing a SMC connection, the newly
> introduced values from net.smc.[rw]mem are used.
>
> net.smc.[rw]mem is initialized to 64kB, respectively. Internal test
> have show this to be a good compromise between throughput/latency
> and memory consumption. Also net.smc.[rw]mem is now decoupled completely
> from any tuning through net.ipv4.tcp_[rw]mem.
>
> If a user chose to tune a socket's receive or send buffer size with
> setsockopt, this tuning is now consistently applied to either fall-back
> TCP or proper SMC connections over the socket.
>
> Thanks,
> Gerd
>
> v1 - v2:
> - In second patch, use sock_net() helper as suggested by Tony and demanded
> by kernel test robot.
>
> [1] https://lore.kernel.org/netdev/[email protected]

Hi Gerd,

unfortunately this patchset does not appear to apply to current 'net'.

Could you rebase and send a v3?

--
pw-bot: changes-requested


2023-08-04 17:39:50

by Gerd Bayer

[permalink] [raw]
Subject: Re: [PATCH net v2 0/2] net/smc: Fix effective buffer size

On Fri, 2023-08-04 at 16:17 +0200, Simon Horman wrote:
> On Thu, Aug 03, 2023 at 04:46:03PM +0200, Gerd Bayer wrote:
> > Hi all,
> >
[...]
> > v1 - v2:
> >  - In second patch, use sock_net() helper as suggested by Tony and
> > demanded
> >    by kernel test robot.
> >
> > [1]
> > https://lore.kernel.org/netdev/[email protected]
>
> Hi Gerd,
>
> unfortunately this patchset does not appear to apply to current
> 'net'.
>
> Could you rebase and send a v3?
>
Hi Simon,

sure, working on it.

Thanks,
Gerd