2023-11-21 02:04:15

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH 0/7] TCP-AO fixes

Hi,

I've been working on TCP-AO key-rotation selftests and as a result
exercised some corner-cases that are not usually met in production.

Here are a bunch of semi-related fixes:
- Documentation typo (reported by Markus Elfring)
- Proper alignment for TCP-AO option in TCP header that has MAC length
of non 4 bytes (now a selftest with randomized maclen/algorithm/etc
passes)
- 3 uAPI restricting patches that disallow more things to userspace in
order to prevent it shooting itself in any parts of the body
- SNEs READ_ONCE()/WRITE_ONCE() that went missing by my human factor
- Avoid storing MAC length from SYN header as SYN-ACK will use
rnext_key.maclen (drops an extra check that fails on new selftests)

Please, consider applying/pulling.

The following changes since commit 98b1cc82c4affc16f5598d4fa14b1858671b2263:

Linux 6.7-rc2 (2023-11-19 15:02:14 -0800)

are available in the Git repository at:

[email protected]:0x7f454c46/linux.git tcp-ao-post-merge

for you to fetch changes up to 4555b5b8d11f4d19ef32a761e2d87dd378e9a435:

net/tcp: Don't store TCP-AO maclen on reqsk (2023-11-21 01:48:23 +0000)

----------------------------------------------------------------
Dmitry Safonov (7):
Documentation/tcp: Fix an obvious typo
net/tcp: Consistently align TCP-AO option in the header
net/tcp: Limit TCP_AO_REPAIR to non-listen sockets
net/tcp: Reset TCP-AO cached keys on listen() syscall
net/tcp: Don't add key with non-matching VRF on connected sockets
net/tcp: ACCESS_ONCE() on snd/rcv SNEs
net/tcp: Don't store TCP-AO maclen on reqsk

Thanks,
Dmitry

Cc: David Ahern <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Francesco Ruggeri <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Salam Noureddine <[email protected]>
Cc: Simon Horman <[email protected]>
Cc: [email protected]
Cc: [email protected]


Dmitry Safonov (7):
Documentation/tcp: Fix an obvious typo
net/tcp: Consistently align TCP-AO option in the header
net/tcp: Limit TCP_AO_REPAIR to non-listen sockets
net/tcp: Reset TCP-AO cached keys on listen() syscall
net/tcp: Don't add key with non-matching VRF on connected sockets
net/tcp: ACCESS_ONCE() on snd/rcv SNEs
net/tcp: Don't store TCP-AO maclen on reqsk

Documentation/networking/tcp_ao.rst | 2 +-
include/linux/tcp.h | 10 ++++------
include/net/tcp_ao.h | 11 +++++++++++
net/ipv4/af_inet.c | 1 +
net/ipv4/tcp.c | 6 ++++++
net/ipv4/tcp_ao.c | 29 +++++++++++++++++++++++------
net/ipv4/tcp_input.c | 9 +++++----
net/ipv4/tcp_ipv4.c | 4 ++--
net/ipv4/tcp_minisocks.c | 2 +-
net/ipv4/tcp_output.c | 15 ++++++---------
net/ipv6/tcp_ipv6.c | 2 +-
11 files changed, 61 insertions(+), 30 deletions(-)


base-commit: 98b1cc82c4affc16f5598d4fa14b1858671b2263
--
2.42.0


2023-11-21 02:04:16

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH 7/7] net/tcp: Don't store TCP-AO maclen on reqsk

This extra check doesn't work for a handshake when SYN segment has
(current_key.maclen != rnext_key.maclen). It could be amended to
preserve rnext_key.maclen instead of current_key.maclen, but that
requires a lookup on listen socket.

Originally, this extra maclen check was introduced just because it was
cheap. Drop it and convert tcp_request_sock::maclen into boolean
tcp_request_sock::used_tcp_ao.

Fixes: 06b22ef29591 ("net/tcp: Wire TCP-AO to request sockets")
Signed-off-by: Dmitry Safonov <[email protected]>
---
include/linux/tcp.h | 10 ++++------
net/ipv4/tcp_ao.c | 4 ++--
net/ipv4/tcp_input.c | 5 +++--
net/ipv4/tcp_output.c | 9 +++------
4 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 68f3d315d2e1..3af897b00920 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -155,6 +155,9 @@ struct tcp_request_sock {
bool req_usec_ts;
#if IS_ENABLED(CONFIG_MPTCP)
bool drop_req;
+#endif
+#ifdef CONFIG_TCP_AO
+ bool used_tcp_ao;
#endif
u32 txhash;
u32 rcv_isn;
@@ -169,7 +172,6 @@ struct tcp_request_sock {
#ifdef CONFIG_TCP_AO
u8 ao_keyid;
u8 ao_rcv_next;
- u8 maclen;
#endif
};

@@ -180,14 +182,10 @@ static inline struct tcp_request_sock *tcp_rsk(const struct request_sock *req)

static inline bool tcp_rsk_used_ao(const struct request_sock *req)
{
- /* The real length of MAC is saved in the request socket,
- * signing anything with zero-length makes no sense, so here is
- * a little hack..
- */
#ifndef CONFIG_TCP_AO
return false;
#else
- return tcp_rsk(req)->maclen != 0;
+ return tcp_rsk(req)->used_tcp_ao;
#endif
}

diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index 9b7f1970c2e9..07221319e8c5 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -851,7 +851,7 @@ void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb,
const struct tcp_ao_hdr *aoh;
struct tcp_ao_key *key;

- treq->maclen = 0;
+ treq->used_tcp_ao = false;

if (tcp_parse_auth_options(th, NULL, &aoh) || !aoh)
return;
@@ -863,7 +863,7 @@ void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb,

treq->ao_rcv_next = aoh->keyid;
treq->ao_keyid = aoh->rnext_keyid;
- treq->maclen = tcp_ao_maclen(key);
+ treq->used_tcp_ao = true;
}

static enum skb_drop_reason
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 78896c8be0d4..89cb6912dd91 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -7182,11 +7182,12 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
if (tcp_parse_auth_options(tcp_hdr(skb), NULL, &aoh))
goto drop_and_release; /* Invalid TCP options */
if (aoh) {
- tcp_rsk(req)->maclen = aoh->length - sizeof(struct tcp_ao_hdr);
+ tcp_rsk(req)->used_tcp_ao = true;
tcp_rsk(req)->ao_rcv_next = aoh->keyid;
tcp_rsk(req)->ao_keyid = aoh->rnext_keyid;
+
} else {
- tcp_rsk(req)->maclen = 0;
+ tcp_rsk(req)->used_tcp_ao = false;
}
#endif
tcp_rsk(req)->snt_isn = isn;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 93eef1dbbc55..f5ef15e1d9ac 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3720,7 +3720,6 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
if (tcp_rsk_used_ao(req)) {
#ifdef CONFIG_TCP_AO
struct tcp_ao_key *ao_key = NULL;
- u8 maclen = tcp_rsk(req)->maclen;
u8 keyid = tcp_rsk(req)->ao_keyid;

ao_key = tcp_sk(sk)->af_specific->ao_lookup(sk, req_to_sk(req),
@@ -3730,13 +3729,11 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
* for another peer-matching key, but the peer has requested
* ao_keyid (RFC5925 RNextKeyID), so let's keep it simple here.
*/
- if (unlikely(!ao_key || tcp_ao_maclen(ao_key) != maclen)) {
- u8 key_maclen = ao_key ? tcp_ao_maclen(ao_key) : 0;
-
+ if (unlikely(!ao_key)) {
rcu_read_unlock();
kfree_skb(skb);
- net_warn_ratelimited("TCP-AO: the keyid %u with maclen %u|%u from SYN packet is not present - not sending SYNACK\n",
- keyid, maclen, key_maclen);
+ net_warn_ratelimited("TCP-AO: the keyid %u from SYN packet is not present - not sending SYNACK\n",
+ keyid);
return NULL;
}
key.ao_key = ao_key;
--
2.42.0

2023-11-21 02:04:17

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH 2/7] net/tcp: Consistently align TCP-AO option in the header

Currently functions that pre-calculate TCP header options length use
unaligned TCP-AO header + MAC-length for skb reservation.
And the functions that actually write TCP-AO options into skb do align
the header. Nothing good can come out of this for ((maclen % 4) != 0).

Provide tcp_ao_len_aligned() helper and use it everywhere for TCP
header options space calculations.

Fixes: 1e03d32bea8e ("net/tcp: Add TCP-AO sign to outgoing packets")
Signed-off-by: Dmitry Safonov <[email protected]>
---
include/net/tcp_ao.h | 6 ++++++
net/ipv4/tcp_ao.c | 4 ++--
net/ipv4/tcp_ipv4.c | 4 ++--
net/ipv4/tcp_minisocks.c | 2 +-
net/ipv4/tcp_output.c | 6 +++---
net/ipv6/tcp_ipv6.c | 2 +-
6 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
index b56be10838f0..647781080613 100644
--- a/include/net/tcp_ao.h
+++ b/include/net/tcp_ao.h
@@ -62,11 +62,17 @@ static inline int tcp_ao_maclen(const struct tcp_ao_key *key)
return key->maclen;
}

+/* Use tcp_ao_len_aligned() for TCP header calculations */
static inline int tcp_ao_len(const struct tcp_ao_key *key)
{
return tcp_ao_maclen(key) + sizeof(struct tcp_ao_hdr);
}

+static inline int tcp_ao_len_aligned(const struct tcp_ao_key *key)
+{
+ return round_up(tcp_ao_len(key), 4);
+}
+
static inline unsigned int tcp_ao_digest_size(struct tcp_ao_key *key)
{
return key->digest_size;
diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index 7696417d0640..c8be1d526eac 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -1100,7 +1100,7 @@ void tcp_ao_connect_init(struct sock *sk)
ao_info->current_key = key;
if (!ao_info->rnext_key)
ao_info->rnext_key = key;
- tp->tcp_header_len += tcp_ao_len(key);
+ tp->tcp_header_len += tcp_ao_len_aligned(key);

ao_info->lisn = htonl(tp->write_seq);
ao_info->snd_sne = 0;
@@ -1346,7 +1346,7 @@ static int tcp_ao_parse_crypto(struct tcp_ao_add *cmd, struct tcp_ao_key *key)
syn_tcp_option_space -= TCPOLEN_MSS_ALIGNED;
syn_tcp_option_space -= TCPOLEN_TSTAMP_ALIGNED;
syn_tcp_option_space -= TCPOLEN_WSCALE_ALIGNED;
- if (tcp_ao_len(key) > syn_tcp_option_space) {
+ if (tcp_ao_len_aligned(key) > syn_tcp_option_space) {
err = -EMSGSIZE;
goto err_kfree;
}
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 5f693bbd578d..0c50c5a32b84 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -690,7 +690,7 @@ static bool tcp_v4_ao_sign_reset(const struct sock *sk, struct sk_buff *skb,

reply_options[0] = htonl((TCPOPT_AO << 24) | (tcp_ao_len(key) << 16) |
(aoh->rnext_keyid << 8) | keyid);
- arg->iov[0].iov_len += round_up(tcp_ao_len(key), 4);
+ arg->iov[0].iov_len += tcp_ao_len_aligned(key);
reply->doff = arg->iov[0].iov_len / 4;

if (tcp_ao_hash_hdr(AF_INET, (char *)&reply_options[1],
@@ -978,7 +978,7 @@ static void tcp_v4_send_ack(const struct sock *sk,
(tcp_ao_len(key->ao_key) << 16) |
(key->ao_key->sndid << 8) |
key->rcv_next);
- arg.iov[0].iov_len += round_up(tcp_ao_len(key->ao_key), 4);
+ arg.iov[0].iov_len += tcp_ao_len_aligned(key->ao_key);
rep.th.doff = arg.iov[0].iov_len / 4;

tcp_ao_hash_hdr(AF_INET, (char *)&rep.opt[offset],
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index a9807eeb311c..9e85f2a0bddd 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -615,7 +615,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
ao_key = treq->af_specific->ao_lookup(sk, req,
tcp_rsk(req)->ao_keyid, -1);
if (ao_key)
- newtp->tcp_header_len += tcp_ao_len(ao_key);
+ newtp->tcp_header_len += tcp_ao_len_aligned(ao_key);
#endif
if (skb->len >= TCP_MSS_DEFAULT + newtp->tcp_header_len)
newicsk->icsk_ack.last_seg_size = skb->len - newtp->tcp_header_len;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index eb13a55d660c..93eef1dbbc55 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -825,7 +825,7 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb,
timestamps = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_timestamps);
if (tcp_key_is_ao(key)) {
opts->options |= OPTION_AO;
- remaining -= tcp_ao_len(key->ao_key);
+ remaining -= tcp_ao_len_aligned(key->ao_key);
}
}

@@ -915,7 +915,7 @@ static unsigned int tcp_synack_options(const struct sock *sk,
ireq->tstamp_ok &= !ireq->sack_ok;
} else if (tcp_key_is_ao(key)) {
opts->options |= OPTION_AO;
- remaining -= tcp_ao_len(key->ao_key);
+ remaining -= tcp_ao_len_aligned(key->ao_key);
ireq->tstamp_ok &= !ireq->sack_ok;
}

@@ -982,7 +982,7 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
size += TCPOLEN_MD5SIG_ALIGNED;
} else if (tcp_key_is_ao(key)) {
opts->options |= OPTION_AO;
- size += tcp_ao_len(key->ao_key);
+ size += tcp_ao_len_aligned(key->ao_key);
}

if (likely(tp->rx_opt.tstamp_ok)) {
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 937a02c2e534..8c6623496dd7 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -881,7 +881,7 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32
if (tcp_key_is_md5(key))
tot_len += TCPOLEN_MD5SIG_ALIGNED;
if (tcp_key_is_ao(key))
- tot_len += tcp_ao_len(key->ao_key);
+ tot_len += tcp_ao_len_aligned(key->ao_key);

#ifdef CONFIG_MPTCP
if (rst && !tcp_key_is_md5(key)) {
--
2.42.0

2023-11-21 02:04:23

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH 5/7] net/tcp: Don't add key with non-matching VRF on connected sockets

If the connection was established, don't allow adding TCP-AO keys that
don't match the peer. Currently, there are checks for ip-address
matching, but L3 index check is missing. Add it to restrict userspace
shooting itself somewhere.

Fixes: 248411b8cb89 ("net/tcp: Wire up l3index to TCP-AO")
Signed-off-by: Dmitry Safonov <[email protected]>
---
net/ipv4/tcp_ao.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index 71c8c9c59642..122ff58168ee 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -1622,6 +1622,9 @@ static int tcp_ao_add_cmd(struct sock *sk, unsigned short int family,
if (!dev || !l3index)
return -EINVAL;

+ if (!((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE)))
+ return -EINVAL;
+
/* It's still possible to bind after adding keys or even
* re-bind to a different dev (with CAP_NET_RAW).
* So, no reason to return error here, rather try to be
--
2.42.0

2023-11-21 02:04:23

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH 4/7] net/tcp: Reset TCP-AO cached keys on listen() syscall

TCP_LISTEN sockets are not connected to any peer, so having
current_key/rnext_key doesn't make sense.

The userspace may falter over this issue by setting current or rnext
TCP-AO key before listen() syscall. setsockopt(TCP_AO_DEL_KEY) doesn't
allow removing a key that is in use (in accordance to RFC 5925), so
it might be inconvenient to have keys that can be destroyed only with
listener socket.

Fixes: 4954f17ddefc ("net/tcp: Introduce TCP_AO setsockopt()s")
Signed-off-by: Dmitry Safonov <[email protected]>
---
include/net/tcp_ao.h | 5 +++++
net/ipv4/af_inet.c | 1 +
net/ipv4/tcp_ao.c | 14 ++++++++++++++
3 files changed, 20 insertions(+)

diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
index 647781080613..e36057ca5ed8 100644
--- a/include/net/tcp_ao.h
+++ b/include/net/tcp_ao.h
@@ -270,6 +270,7 @@ int tcp_v6_ao_synack_hash(char *ao_hash, struct tcp_ao_key *ao_key,
void tcp_ao_established(struct sock *sk);
void tcp_ao_finish_connect(struct sock *sk, struct sk_buff *skb);
void tcp_ao_connect_init(struct sock *sk);
+void tcp_ao_listen(struct sock *sk);
void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb,
struct tcp_request_sock *treq,
unsigned short int family, int l3index);
@@ -330,6 +331,10 @@ static inline void tcp_ao_connect_init(struct sock *sk)
{
}

+static inline void tcp_ao_listen(struct sock *sk)
+{
+}
+
static inline int tcp_ao_get_mkts(struct sock *sk, sockptr_t optval, sockptr_t optlen)
{
return -ENOPROTOOPT;
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index fb81de10d332..a08d1266344f 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -200,6 +200,7 @@ int __inet_listen_sk(struct sock *sk, int backlog)
* we can only allow the backlog to be adjusted.
*/
if (old_state != TCP_LISTEN) {
+ tcp_ao_listen(sk);
/* Enable TFO w/o requiring TCP_FASTOPEN socket option.
* Note that only TCP sockets (SOCK_STREAM) will reach here.
* Also fastopen backlog may already been set via the option
diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index c8be1d526eac..71c8c9c59642 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -1052,6 +1052,20 @@ static int tcp_ao_cache_traffic_keys(const struct sock *sk,
return ret;
}

+void tcp_ao_listen(struct sock *sk)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct tcp_ao_info *ao_info;
+
+ ao_info = rcu_dereference_protected(tp->ao_info,
+ lockdep_sock_is_held(sk));
+ if (!ao_info)
+ return;
+
+ WRITE_ONCE(ao_info->current_key, NULL);
+ WRITE_ONCE(ao_info->rnext_key, NULL);
+}
+
void tcp_ao_connect_init(struct sock *sk)
{
struct tcp_sock *tp = tcp_sk(sk);
--
2.42.0

2023-11-21 02:04:24

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH 6/7] net/tcp: ACCESS_ONCE() on snd/rcv SNEs

SNEs need READ_ONCE()/WRITE_ONCE() for access as they can be written and
read at the same time.

This is actually a shame: I planned to send it in TCP-AO patches, but
it seems I've chosen a wrong commit to git-commit-fixup some time ago.
It ended up in a commit that adds a selftest. Human factor.

Fixes: 64382c71a557 ("net/tcp: Add TCP-AO SNE support")
Signed-off-by: Dmitry Safonov <[email protected]>
---
net/ipv4/tcp_ao.c | 4 ++--
net/ipv4/tcp_input.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index 122ff58168ee..9b7f1970c2e9 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -956,8 +956,8 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb,
if (unlikely(th->syn && !th->ack))
goto verify_hash;

- sne = tcp_ao_compute_sne(info->rcv_sne, tcp_sk(sk)->rcv_nxt,
- ntohl(th->seq));
+ sne = tcp_ao_compute_sne(READ_ONCE(info->rcv_sne),
+ tcp_sk(sk)->rcv_nxt, ntohl(th->seq));
/* Established socket, traffic key are cached */
traffic_key = rcv_other_key(key);
err = tcp_ao_verify_hash(sk, skb, family, info, aoh, key,
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index bcb55d98004c..78896c8be0d4 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3583,7 +3583,7 @@ static void tcp_snd_sne_update(struct tcp_sock *tp, u32 ack)
ao = rcu_dereference_protected(tp->ao_info,
lockdep_sock_is_held((struct sock *)tp));
if (ao && ack < tp->snd_una)
- ao->snd_sne++;
+ WRITE_ONCE(ao->snd_sne, ao->snd_sne + 1);
#endif
}

@@ -3609,7 +3609,7 @@ static void tcp_rcv_sne_update(struct tcp_sock *tp, u32 seq)
ao = rcu_dereference_protected(tp->ao_info,
lockdep_sock_is_held((struct sock *)tp));
if (ao && seq < tp->rcv_nxt)
- ao->rcv_sne++;
+ WRITE_ONCE(ao->rcv_sne, ao->rcv_sne + 1);
#endif
}

--
2.42.0

2023-11-21 08:09:23

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH 6/7] net/tcp: ACCESS_ONCE() on snd/rcv SNEs

On Tue, Nov 21, 2023 at 3:01 AM Dmitry Safonov <[email protected]> wrote:
>
> SNEs need READ_ONCE()/WRITE_ONCE() for access as they can be written and
> read at the same time.
>
> This is actually a shame: I planned to send it in TCP-AO patches, but
> it seems I've chosen a wrong commit to git-commit-fixup some time ago.
> It ended up in a commit that adds a selftest. Human factor.
>
> Fixes: 64382c71a557 ("net/tcp: Add TCP-AO SNE support")
> Signed-off-by: Dmitry Safonov <[email protected]>
> ---
> net/ipv4/tcp_ao.c | 4 ++--
> net/ipv4/tcp_input.c | 4 ++--
> 2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
> index 122ff58168ee..9b7f1970c2e9 100644
> --- a/net/ipv4/tcp_ao.c
> +++ b/net/ipv4/tcp_ao.c
> @@ -956,8 +956,8 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb,
> if (unlikely(th->syn && !th->ack))
> goto verify_hash;
>
> - sne = tcp_ao_compute_sne(info->rcv_sne, tcp_sk(sk)->rcv_nxt,
> - ntohl(th->seq));
> + sne = tcp_ao_compute_sne(READ_ONCE(info->rcv_sne),
> + tcp_sk(sk)->rcv_nxt, ntohl(th->seq));


I think this is a wrong fix. Something is definitely fishy here.

Update side should only happen for an established socket ?

And the read side should have locked the socket before calling
tcp_inbound_ao_hash(),
otherwise reading other fields (like tcp_sk(sk)->rcv_nxt) would be racy anyway.


> /* Established socket, traffic key are cached */
> traffic_key = rcv_other_key(key);
> err = tcp_ao_verify_hash(sk, skb, family, info, aoh, key,
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index bcb55d98004c..78896c8be0d4 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c

tcp_snd_sne_update() definitely only deals with full sockets
(TCP_AO_ESTABLISHED)

> @@ -3583,7 +3583,7 @@ static void tcp_snd_sne_update(struct tcp_sock *tp, u32 ack)
> ao = rcu_dereference_protected(tp->ao_info,
> lockdep_sock_is_held((struct sock *)tp));
> if (ao && ack < tp->snd_una)
> - ao->snd_sne++;
> + WRITE_ONCE(ao->snd_sne, ao->snd_sne + 1);
> #endif
> }
>
> @@ -3609,7 +3609,7 @@ static void tcp_rcv_sne_update(struct tcp_sock *tp, u32 seq)
> ao = rcu_dereference_protected(tp->ao_info,
> lockdep_sock_is_held((struct sock *)tp));
> if (ao && seq < tp->rcv_nxt)
> - ao->rcv_sne++;
> + WRITE_ONCE(ao->rcv_sne, ao->rcv_sne + 1);
> #endif
> }
>
> --
> 2.42.0
>

2023-11-21 08:14:11

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH 7/7] net/tcp: Don't store TCP-AO maclen on reqsk

On Tue, Nov 21, 2023 at 3:01 AM Dmitry Safonov <[email protected]> wrote:
>
> This extra check doesn't work for a handshake when SYN segment has
> (current_key.maclen != rnext_key.maclen). It could be amended to
> preserve rnext_key.maclen instead of current_key.maclen, but that
> requires a lookup on listen socket.
>
> Originally, this extra maclen check was introduced just because it was
> cheap. Drop it and convert tcp_request_sock::maclen into boolean
> tcp_request_sock::used_tcp_ao.
>
> Fixes: 06b22ef29591 ("net/tcp: Wire TCP-AO to request sockets")
> Signed-off-by: Dmitry Safonov <[email protected]>
> ---
> include/linux/tcp.h | 10 ++++------
> net/ipv4/tcp_ao.c | 4 ++--
> net/ipv4/tcp_input.c | 5 +++--
> net/ipv4/tcp_output.c | 9 +++------
> 4 files changed, 12 insertions(+), 16 deletions(-)
>
> diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> index 68f3d315d2e1..3af897b00920 100644
> --- a/include/linux/tcp.h
> +++ b/include/linux/tcp.h
> @@ -155,6 +155,9 @@ struct tcp_request_sock {
> bool req_usec_ts;
> #if IS_ENABLED(CONFIG_MPTCP)
> bool drop_req;
> +#endif
> +#ifdef CONFIG_TCP_AO
> + bool used_tcp_ao;

Why adding another 8bit field here and creating a hole ?

> #endif
> u32 txhash;
> u32 rcv_isn;
> @@ -169,7 +172,6 @@ struct tcp_request_sock {
> #ifdef CONFIG_TCP_AO
> u8 ao_keyid;
> u8 ao_rcv_next;
> - u8 maclen;

Just rename maclen here to used_tcp_ao ?

> #endif
> };
>

2023-11-21 08:23:06

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH 4/7] net/tcp: Reset TCP-AO cached keys on listen() syscall

On Tue, Nov 21, 2023 at 3:01 AM Dmitry Safonov <[email protected]> wrote:
>
> TCP_LISTEN sockets are not connected to any peer, so having
> current_key/rnext_key doesn't make sense.
>
> The userspace may falter over this issue by setting current or rnext
> TCP-AO key before listen() syscall. setsockopt(TCP_AO_DEL_KEY) doesn't
> allow removing a key that is in use (in accordance to RFC 5925), so
> it might be inconvenient to have keys that can be destroyed only with
> listener socket.

I think this is the wrong way to solve this issue. listen() should not
mess with anything else than socket state.

>
> Fixes: 4954f17ddefc ("net/tcp: Introduce TCP_AO setsockopt()s")
> Signed-off-by: Dmitry Safonov <[email protected]>
> ---
> include/net/tcp_ao.h | 5 +++++
> net/ipv4/af_inet.c | 1 +
> net/ipv4/tcp_ao.c | 14 ++++++++++++++
> 3 files changed, 20 insertions(+)
>
> diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
> index 647781080613..e36057ca5ed8 100644
> --- a/include/net/tcp_ao.h
> +++ b/include/net/tcp_ao.h
> @@ -270,6 +270,7 @@ int tcp_v6_ao_synack_hash(char *ao_hash, struct tcp_ao_key *ao_key,
> void tcp_ao_established(struct sock *sk);
> void tcp_ao_finish_connect(struct sock *sk, struct sk_buff *skb);
> void tcp_ao_connect_init(struct sock *sk);
> +void tcp_ao_listen(struct sock *sk);
> void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb,
> struct tcp_request_sock *treq,
> unsigned short int family, int l3index);
> @@ -330,6 +331,10 @@ static inline void tcp_ao_connect_init(struct sock *sk)
> {
> }
>
> +static inline void tcp_ao_listen(struct sock *sk)
> +{
> +}
> +
> static inline int tcp_ao_get_mkts(struct sock *sk, sockptr_t optval, sockptr_t optlen)
> {
> return -ENOPROTOOPT;
> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index fb81de10d332..a08d1266344f 100644
> --- a/net/ipv4/af_inet.c
> +++ b/net/ipv4/af_inet.c
> @@ -200,6 +200,7 @@ int __inet_listen_sk(struct sock *sk, int backlog)
> * we can only allow the backlog to be adjusted.
> */
> if (old_state != TCP_LISTEN) {
> + tcp_ao_listen(sk);

Ouch...

Please add your hook in tcp_disconnect() instead of this layering violation.

I think you missed the fact that applications can call listen(fd,
backlog) multiple times,
if they need to dynamically adjust backlog.

2023-11-22 01:01:28

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH 4/7] net/tcp: Reset TCP-AO cached keys on listen() syscall

On 11/21/23 08:18, Eric Dumazet wrote:
> On Tue, Nov 21, 2023 at 3:01 AM Dmitry Safonov <[email protected]> wrote:
>>
>> TCP_LISTEN sockets are not connected to any peer, so having
>> current_key/rnext_key doesn't make sense.
>>
>> The userspace may falter over this issue by setting current or rnext
>> TCP-AO key before listen() syscall. setsockopt(TCP_AO_DEL_KEY) doesn't
>> allow removing a key that is in use (in accordance to RFC 5925), so
>> it might be inconvenient to have keys that can be destroyed only with
>> listener socket.
>
> I think this is the wrong way to solve this issue. listen() should not
> mess with anything else than socket state.
>
>>
>> Fixes: 4954f17ddefc ("net/tcp: Introduce TCP_AO setsockopt()s")
>> Signed-off-by: Dmitry Safonov <[email protected]>
[..]
>> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
>> index fb81de10d332..a08d1266344f 100644
>> --- a/net/ipv4/af_inet.c
>> +++ b/net/ipv4/af_inet.c
>> @@ -200,6 +200,7 @@ int __inet_listen_sk(struct sock *sk, int backlog)
>> * we can only allow the backlog to be adjusted.
>> */
>> if (old_state != TCP_LISTEN) {
>> + tcp_ao_listen(sk);
>
> Ouch...
>
> Please add your hook in tcp_disconnect() instead of this layering violation.
>
> I think you missed the fact that applications can call listen(fd,
> backlog) multiple times,
> if they need to dynamically adjust backlog.

Hmm, unsure, I've probably failed at describing the issue or failing to
understand your reply :-)

Let me try again:
1. sk = socket(AF_*, SOCK_STREAM, IPPROTO_TCP)
2. setsockopt(sk, TCP_AO_ADD_KEY, ...) - adding a key to use later
3. setsockopt(sk, IPPROTO_TCP, TCP_AO_INFO, set_current=1) - could be
done straight on adding a key at (2), but for an example, explicitely
4.a. connect(sk, peer) - all as expected, the current key will be the
one that is used for SYN (and ending ACK if the peer doesn't
request to switch)
4.b listen(sk, ...) - userspace shoots itself in foot: the current_key
has no usage on TCP_LISTEN, so it just "hangs" as a pointer until
the socket gets destroyed.

An alternative fix would be to make setsockopt(TCP_AO_DEL_KEY) remove a
key even if it's current_key on TCP_LISTEN, re-setting that to NULL.

Now as I described, somewhat feeling like the alternative fix sounds
better. Will proceed with that for v2.

Thanks,
Dmitry

2023-11-22 01:20:17

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH 7/7] net/tcp: Don't store TCP-AO maclen on reqsk

On 11/21/23 08:13, Eric Dumazet wrote:
> On Tue, Nov 21, 2023 at 3:01 AM Dmitry Safonov <[email protected]> wrote:
>>
>> This extra check doesn't work for a handshake when SYN segment has
>> (current_key.maclen != rnext_key.maclen). It could be amended to
>> preserve rnext_key.maclen instead of current_key.maclen, but that
>> requires a lookup on listen socket.
>>
>> Originally, this extra maclen check was introduced just because it was
>> cheap. Drop it and convert tcp_request_sock::maclen into boolean
>> tcp_request_sock::used_tcp_ao.
>>
>> Fixes: 06b22ef29591 ("net/tcp: Wire TCP-AO to request sockets")
>> Signed-off-by: Dmitry Safonov <[email protected]>
>> ---
>> include/linux/tcp.h | 10 ++++------
>> net/ipv4/tcp_ao.c | 4 ++--
>> net/ipv4/tcp_input.c | 5 +++--
>> net/ipv4/tcp_output.c | 9 +++------
>> 4 files changed, 12 insertions(+), 16 deletions(-)
>>
>> diff --git a/include/linux/tcp.h b/include/linux/tcp.h
>> index 68f3d315d2e1..3af897b00920 100644
>> --- a/include/linux/tcp.h
>> +++ b/include/linux/tcp.h
>> @@ -155,6 +155,9 @@ struct tcp_request_sock {
>> bool req_usec_ts;
>> #if IS_ENABLED(CONFIG_MPTCP)
>> bool drop_req;
>> +#endif
>> +#ifdef CONFIG_TCP_AO
>> + bool used_tcp_ao;
>
> Why adding another 8bit field here and creating a hole ?

Sorry about it, it seems that I

(1) checked with CONFIG_MPTCP=n and it seemed like a hole
(2) was planning to unify it with other booleans under bitfileds
(3) found that some bitfileds require protection as set not only
on initialization, so in the end it either should be flags+set_bit()
or something well-thought on (that separate bitfileds won't be
able to change at the same time)
(4) decided to focus on fixing the issue, rather than doing 2 things
with the same patch

Will fix it up for v2, thanks!

>
>> #endif
>> u32 txhash;
>> u32 rcv_isn;
>> @@ -169,7 +172,6 @@ struct tcp_request_sock {
>> #ifdef CONFIG_TCP_AO
>> u8 ao_keyid;
>> u8 ao_rcv_next;
>> - u8 maclen;
>
> Just rename maclen here to used_tcp_ao ?
>
>> #endif
>> };
>>

Thanks,
Dmitry