2023-11-24 00:27:50

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v2 0/7] TCP-AO fixes

Hi,

Changes from v1:
- Use tcp_can_repair_sock() helper to limit TCP_AO_REPAIR (Eric)
- Instead of hook to listen() syscall, allow removing current/rnext keys
on TCP_LISTEN (addressing Eric's objection)
- Add sne_lock to protect snd_sne/rcv_sne
- Don't move used_tcp_ao in struct tcp_request_sock (Eric)

I've been working on TCP-AO key-rotation selftests and as a result
exercised some corner-cases that are not usually met in production.

Here are a bunch of semi-related fixes:
- Documentation typo (reported by Markus Elfring)
- Proper alignment for TCP-AO option in TCP header that has MAC length
of non 4 bytes (now a selftest with randomized maclen/algorithm/etc
passes)
- 3 uAPI restricting patches that disallow more things to userspace in
order to prevent it shooting itself in any parts of the body
- SNEs READ_ONCE()/WRITE_ONCE() that went missing by my human factor
- Avoid storing MAC length from SYN header as SYN-ACK will use
rnext_key.maclen (drops an extra check that fails on new selftests)

Please, consider applying/pulling.

The following changes since commit d3fa86b1a7b4cdc4367acacea16b72e0a200b3d7:

Merge tag 'net-6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net (2023-11-23 10:40:13 -0800)

are available in the Git repository at:

[email protected]:0x7f454c46/linux.git tcp-ao-post-merge-v2

for you to fetch changes up to c5e4cecfcdc7f996acae740812d9ab2ebcd90517:

net/tcp: Don't store TCP-AO maclen on reqsk (2023-11-23 20:54:54 +0000)

----------------------------------------------------------------

Thanks,
Dmitry

Cc: David Ahern <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Francesco Ruggeri <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Salam Noureddine <[email protected]>
Cc: Simon Horman <[email protected]>
Cc: [email protected]
Cc: [email protected]

Dmitry Safonov (7):
Documentation/tcp: Fix an obvious typo
net/tcp: Consistently align TCP-AO option in the header
net/tcp: Limit TCP_AO_REPAIR to non-listen sockets
net/tcp: Allow removing current/rnext TCP-AO keys on TCP_LISTEN sockets
net/tcp: Don't add key with non-matching VRF on connected sockets
net/tcp: Add sne_lock to access SNEs
net/tcp: Don't store TCP-AO maclen on reqsk

Documentation/networking/tcp_ao.rst | 2 +-
include/linux/tcp.h | 8 +---
include/net/tcp_ao.h | 8 +++-
net/ipv4/tcp.c | 6 +++
net/ipv4/tcp_ao.c | 57 +++++++++++++++++++----------
net/ipv4/tcp_input.c | 21 +++++++++--
net/ipv4/tcp_ipv4.c | 4 +-
net/ipv4/tcp_minisocks.c | 2 +-
net/ipv4/tcp_output.c | 15 +++-----
net/ipv6/tcp_ipv6.c | 2 +-
10 files changed, 81 insertions(+), 44 deletions(-)


base-commit: d3fa86b1a7b4cdc4367acacea16b72e0a200b3d7
--
2.43.0


2023-11-24 00:28:03

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v2 1/7] Documentation/tcp: Fix an obvious typo

Yep, my VIM spellchecker is not good enough for typos like this one.

Fixes: 7fe0e38bb669 ("Documentation/tcp: Add TCP-AO documentation")
Cc: Jonathan Corbet <[email protected]>
Cc: [email protected]
Reported-by: Markus Elfring <[email protected]>
Closes: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Dmitry Safonov <[email protected]>
---
Documentation/networking/tcp_ao.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/networking/tcp_ao.rst b/Documentation/networking/tcp_ao.rst
index cfa5bf1cc542..8a58321acce7 100644
--- a/Documentation/networking/tcp_ao.rst
+++ b/Documentation/networking/tcp_ao.rst
@@ -99,7 +99,7 @@ also [6.1]::
when it is no longer considered permitted.

Linux TCP-AO will try its best to prevent you from removing a key that's
-being used, considering it a key management failure. But sine keeping
+being used, considering it a key management failure. But since keeping
an outdated key may become a security issue and as a peer may
unintentionally prevent the removal of an old key by always setting
it as RNextKeyID - a forced key removal mechanism is provided, where
--
2.43.0

2023-11-24 00:28:12

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v2 3/7] net/tcp: Limit TCP_AO_REPAIR to non-listen sockets

Listen socket is not an established TCP connection, so
setsockopt(TCP_AO_REPAIR) doesn't have any impact.

Restrict this uAPI for listen sockets.

Fixes: faadfaba5e01 ("net/tcp: Add TCP_AO_REPAIR")
Signed-off-by: Dmitry Safonov <[email protected]>
---
net/ipv4/tcp.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 53bcc17c91e4..b1fe4eb01829 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3594,6 +3594,10 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname,
break;

case TCP_AO_REPAIR:
+ if (!tcp_can_repair_sock(sk)) {
+ err = -EPERM;
+ break;
+ }
err = tcp_ao_set_repair(sk, optval, optlen);
break;
#ifdef CONFIG_TCP_AO
@@ -4293,6 +4297,8 @@ int do_tcp_getsockopt(struct sock *sk, int level,
}
#endif
case TCP_AO_REPAIR:
+ if (!tcp_can_repair_sock(sk))
+ return -EPERM;
return tcp_ao_get_repair(sk, optval, optlen);
case TCP_AO_GET_KEYS:
case TCP_AO_INFO: {
--
2.43.0

2023-11-24 00:28:15

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v2 2/7] net/tcp: Consistently align TCP-AO option in the header

Currently functions that pre-calculate TCP header options length use
unaligned TCP-AO header + MAC-length for skb reservation.
And the functions that actually write TCP-AO options into skb do align
the header. Nothing good can come out of this for ((maclen % 4) != 0).

Provide tcp_ao_len_aligned() helper and use it everywhere for TCP
header options space calculations.

Fixes: 1e03d32bea8e ("net/tcp: Add TCP-AO sign to outgoing packets")
Signed-off-by: Dmitry Safonov <[email protected]>
---
include/net/tcp_ao.h | 6 ++++++
net/ipv4/tcp_ao.c | 4 ++--
net/ipv4/tcp_ipv4.c | 4 ++--
net/ipv4/tcp_minisocks.c | 2 +-
net/ipv4/tcp_output.c | 6 +++---
net/ipv6/tcp_ipv6.c | 2 +-
6 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
index b56be10838f0..647781080613 100644
--- a/include/net/tcp_ao.h
+++ b/include/net/tcp_ao.h
@@ -62,11 +62,17 @@ static inline int tcp_ao_maclen(const struct tcp_ao_key *key)
return key->maclen;
}

+/* Use tcp_ao_len_aligned() for TCP header calculations */
static inline int tcp_ao_len(const struct tcp_ao_key *key)
{
return tcp_ao_maclen(key) + sizeof(struct tcp_ao_hdr);
}

+static inline int tcp_ao_len_aligned(const struct tcp_ao_key *key)
+{
+ return round_up(tcp_ao_len(key), 4);
+}
+
static inline unsigned int tcp_ao_digest_size(struct tcp_ao_key *key)
{
return key->digest_size;
diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index 7696417d0640..c8be1d526eac 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -1100,7 +1100,7 @@ void tcp_ao_connect_init(struct sock *sk)
ao_info->current_key = key;
if (!ao_info->rnext_key)
ao_info->rnext_key = key;
- tp->tcp_header_len += tcp_ao_len(key);
+ tp->tcp_header_len += tcp_ao_len_aligned(key);

ao_info->lisn = htonl(tp->write_seq);
ao_info->snd_sne = 0;
@@ -1346,7 +1346,7 @@ static int tcp_ao_parse_crypto(struct tcp_ao_add *cmd, struct tcp_ao_key *key)
syn_tcp_option_space -= TCPOLEN_MSS_ALIGNED;
syn_tcp_option_space -= TCPOLEN_TSTAMP_ALIGNED;
syn_tcp_option_space -= TCPOLEN_WSCALE_ALIGNED;
- if (tcp_ao_len(key) > syn_tcp_option_space) {
+ if (tcp_ao_len_aligned(key) > syn_tcp_option_space) {
err = -EMSGSIZE;
goto err_kfree;
}
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 5f693bbd578d..0c50c5a32b84 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -690,7 +690,7 @@ static bool tcp_v4_ao_sign_reset(const struct sock *sk, struct sk_buff *skb,

reply_options[0] = htonl((TCPOPT_AO << 24) | (tcp_ao_len(key) << 16) |
(aoh->rnext_keyid << 8) | keyid);
- arg->iov[0].iov_len += round_up(tcp_ao_len(key), 4);
+ arg->iov[0].iov_len += tcp_ao_len_aligned(key);
reply->doff = arg->iov[0].iov_len / 4;

if (tcp_ao_hash_hdr(AF_INET, (char *)&reply_options[1],
@@ -978,7 +978,7 @@ static void tcp_v4_send_ack(const struct sock *sk,
(tcp_ao_len(key->ao_key) << 16) |
(key->ao_key->sndid << 8) |
key->rcv_next);
- arg.iov[0].iov_len += round_up(tcp_ao_len(key->ao_key), 4);
+ arg.iov[0].iov_len += tcp_ao_len_aligned(key->ao_key);
rep.th.doff = arg.iov[0].iov_len / 4;

tcp_ao_hash_hdr(AF_INET, (char *)&rep.opt[offset],
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index a9807eeb311c..9e85f2a0bddd 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -615,7 +615,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
ao_key = treq->af_specific->ao_lookup(sk, req,
tcp_rsk(req)->ao_keyid, -1);
if (ao_key)
- newtp->tcp_header_len += tcp_ao_len(ao_key);
+ newtp->tcp_header_len += tcp_ao_len_aligned(ao_key);
#endif
if (skb->len >= TCP_MSS_DEFAULT + newtp->tcp_header_len)
newicsk->icsk_ack.last_seg_size = skb->len - newtp->tcp_header_len;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index eb13a55d660c..93eef1dbbc55 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -825,7 +825,7 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb,
timestamps = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_timestamps);
if (tcp_key_is_ao(key)) {
opts->options |= OPTION_AO;
- remaining -= tcp_ao_len(key->ao_key);
+ remaining -= tcp_ao_len_aligned(key->ao_key);
}
}

@@ -915,7 +915,7 @@ static unsigned int tcp_synack_options(const struct sock *sk,
ireq->tstamp_ok &= !ireq->sack_ok;
} else if (tcp_key_is_ao(key)) {
opts->options |= OPTION_AO;
- remaining -= tcp_ao_len(key->ao_key);
+ remaining -= tcp_ao_len_aligned(key->ao_key);
ireq->tstamp_ok &= !ireq->sack_ok;
}

@@ -982,7 +982,7 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
size += TCPOLEN_MD5SIG_ALIGNED;
} else if (tcp_key_is_ao(key)) {
opts->options |= OPTION_AO;
- size += tcp_ao_len(key->ao_key);
+ size += tcp_ao_len_aligned(key->ao_key);
}

if (likely(tp->rx_opt.tstamp_ok)) {
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 937a02c2e534..8c6623496dd7 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -881,7 +881,7 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32
if (tcp_key_is_md5(key))
tot_len += TCPOLEN_MD5SIG_ALIGNED;
if (tcp_key_is_ao(key))
- tot_len += tcp_ao_len(key->ao_key);
+ tot_len += tcp_ao_len_aligned(key->ao_key);

#ifdef CONFIG_MPTCP
if (rst && !tcp_key_is_md5(key)) {
--
2.43.0

2023-11-24 00:28:15

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v2 4/7] net/tcp: Allow removing current/rnext TCP-AO keys on TCP_LISTEN sockets

TCP_LISTEN sockets are not connected to any peer, so having
current_key/rnext_key doesn't make sense.

The userspace may falter over this issue by setting current or rnext
TCP-AO key before listen() syscall. setsockopt(TCP_AO_DEL_KEY) doesn't
allow removing a key that is in use (in accordance to RFC 5925), so
it might be inconvenient to have keys that can be destroyed only with
listener socket.

Fixes: 4954f17ddefc ("net/tcp: Introduce TCP_AO setsockopt()s")
Signed-off-by: Dmitry Safonov <[email protected]>
---
net/ipv4/tcp_ao.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index c8be1d526eac..bf41be6d4721 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -1818,8 +1818,16 @@ static int tcp_ao_del_cmd(struct sock *sk, unsigned short int family,
if (!new_rnext)
return -ENOENT;
}
- if (cmd.del_async && sk->sk_state != TCP_LISTEN)
- return -EINVAL;
+ if (sk->sk_state == TCP_LISTEN) {
+ /* Cleaning up possible "stale" current/rnext keys state,
+ * that may have preserved from TCP_CLOSE, before sys_listen()
+ */
+ ao_info->current_key = NULL;
+ ao_info->rnext_key = NULL;
+ } else {
+ if (cmd.del_async)
+ return -EINVAL;
+ }

if (family == AF_INET) {
struct sockaddr_in *sin = (struct sockaddr_in *)&cmd.addr;
--
2.43.0

2023-11-24 00:28:16

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v2 6/7] net/tcp: Add sne_lock to access SNEs

RFC 5925 (6.2):
> TCP-AO emulates a 64-bit sequence number space by inferring when to
> increment the high-order 32-bit portion (the SNE) based on
> transitions in the low-order portion (the TCP sequence number).

snd_sne and rcv_sne are the upper 4 bytes of extended SEQ number.
Unfortunately, reading two 4-bytes pointers can't be performed
atomically (without synchronization).

Let's keep it KISS and add an rwlock - that shouldn't create much
contention as SNE are updated every 4Gb of traffic and the atomic region
is quite small.

Fixes: 64382c71a557 ("net/tcp: Add TCP-AO SNE support")
Signed-off-by: Dmitry Safonov <[email protected]>
---
include/net/tcp_ao.h | 2 +-
net/ipv4/tcp_ao.c | 34 +++++++++++++++++++++-------------
net/ipv4/tcp_input.c | 16 ++++++++++++++--
3 files changed, 36 insertions(+), 16 deletions(-)

diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
index 647781080613..beea3e6b39e2 100644
--- a/include/net/tcp_ao.h
+++ b/include/net/tcp_ao.h
@@ -123,6 +123,7 @@ struct tcp_ao_info {
*/
u32 snd_sne;
u32 rcv_sne;
+ rwlock_t sne_lock;
refcount_t refcnt; /* Protects twsk destruction */
struct rcu_head rcu;
};
@@ -212,7 +213,6 @@ enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk,
const struct sk_buff *skb, unsigned short int family,
const struct request_sock *req, int l3index,
const struct tcp_ao_hdr *aoh);
-u32 tcp_ao_compute_sne(u32 next_sne, u32 next_seq, u32 seq);
struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk, int l3index,
const union tcp_ao_addr *addr,
int family, int sndid, int rcvid);
diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index 2d000e275ce7..74db80aeeef3 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -230,6 +230,7 @@ static struct tcp_ao_info *tcp_ao_alloc_info(gfp_t flags)
return NULL;
INIT_HLIST_HEAD(&ao->head);
refcount_set(&ao->refcnt, 1);
+ rwlock_init(&ao->sne_lock);

return ao;
}
@@ -472,10 +473,8 @@ static int tcp_ao_hash_pseudoheader(unsigned short int family,
return -EAFNOSUPPORT;
}

-u32 tcp_ao_compute_sne(u32 next_sne, u32 next_seq, u32 seq)
+static u32 tcp_ao_compute_sne(u32 sne, u32 next_seq, u32 seq)
{
- u32 sne = next_sne;
-
if (before(seq, next_seq)) {
if (seq > next_seq)
sne--;
@@ -483,7 +482,6 @@ u32 tcp_ao_compute_sne(u32 next_sne, u32 next_seq, u32 seq)
if (seq < next_seq)
sne++;
}
-
return sne;
}

@@ -763,14 +761,15 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb,
*keyid = (*key)->rcvid;
} else {
struct tcp_ao_key *rnext_key;
- u32 snd_basis;
+ const u32 *snd_basis;
+ unsigned long flags;

if (sk->sk_state == TCP_TIME_WAIT) {
ao_info = rcu_dereference(tcp_twsk(sk)->ao_info);
- snd_basis = tcp_twsk(sk)->tw_snd_nxt;
+ snd_basis = &tcp_twsk(sk)->tw_snd_nxt;
} else {
ao_info = rcu_dereference(tcp_sk(sk)->ao_info);
- snd_basis = tcp_sk(sk)->snd_una;
+ snd_basis = &tcp_sk(sk)->snd_una;
}
if (!ao_info)
return -ENOENT;
@@ -781,8 +780,10 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb,
*traffic_key = snd_other_key(*key);
rnext_key = READ_ONCE(ao_info->rnext_key);
*keyid = rnext_key->rcvid;
- *sne = tcp_ao_compute_sne(READ_ONCE(ao_info->snd_sne),
- snd_basis, seq);
+ read_lock_irqsave(&ao_info->sne_lock, flags);
+ *sne = tcp_ao_compute_sne(ao_info->snd_sne,
+ READ_ONCE(*snd_basis), seq);
+ read_unlock_irqrestore(&ao_info->sne_lock, flags);
}
return 0;
}
@@ -795,6 +796,7 @@ int tcp_ao_transmit_skb(struct sock *sk, struct sk_buff *skb,
struct tcp_sock *tp = tcp_sk(sk);
struct tcp_ao_info *ao;
void *tkey_buf = NULL;
+ unsigned long flags;
u8 *traffic_key;
u32 sne;

@@ -816,8 +818,10 @@ int tcp_ao_transmit_skb(struct sock *sk, struct sk_buff *skb,
tp->af_specific->ao_calc_key_sk(key, traffic_key,
sk, ao->lisn, disn, true);
}
- sne = tcp_ao_compute_sne(READ_ONCE(ao->snd_sne), READ_ONCE(tp->snd_una),
- ntohl(th->seq));
+ read_lock_irqsave(&ao->sne_lock, flags);
+ sne = tcp_ao_compute_sne(ao->snd_sne,
+ READ_ONCE(tp->snd_una), ntohl(th->seq));
+ read_unlock_irqrestore(&ao->sne_lock, flags);
tp->af_specific->calc_ao_hash(hash_location, key, sk, skb, traffic_key,
hash_location - (u8 *)th, sne);
kfree(tkey_buf);
@@ -938,8 +942,9 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb,

/* Fast-path */
if (likely((1 << sk->sk_state) & TCP_AO_ESTABLISHED)) {
- enum skb_drop_reason err;
struct tcp_ao_key *current_key;
+ enum skb_drop_reason err;
+ unsigned long flags;

/* Check if this socket's rnext_key matches the keyid in the
* packet. If not we lookup the key based on the keyid
@@ -956,8 +961,11 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb,
if (unlikely(th->syn && !th->ack))
goto verify_hash;

- sne = tcp_ao_compute_sne(info->rcv_sne, tcp_sk(sk)->rcv_nxt,
+ read_lock_irqsave(&info->sne_lock, flags);
+ sne = tcp_ao_compute_sne(info->rcv_sne,
+ READ_ONCE(tcp_sk(sk)->rcv_nxt),
ntohl(th->seq));
+ read_unlock_irqrestore(&info->sne_lock, flags);
/* Established socket, traffic key are cached */
traffic_key = rcv_other_key(key);
err = tcp_ao_verify_hash(sk, skb, family, info, aoh, key,
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index bcb55d98004c..fc3c27ce2b73 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3582,8 +3582,14 @@ static void tcp_snd_sne_update(struct tcp_sock *tp, u32 ack)

ao = rcu_dereference_protected(tp->ao_info,
lockdep_sock_is_held((struct sock *)tp));
- if (ao && ack < tp->snd_una)
+ if (ao && ack < tp->snd_una) {
+ unsigned long flags;
+
+ write_lock_irqsave(&ao->sne_lock, flags);
ao->snd_sne++;
+ tp->snd_una = ack;
+ write_unlock_irqrestore(&ao->sne_lock, flags);
+ }
#endif
}

@@ -3608,8 +3614,14 @@ static void tcp_rcv_sne_update(struct tcp_sock *tp, u32 seq)

ao = rcu_dereference_protected(tp->ao_info,
lockdep_sock_is_held((struct sock *)tp));
- if (ao && seq < tp->rcv_nxt)
+ if (ao && seq < tp->rcv_nxt) {
+ unsigned long flags;
+
+ write_lock_irqsave(&ao->sne_lock, flags);
ao->rcv_sne++;
+ WRITE_ONCE(tp->rcv_nxt, seq);
+ write_unlock_irqrestore(&ao->sne_lock, flags);
+ }
#endif
}

--
2.43.0

2023-11-24 00:28:26

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v2 5/7] net/tcp: Don't add key with non-matching VRF on connected sockets

If the connection was established, don't allow adding TCP-AO keys that
don't match the peer. Currently, there are checks for ip-address
matching, but L3 index check is missing. Add it to restrict userspace
shooting itself somewhere.

Fixes: 248411b8cb89 ("net/tcp: Wire up l3index to TCP-AO")
Signed-off-by: Dmitry Safonov <[email protected]>
---
net/ipv4/tcp_ao.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index bf41be6d4721..2d000e275ce7 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -1608,6 +1608,9 @@ static int tcp_ao_add_cmd(struct sock *sk, unsigned short int family,
if (!dev || !l3index)
return -EINVAL;

+ if (!((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE)))
+ return -EINVAL;
+
/* It's still possible to bind after adding keys or even
* re-bind to a different dev (with CAP_NET_RAW).
* So, no reason to return error here, rather try to be
--
2.43.0

2023-11-24 00:28:33

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v2 7/7] net/tcp: Don't store TCP-AO maclen on reqsk

This extra check doesn't work for a handshake when SYN segment has
(current_key.maclen != rnext_key.maclen). It could be amended to
preserve rnext_key.maclen instead of current_key.maclen, but that
requires a lookup on listen socket.

Originally, this extra maclen check was introduced just because it was
cheap. Drop it and convert tcp_request_sock::maclen into boolean
tcp_request_sock::used_tcp_ao.

Fixes: 06b22ef29591 ("net/tcp: Wire TCP-AO to request sockets")
Signed-off-by: Dmitry Safonov <[email protected]>
---
include/linux/tcp.h | 8 ++------
net/ipv4/tcp_ao.c | 4 ++--
net/ipv4/tcp_input.c | 5 +++--
net/ipv4/tcp_output.c | 9 +++------
4 files changed, 10 insertions(+), 16 deletions(-)

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 68f3d315d2e1..b646b574b060 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -169,7 +169,7 @@ struct tcp_request_sock {
#ifdef CONFIG_TCP_AO
u8 ao_keyid;
u8 ao_rcv_next;
- u8 maclen;
+ bool used_tcp_ao;
#endif
};

@@ -180,14 +180,10 @@ static inline struct tcp_request_sock *tcp_rsk(const struct request_sock *req)

static inline bool tcp_rsk_used_ao(const struct request_sock *req)
{
- /* The real length of MAC is saved in the request socket,
- * signing anything with zero-length makes no sense, so here is
- * a little hack..
- */
#ifndef CONFIG_TCP_AO
return false;
#else
- return tcp_rsk(req)->maclen != 0;
+ return tcp_rsk(req)->used_tcp_ao;
#endif
}

diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index 74db80aeeef3..cfa264c320a7 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -855,7 +855,7 @@ void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb,
const struct tcp_ao_hdr *aoh;
struct tcp_ao_key *key;

- treq->maclen = 0;
+ treq->used_tcp_ao = false;

if (tcp_parse_auth_options(th, NULL, &aoh) || !aoh)
return;
@@ -867,7 +867,7 @@ void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb,

treq->ao_rcv_next = aoh->keyid;
treq->ao_keyid = aoh->rnext_keyid;
- treq->maclen = tcp_ao_maclen(key);
+ treq->used_tcp_ao = true;
}

static enum skb_drop_reason
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index fc3c27ce2b73..0135a6c6f600 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -7194,11 +7194,12 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
if (tcp_parse_auth_options(tcp_hdr(skb), NULL, &aoh))
goto drop_and_release; /* Invalid TCP options */
if (aoh) {
- tcp_rsk(req)->maclen = aoh->length - sizeof(struct tcp_ao_hdr);
+ tcp_rsk(req)->used_tcp_ao = true;
tcp_rsk(req)->ao_rcv_next = aoh->keyid;
tcp_rsk(req)->ao_keyid = aoh->rnext_keyid;
+
} else {
- tcp_rsk(req)->maclen = 0;
+ tcp_rsk(req)->used_tcp_ao = false;
}
#endif
tcp_rsk(req)->snt_isn = isn;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 93eef1dbbc55..f5ef15e1d9ac 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3720,7 +3720,6 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
if (tcp_rsk_used_ao(req)) {
#ifdef CONFIG_TCP_AO
struct tcp_ao_key *ao_key = NULL;
- u8 maclen = tcp_rsk(req)->maclen;
u8 keyid = tcp_rsk(req)->ao_keyid;

ao_key = tcp_sk(sk)->af_specific->ao_lookup(sk, req_to_sk(req),
@@ -3730,13 +3729,11 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
* for another peer-matching key, but the peer has requested
* ao_keyid (RFC5925 RNextKeyID), so let's keep it simple here.
*/
- if (unlikely(!ao_key || tcp_ao_maclen(ao_key) != maclen)) {
- u8 key_maclen = ao_key ? tcp_ao_maclen(ao_key) : 0;
-
+ if (unlikely(!ao_key)) {
rcu_read_unlock();
kfree_skb(skb);
- net_warn_ratelimited("TCP-AO: the keyid %u with maclen %u|%u from SYN packet is not present - not sending SYNACK\n",
- keyid, maclen, key_maclen);
+ net_warn_ratelimited("TCP-AO: the keyid %u from SYN packet is not present - not sending SYNACK\n",
+ keyid);
return NULL;
}
key.ao_key = ao_key;
--
2.43.0

2023-11-27 11:42:22

by Paolo Abeni

[permalink] [raw]
Subject: Re: [PATCH v2 6/7] net/tcp: Add sne_lock to access SNEs

On Fri, 2023-11-24 at 00:27 +0000, Dmitry Safonov wrote:
> RFC 5925 (6.2):
> > TCP-AO emulates a 64-bit sequence number space by inferring when to
> > increment the high-order 32-bit portion (the SNE) based on
> > transitions in the low-order portion (the TCP sequence number).
>
> snd_sne and rcv_sne are the upper 4 bytes of extended SEQ number.
> Unfortunately, reading two 4-bytes pointers can't be performed
> atomically (without synchronization).
>
> Let's keep it KISS and add an rwlock - that shouldn't create much
> contention as SNE are updated every 4Gb of traffic and the atomic region
> is quite small.
>
> Fixes: 64382c71a557 ("net/tcp: Add TCP-AO SNE support")
> Signed-off-by: Dmitry Safonov <[email protected]>
> ---
> include/net/tcp_ao.h | 2 +-
> net/ipv4/tcp_ao.c | 34 +++++++++++++++++++++-------------
> net/ipv4/tcp_input.c | 16 ++++++++++++++--
> 3 files changed, 36 insertions(+), 16 deletions(-)
>
> diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
> index 647781080613..beea3e6b39e2 100644
> --- a/include/net/tcp_ao.h
> +++ b/include/net/tcp_ao.h
> @@ -123,6 +123,7 @@ struct tcp_ao_info {
> */
> u32 snd_sne;
> u32 rcv_sne;
> + rwlock_t sne_lock;

RW lock are problematic in the networking code, see commit
dbca1596bbb08318f5e3b3b99f8ca0a0d3830a65.

I think you can use a plain spinlock here, as both read and write
appears to be in the fastpath (?!?)

> @@ -781,8 +780,10 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb,
> *traffic_key = snd_other_key(*key);
> rnext_key = READ_ONCE(ao_info->rnext_key);
> *keyid = rnext_key->rcvid;
> - *sne = tcp_ao_compute_sne(READ_ONCE(ao_info->snd_sne),
> - snd_basis, seq);
> + read_lock_irqsave(&ao_info->sne_lock, flags);
> + *sne = tcp_ao_compute_sne(ao_info->snd_sne,
> + READ_ONCE(*snd_basis), seq);
> + read_unlock_irqrestore(&ao_info->sne_lock, flags);

Why are you using the irqsave variant? bh should suffice.

Cheers,

Paolo

2023-11-27 14:42:59

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH v2 6/7] net/tcp: Add sne_lock to access SNEs

On 11/27/23 11:41, Paolo Abeni wrote:
> On Fri, 2023-11-24 at 00:27 +0000, Dmitry Safonov wrote:
>> RFC 5925 (6.2):
>>> TCP-AO emulates a 64-bit sequence number space by inferring when to
>>> increment the high-order 32-bit portion (the SNE) based on
>>> transitions in the low-order portion (the TCP sequence number).
>>
>> snd_sne and rcv_sne are the upper 4 bytes of extended SEQ number.
>> Unfortunately, reading two 4-bytes pointers can't be performed
>> atomically (without synchronization).
>>
>> Let's keep it KISS and add an rwlock - that shouldn't create much
>> contention as SNE are updated every 4Gb of traffic and the atomic region
>> is quite small.
>>
>> Fixes: 64382c71a557 ("net/tcp: Add TCP-AO SNE support")
>> Signed-off-by: Dmitry Safonov <[email protected]>
>> ---
>> include/net/tcp_ao.h | 2 +-
>> net/ipv4/tcp_ao.c | 34 +++++++++++++++++++++-------------
>> net/ipv4/tcp_input.c | 16 ++++++++++++++--
>> 3 files changed, 36 insertions(+), 16 deletions(-)
>>
>> diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
>> index 647781080613..beea3e6b39e2 100644
>> --- a/include/net/tcp_ao.h
>> +++ b/include/net/tcp_ao.h
>> @@ -123,6 +123,7 @@ struct tcp_ao_info {
>> */
>> u32 snd_sne;
>> u32 rcv_sne;
>> + rwlock_t sne_lock;
>
> RW lock are problematic in the networking code, see commit
> dbca1596bbb08318f5e3b3b99f8ca0a0d3830a65.

Thanks, was not aware of this pitfall.

> I think you can use a plain spinlock here, as both read and write
> appears to be in the fastpath (?!?)

Yeah, I wanted to avoid to RX concurrency here as writing happens only
once in 4Gb. I'll take another attempt to prevent that in v3.

>> @@ -781,8 +780,10 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb,
>> *traffic_key = snd_other_key(*key);
>> rnext_key = READ_ONCE(ao_info->rnext_key);
>> *keyid = rnext_key->rcvid;
>> - *sne = tcp_ao_compute_sne(READ_ONCE(ao_info->snd_sne),
>> - snd_basis, seq);
>> + read_lock_irqsave(&ao_info->sne_lock, flags);
>> + *sne = tcp_ao_compute_sne(ao_info->snd_sne,
>> + READ_ONCE(*snd_basis), seq);
>> + read_unlock_irqrestore(&ao_info->sne_lock, flags);
>
> Why are you using the irqsave variant? bh should suffice.

It should, yes :)

>
> Cheers,
>
> Paolo
>

Thanks,
Dmitry