2023-09-12 07:53:08

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v11 net-next 00/23] net/tcp: Add TCP-AO support

Hi,

This is version 11 of TCP-AO support. The chages from v10 address
Simon's review comments.

There's one Sparse warning introduced by tcp_sigpool_start():
__cond_acquires() seems to currently being broken. I've described
the reasoning for it on v9 cover letter. Also, checkpatch.pl warnings
were addressed, but yet I've left the ones that are more personal
preferences (i.e. 80 columns limit). Please, ping me if you have
a strong feeling about one of them.

The following changes since commit 73be7fb14e83d24383f840a22f24d3ed222ca319:

Merge tag 'net-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net (2023-09-07 18:33:07 -0700)

are available in the Git repository at:

[email protected]:0x7f454c46/linux.git tcp-ao-v11

for you to fetch changes up to 43b9bbf84d46412054f8a30a0085b7af15855703:

Documentation/tcp: Add TCP-AO documentation (2023-09-11 20:16:20 +0100)

----------------------------------------------------------------

And another branch with selftests, that will be sent later separately:
[email protected]:0x7f454c46/linux.git tcp-ao-v11-with-selftests

Thanks for your time and reviews,
Dmitry

--- Changelog ---

Changes from v10:
- Make seq (u32) in tcp_ao_prepare_reset() and declare the argument
in "net/tcp: Add TCP-AO SNE support", where it gets used (Simon)
- Fix rebase artifact in tcp_v6_reqsk_send_ack(), which adds
compile-error on a patch in the middle of series (Simon)
- Another rebase artifact in tcp_v6_reqsk_send_ack() that makes
keyid, requested by peer on ipv6 reqsk ACKs not respected (Simon)

Version 10: https://lore.kernel.org/all/[email protected]/T/#u

Changes from v9:
- Read sk_family only once in tcp_ao_ignore_icmp() (Eric)
- Don't WARN_ON_ONCE() on unexpected sk_family (Eric)
- Call tcp_ao_ignore_icmp() outside bh_lock_sock() (Eric)
- Make struct sock *sk `const' in tcp_ao_ignore_icmp() (Eric)
- WRITE_ONCE() for tcp_md5_sigpool_id (Eric)
- Cc Mohammad, who wants to contribute with PPC testing, reviews, etc

Version 9: https://lore.kernel.org/all/[email protected]/T/#u

Changes from v8:
- Based on net-next
- Now doing git request-pull, rather than GitHub URLs
- Fix tmp_key buffer leak, introduced in v7 (Simon)
- More checkpatch.pl warning fixes (even to the code that existed but
was touched)
- More reverse Xmas tree declarations (Simon)
- static code analysis fixes
- Removed TCP-AO key port matching code
- Removed `inline' for for static functions in .c files to make
netdev/source_inline happy (I didn't know it's a thing)
- Moved tcp_ao_do_lookup() to a commit that uses it (Simon)
- __tcp_ao_key_cmp(): prefixlen is bits, but memcmp() uses bytes
- Added TCP port matching limitation to Documentation/networking/tcp_ao.rst

Version 8: https://lore.kernel.org/all/[email protected]/T/#u

Changes from v7:
- Fixed copy'n'paste typo in unsigned-md5.c selftest output
- Fix build error in tcp_v6_send_reset() (kernel test robot <[email protected]>)
- Make CONFIG_TCP_AO imply IPV6 != m
- Cleanup EXPORT_SYMBOL*() as they aren't needed with IPV6 != m
- Used scratch area instead of on-stack buffer for scatter-gather list
in tcp_v{4,6}_ao_calc_key(). Fixes CONFIG_VMAP_STACK=y + CONFIG_DEBUG_SG=y
- Allocated digest_size'd buffers for traffic keys in tcp_ao_key instead
of maximum-sized buffers of TCP_AO_MAX_HASH_SIZE. That will save
little space per key and also potentially allow algorithms with
digest size > TCP_AO_MAX_HASH_SIZE.
- Removed TCP_AO_MAX_HASH_SIZE and used kmalloc(GFP_ATOMIC) instead of
on-stack hash buffer.
- Don't treat fd=0 as invalid in selftests
- Make TCP-AO selftests work with CONFIG_CRYPTO_FIPS=y
- Don't tcp_ao_compute_sne() for snd_sne on twsk: it's redundant as
no data can be sent on twsk
- Get rid of {snd,rcv}_sne_seq: use snd_nxt/snd_una or rcv_nxt instead
- {rcv,snd}_sne and tcp_ao_compute_sne() now are introduced in
"net/tcp: Add TCP-AO SNE support" patch
- trivial copy_to_sockptr() fixup for tcp_ao_get_repair() - it could
try copying bigger struct than the kernel one (embarrassing!)
- Added Documentation/networking/tcp_ao.rst that describes:
uAPI, has FAQ on RFC 5925 and has implementation details of Linux TCP-AO

Version 7: https://lore.kernel.org/all/[email protected]/T/#u

Changes from v6:
- Some more trivial build warnings fixups (kernel test robot <[email protected]>)
- Added TCP_AO_REPAIR setsockopt(), getsockopt()
- Allowed TCP_AO_* setsockopts if (tp->repair) is on
- Added selftests for TCP_AO_REPAIR, that also check incorrect
ISNs/SNEs, which result in a broken TCP-AO connection - that verifies
that both Initial Sequence Numbers and Sequence Number Extension are
part of MAC generation
- Using TCP_AO_REPAIR added a selftest for SEQ numbers rollover,
checking that SNE was incremented, connection is alive post-rolloever
and no TCP segments with a wrong signature arrived
- Wrote a selftest for RST segments: both active reset (goes through
transmit_skb()) and passive reset (goes through tcp_v{4,6}_send_reset()).
- Refactored and made readable tcp_v{4,6}_send_reset(), also adding
support for TCP_LISTEN/TCP_NEW_SYN_RECV
- Dropped per-CPU ahash requests allocations in favor of Herbert's
clone-tfm crypto API
- Added Donald Cassidy to Cc as he's interested in getting it into RHEL.

Version 6: https://lore.kernel.org/all/[email protected]/T/#u

iperf[3] benchmarks for version 6:
v6.4-rc1 TCP-AO-v6
TCP 43.9 Gbits/sec 43.5 Gbits/sec
TCP-MD5 2.20 Gbits/sec 2.25 Gbits/sec
TCP-AO(hmac(sha1)) 2.53 Gbits/sec
TCP-AO(hmac(sha512)) 1.67 Gbits/sec
TCP-AO(hmac(sha384)) 1.77 Gbits/sec
TCP-AO(hmac(sha224)) 1.29 Gbits/sec
TCP-AO(hmac(sha3-512)) 481 Mbits/sec
TCP-AO(hmac(md5)) 2.07 Gbits/sec
TCP-AO(hmac(rmd160)) 1.01 Gbits/sec
TCP-AO(cmac(aes128)) 2.11 Gbits/sec

Changes from v5:
- removed check for TCP_AO_KEYF_IFINDEX in delete command:
VRF might have been destroyed, there still needs to be a way to delete
keys that were bound to that l3intf (should tcp_v{4,6}_parse_md5_keys()
avoid the same check as well?)
- corrected copy'n'paste typo in tcp_ao_info_cmd() (assign ao_info->rnext_key)
- simplified a bit tcp_ao_copy_mkts_to_user(); added more UAPI checks
for getsockopt(TCP_AO_GET_KEYS)
- More UAPI selftests in setsockopt-closed: 29 => 120
- ported TCP-AO patches on Herbert's clone-tfm changes
- adjusted iperf patch for TCP-AO UAPI changes from version 5
- added measures for TCP-AO with tcp_sigpool & clone_tfm backends

Version 5: https://lore.kernel.org/all/[email protected]/T/#u

Changes from v4:
- Renamed tcp_ao_matched_key() => tcp_ao_established_key()
- Missed `static` in function definitions
(kernel test robot <[email protected]>)
- Fixed CONFIG_IPV6=m build
- Unexported tcp_md5_*_sigpool() functions
- Cleaned up tcp_ao.h: undeclared tcp_ao_cache_traffic_keys(),
tcp_v4_ao_calc_key_skb(); removed tcp_v4_inbound_ao_hash()
- Marked "net/tcp: Prepare tcp_md5sig_pool for TCP-AO" as a [draft] patch
- getsockopt() now returns TCP-AO per-key counters
- Another getsockopt() now returns per-ao_info stats: counters
and accept_icmps flag state
- Wired up getsockopt() returning counters to selftests
- Fixed a porting mistake: TCP-AO hash in some cases was written in TCP
header without accounting for MAC length of the key, rewritting skb
shared info
- Fail adding a key with L3 ifindex when !TCP_AO_KEYF_IFINDEX, instead
of ignoring tcpa_ifindex (stricter UAPI check)
- Added more test-cases to setsockopt-closed.c selftest
- tcp_ao_hash_skb_data() was a copy'n'paste of tcp_md5_hash_skb_data()
share it now under tcp_sigpool_hash_skb_data()
- tcp_ao_mkt_overlap_v{4,6}() deleted as they just re-invented
tcp_ao_do_lookup(). That fixes an issue with multiple IPv4-mapped-IPv6
keys for different peers on a listening socket.
- getsockopt() now is tested to return correct VRF number for a key
- TCP-AO and TCP-MD5 interraction in non/default VRFs: added +19 selftests
made them SKIP when CONFIG_VRF=n
- unsigned-md5 selftests now checks both scenarios:
(1) adding TCP-AO key _after_ TCP-MD5 key
(2) adding TCP-MD5 key _after_ TCP-AO key
- Added a ratelimited warning if TCP-AO key.ifindex doesn't match
sk->sk_bound_dev_if - that will warn a user for potential VRF issues
- tcp_v{4,6}_parse_md5_keys() now allows adding TCP-MD5 key with
ifindex=0 and TCP_MD5SIG_FLAG_IFINDEX together with TCP-AO key from
another VRF
- Add TCP_AO_CMDF_AO_REQUIRED, which makes a socket TCP-AO only,
rejecting TCP-MD5 keys or any unsigned TCP segments
- Remove `tcpa_' prefix for UAPI structure members
- UAPI cleanup: I've separated & renamed per-socket settings
(such as ao_info flags + current/rnext set) from per-key changes:
TCP_AO => TCP_AO_ADD_KEY
TCP_AO_DEL => TCP_AO_DEL_KEY
TCP_AO_GET => TCP_AO_GET_KEYS
TCP_AO_MOD => TCP_AO_INFO, the structure is now valid for both
getsockopt() and setsockopt().
- tcp_ao_current_rnext() was split up in order to fail earlier when
sndid/rcvid specified can't be set, before anything was changed in ao_info
- fetch current_key before dumping TCP-AO keys in getsockopt(TCP_AO_GET_KEYS):
it may race with changing current_key by RX, which in result might
produce a dump with no current_key for userspace.
- instead of TCP_AO_CMDF_* flags, used bitfileds: the flags weren't
shared between all TCP_AO_{ADD,GET,DEL}_KEY{,S}, so bitfields are more
descriptive here
- use READ_ONCE()/WRITE_ONCE() for current_key and rnext_key more
consistently; document in comment the rules for accessing them
- selftests: check all setsockopts()/getsockopts() support extending
option structs

Version 4: https://lore.kernel.org/all/[email protected]/T/#u

Changes from v3:
- TCP_MD5 dynamic static key enable/disable patches merged separately [4]
- crypto_pool patches were nacked [5], so instead this patch set extends
TCP-MD5-sigpool to be used for TCP-AO as well as for TCP-MD5
- Added missing `static' for tcp_v6_ao_calc_key()
(kernel test robot <[email protected]>)
- Removed CONFIG_TCP_AO default=y and added "If unsure, say N."
- Don't leak ao_info and don't create an unsigned TCP socket if there was
a TCP-AO key during handshake, but it was removed from listening socket
while the connection was being established
- Migrate to use static_key_fast_inc_not_disabled() and check return
code of static_branch_inc()
- Change some return codes to EAFNOSUPPORT for error-pathes where
family is neither AF_INET nor AF_INET6
- setsockopt()s on a closed/listen socket might have created stray ao_info,
remove it if connect() is called with a correct TCP-MD5 key, the same
for the reverse situation: remove md5sig_info straight away from the
socket if it's going to be TCP-AO connection
- IPv4-mapped-IPv6 addresses + selftest in fcnal-test.sh (by Salam)
- fix using uninitialized sisn/disn from stack - it would only make
non-SYN packets fail verification on a listen socket, which are not
expected anyway (kernel test robot <[email protected]>)
- implicit padding in UAPI TCP-AO structures converted to explicit
(spotted-by David Laight)
- Some selftests missed zero-initializers for uapi structs on stack
- Removed tcp_ao_do_lookup_rcvid() and tcp_ao_do_lookup_sndid() in
favor of unified tcp_ao_matched_key()
- Disallowed setting current/rnext keys on listen sockets - that wasn't
supported and didn't affect anything, cleanup for the UAPI
- VRFs support for TCP-AO

Version 3: https://lore.kernel.org/all/[email protected]/T/#u

Changes from v2:
- Added more missing `static' declarations for local functions
(kernel test robot <[email protected]>)
- Building now with CONFIG_TCP_AO=n and CONFIG_TCP_MD5SIG=n
(kernel test robot <[email protected]>)
- Now setsockopt(TCP_AO) is allowed when it's TCP_LISTEN or TCP_CLOSE
state OR the key added is not the first key on a socket (by Salam)
- CONFIG_TCP_AO does not depend on CONFIG_TCP_MD5SIG anymore
- Don't leak tcp_md5_needed static branch counter when TCP-MD5 key
is modified/changed
- TCP-AO lookups are dynamically enabled/disabled with static key when
there is ao_info in the system (and when it is destroyed)
- Wired SYN cookies up to TCP-AO (by Salam)
- Fix verification for possible re-transmitted SYN packets (by Salam)
- use sockopt_lock_sock() instead of lock_sock()
(from v6.1 rebase, commit d51bbff2aba7)
- use sockptr_t in getsockopt(TCP_AO_GET)
(from v6.1 rebase, commit 34704ef024ae)
- Fixed reallocating crypto_pool's scratch area by IPI while
crypto_pool_get() was get by another CPU
- selftests on older kernels (or with CONFIG_TCP_AO=n) should exit with
SKIP, not FAIL (Shuah Khan <[email protected]>)
- selftests that check interaction between TCP-AO and TCP-MD5 now
SKIP when CONFIG_TCP_MD5SIG=n
- Measured the performance of different hashing algorithms for TCP-AO
and compare with TCP-MD5 performance. This is done with hacky patches
to iperf (see [3]). At this moment I've done it in qemu/KVM with CPU
affinities set on Intel(R) Core(TM) i7-7600U CPU @ 2.80GHz.
No performance degradation was noticed before/after patches, but given
the measures were done in a VM, without measuring it on a physical dut
it only gives a hint of relative speed for different hash algorithms
with TCP-AO. Here are results, averaging on 30 measures each:
TCP: 3.51Gbits/sec
TCP-MD5: 1.12Gbits/sec
TCP-AO(HMAC(SHA1)): 1.53Gbits/sec
TCP-AO(CMAC(AES128)): 621Mbits/sec
TCP-AO(HMAC(SHA512)): 1.21Gbits/sec
TCP-AO(HMAC(SHA384)): 1.20Gbits/sec
TCP-AO(HMAC(SHA224)): 961Mbits/sec
TCP-AO(HMAC(SHA3-512)): 157Mbits/sec
TCP-AO(HMAC(RMD160)): 659Mbits/sec
TCP-AO(HMAC(MD5): 1.12Gbits/sec
(the last one is just for fun, but may make sense as it provides
the same security as TCP-MD5, but allows multiple keys and a mechanism
to change them from RFC5925)

Version 2: https://lore.kernel.org/all/[email protected]/T/#u

Changes from v1:
- Building now with CONFIG_IPV6=n (kernel test robot <[email protected]>)
- Added missing static declarations for local functions
(kernel test robot <[email protected]>)
- Addressed static analyzer and review comments by Dan Carpenter
(thanks, they were very useful!)
- Fix elif without defined() for !CONFIG_TCP_AO
- Recursively build selftests/net/tcp_ao (Shuah Khan), patches in:
https://lore.kernel.org/all/[email protected]/T/#u
- Don't leak crypto_pool reference when TCP-MD5 key is modified/changed
- Add TCP-AO support for nettest.c and fcnal-test.sh
(will be used for VRF testing in later versions)

Comparison between Leonard proposal and this (overview):
https://lore.kernel.org/all/[email protected]/T/#u

Version 1: https://lore.kernel.org/all/[email protected]/T/#u

This patchset implements the TCP-AO option as described in RFC5925. There
is a request from industry to move away from TCP-MD5SIG and it seems the time
is right to have a TCP-AO upstreamed. This TCP option is meant to replace
the TCP MD5 option and address its shortcomings. Specifically, it provides
more secure hashing, key rotation and support for long-lived connections
(see the summary of TCP-AO advantages over TCP-MD5 in (1.3) of RFC5925).
The patch series starts with six patches that are not specific to TCP-AO
but implement a general crypto facility that we thought is useful
to eliminate code duplication between TCP-MD5SIG and TCP-AO as well as other
crypto users. These six patches are being submitted separately in
a different patchset [1]. Including them here will show better the gain
in code sharing. Next are 18 patches that implement the actual TCP-AO option,
followed by patches implementing selftests.

The patch set was written as a collaboration of three authors (in alphabetical
order): Dmitry Safonov, Francesco Ruggeri and Salam Noureddine. Additional
credits should be given to Prasad Koya, who was involved in early prototyping
a few years back. There is also a separate submission done by Leonard Crestez
whom we thank for his efforts getting an implementation of RFC5925 submitted
for review upstream [2]. This is an independent implementation that makes
different design decisions.

For example, we chose a similar design to the TCP-MD5SIG implementation and
used setsockopts to program per-socket keys, avoiding the extra complexity
of managing a centralized key database in the kernel. A centralized database
in the kernel has dubious benefits since it doesn’t eliminate per-socket
setsockopts needed to specify which sockets need TCP-AO and what are the
currently preferred keys. It also complicates traffic key caching and
preventing deletion of in-use keys.

In this implementation, a centralized database of keys can be thought of
as living in user space and user applications would have to program those
keys on matching sockets. On the server side, the user application programs
keys (MKTS in TCP-AO nomenclature) on the listening socket for all peers that
are expected to connect. Prefix matching on the peer address is supported.
When a peer issues a successful connect, all the MKTs matching the IP address
of the peer are copied to the newly created socket. On the active side,
when a connect() is issued all MKTs that do not match the peer are deleted
from the socket since they will never match the peer. This implementation
uses three setsockopt()s for adding, deleting and modifying keys on a socket.
All three setsockopt()s have extensive sanity checks that prevent
inconsistencies in the keys on a given socket. A getsockopt() is provided
to get key information from any given socket.

Few things to note about this implementation:
- Traffic keys are cached for established connections avoiding the cost of
such calculation for each packet received or sent.
- Great care has been taken to avoid deleting in-use MKTs
as required by the RFC.
- Any crypto algorithm supported by the Linux kernel can be used
to calculate packet hashes.
- Fastopen works with TCP-AO but hasn’t been tested extensively.
- Tested for interop with other major networking vendors (on linux-4.19),
including testing for key rotation and long lived connections.

[1]: https://lore.kernel.org/all/[email protected]/
[2]: https://lore.kernel.org/all/[email protected]/
[3]: https://github.com/0x7f454c46/iperf/tree/tcp-md5-ao
[4]: https://lore.kernel.org/all/166995421700.16716.17446147162780881407.git-patchwork-notify@kernel.org/T/#u
[5]: https://lore.kernel.org/all/[email protected]/T/#u
[6]: https://lore.kernel.org/all/[email protected]/T/#u

Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Bob Gilligan <[email protected]>
Cc: Dan Carpenter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: David Laight <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: Donald Cassidy <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Francesco Ruggeri <[email protected]>
Cc: Gaillardetz, Dominik <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Hideaki YOSHIFUJI <[email protected]>
Cc: Ivan Delalande <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Leonard Crestez <[email protected]>
Cc: Nassiri, Mohammad <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Salam Noureddine <[email protected]>
Cc: Simon Horman <[email protected]>
Cc: Tetreault, Francois <[email protected]>
Cc: [email protected]
Cc: [email protected]

Dmitry Safonov (23):
net/tcp: Prepare tcp_md5sig_pool for TCP-AO
net/tcp: Add TCP-AO config and structures
net/tcp: Introduce TCP_AO setsockopt()s
net/tcp: Prevent TCP-MD5 with TCP-AO being set
net/tcp: Calculate TCP-AO traffic keys
net/tcp: Add TCP-AO sign to outgoing packets
net/tcp: Add tcp_parse_auth_options()
net/tcp: Add AO sign to RST packets
net/tcp: Add TCP-AO sign to twsk
net/tcp: Wire TCP-AO to request sockets
net/tcp: Sign SYN-ACK segments with TCP-AO
net/tcp: Verify inbound TCP-AO signed segments
net/tcp: Add TCP-AO segments counters
net/tcp: Add TCP-AO SNE support
net/tcp: Add tcp_hash_fail() ratelimited logs
net/tcp: Ignore specific ICMPs for TCP-AO connections
net/tcp: Add option for TCP-AO to (not) hash header
net/tcp: Add TCP-AO getsockopt()s
net/tcp: Allow asynchronous delete for TCP-AO keys (MKTs)
net/tcp: Add static_key for TCP-AO
net/tcp: Wire up l3index to TCP-AO
net/tcp: Add TCP_AO_REPAIR
Documentation/tcp: Add TCP-AO documentation

Documentation/networking/index.rst | 1 +
Documentation/networking/tcp_ao.rst | 434 +++++
include/linux/sockptr.h | 23 +
include/linux/tcp.h | 30 +-
include/net/dropreason-core.h | 30 +
include/net/tcp.h | 218 ++-
include/net/tcp_ao.h | 347 ++++
include/uapi/linux/snmp.h | 5 +
include/uapi/linux/tcp.h | 105 ++
net/ipv4/Kconfig | 17 +
net/ipv4/Makefile | 2 +
net/ipv4/proc.c | 5 +
net/ipv4/syncookies.c | 4 +
net/ipv4/tcp.c | 246 +--
net/ipv4/tcp_ao.c | 2341 +++++++++++++++++++++++++++
net/ipv4/tcp_input.c | 97 +-
net/ipv4/tcp_ipv4.c | 334 +++-
net/ipv4/tcp_minisocks.c | 50 +-
net/ipv4/tcp_output.c | 232 ++-
net/ipv4/tcp_sigpool.c | 358 ++++
net/ipv6/Makefile | 1 +
net/ipv6/syncookies.c | 5 +
net/ipv6/tcp_ao.c | 168 ++
net/ipv6/tcp_ipv6.c | 342 +++-
24 files changed, 5022 insertions(+), 373 deletions(-)
create mode 100644 Documentation/networking/tcp_ao.rst
create mode 100644 include/net/tcp_ao.h
create mode 100644 net/ipv4/tcp_ao.c
create mode 100644 net/ipv4/tcp_sigpool.c
create mode 100644 net/ipv6/tcp_ao.c


base-commit: 73be7fb14e83d24383f840a22f24d3ed222ca319
--
2.41.0


2023-09-12 09:54:18

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v11 net-next 20/23] net/tcp: Add static_key for TCP-AO

Similarly to TCP-MD5, add a static key to TCP-AO that is patched out
when there are no keys on a machine and dynamically enabled with the
first setsockopt(TCP_AO) adds a key on any socket. The static key is as
well dynamically disabled later when the socket is destructed.

The lifetime of enabled static key here is the same as ao_info: it is
enabled on allocation, passed over from full socket to twsk and
destructed when ao_info is scheduled for destruction.

Signed-off-by: Dmitry Safonov <[email protected]>
Acked-by: David Ahern <[email protected]>
---
include/net/tcp.h | 3 +++
include/net/tcp_ao.h | 2 ++
net/ipv4/tcp_ao.c | 22 +++++++++++++++++++++
net/ipv4/tcp_input.c | 46 +++++++++++++++++++++++++++++---------------
4 files changed, 57 insertions(+), 16 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index ac5f96f0ce19..fa100e4f2cde 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -2611,6 +2611,9 @@ static inline bool tcp_ao_required(struct sock *sk, const void *saddr,
struct tcp_ao_info *ao_info;
struct tcp_ao_key *ao_key;

+ if (!static_branch_unlikely(&tcp_ao_needed.key))
+ return false;
+
ao_info = rcu_dereference_check(tcp_sk(sk)->ao_info,
lockdep_sock_is_held(sk));
if (!ao_info)
diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
index 09cf9d216b3a..b97e1b3c6448 100644
--- a/include/net/tcp_ao.h
+++ b/include/net/tcp_ao.h
@@ -151,6 +151,8 @@ do { \

#ifdef CONFIG_TCP_AO
/* TCP-AO structures and functions */
+#include <linux/jump_label.h>
+extern struct static_key_false_deferred tcp_ao_needed;

struct tcp4_ao_context {
__be32 saddr;
diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index c5bde089916d..24fd8772deea 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -17,6 +17,8 @@
#include <net/ipv6.h>
#include <net/icmp.h>

+DEFINE_STATIC_KEY_DEFERRED_FALSE(tcp_ao_needed, HZ);
+
int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx,
unsigned int len, struct tcp_sigpool *hp)
{
@@ -50,6 +52,9 @@ bool tcp_ao_ignore_icmp(const struct sock *sk, int type, int code)
bool ignore_icmp = false;
struct tcp_ao_info *ao;

+ if (!static_branch_unlikely(&tcp_ao_needed.key))
+ return false;
+
/* RFC5925, 7.8:
* >> A TCP-AO implementation MUST default to ignore incoming ICMPv4
* messages of Type 3 (destination unreachable), Codes 2-4 (protocol
@@ -185,6 +190,9 @@ static struct tcp_ao_key *__tcp_ao_do_lookup(const struct sock *sk,
struct tcp_ao_key *key;
struct tcp_ao_info *ao;

+ if (!static_branch_unlikely(&tcp_ao_needed.key))
+ return NULL;
+
ao = rcu_dereference_check(tcp_sk(sk)->ao_info,
lockdep_sock_is_held(sk));
if (!ao)
@@ -276,6 +284,7 @@ void tcp_ao_destroy_sock(struct sock *sk, bool twsk)
}

kfree_rcu(ao, rcu);
+ static_branch_slow_dec_deferred(&tcp_ao_needed);
}

void tcp_ao_time_wait(struct tcp_timewait_sock *tcptw, struct tcp_sock *tp)
@@ -1129,6 +1138,11 @@ int tcp_ao_copy_all_matching(const struct sock *sk, struct sock *newsk,
goto free_and_exit;
}

+ if (!static_key_fast_inc_not_disabled(&tcp_ao_needed.key.key)) {
+ ret = -EUSERS;
+ goto free_and_exit;
+ }
+
key_head = rcu_dereference(hlist_first_rcu(&new_ao->head));
first_key = hlist_entry_safe(key_head, struct tcp_ao_key, node);

@@ -1556,6 +1570,10 @@ static int tcp_ao_add_cmd(struct sock *sk, unsigned short int family,

tcp_ao_link_mkt(ao_info, key);
if (first) {
+ if (!static_branch_inc(&tcp_ao_needed.key)) {
+ ret = -EUSERS;
+ goto err_free_sock;
+ }
sk_gso_disable(sk);
rcu_assign_pointer(tcp_sk(sk)->ao_info, ao_info);
}
@@ -1824,6 +1842,10 @@ static int tcp_ao_info_cmd(struct sock *sk, unsigned short int family,
if (new_rnext)
WRITE_ONCE(ao_info->rnext_key, new_rnext);
if (first) {
+ if (!static_branch_inc(&tcp_ao_needed.key)) {
+ err = -EUSERS;
+ goto out;
+ }
sk_gso_disable(sk);
rcu_assign_pointer(tcp_sk(sk)->ao_info, ao_info);
}
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 1e9e423bb718..414c49d37390 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3528,41 +3528,55 @@ static inline bool tcp_may_update_window(const struct tcp_sock *tp,
(ack_seq == tp->snd_wl1 && (nwin > tp->snd_wnd || !nwin));
}

-/* If we update tp->snd_una, also update tp->bytes_acked */
-static void tcp_snd_una_update(struct tcp_sock *tp, u32 ack)
+static void tcp_snd_sne_update(struct tcp_sock *tp, u32 ack)
{
- u32 delta = ack - tp->snd_una;
#ifdef CONFIG_TCP_AO
struct tcp_ao_info *ao;
-#endif

- sock_owned_by_me((struct sock *)tp);
- tp->bytes_acked += delta;
-#ifdef CONFIG_TCP_AO
+ if (!static_branch_unlikely(&tcp_ao_needed.key))
+ return;
+
ao = rcu_dereference_protected(tp->ao_info,
lockdep_sock_is_held((struct sock *)tp));
if (ao && ack < tp->snd_una)
ao->snd_sne++;
#endif
+}
+
+/* If we update tp->snd_una, also update tp->bytes_acked */
+static void tcp_snd_una_update(struct tcp_sock *tp, u32 ack)
+{
+ u32 delta = ack - tp->snd_una;
+
+ sock_owned_by_me((struct sock *)tp);
+ tp->bytes_acked += delta;
+ tcp_snd_sne_update(tp, ack);
tp->snd_una = ack;
}

+static void tcp_rcv_sne_update(struct tcp_sock *tp, u32 seq)
+{
+#ifdef CONFIG_TCP_AO
+ struct tcp_ao_info *ao;
+
+ if (!static_branch_unlikely(&tcp_ao_needed.key))
+ return;
+
+ ao = rcu_dereference_protected(tp->ao_info,
+ lockdep_sock_is_held((struct sock *)tp));
+ if (ao && seq < tp->rcv_nxt)
+ ao->rcv_sne++;
+#endif
+}
+
/* If we update tp->rcv_nxt, also update tp->bytes_received */
static void tcp_rcv_nxt_update(struct tcp_sock *tp, u32 seq)
{
u32 delta = seq - tp->rcv_nxt;
-#ifdef CONFIG_TCP_AO
- struct tcp_ao_info *ao;
-#endif

sock_owned_by_me((struct sock *)tp);
tp->bytes_received += delta;
-#ifdef CONFIG_TCP_AO
- ao = rcu_dereference_protected(tp->ao_info,
- lockdep_sock_is_held((struct sock *)tp));
- if (ao && seq < tp->rcv_nxt)
- ao->rcv_sne++;
-#endif
+ tcp_rcv_sne_update(tp, seq);
WRITE_ONCE(tp->rcv_nxt, seq);
}

--
2.41.0

2023-09-12 11:55:45

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v11 net-next 02/23] net/tcp: Add TCP-AO config and structures

Introduce new kernel config option and common structures as well as
helpers to be used by TCP-AO code.

Co-developed-by: Francesco Ruggeri <[email protected]>
Signed-off-by: Francesco Ruggeri <[email protected]>
Co-developed-by: Salam Noureddine <[email protected]>
Signed-off-by: Salam Noureddine <[email protected]>
Signed-off-by: Dmitry Safonov <[email protected]>
Acked-by: David Ahern <[email protected]>
---
include/linux/tcp.h | 9 +++-
include/net/tcp.h | 8 +---
include/net/tcp_ao.h | 90 ++++++++++++++++++++++++++++++++++++++++
include/uapi/linux/tcp.h | 2 +
net/ipv4/Kconfig | 13 ++++++
5 files changed, 114 insertions(+), 8 deletions(-)
create mode 100644 include/net/tcp_ao.h

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 3c5efeeb024f..fc98c7d63360 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -437,13 +437,18 @@ struct tcp_sock {
bool syn_smc; /* SYN includes SMC */
#endif

-#ifdef CONFIG_TCP_MD5SIG
-/* TCP AF-Specific parts; only used by MD5 Signature support so far */
+#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
+/* TCP AF-Specific parts; only used by TCP-AO/MD5 Signature support so far */
const struct tcp_sock_af_ops *af_specific;

+#ifdef CONFIG_TCP_MD5SIG
/* TCP MD5 Signature Option information */
struct tcp_md5sig_info __rcu *md5sig_info;
#endif
+#ifdef CONFIG_TCP_AO
+ struct tcp_ao_info __rcu *ao_info;
+#endif
+#endif

/* TCP fastopen related information */
struct tcp_fastopen_request *fastopen_req;
diff --git a/include/net/tcp.h b/include/net/tcp.h
index cb8fadde8c5c..cd93b2aa88c8 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -37,6 +37,7 @@
#include <net/snmp.h>
#include <net/ip.h>
#include <net/tcp_states.h>
+#include <net/tcp_ao.h>
#include <net/inet_ecn.h>
#include <net/dst.h>
#include <net/mptcp.h>
@@ -1650,12 +1651,7 @@ static inline void tcp_clear_all_retrans_hints(struct tcp_sock *tp)
tp->retransmit_skb_hint = NULL;
}

-union tcp_md5_addr {
- struct in_addr a4;
-#if IS_ENABLED(CONFIG_IPV6)
- struct in6_addr a6;
-#endif
-};
+#define tcp_md5_addr tcp_ao_addr

/* - key database */
struct tcp_md5sig_key {
diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
new file mode 100644
index 000000000000..af76e1c47bea
--- /dev/null
+++ b/include/net/tcp_ao.h
@@ -0,0 +1,90 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _TCP_AO_H
+#define _TCP_AO_H
+
+#define TCP_AO_KEY_ALIGN 1
+#define __tcp_ao_key_align __aligned(TCP_AO_KEY_ALIGN)
+
+union tcp_ao_addr {
+ struct in_addr a4;
+#if IS_ENABLED(CONFIG_IPV6)
+ struct in6_addr a6;
+#endif
+};
+
+struct tcp_ao_hdr {
+ u8 kind;
+ u8 length;
+ u8 keyid;
+ u8 rnext_keyid;
+};
+
+struct tcp_ao_key {
+ struct hlist_node node;
+ union tcp_ao_addr addr;
+ u8 key[TCP_AO_MAXKEYLEN] __tcp_ao_key_align;
+ unsigned int tcp_sigpool_id;
+ unsigned int digest_size;
+ u8 prefixlen;
+ u8 family;
+ u8 keylen;
+ u8 keyflags;
+ u8 sndid;
+ u8 rcvid;
+ u8 maclen;
+ struct rcu_head rcu;
+ u8 traffic_keys[];
+};
+
+static inline u8 *rcv_other_key(struct tcp_ao_key *key)
+{
+ return key->traffic_keys;
+}
+
+static inline u8 *snd_other_key(struct tcp_ao_key *key)
+{
+ return key->traffic_keys + key->digest_size;
+}
+
+static inline int tcp_ao_maclen(const struct tcp_ao_key *key)
+{
+ return key->maclen;
+}
+
+static inline int tcp_ao_len(const struct tcp_ao_key *key)
+{
+ return tcp_ao_maclen(key) + sizeof(struct tcp_ao_hdr);
+}
+
+static inline unsigned int tcp_ao_digest_size(struct tcp_ao_key *key)
+{
+ return key->digest_size;
+}
+
+static inline int tcp_ao_sizeof_key(const struct tcp_ao_key *key)
+{
+ return sizeof(struct tcp_ao_key) + (key->digest_size << 1);
+}
+
+struct tcp_ao_info {
+ /* List of tcp_ao_key's */
+ struct hlist_head head;
+ /* current_key and rnext_key aren't maintained on listen sockets.
+ * Their purpose is to cache keys on established connections,
+ * saving needless lookups. Never dereference any of them from
+ * listen sockets.
+ * ::current_key may change in RX to the key that was requested by
+ * the peer, please use READ_ONCE()/WRITE_ONCE() in order to avoid
+ * load/store tearing.
+ * Do the same for ::rnext_key, if you don't hold socket lock
+ * (it's changed only by userspace request in setsockopt()).
+ */
+ struct tcp_ao_key *current_key;
+ struct tcp_ao_key *rnext_key;
+ u32 flags;
+ __be32 lisn;
+ __be32 risn;
+ struct rcu_head rcu;
+};
+
+#endif /* _TCP_AO_H */
diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index 879eeb0a084b..5655bfe28b8d 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -348,6 +348,8 @@ struct tcp_diag_md5sig {
__u8 tcpm_key[TCP_MD5SIG_MAXKEYLEN];
};

+#define TCP_AO_MAXKEYLEN 80
+
/* setsockopt(fd, IPPROTO_TCP, TCP_ZEROCOPY_RECEIVE, ...) */

#define TCP_RECEIVE_ZEROCOPY_FLAG_TLB_CLEAN_HINT 0x1
diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig
index 89e2ab023272..8e94ed7c56a0 100644
--- a/net/ipv4/Kconfig
+++ b/net/ipv4/Kconfig
@@ -744,6 +744,19 @@ config DEFAULT_TCP_CONG
config TCP_SIGPOOL
tristate

+config TCP_AO
+ bool "TCP: Authentication Option (RFC5925)"
+ select CRYPTO
+ select TCP_SIGPOOL
+ depends on 64BIT && IPV6 != m # seq-number extension needs WRITE_ONCE(u64)
+ help
+ TCP-AO specifies the use of stronger Message Authentication Codes (MACs),
+ protects against replays for long-lived TCP connections, and
+ provides more details on the association of security with TCP
+ connections than TCP MD5 (See RFC5925)
+
+ If unsure, say N.
+
config TCP_MD5SIG
bool "TCP: MD5 Signature Option support (RFC2385)"
select CRYPTO
--
2.41.0

2023-09-12 14:09:42

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v11 net-next 15/23] net/tcp: Add tcp_hash_fail() ratelimited logs

Add a helper for logging connection-detailed messages for failed TCP
hash verification (both MD5 and AO).

Co-developed-by: Francesco Ruggeri <[email protected]>
Signed-off-by: Francesco Ruggeri <[email protected]>
Co-developed-by: Salam Noureddine <[email protected]>
Signed-off-by: Salam Noureddine <[email protected]>
Signed-off-by: Dmitry Safonov <[email protected]>
Acked-by: David Ahern <[email protected]>
---
include/net/tcp.h | 14 ++++++++++++--
include/net/tcp_ao.h | 29 +++++++++++++++++++++++++++++
net/ipv4/tcp.c | 23 +++++++++++++----------
net/ipv4/tcp_ao.c | 7 +++++++
4 files changed, 61 insertions(+), 12 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 7003b64527d4..ac5f96f0ce19 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -2641,12 +2641,18 @@ tcp_inbound_hash(struct sock *sk, const struct request_sock *req,
int l3index;

/* Invalid option or two times meet any of auth options */
- if (tcp_parse_auth_options(th, &md5_location, &aoh))
+ if (tcp_parse_auth_options(th, &md5_location, &aoh)) {
+ tcp_hash_fail("TCP segment has incorrect auth options set",
+ family, skb, "");
return SKB_DROP_REASON_TCP_AUTH_HDR;
+ }

if (req) {
if (tcp_rsk_used_ao(req) != !!aoh) {
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOBAD);
+ tcp_hash_fail("TCP connection can't start/end using TCP-AO",
+ family, skb, "%s",
+ !aoh ? "missing AO" : "AO signed");
return SKB_DROP_REASON_TCP_AOFAILURE;
}
}
@@ -2663,10 +2669,14 @@ tcp_inbound_hash(struct sock *sk, const struct request_sock *req,
* the last key is impossible to remove, so there's
* always at least one current_key.
*/
- if (tcp_ao_required(sk, saddr, family, true))
+ if (tcp_ao_required(sk, saddr, family, true)) {
+ tcp_hash_fail("AO hash is required, but not found",
+ family, skb, "L3 index %d", l3index);
return SKB_DROP_REASON_TCP_AONOTFOUND;
+ }
if (unlikely(tcp_md5_do_lookup(sk, l3index, saddr, family))) {
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND);
+ tcp_hash_fail("MD5 Hash not found", family, skb, "");
return SKB_DROP_REASON_TCP_MD5NOTFOUND;
}
return SKB_NOT_DROPPED_YET;
diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
index e62452bc17d6..5c5d16b6f9f9 100644
--- a/include/net/tcp_ao.h
+++ b/include/net/tcp_ao.h
@@ -118,6 +118,35 @@ struct tcp_ao_info {
struct rcu_head rcu;
};

+#define tcp_hash_fail(msg, family, skb, fmt, ...) \
+do { \
+ const struct tcphdr *th = tcp_hdr(skb); \
+ char hdr_flags[5] = {}; \
+ char *f = hdr_flags; \
+ \
+ if (th->fin) \
+ *f++ = 'F'; \
+ if (th->syn) \
+ *f++ = 'S'; \
+ if (th->rst) \
+ *f++ = 'R'; \
+ if (th->ack) \
+ *f++ = 'A'; \
+ if (f != hdr_flags) \
+ *f = ' '; \
+ if ((family) == AF_INET) { \
+ net_info_ratelimited("%s for (%pI4, %d)->(%pI4, %d) %s" fmt "\n", \
+ msg, &ip_hdr(skb)->saddr, ntohs(th->source), \
+ &ip_hdr(skb)->daddr, ntohs(th->dest), \
+ hdr_flags, ##__VA_ARGS__); \
+ } else { \
+ net_info_ratelimited("%s for [%pI6c]:%u->[%pI6c]:%u %s" fmt "\n", \
+ msg, &ipv6_hdr(skb)->saddr, ntohs(th->source), \
+ &ipv6_hdr(skb)->daddr, ntohs(th->dest), \
+ hdr_flags, ##__VA_ARGS__); \
+ } \
+} while (0)
+
#ifdef CONFIG_TCP_AO
/* TCP-AO structures and functions */

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 8506be193843..ab6eb3cc38e1 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -4367,7 +4367,6 @@ tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb,
* o MD5 hash and we're not expecting one.
* o MD5 hash and its wrong.
*/
- const struct tcphdr *th = tcp_hdr(skb);
const struct tcp_sock *tp = tcp_sk(sk);
struct tcp_md5sig_key *key;
u8 newhash[16];
@@ -4377,6 +4376,7 @@ tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb,

if (!key && hash_location) {
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5UNEXPECTED);
+ tcp_hash_fail("Unexpected MD5 Hash found", family, skb, "");
return SKB_DROP_REASON_TCP_MD5UNEXPECTED;
}

@@ -4392,16 +4392,19 @@ tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb,
if (genhash || memcmp(hash_location, newhash, 16) != 0) {
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5FAILURE);
if (family == AF_INET) {
- net_info_ratelimited("MD5 Hash failed for (%pI4, %d)->(%pI4, %d)%s L3 index %d\n",
- saddr, ntohs(th->source),
- daddr, ntohs(th->dest),
- genhash ? " tcp_v4_calc_md5_hash failed"
- : "", l3index);
+ tcp_hash_fail("MD5 Hash failed", AF_INET, skb, "%s L3 index %d",
+ genhash ? "tcp_v4_calc_md5_hash failed"
+ : "", l3index);
} else {
- net_info_ratelimited("MD5 Hash %s for [%pI6c]:%u->[%pI6c]:%u L3 index %d\n",
- genhash ? "failed" : "mismatch",
- saddr, ntohs(th->source),
- daddr, ntohs(th->dest), l3index);
+ if (genhash) {
+ tcp_hash_fail("MD5 Hash failed",
+ AF_INET6, skb, "L3 index %d",
+ l3index);
+ } else {
+ tcp_hash_fail("MD5 Hash mismatch",
+ AF_INET6, skb, "L3 index %d",
+ l3index);
+ }
}
return SKB_DROP_REASON_TCP_MD5FAILURE;
}
diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index 0b5621cbc744..2283203f1ac5 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -764,6 +764,8 @@ tcp_ao_verify_hash(const struct sock *sk, const struct sk_buff *skb,
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOBAD);
atomic64_inc(&info->counters.pkt_bad);
atomic64_inc(&key->pkt_bad);
+ tcp_hash_fail("AO hash wrong length", family, skb,
+ "%u != %d", maclen, tcp_ao_maclen(key));
return SKB_DROP_REASON_TCP_AOFAILURE;
}

@@ -778,6 +780,7 @@ tcp_ao_verify_hash(const struct sock *sk, const struct sk_buff *skb,
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOBAD);
atomic64_inc(&info->counters.pkt_bad);
atomic64_inc(&key->pkt_bad);
+ tcp_hash_fail("AO hash mismatch", family, skb, "");
kfree(hash_buf);
return SKB_DROP_REASON_TCP_AOFAILURE;
}
@@ -805,6 +808,8 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb,
info = rcu_dereference(tcp_sk(sk)->ao_info);
if (!info) {
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOKEYNOTFOUND);
+ tcp_hash_fail("AO key not found", family, skb,
+ "keyid: %u", aoh->keyid);
return SKB_DROP_REASON_TCP_AOUNEXPECTED;
}

@@ -907,6 +912,8 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb,
key_not_found:
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOKEYNOTFOUND);
atomic64_inc(&info->counters.key_not_found);
+ tcp_hash_fail("Requested by the peer AO key id not found",
+ family, skb, "");
return SKB_DROP_REASON_TCP_AOKEYNOTFOUND;
}

--
2.41.0

2023-09-12 17:42:19

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v11 net-next 03/23] net/tcp: Introduce TCP_AO setsockopt()s

Add 3 setsockopt()s:
1. TCP_AO_ADD_KEY to add a new Master Key Tuple (MKT) on a socket
2. TCP_AO_DEL_KEY to delete present MKT from a socket
3. TCP_AO_INFO to change flags, Current_key/RNext_key on a TCP-AO sk

Userspace has to introduce keys on every socket it wants to use TCP-AO
option on, similarly to TCP_MD5SIG/TCP_MD5SIG_EXT.
RFC5925 prohibits definition of MKTs that would match the same peer,
so do sanity checks on the data provided by userspace. Be as
conservative as possible, including refusal of defining MKT on
an established connection with no AO, removing the key in-use and etc.

(1) and (2) are to be used by userspace key manager to add/remove keys.
(3) main purpose is to set RNext_key, which (as prescribed by RFC5925)
is the KeyID that will be requested in TCP-AO header from the peer to
sign their segments with.

At this moment the life of ao_info ends in tcp_v4_destroy_sock().

Co-developed-by: Francesco Ruggeri <[email protected]>
Signed-off-by: Francesco Ruggeri <[email protected]>
Co-developed-by: Salam Noureddine <[email protected]>
Signed-off-by: Salam Noureddine <[email protected]>
Signed-off-by: Dmitry Safonov <[email protected]>
Acked-by: David Ahern <[email protected]>
---
include/linux/sockptr.h | 23 ++
include/net/tcp.h | 3 +
include/net/tcp_ao.h | 17 +-
include/uapi/linux/tcp.h | 46 +++
net/ipv4/Makefile | 1 +
net/ipv4/tcp.c | 17 +
net/ipv4/tcp_ao.c | 794 +++++++++++++++++++++++++++++++++++++++
net/ipv4/tcp_ipv4.c | 10 +-
net/ipv6/Makefile | 1 +
net/ipv6/tcp_ao.c | 19 +
net/ipv6/tcp_ipv6.c | 39 +-
11 files changed, 952 insertions(+), 18 deletions(-)
create mode 100644 net/ipv4/tcp_ao.c
create mode 100644 net/ipv6/tcp_ao.c

diff --git a/include/linux/sockptr.h b/include/linux/sockptr.h
index bae5e2369b4f..307961b41541 100644
--- a/include/linux/sockptr.h
+++ b/include/linux/sockptr.h
@@ -55,6 +55,29 @@ static inline int copy_from_sockptr(void *dst, sockptr_t src, size_t size)
return copy_from_sockptr_offset(dst, src, 0, size);
}

+static inline int copy_struct_from_sockptr(void *dst, size_t ksize,
+ sockptr_t src, size_t usize)
+{
+ size_t size = min(ksize, usize);
+ size_t rest = max(ksize, usize) - size;
+
+ if (!sockptr_is_kernel(src))
+ return copy_struct_from_user(dst, ksize, src.user, size);
+
+ if (usize < ksize) {
+ memset(dst + size, 0, rest);
+ } else if (usize > ksize) {
+ char *p = src.kernel;
+
+ while (rest--) {
+ if (*p++)
+ return -E2BIG;
+ }
+ }
+ memcpy(dst, src.kernel, size);
+ return 0;
+}
+
static inline int copy_to_sockptr_offset(sockptr_t dst, size_t offset,
const void *src, size_t size)
{
diff --git a/include/net/tcp.h b/include/net/tcp.h
index cd93b2aa88c8..6b5bf9e9b9f1 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -2133,6 +2133,9 @@ struct tcp_sock_af_ops {
sockptr_t optval,
int optlen);
#endif
+#ifdef CONFIG_TCP_AO
+ int (*ao_parse)(struct sock *sk, int optname, sockptr_t optval, int optlen);
+#endif
};

struct tcp_request_sock_ops {
diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
index af76e1c47bea..a81e40fd255a 100644
--- a/include/net/tcp_ao.h
+++ b/include/net/tcp_ao.h
@@ -81,10 +81,25 @@ struct tcp_ao_info {
*/
struct tcp_ao_key *current_key;
struct tcp_ao_key *rnext_key;
- u32 flags;
+ u32 ao_required :1,
+ __unused :31;
__be32 lisn;
__be32 risn;
struct rcu_head rcu;
};

+#ifdef CONFIG_TCP_AO
+int tcp_parse_ao(struct sock *sk, int cmd, unsigned short int family,
+ sockptr_t optval, int optlen);
+void tcp_ao_destroy_sock(struct sock *sk);
+/* ipv4 specific functions */
+int tcp_v4_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen);
+/* ipv6 specific functions */
+int tcp_v6_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen);
+#else
+static inline void tcp_ao_destroy_sock(struct sock *sk)
+{
+}
+#endif
+
#endif /* _TCP_AO_H */
diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index 5655bfe28b8d..250e0ce2cc38 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -129,6 +129,9 @@ enum {

#define TCP_TX_DELAY 37 /* delay outgoing packets by XX usec */

+#define TCP_AO_ADD_KEY 38 /* Add/Set MKT */
+#define TCP_AO_DEL_KEY 39 /* Delete MKT */
+#define TCP_AO_INFO 40 /* Modify TCP-AO per-socket options */

#define TCP_REPAIR_ON 1
#define TCP_REPAIR_OFF 0
@@ -350,6 +353,49 @@ struct tcp_diag_md5sig {

#define TCP_AO_MAXKEYLEN 80

+#define TCP_AO_KEYF_IFINDEX (1 << 0) /* L3 ifindex for VRF */
+
+struct tcp_ao_add { /* setsockopt(TCP_AO_ADD_KEY) */
+ struct __kernel_sockaddr_storage addr; /* peer's address for the key */
+ char alg_name[64]; /* crypto hash algorithm to use */
+ __s32 ifindex; /* L3 dev index for VRF */
+ __u32 set_current :1, /* set key as Current_key at once */
+ set_rnext :1, /* request it from peer with RNext_key */
+ reserved :30; /* must be 0 */
+ __u16 reserved2; /* padding, must be 0 */
+ __u8 prefix; /* peer's address prefix */
+ __u8 sndid; /* SendID for outgoing segments */
+ __u8 rcvid; /* RecvID to match for incoming seg */
+ __u8 maclen; /* length of authentication code (hash) */
+ __u8 keyflags; /* see TCP_AO_KEYF_ */
+ __u8 keylen; /* length of ::key */
+ __u8 key[TCP_AO_MAXKEYLEN];
+} __attribute__((aligned(8)));
+
+struct tcp_ao_del { /* setsockopt(TCP_AO_DEL_KEY) */
+ struct __kernel_sockaddr_storage addr; /* peer's address for the key */
+ __s32 ifindex; /* L3 dev index for VRF */
+ __u32 set_current :1, /* corresponding ::current_key */
+ set_rnext :1, /* corresponding ::rnext */
+ reserved :30; /* must be 0 */
+ __u16 reserved2; /* padding, must be 0 */
+ __u8 prefix; /* peer's address prefix */
+ __u8 sndid; /* SendID for outgoing segments */
+ __u8 rcvid; /* RecvID to match for incoming seg */
+ __u8 current_key; /* KeyID to set as Current_key */
+ __u8 rnext; /* KeyID to set as Rnext_key */
+ __u8 keyflags; /* see TCP_AO_KEYF_ */
+} __attribute__((aligned(8)));
+
+struct tcp_ao_info_opt { /* setsockopt(TCP_AO_INFO) */
+ __u32 set_current :1, /* corresponding ::current_key */
+ set_rnext :1, /* corresponding ::rnext */
+ ao_required :1, /* don't accept non-AO connects */
+ reserved :29; /* must be 0 */
+ __u8 current_key; /* KeyID to set as Current_key */
+ __u8 rnext; /* KeyID to set as Rnext_key */
+} __attribute__((aligned(8)));
+
/* setsockopt(fd, IPPROTO_TCP, TCP_ZEROCOPY_RECEIVE, ...) */

#define TCP_RECEIVE_ZEROCOPY_FLAG_TLB_CLEAN_HINT 0x1
diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile
index cd760793cfcb..e144a02a6a61 100644
--- a/net/ipv4/Makefile
+++ b/net/ipv4/Makefile
@@ -69,6 +69,7 @@ obj-$(CONFIG_NETLABEL) += cipso_ipv4.o

obj-$(CONFIG_XFRM) += xfrm4_policy.o xfrm4_state.o xfrm4_input.o \
xfrm4_output.o xfrm4_protocol.o
+obj-$(CONFIG_TCP_AO) += tcp_ao.o

ifeq ($(CONFIG_BPF_JIT),y)
obj-$(CONFIG_BPF_SYSCALL) += bpf_tcp_ca.o
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 99ad9b1b1f62..fad58dd85be7 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3597,6 +3597,23 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname,
__tcp_sock_set_quickack(sk, val);
break;

+#ifdef CONFIG_TCP_AO
+ case TCP_AO_ADD_KEY:
+ case TCP_AO_DEL_KEY:
+ case TCP_AO_INFO: {
+ /* If this is the first TCP-AO setsockopt() on the socket,
+ * sk_state has to be LISTEN or CLOSE
+ */
+ if (((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE)) ||
+ rcu_dereference_protected(tcp_sk(sk)->ao_info,
+ lockdep_sock_is_held(sk)))
+ err = tp->af_specific->ao_parse(sk, optname, optval,
+ optlen);
+ else
+ err = -EISCONN;
+ break;
+ }
+#endif
#ifdef CONFIG_TCP_MD5SIG
case TCP_MD5SIG:
case TCP_MD5SIG_EXT:
diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
new file mode 100644
index 000000000000..9121f1eeb224
--- /dev/null
+++ b/net/ipv4/tcp_ao.c
@@ -0,0 +1,794 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * INET An implementation of the TCP Authentication Option (TCP-AO).
+ * See RFC5925.
+ *
+ * Authors: Dmitry Safonov <[email protected]>
+ * Francesco Ruggeri <[email protected]>
+ * Salam Noureddine <[email protected]>
+ */
+#define pr_fmt(fmt) "TCP: " fmt
+
+#include <crypto/hash.h>
+#include <linux/inetdevice.h>
+#include <linux/tcp.h>
+
+#include <net/tcp.h>
+#include <net/ipv6.h>
+
+/* Optimized version of tcp_ao_do_lookup(): only for sockets for which
+ * it's known that the keys in ao_info are matching peer's
+ * family/address/VRF/etc.
+ */
+static struct tcp_ao_key *tcp_ao_established_key(struct tcp_ao_info *ao,
+ int sndid, int rcvid)
+{
+ struct tcp_ao_key *key;
+
+ hlist_for_each_entry_rcu(key, &ao->head, node) {
+ if ((sndid >= 0 && key->sndid != sndid) ||
+ (rcvid >= 0 && key->rcvid != rcvid))
+ continue;
+ return key;
+ }
+
+ return NULL;
+}
+
+static int ipv4_prefix_cmp(const struct in_addr *addr1,
+ const struct in_addr *addr2,
+ unsigned int prefixlen)
+{
+ __be32 mask = inet_make_mask(prefixlen);
+ __be32 a1 = addr1->s_addr & mask;
+ __be32 a2 = addr2->s_addr & mask;
+
+ if (a1 == a2)
+ return 0;
+ return memcmp(&a1, &a2, sizeof(a1));
+}
+
+static int __tcp_ao_key_cmp(const struct tcp_ao_key *key,
+ const union tcp_ao_addr *addr, u8 prefixlen,
+ int family, int sndid, int rcvid)
+{
+ if (sndid >= 0 && key->sndid != sndid)
+ return (key->sndid > sndid) ? 1 : -1;
+ if (rcvid >= 0 && key->rcvid != rcvid)
+ return (key->rcvid > rcvid) ? 1 : -1;
+
+ if (family == AF_UNSPEC)
+ return 0;
+ if (key->family != family)
+ return (key->family > family) ? 1 : -1;
+
+ if (family == AF_INET) {
+ if (ntohl(key->addr.a4.s_addr) == INADDR_ANY)
+ return 0;
+ if (ntohl(addr->a4.s_addr) == INADDR_ANY)
+ return 0;
+ return ipv4_prefix_cmp(&key->addr.a4, &addr->a4, prefixlen);
+#if IS_ENABLED(CONFIG_IPV6)
+ } else {
+ if (ipv6_addr_any(&key->addr.a6) || ipv6_addr_any(&addr->a6))
+ return 0;
+ if (ipv6_prefix_equal(&key->addr.a6, &addr->a6, prefixlen))
+ return 0;
+ return memcmp(&key->addr.a6, &addr->a6, sizeof(addr->a6));
+#endif
+ }
+ return -1;
+}
+
+static int tcp_ao_key_cmp(const struct tcp_ao_key *key,
+ const union tcp_ao_addr *addr, u8 prefixlen,
+ int family, int sndid, int rcvid)
+{
+#if IS_ENABLED(CONFIG_IPV6)
+ if (family == AF_INET6 && ipv6_addr_v4mapped(&addr->a6)) {
+ __be32 addr4 = addr->a6.s6_addr32[3];
+
+ return __tcp_ao_key_cmp(key, (union tcp_ao_addr *)&addr4,
+ prefixlen, AF_INET, sndid, rcvid);
+ }
+#endif
+ return __tcp_ao_key_cmp(key, addr, prefixlen, family, sndid, rcvid);
+}
+
+static struct tcp_ao_key *__tcp_ao_do_lookup(const struct sock *sk,
+ const union tcp_ao_addr *addr, int family, u8 prefix,
+ int sndid, int rcvid)
+{
+ struct tcp_ao_key *key;
+ struct tcp_ao_info *ao;
+
+ ao = rcu_dereference_check(tcp_sk(sk)->ao_info,
+ lockdep_sock_is_held(sk));
+ if (!ao)
+ return NULL;
+
+ hlist_for_each_entry_rcu(key, &ao->head, node) {
+ u8 prefixlen = min(prefix, key->prefixlen);
+
+ if (!tcp_ao_key_cmp(key, addr, prefixlen, family, sndid, rcvid))
+ return key;
+ }
+ return NULL;
+}
+
+static struct tcp_ao_info *tcp_ao_alloc_info(gfp_t flags)
+{
+ struct tcp_ao_info *ao;
+
+ ao = kzalloc(sizeof(*ao), flags);
+ if (!ao)
+ return NULL;
+ INIT_HLIST_HEAD(&ao->head);
+
+ return ao;
+}
+
+static void tcp_ao_link_mkt(struct tcp_ao_info *ao, struct tcp_ao_key *mkt)
+{
+ hlist_add_head_rcu(&mkt->node, &ao->head);
+}
+
+static void tcp_ao_key_free_rcu(struct rcu_head *head)
+{
+ struct tcp_ao_key *key = container_of(head, struct tcp_ao_key, rcu);
+
+ tcp_sigpool_release(key->tcp_sigpool_id);
+ kfree(key);
+}
+
+void tcp_ao_destroy_sock(struct sock *sk)
+{
+ struct tcp_ao_info *ao;
+ struct tcp_ao_key *key;
+ struct hlist_node *n;
+
+ ao = rcu_dereference_protected(tcp_sk(sk)->ao_info, 1);
+ tcp_sk(sk)->ao_info = NULL;
+
+ if (!ao)
+ return;
+
+ hlist_for_each_entry_safe(key, n, &ao->head, node) {
+ hlist_del_rcu(&key->node);
+ atomic_sub(tcp_ao_sizeof_key(key), &sk->sk_omem_alloc);
+ call_rcu(&key->rcu, tcp_ao_key_free_rcu);
+ }
+
+ kfree_rcu(ao, rcu);
+}
+
+static bool tcp_ao_can_set_current_rnext(struct sock *sk)
+{
+ /* There aren't current/rnext keys on TCP_LISTEN sockets */
+ if (sk->sk_state == TCP_LISTEN)
+ return false;
+ return true;
+}
+
+static int tcp_ao_verify_ipv4(struct sock *sk, struct tcp_ao_add *cmd,
+ union tcp_ao_addr **addr)
+{
+ struct sockaddr_in *sin = (struct sockaddr_in *)&cmd->addr;
+ struct inet_sock *inet = inet_sk(sk);
+
+ if (sin->sin_family != AF_INET)
+ return -EINVAL;
+
+ /* Currently matching is not performed on port (or port ranges) */
+ if (sin->sin_port != 0)
+ return -EINVAL;
+
+ /* Check prefix and trailing 0's in addr */
+ if (cmd->prefix != 0) {
+ __be32 mask;
+
+ if (ntohl(sin->sin_addr.s_addr) == INADDR_ANY)
+ return -EINVAL;
+ if (cmd->prefix > 32)
+ return -EINVAL;
+
+ mask = inet_make_mask(cmd->prefix);
+ if (sin->sin_addr.s_addr & ~mask)
+ return -EINVAL;
+
+ /* Check that MKT address is consistent with socket */
+ if (ntohl(inet->inet_daddr) != INADDR_ANY &&
+ (inet->inet_daddr & mask) != sin->sin_addr.s_addr)
+ return -EINVAL;
+ } else {
+ if (ntohl(sin->sin_addr.s_addr) != INADDR_ANY)
+ return -EINVAL;
+ }
+
+ *addr = (union tcp_ao_addr *)&sin->sin_addr;
+ return 0;
+}
+
+static int tcp_ao_parse_crypto(struct tcp_ao_add *cmd, struct tcp_ao_key *key)
+{
+ unsigned int syn_tcp_option_space;
+ bool is_kdf_aes_128_cmac = false;
+ struct crypto_ahash *tfm;
+ struct tcp_sigpool hp;
+ void *tmp_key = NULL;
+ int err;
+
+ /* RFC5926, 3.1.1.2. KDF_AES_128_CMAC */
+ if (!strcmp("cmac(aes128)", cmd->alg_name)) {
+ strscpy(cmd->alg_name, "cmac(aes)", sizeof(cmd->alg_name));
+ is_kdf_aes_128_cmac = (cmd->keylen != 16);
+ tmp_key = kmalloc(cmd->keylen, GFP_KERNEL);
+ if (!tmp_key)
+ return -ENOMEM;
+ }
+
+ key->maclen = cmd->maclen ?: 12; /* 12 is the default in RFC5925 */
+
+ /* Check: maclen + tcp-ao header <= (MAX_TCP_OPTION_SPACE - mss
+ * - tstamp - wscale - sackperm),
+ * see tcp_syn_options(), tcp_synack_options(), commit 33ad798c924b.
+ *
+ * In order to allow D-SACK with TCP-AO, the header size should be:
+ * (MAX_TCP_OPTION_SPACE - TCPOLEN_TSTAMP_ALIGNED
+ * - TCPOLEN_SACK_BASE_ALIGNED
+ * - 2 * TCPOLEN_SACK_PERBLOCK) = 8 (maclen = 4),
+ * see tcp_established_options().
+ *
+ * RFC5925, 2.2:
+ * Typical MACs are 96-128 bits (12-16 bytes), but any length
+ * that fits in the header of the segment being authenticated
+ * is allowed.
+ *
+ * RFC5925, 7.6:
+ * TCP-AO continues to consume 16 bytes in non-SYN segments,
+ * leaving a total of 24 bytes for other options, of which
+ * the timestamp consumes 10. This leaves 14 bytes, of which 10
+ * are used for a single SACK block. When two SACK blocks are used,
+ * such as to handle D-SACK, a smaller TCP-AO MAC would be required
+ * to make room for the additional SACK block (i.e., to leave 18
+ * bytes for the D-SACK variant of the SACK option) [RFC2883].
+ * Note that D-SACK is not supportable in TCP MD5 in the presence
+ * of timestamps, because TCP MD5’s MAC length is fixed and too
+ * large to leave sufficient option space.
+ */
+ syn_tcp_option_space = MAX_TCP_OPTION_SPACE;
+ syn_tcp_option_space -= TCPOLEN_TSTAMP_ALIGNED;
+ syn_tcp_option_space -= TCPOLEN_WSCALE_ALIGNED;
+ syn_tcp_option_space -= TCPOLEN_SACKPERM_ALIGNED;
+ if (tcp_ao_len(key) > syn_tcp_option_space) {
+ err = -EMSGSIZE;
+ goto err_kfree;
+ }
+
+ key->keylen = cmd->keylen;
+ memcpy(key->key, cmd->key, cmd->keylen);
+
+ err = tcp_sigpool_start(key->tcp_sigpool_id, &hp);
+ if (err)
+ goto err_kfree;
+
+ tfm = crypto_ahash_reqtfm(hp.req);
+ if (is_kdf_aes_128_cmac) {
+ void *scratch = hp.scratch;
+ struct scatterlist sg;
+
+ memcpy(tmp_key, cmd->key, cmd->keylen);
+ sg_init_one(&sg, tmp_key, cmd->keylen);
+
+ /* Using zero-key of 16 bytes as described in RFC5926 */
+ memset(scratch, 0, 16);
+ err = crypto_ahash_setkey(tfm, scratch, 16);
+ if (err)
+ goto err_pool_end;
+
+ err = crypto_ahash_init(hp.req);
+ if (err)
+ goto err_pool_end;
+
+ ahash_request_set_crypt(hp.req, &sg, key->key, cmd->keylen);
+ err = crypto_ahash_update(hp.req);
+ if (err)
+ goto err_pool_end;
+
+ err |= crypto_ahash_final(hp.req);
+ if (err)
+ goto err_pool_end;
+ key->keylen = 16;
+ }
+
+ err = crypto_ahash_setkey(tfm, key->key, key->keylen);
+ if (err)
+ goto err_pool_end;
+
+ tcp_sigpool_end(&hp);
+ kfree(tmp_key);
+
+ if (tcp_ao_maclen(key) > key->digest_size)
+ return -EINVAL;
+
+ return 0;
+
+err_pool_end:
+ tcp_sigpool_end(&hp);
+err_kfree:
+ kfree(tmp_key);
+ return err;
+}
+
+#if IS_ENABLED(CONFIG_IPV6)
+static int tcp_ao_verify_ipv6(struct sock *sk, struct tcp_ao_add *cmd,
+ union tcp_ao_addr **paddr,
+ unsigned short int *family)
+{
+ struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)&cmd->addr;
+ struct in6_addr *addr = &sin6->sin6_addr;
+ u8 prefix = cmd->prefix;
+
+ if (sin6->sin6_family != AF_INET6)
+ return -EINVAL;
+
+ /* Currently matching is not performed on port (or port ranges) */
+ if (sin6->sin6_port != 0)
+ return -EINVAL;
+
+ /* Check prefix and trailing 0's in addr */
+ if (cmd->prefix != 0 && ipv6_addr_v4mapped(addr)) {
+ __be32 addr4 = addr->s6_addr32[3];
+ __be32 mask;
+
+ if (prefix > 32 || ntohl(addr4) == INADDR_ANY)
+ return -EINVAL;
+
+ mask = inet_make_mask(prefix);
+ if (addr4 & ~mask)
+ return -EINVAL;
+
+ /* Check that MKT address is consistent with socket */
+ if (!ipv6_addr_any(&sk->sk_v6_daddr)) {
+ __be32 daddr4 = sk->sk_v6_daddr.s6_addr32[3];
+
+ if (!ipv6_addr_v4mapped(&sk->sk_v6_daddr))
+ return -EINVAL;
+ if ((daddr4 & mask) != addr4)
+ return -EINVAL;
+ }
+
+ *paddr = (union tcp_ao_addr *)&addr->s6_addr32[3];
+ *family = AF_INET;
+ return 0;
+ } else if (cmd->prefix != 0) {
+ struct in6_addr pfx;
+
+ if (ipv6_addr_any(addr) || prefix > 128)
+ return -EINVAL;
+
+ ipv6_addr_prefix(&pfx, addr, prefix);
+ if (ipv6_addr_cmp(&pfx, addr))
+ return -EINVAL;
+
+ /* Check that MKT address is consistent with socket */
+ if (!ipv6_addr_any(&sk->sk_v6_daddr) &&
+ !ipv6_prefix_equal(&sk->sk_v6_daddr, addr, prefix))
+
+ return -EINVAL;
+ } else {
+ if (!ipv6_addr_any(addr))
+ return -EINVAL;
+ }
+
+ *paddr = (union tcp_ao_addr *)addr;
+ return 0;
+}
+#else
+static int tcp_ao_verify_ipv6(struct sock *sk, struct tcp_ao_add *cmd,
+ union tcp_ao_addr **paddr,
+ unsigned short int *family)
+{
+ return -EOPNOTSUPP;
+}
+#endif
+
+static struct tcp_ao_info *setsockopt_ao_info(struct sock *sk)
+{
+ if (sk_fullsock(sk)) {
+ return rcu_dereference_protected(tcp_sk(sk)->ao_info,
+ lockdep_sock_is_held(sk));
+ }
+ return ERR_PTR(-ESOCKTNOSUPPORT);
+}
+
+#define TCP_AO_KEYF_ALL (0)
+
+static struct tcp_ao_key *tcp_ao_key_alloc(struct sock *sk,
+ struct tcp_ao_add *cmd)
+{
+ const char *algo = cmd->alg_name;
+ unsigned int digest_size;
+ struct crypto_ahash *tfm;
+ struct tcp_ao_key *key;
+ struct tcp_sigpool hp;
+ int err, pool_id;
+ size_t size;
+
+ /* Force null-termination of alg_name */
+ cmd->alg_name[ARRAY_SIZE(cmd->alg_name) - 1] = '\0';
+
+ /* RFC5926, 3.1.1.2. KDF_AES_128_CMAC */
+ if (!strcmp("cmac(aes128)", algo))
+ algo = "cmac(aes)";
+
+ /* Full TCP header (th->doff << 2) should fit into scratch area,
+ * see tcp_ao_hash_header().
+ */
+ pool_id = tcp_sigpool_alloc_ahash(algo, 60);
+ if (pool_id < 0)
+ return ERR_PTR(pool_id);
+
+ err = tcp_sigpool_start(pool_id, &hp);
+ if (err)
+ goto err_free_pool;
+
+ tfm = crypto_ahash_reqtfm(hp.req);
+ if (crypto_ahash_alignmask(tfm) > TCP_AO_KEY_ALIGN) {
+ err = -EOPNOTSUPP;
+ goto err_pool_end;
+ }
+ digest_size = crypto_ahash_digestsize(tfm);
+ tcp_sigpool_end(&hp);
+
+ size = sizeof(struct tcp_ao_key) + (digest_size << 1);
+ key = sock_kmalloc(sk, size, GFP_KERNEL);
+ if (!key) {
+ err = -ENOMEM;
+ goto err_free_pool;
+ }
+
+ key->tcp_sigpool_id = pool_id;
+ key->digest_size = digest_size;
+ return key;
+
+err_pool_end:
+ tcp_sigpool_end(&hp);
+err_free_pool:
+ tcp_sigpool_release(pool_id);
+ return ERR_PTR(err);
+}
+
+static int tcp_ao_add_cmd(struct sock *sk, unsigned short int family,
+ sockptr_t optval, int optlen)
+{
+ struct tcp_ao_info *ao_info;
+ union tcp_ao_addr *addr;
+ struct tcp_ao_key *key;
+ struct tcp_ao_add cmd;
+ bool first = false;
+ int ret;
+
+ if (optlen < sizeof(cmd))
+ return -EINVAL;
+
+ ret = copy_struct_from_sockptr(&cmd, sizeof(cmd), optval, optlen);
+ if (ret)
+ return ret;
+
+ if (cmd.keylen > TCP_AO_MAXKEYLEN)
+ return -EINVAL;
+
+ if (cmd.reserved != 0 || cmd.reserved2 != 0)
+ return -EINVAL;
+
+ if (family == AF_INET)
+ ret = tcp_ao_verify_ipv4(sk, &cmd, &addr);
+ else
+ ret = tcp_ao_verify_ipv6(sk, &cmd, &addr, &family);
+ if (ret)
+ return ret;
+
+ if (cmd.keyflags & ~TCP_AO_KEYF_ALL)
+ return -EINVAL;
+
+ if (cmd.set_current || cmd.set_rnext) {
+ if (!tcp_ao_can_set_current_rnext(sk))
+ return -EINVAL;
+ }
+
+ ao_info = setsockopt_ao_info(sk);
+ if (IS_ERR(ao_info))
+ return PTR_ERR(ao_info);
+
+ if (!ao_info) {
+ ao_info = tcp_ao_alloc_info(GFP_KERNEL);
+ if (!ao_info)
+ return -ENOMEM;
+ first = true;
+ } else {
+ /* Check that neither RecvID nor SendID match any
+ * existing key for the peer, RFC5925 3.1:
+ * > The IDs of MKTs MUST NOT overlap where their
+ * > TCP connection identifiers overlap.
+ */
+ if (__tcp_ao_do_lookup(sk, addr, family,
+ cmd.prefix, -1, cmd.rcvid))
+ return -EEXIST;
+ if (__tcp_ao_do_lookup(sk, addr, family,
+ cmd.prefix, cmd.sndid, -1))
+ return -EEXIST;
+ }
+
+ key = tcp_ao_key_alloc(sk, &cmd);
+ if (IS_ERR(key)) {
+ ret = PTR_ERR(key);
+ goto err_free_ao;
+ }
+
+ INIT_HLIST_NODE(&key->node);
+ memcpy(&key->addr, addr, (family == AF_INET) ? sizeof(struct in_addr) :
+ sizeof(struct in6_addr));
+ key->prefixlen = cmd.prefix;
+ key->family = family;
+ key->keyflags = cmd.keyflags;
+ key->sndid = cmd.sndid;
+ key->rcvid = cmd.rcvid;
+
+ ret = tcp_ao_parse_crypto(&cmd, key);
+ if (ret < 0)
+ goto err_free_sock;
+
+ tcp_ao_link_mkt(ao_info, key);
+ if (first) {
+ sk_gso_disable(sk);
+ rcu_assign_pointer(tcp_sk(sk)->ao_info, ao_info);
+ }
+
+ if (cmd.set_current)
+ WRITE_ONCE(ao_info->current_key, key);
+ if (cmd.set_rnext)
+ WRITE_ONCE(ao_info->rnext_key, key);
+ return 0;
+
+err_free_sock:
+ atomic_sub(tcp_ao_sizeof_key(key), &sk->sk_omem_alloc);
+ tcp_sigpool_release(key->tcp_sigpool_id);
+ kfree(key);
+err_free_ao:
+ if (first)
+ kfree(ao_info);
+ return ret;
+}
+
+static int tcp_ao_delete_key(struct sock *sk, struct tcp_ao_info *ao_info,
+ struct tcp_ao_key *key,
+ struct tcp_ao_key *new_current,
+ struct tcp_ao_key *new_rnext)
+{
+ int err;
+
+ hlist_del_rcu(&key->node);
+
+ /* At this moment another CPU could have looked this key up
+ * while it was unlinked from the list. Wait for RCU grace period,
+ * after which the key is off-list and can't be looked up again;
+ * the rx path [just before RCU came] might have used it and set it
+ * as current_key (very unlikely).
+ */
+ synchronize_rcu();
+ if (new_current)
+ WRITE_ONCE(ao_info->current_key, new_current);
+ if (new_rnext)
+ WRITE_ONCE(ao_info->rnext_key, new_rnext);
+
+ if (unlikely(READ_ONCE(ao_info->current_key) == key ||
+ READ_ONCE(ao_info->rnext_key) == key)) {
+ err = -EBUSY;
+ goto add_key;
+ }
+
+ atomic_sub(tcp_ao_sizeof_key(key), &sk->sk_omem_alloc);
+ call_rcu(&key->rcu, tcp_ao_key_free_rcu);
+
+ return 0;
+add_key:
+ hlist_add_head_rcu(&key->node, &ao_info->head);
+ return err;
+}
+
+static int tcp_ao_del_cmd(struct sock *sk, unsigned short int family,
+ sockptr_t optval, int optlen)
+{
+ struct tcp_ao_key *key, *new_current = NULL, *new_rnext = NULL;
+ struct tcp_ao_info *ao_info;
+ union tcp_ao_addr *addr;
+ struct tcp_ao_del cmd;
+ int addr_len;
+ __u8 prefix;
+ u16 port;
+ int err;
+
+ if (optlen < sizeof(cmd))
+ return -EINVAL;
+
+ err = copy_struct_from_sockptr(&cmd, sizeof(cmd), optval, optlen);
+ if (err)
+ return err;
+
+ if (cmd.reserved != 0 || cmd.reserved2 != 0)
+ return -EINVAL;
+
+ if (cmd.set_current || cmd.set_rnext) {
+ if (!tcp_ao_can_set_current_rnext(sk))
+ return -EINVAL;
+ }
+
+ ao_info = setsockopt_ao_info(sk);
+ if (IS_ERR(ao_info))
+ return PTR_ERR(ao_info);
+ if (!ao_info)
+ return -ENOENT;
+
+ /* For sockets in TCP_CLOSED it's possible set keys that aren't
+ * matching the future peer (address/VRF/etc),
+ * tcp_ao_connect_init() will choose a correct matching MKT
+ * if there's any.
+ */
+ if (cmd.set_current) {
+ new_current = tcp_ao_established_key(ao_info, cmd.current_key, -1);
+ if (!new_current)
+ return -ENOENT;
+ }
+ if (cmd.set_rnext) {
+ new_rnext = tcp_ao_established_key(ao_info, -1, cmd.rnext);
+ if (!new_rnext)
+ return -ENOENT;
+ }
+
+ if (family == AF_INET) {
+ struct sockaddr_in *sin = (struct sockaddr_in *)&cmd.addr;
+
+ addr = (union tcp_ao_addr *)&sin->sin_addr;
+ addr_len = sizeof(struct in_addr);
+ port = ntohs(sin->sin_port);
+ } else {
+ struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)&cmd.addr;
+ struct in6_addr *addr6 = &sin6->sin6_addr;
+
+ if (ipv6_addr_v4mapped(addr6)) {
+ addr = (union tcp_ao_addr *)&addr6->s6_addr32[3];
+ addr_len = sizeof(struct in_addr);
+ family = AF_INET;
+ } else {
+ addr = (union tcp_ao_addr *)addr6;
+ addr_len = sizeof(struct in6_addr);
+ }
+ port = ntohs(sin6->sin6_port);
+ }
+ prefix = cmd.prefix;
+
+ /* Currently matching is not performed on port (or port ranges) */
+ if (port != 0)
+ return -EINVAL;
+
+ /* We could choose random present key here for current/rnext
+ * but that's less predictable. Let's be strict and don't
+ * allow removing a key that's in use. RFC5925 doesn't
+ * specify how-to coordinate key removal, but says:
+ * "It is presumed that an MKT affecting a particular
+ * connection cannot be destroyed during an active connection"
+ */
+ hlist_for_each_entry_rcu(key, &ao_info->head, node) {
+ if (cmd.sndid != key->sndid ||
+ cmd.rcvid != key->rcvid)
+ continue;
+
+ if (family != key->family ||
+ prefix != key->prefixlen ||
+ memcmp(addr, &key->addr, addr_len))
+ continue;
+
+ if (key == new_current || key == new_rnext)
+ continue;
+
+ return tcp_ao_delete_key(sk, ao_info, key,
+ new_current, new_rnext);
+ }
+ return -ENOENT;
+}
+
+static int tcp_ao_info_cmd(struct sock *sk, unsigned short int family,
+ sockptr_t optval, int optlen)
+{
+ struct tcp_ao_key *new_current = NULL, *new_rnext = NULL;
+ struct tcp_ao_info *ao_info;
+ struct tcp_ao_info_opt cmd;
+ bool first = false;
+ int err;
+
+ if (optlen < sizeof(cmd))
+ return -EINVAL;
+
+ err = copy_struct_from_sockptr(&cmd, sizeof(cmd), optval, optlen);
+ if (err)
+ return err;
+
+ if (cmd.set_current || cmd.set_rnext) {
+ if (!tcp_ao_can_set_current_rnext(sk))
+ return -EINVAL;
+ }
+
+ if (cmd.reserved != 0)
+ return -EINVAL;
+
+ ao_info = setsockopt_ao_info(sk);
+ if (IS_ERR(ao_info))
+ return PTR_ERR(ao_info);
+ if (!ao_info) {
+ ao_info = tcp_ao_alloc_info(GFP_KERNEL);
+ if (!ao_info)
+ return -ENOMEM;
+ first = true;
+ }
+
+ /* For sockets in TCP_CLOSED it's possible set keys that aren't
+ * matching the future peer (address/port/VRF/etc),
+ * tcp_ao_connect_init() will choose a correct matching MKT
+ * if there's any.
+ */
+ if (cmd.set_current) {
+ new_current = tcp_ao_established_key(ao_info, cmd.current_key, -1);
+ if (!new_current) {
+ err = -ENOENT;
+ goto out;
+ }
+ }
+ if (cmd.set_rnext) {
+ new_rnext = tcp_ao_established_key(ao_info, -1, cmd.rnext);
+ if (!new_rnext) {
+ err = -ENOENT;
+ goto out;
+ }
+ }
+
+ ao_info->ao_required = cmd.ao_required;
+ if (new_current)
+ WRITE_ONCE(ao_info->current_key, new_current);
+ if (new_rnext)
+ WRITE_ONCE(ao_info->rnext_key, new_rnext);
+ if (first) {
+ sk_gso_disable(sk);
+ rcu_assign_pointer(tcp_sk(sk)->ao_info, ao_info);
+ }
+ return 0;
+out:
+ if (first)
+ kfree(ao_info);
+ return err;
+}
+
+int tcp_parse_ao(struct sock *sk, int cmd, unsigned short int family,
+ sockptr_t optval, int optlen)
+{
+ if (WARN_ON_ONCE(family != AF_INET && family != AF_INET6))
+ return -EAFNOSUPPORT;
+
+ switch (cmd) {
+ case TCP_AO_ADD_KEY:
+ return tcp_ao_add_cmd(sk, family, optval, optlen);
+ case TCP_AO_DEL_KEY:
+ return tcp_ao_del_cmd(sk, family, optval, optlen);
+ case TCP_AO_INFO:
+ return tcp_ao_info_cmd(sk, family, optval, optlen);
+ default:
+ WARN_ON_ONCE(1);
+ return -EINVAL;
+ }
+}
+
+int tcp_v4_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen)
+{
+ return tcp_parse_ao(sk, cmd, AF_INET, optval, optlen);
+}
+
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index edc416897ac2..d89591df71cd 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2268,11 +2268,16 @@ const struct inet_connection_sock_af_ops ipv4_specific = {
};
EXPORT_SYMBOL(ipv4_specific);

-#ifdef CONFIG_TCP_MD5SIG
+#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
static const struct tcp_sock_af_ops tcp_sock_ipv4_specific = {
+#ifdef CONFIG_TCP_MD5SIG
.md5_lookup = tcp_v4_md5_lookup,
.calc_md5_hash = tcp_v4_md5_hash_skb,
.md5_parse = tcp_v4_parse_md5_keys,
+#endif
+#ifdef CONFIG_TCP_AO
+ .ao_parse = tcp_v4_parse_ao,
+#endif
};
#endif

@@ -2287,7 +2292,7 @@ static int tcp_v4_init_sock(struct sock *sk)

icsk->icsk_af_ops = &ipv4_specific;

-#ifdef CONFIG_TCP_MD5SIG
+#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
tcp_sk(sk)->af_specific = &tcp_sock_ipv4_specific;
#endif

@@ -2338,6 +2343,7 @@ void tcp_v4_destroy_sock(struct sock *sk)
rcu_assign_pointer(tp->md5sig_info, NULL);
}
#endif
+ tcp_ao_destroy_sock(sk);

/* Clean up a referenced TCP bind bucket. */
if (inet_csk(sk)->icsk_bind_hash)
diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile
index 3036a45e8a1e..d283c59df4c1 100644
--- a/net/ipv6/Makefile
+++ b/net/ipv6/Makefile
@@ -52,4 +52,5 @@ obj-$(subst m,y,$(CONFIG_IPV6)) += inet6_hashtables.o
ifneq ($(CONFIG_IPV6),)
obj-$(CONFIG_NET_UDP_TUNNEL) += ip6_udp_tunnel.o
obj-y += mcast_snoop.o
+obj-$(CONFIG_TCP_AO) += tcp_ao.o
endif
diff --git a/net/ipv6/tcp_ao.c b/net/ipv6/tcp_ao.c
new file mode 100644
index 000000000000..049ddbabe049
--- /dev/null
+++ b/net/ipv6/tcp_ao.c
@@ -0,0 +1,19 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * INET An implementation of the TCP Authentication Option (TCP-AO).
+ * See RFC5925.
+ *
+ * Authors: Dmitry Safonov <[email protected]>
+ * Francesco Ruggeri <[email protected]>
+ * Salam Noureddine <[email protected]>
+ */
+#include <linux/tcp.h>
+
+#include <net/tcp.h>
+#include <net/ipv6.h>
+
+int tcp_v6_parse_ao(struct sock *sk, int cmd,
+ sockptr_t optval, int optlen)
+{
+ return tcp_parse_ao(sk, cmd, AF_INET6, optval, optlen);
+}
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 5a06bcfd6cd1..21e2dc011b23 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -76,16 +76,9 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb);

static const struct inet_connection_sock_af_ops ipv6_mapped;
const struct inet_connection_sock_af_ops ipv6_specific;
-#ifdef CONFIG_TCP_MD5SIG
+#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
static const struct tcp_sock_af_ops tcp_sock_ipv6_specific;
static const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific;
-#else
-static struct tcp_md5sig_key *tcp_v6_md5_do_lookup(const struct sock *sk,
- const struct in6_addr *addr,
- int l3index)
-{
- return NULL;
-}
#endif

/* Helper returning the inet6 address from a given tcp socket.
@@ -239,7 +232,7 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
if (sk_is_mptcp(sk))
mptcpv6_handle_mapped(sk, true);
sk->sk_backlog_rcv = tcp_v4_do_rcv;
-#ifdef CONFIG_TCP_MD5SIG
+#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
tp->af_specific = &tcp_sock_ipv6_mapped_specific;
#endif

@@ -252,7 +245,7 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr,
if (sk_is_mptcp(sk))
mptcpv6_handle_mapped(sk, false);
sk->sk_backlog_rcv = tcp_v6_do_rcv;
-#ifdef CONFIG_TCP_MD5SIG
+#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
tp->af_specific = &tcp_sock_ipv6_specific;
#endif
goto failure;
@@ -768,7 +761,13 @@ static int tcp_v6_md5_hash_skb(char *md5_hash,
memset(md5_hash, 0, 16);
return 1;
}
-
+#else /* CONFIG_TCP_MD5SIG */
+static struct tcp_md5sig_key *tcp_v6_md5_do_lookup(const struct sock *sk,
+ const struct in6_addr *addr,
+ int l3index)
+{
+ return NULL;
+}
#endif

static void tcp_v6_init_req(struct request_sock *req,
@@ -1229,7 +1228,7 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
if (sk_is_mptcp(newsk))
mptcpv6_handle_mapped(newsk, true);
newsk->sk_backlog_rcv = tcp_v4_do_rcv;
-#ifdef CONFIG_TCP_MD5SIG
+#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
newtp->af_specific = &tcp_sock_ipv6_mapped_specific;
#endif

@@ -1893,11 +1892,16 @@ const struct inet_connection_sock_af_ops ipv6_specific = {
.mtu_reduced = tcp_v6_mtu_reduced,
};

-#ifdef CONFIG_TCP_MD5SIG
+#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
static const struct tcp_sock_af_ops tcp_sock_ipv6_specific = {
+#ifdef CONFIG_TCP_MD5SIG
.md5_lookup = tcp_v6_md5_lookup,
.calc_md5_hash = tcp_v6_md5_hash_skb,
.md5_parse = tcp_v6_parse_md5_keys,
+#endif
+#ifdef CONFIG_TCP_AO
+ .ao_parse = tcp_v6_parse_ao,
+#endif
};
#endif

@@ -1919,11 +1923,16 @@ static const struct inet_connection_sock_af_ops ipv6_mapped = {
.mtu_reduced = tcp_v4_mtu_reduced,
};

-#ifdef CONFIG_TCP_MD5SIG
+#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
static const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific = {
+#ifdef CONFIG_TCP_MD5SIG
.md5_lookup = tcp_v4_md5_lookup,
.calc_md5_hash = tcp_v4_md5_hash_skb,
.md5_parse = tcp_v6_parse_md5_keys,
+#endif
+#ifdef CONFIG_TCP_AO
+ .ao_parse = tcp_v6_parse_ao,
+#endif
};
#endif

@@ -1938,7 +1947,7 @@ static int tcp_v6_init_sock(struct sock *sk)

icsk->icsk_af_ops = &ipv6_specific;

-#ifdef CONFIG_TCP_MD5SIG
+#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
tcp_sk(sk)->af_specific = &tcp_sock_ipv6_specific;
#endif

--
2.41.0

2023-09-12 18:43:13

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v11 net-next 11/23] net/tcp: Sign SYN-ACK segments with TCP-AO

Similarly to RST segments, wire SYN-ACKs to TCP-AO.
tcp_rsk_used_ao() is handy here to check if the request socket used AO
and needs a signature on the outgoing segments.

Co-developed-by: Francesco Ruggeri <[email protected]>
Signed-off-by: Francesco Ruggeri <[email protected]>
Co-developed-by: Salam Noureddine <[email protected]>
Signed-off-by: Salam Noureddine <[email protected]>
Signed-off-by: Dmitry Safonov <[email protected]>
Acked-by: David Ahern <[email protected]>
---
include/net/tcp.h | 3 +++
include/net/tcp_ao.h | 6 +++++
net/ipv4/tcp_ao.c | 22 +++++++++++++++++
net/ipv4/tcp_ipv4.c | 1 +
net/ipv4/tcp_output.c | 57 ++++++++++++++++++++++++++++++++++++++-----
net/ipv6/tcp_ao.c | 22 +++++++++++++++++
net/ipv6/tcp_ipv6.c | 1 +
7 files changed, 106 insertions(+), 6 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 5daa2e98e6a3..56f4180443c7 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -2179,6 +2179,9 @@ struct tcp_request_sock_ops {
struct request_sock *req,
int sndid, int rcvid);
int (*ao_calc_key)(struct tcp_ao_key *mkt, u8 *key, struct request_sock *sk);
+ int (*ao_synack_hash)(char *ao_hash, struct tcp_ao_key *mkt,
+ struct request_sock *req, const struct sk_buff *skb,
+ int hash_offset, u32 sne);
#endif
#ifdef CONFIG_SYN_COOKIES
__u32 (*cookie_init_seq)(const struct sk_buff *skb,
diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
index d26d98f1b048..c922d2e31d08 100644
--- a/include/net/tcp_ao.h
+++ b/include/net/tcp_ao.h
@@ -144,6 +144,9 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb,
int tcp_v4_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen);
struct tcp_ao_key *tcp_v4_ao_lookup(const struct sock *sk, struct sock *addr_sk,
int sndid, int rcvid);
+int tcp_v4_ao_synack_hash(char *ao_hash, struct tcp_ao_key *mkt,
+ struct request_sock *req, const struct sk_buff *skb,
+ int hash_offset, u32 sne);
int tcp_v4_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key,
const struct sock *sk,
__be32 sisn, __be32 disn, bool send);
@@ -178,6 +181,9 @@ int tcp_v6_ao_hash_skb(char *ao_hash, struct tcp_ao_key *key,
const struct sock *sk, const struct sk_buff *skb,
const u8 *tkey, int hash_offset, u32 sne);
int tcp_v6_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen);
+int tcp_v6_ao_synack_hash(char *ao_hash, struct tcp_ao_key *ao_key,
+ struct request_sock *req, const struct sk_buff *skb,
+ int hash_offset, u32 sne);
void tcp_ao_finish_connect(struct sock *sk, struct sk_buff *skb);
void tcp_ao_connect_init(struct sock *sk);
void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb,
diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index b114f3d901a0..0d8ea381300b 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -568,6 +568,28 @@ int tcp_v4_ao_hash_skb(char *ao_hash, struct tcp_ao_key *key,
tkey, hash_offset, sne);
}

+int tcp_v4_ao_synack_hash(char *ao_hash, struct tcp_ao_key *ao_key,
+ struct request_sock *req, const struct sk_buff *skb,
+ int hash_offset, u32 sne)
+{
+ void *hash_buf = NULL;
+ int err;
+
+ hash_buf = kmalloc(tcp_ao_digest_size(ao_key), GFP_ATOMIC);
+ if (!hash_buf)
+ return -ENOMEM;
+
+ err = tcp_v4_ao_calc_key_rsk(ao_key, hash_buf, req);
+ if (err)
+ goto out;
+
+ err = tcp_ao_hash_skb(AF_INET, ao_hash, ao_key, req_to_sk(req), skb,
+ hash_buf, hash_offset, sne);
+out:
+ kfree(hash_buf);
+ return err;
+}
+
struct tcp_ao_key *tcp_v4_ao_lookup_rsk(const struct sock *sk,
struct request_sock *req,
int sndid, int rcvid)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 9a4ffcc965f3..c40da33d988b 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1675,6 +1675,7 @@ const struct tcp_request_sock_ops tcp_request_sock_ipv4_ops = {
#ifdef CONFIG_TCP_AO
.ao_lookup = tcp_v4_ao_lookup_rsk,
.ao_calc_key = tcp_v4_ao_calc_key_rsk,
+ .ao_synack_hash = tcp_v4_ao_synack_hash,
#endif
#ifdef CONFIG_SYN_COOKIES
.cookie_init_seq = cookie_v4_init_sequence,
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 44c97e6ddd50..c9d6decef443 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -900,6 +900,7 @@ static unsigned int tcp_synack_options(const struct sock *sk,
unsigned int mss, struct sk_buff *skb,
struct tcp_out_options *opts,
const struct tcp_md5sig_key *md5,
+ const struct tcp_ao_key *ao,
struct tcp_fastopen_cookie *foc,
enum tcp_synack_type synack_type,
struct sk_buff *syn_skb)
@@ -921,6 +922,14 @@ static unsigned int tcp_synack_options(const struct sock *sk,
ireq->tstamp_ok &= !ireq->sack_ok;
}
#endif
+#ifdef CONFIG_TCP_AO
+ if (ao) {
+ opts->options |= OPTION_AO;
+ remaining -= tcp_ao_len(ao);
+ ireq->tstamp_ok &= !ireq->sack_ok;
+ }
+#endif
+ WARN_ON_ONCE(md5 && ao);

/* We always send an MSS option. */
opts->mss = mss;
@@ -3727,6 +3736,7 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
struct inet_request_sock *ireq = inet_rsk(req);
const struct tcp_sock *tp = tcp_sk(sk);
struct tcp_md5sig_key *md5 = NULL;
+ struct tcp_ao_key *ao_key = NULL;
struct tcp_out_options opts;
struct sk_buff *skb;
int tcp_header_size;
@@ -3777,16 +3787,43 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
tcp_rsk(req)->snt_synack = tcp_skb_timestamp_us(skb);
}

-#ifdef CONFIG_TCP_MD5SIG
+#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
rcu_read_lock();
- md5 = tcp_rsk(req)->af_specific->req_md5_lookup(sk, req_to_sk(req));
#endif
+ if (tcp_rsk_used_ao(req)) {
+#ifdef CONFIG_TCP_AO
+ u8 maclen = tcp_rsk(req)->maclen;
+ u8 keyid = tcp_rsk(req)->ao_keyid;
+
+ ao_key = tcp_sk(sk)->af_specific->ao_lookup(sk, req_to_sk(req),
+ keyid, -1);
+ /* If there is no matching key - avoid sending anything,
+ * especially usigned segments. It could try harder and lookup
+ * for another peer-matching key, but the peer has requested
+ * ao_keyid (RFC5925 RNextKeyID), so let's keep it simple here.
+ */
+ if (unlikely(!ao_key || tcp_ao_maclen(ao_key) != maclen)) {
+ rcu_read_unlock();
+ skb_dst_drop(skb);
+ kfree_skb(skb);
+ net_warn_ratelimited("TCP-AO: the keyid %u with maclen %u|%u from SYN packet is not present - not sending SYNACK\n",
+ keyid, maclen,
+ ao_key ? tcp_ao_maclen(ao_key) : 0);
+ return NULL;
+ }
+#endif
+ } else {
+#ifdef CONFIG_TCP_MD5SIG
+ md5 = tcp_rsk(req)->af_specific->req_md5_lookup(sk,
+ req_to_sk(req));
+#endif
+ }
skb_set_hash(skb, READ_ONCE(tcp_rsk(req)->txhash), PKT_HASH_TYPE_L4);
/* bpf program will be interested in the tcp_flags */
TCP_SKB_CB(skb)->tcp_flags = TCPHDR_SYN | TCPHDR_ACK;
tcp_header_size = tcp_synack_options(sk, req, mss, skb, &opts, md5,
- foc, synack_type,
- syn_skb) + sizeof(*th);
+ ao_key, foc, synack_type, syn_skb)
+ + sizeof(*th);

skb_push(skb, tcp_header_size);
skb_reset_transport_header(skb);
@@ -3806,7 +3843,7 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,

/* RFC1323: The window in SYN & SYN/ACK segments is never scaled. */
th->window = htons(min(req->rsk_rcv_wnd, 65535U));
- tcp_options_write(th, NULL, NULL, &opts, NULL);
+ tcp_options_write(th, NULL, tcp_rsk(req), &opts, ao_key);
th->doff = (tcp_header_size >> 2);
TCP_INC_STATS(sock_net(sk), TCP_MIB_OUTSEGS);

@@ -3814,7 +3851,15 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
/* Okay, we have all we need - do the md5 hash if needed */
if (md5)
tcp_rsk(req)->af_specific->calc_md5_hash(opts.hash_location,
- md5, req_to_sk(req), skb);
+ md5, req_to_sk(req), skb);
+#endif
+#ifdef CONFIG_TCP_AO
+ if (ao_key)
+ tcp_rsk(req)->af_specific->ao_synack_hash(opts.hash_location,
+ ao_key, req, skb,
+ opts.hash_location - (u8 *)th, 0);
+#endif
+#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
rcu_read_unlock();
#endif

diff --git a/net/ipv6/tcp_ao.c b/net/ipv6/tcp_ao.c
index c9a6fa84f6ce..99753e12c08c 100644
--- a/net/ipv6/tcp_ao.c
+++ b/net/ipv6/tcp_ao.c
@@ -144,3 +144,25 @@ int tcp_v6_parse_ao(struct sock *sk, int cmd,
{
return tcp_parse_ao(sk, cmd, AF_INET6, optval, optlen);
}
+
+int tcp_v6_ao_synack_hash(char *ao_hash, struct tcp_ao_key *ao_key,
+ struct request_sock *req, const struct sk_buff *skb,
+ int hash_offset, u32 sne)
+{
+ void *hash_buf = NULL;
+ int err;
+
+ hash_buf = kmalloc(tcp_ao_digest_size(ao_key), GFP_ATOMIC);
+ if (!hash_buf)
+ return -ENOMEM;
+
+ err = tcp_v6_ao_calc_key_rsk(ao_key, hash_buf, req);
+ if (err)
+ goto out;
+
+ err = tcp_ao_hash_skb(AF_INET6, ao_hash, ao_key, req_to_sk(req), skb,
+ hash_buf, hash_offset, sne);
+out:
+ kfree(hash_buf);
+ return err;
+}
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index c060cd964f91..f57617d2921a 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -845,6 +845,7 @@ const struct tcp_request_sock_ops tcp_request_sock_ipv6_ops = {
#ifdef CONFIG_TCP_AO
.ao_lookup = tcp_v6_ao_lookup_rsk,
.ao_calc_key = tcp_v6_ao_calc_key_rsk,
+ .ao_synack_hash = tcp_v6_ao_synack_hash,
#endif
#ifdef CONFIG_SYN_COOKIES
.cookie_init_seq = cookie_v6_init_sequence,
--
2.41.0

2023-09-13 01:15:41

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v11 net-next 17/23] net/tcp: Add option for TCP-AO to (not) hash header

Provide setsockopt() key flag that makes TCP-AO exclude hashing TCP
header for peers that match the key. This is needed for interraction
with middleboxes that may change TCP options, see RFC5925 (9.2).

Co-developed-by: Francesco Ruggeri <[email protected]>
Signed-off-by: Francesco Ruggeri <[email protected]>
Co-developed-by: Salam Noureddine <[email protected]>
Signed-off-by: Salam Noureddine <[email protected]>
Signed-off-by: Dmitry Safonov <[email protected]>
Acked-by: David Ahern <[email protected]>
---
include/uapi/linux/tcp.h | 5 +++++
net/ipv4/tcp_ao.c | 8 +++++---
2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index ca7ed18ce67b..3275ade3293a 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -354,6 +354,11 @@ struct tcp_diag_md5sig {
#define TCP_AO_MAXKEYLEN 80

#define TCP_AO_KEYF_IFINDEX (1 << 0) /* L3 ifindex for VRF */
+#define TCP_AO_KEYF_EXCLUDE_OPT (1 << 1) /* "Indicates whether TCP
+ * options other than TCP-AO
+ * are included in the MAC
+ * calculation"
+ */

struct tcp_ao_add { /* setsockopt(TCP_AO_ADD_KEY) */
struct __kernel_sockaddr_storage addr; /* peer's address for the key */
diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index a8af93972ee5..ecbf2e217f29 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -562,7 +562,8 @@ int tcp_ao_hash_hdr(unsigned short int family, char *ao_hash,
WARN_ON_ONCE(1);
goto clear_hash;
}
- if (tcp_ao_hash_header(&hp, th, false,
+ if (tcp_ao_hash_header(&hp, th,
+ !!(key->keyflags & TCP_AO_KEYF_EXCLUDE_OPT),
ao_hash, hash_offset, tcp_ao_maclen(key)))
goto clear_hash;
ahash_request_set_crypt(hp.req, NULL, hash_buf, 0);
@@ -610,7 +611,8 @@ int tcp_ao_hash_skb(unsigned short int family,
goto clear_hash;
if (tcp_ao_hash_pseudoheader(family, sk, skb, &hp, skb->len))
goto clear_hash;
- if (tcp_ao_hash_header(&hp, th, false,
+ if (tcp_ao_hash_header(&hp, th,
+ !!(key->keyflags & TCP_AO_KEYF_EXCLUDE_OPT),
ao_hash, hash_offset, tcp_ao_maclen(key)))
goto clear_hash;
if (tcp_sigpool_hash_skb_data(&hp, skb, th->doff << 2))
@@ -1403,7 +1405,7 @@ static struct tcp_ao_info *setsockopt_ao_info(struct sock *sk)
return ERR_PTR(-ESOCKTNOSUPPORT);
}

-#define TCP_AO_KEYF_ALL (0)
+#define TCP_AO_KEYF_ALL (TCP_AO_KEYF_EXCLUDE_OPT)

static struct tcp_ao_key *tcp_ao_key_alloc(struct sock *sk,
struct tcp_ao_add *cmd)
--
2.41.0

2023-09-13 01:43:32

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v11 net-next 12/23] net/tcp: Verify inbound TCP-AO signed segments

Now there is a common function to verify signature on TCP segments:
tcp_inbound_hash(). It has checks for all possible cross-interactions
with MD5 signs as well as with unsigned segments.

The rules from RFC5925 are:
(1) Any TCP segment can have at max only one signature.
(2) TCP connections can't switch between using TCP-MD5 and TCP-AO.
(3) TCP-AO connections can't stop using AO, as well as unsigned
connections can't suddenly start using AO.

Co-developed-by: Francesco Ruggeri <[email protected]>
Signed-off-by: Francesco Ruggeri <[email protected]>
Co-developed-by: Salam Noureddine <[email protected]>
Signed-off-by: Salam Noureddine <[email protected]>
Signed-off-by: Dmitry Safonov <[email protected]>
Acked-by: David Ahern <[email protected]>
---
include/net/dropreason-core.h | 17 ++++
include/net/tcp.h | 53 ++++++++++++-
include/net/tcp_ao.h | 15 ++++
net/ipv4/tcp.c | 39 ++--------
net/ipv4/tcp_ao.c | 143 ++++++++++++++++++++++++++++++++++
net/ipv4/tcp_ipv4.c | 10 +--
net/ipv6/tcp_ao.c | 9 ++-
net/ipv6/tcp_ipv6.c | 11 +--
8 files changed, 250 insertions(+), 47 deletions(-)

diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h
index 216cde184db1..a01e1860fe25 100644
--- a/include/net/dropreason-core.h
+++ b/include/net/dropreason-core.h
@@ -24,6 +24,10 @@
FN(TCP_MD5NOTFOUND) \
FN(TCP_MD5UNEXPECTED) \
FN(TCP_MD5FAILURE) \
+ FN(TCP_AONOTFOUND) \
+ FN(TCP_AOUNEXPECTED) \
+ FN(TCP_AOKEYNOTFOUND) \
+ FN(TCP_AOFAILURE) \
FN(SOCKET_BACKLOG) \
FN(TCP_FLAGS) \
FN(TCP_ZEROWINDOW) \
@@ -162,6 +166,19 @@ enum skb_drop_reason {
* to LINUX_MIB_TCPMD5FAILURE
*/
SKB_DROP_REASON_TCP_MD5FAILURE,
+ /**
+ * @SKB_DROP_REASON_TCP_AONOTFOUND: no TCP-AO hash and one was expected
+ */
+ SKB_DROP_REASON_TCP_AONOTFOUND,
+ /**
+ * @SKB_DROP_REASON_TCP_AOUNEXPECTED: TCP-AO hash is present and it
+ * was not expected.
+ */
+ SKB_DROP_REASON_TCP_AOUNEXPECTED,
+ /** @SKB_DROP_REASON_TCP_AOKEYNOTFOUND: TCP-AO key is unknown */
+ SKB_DROP_REASON_TCP_AOKEYNOTFOUND,
+ /** @SKB_DROP_REASON_TCP_AOFAILURE: TCP-AO hash is wrong */
+ SKB_DROP_REASON_TCP_AOFAILURE,
/**
* @SKB_DROP_REASON_SOCKET_BACKLOG: failed to add skb to socket backlog (
* see LINUX_MIB_TCPBACKLOGDROP)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 56f4180443c7..a81836268245 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1771,7 +1771,7 @@ tcp_md5_do_lookup_any_l3index(const struct sock *sk,
enum skb_drop_reason
tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb,
const void *saddr, const void *daddr,
- int family, int dif, int sdif);
+ int family, int l3index, const __u8 *hash_location);


#define tcp_twsk_md5_key(twsk) ((twsk)->tw_md5_key)
@@ -1793,7 +1793,7 @@ tcp_md5_do_lookup_any_l3index(const struct sock *sk,
static inline enum skb_drop_reason
tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb,
const void *saddr, const void *daddr,
- int family, int dif, int sdif)
+ int family, int l3index, const __u8 *hash_location)
{
return SKB_NOT_DROPPED_YET;
}
@@ -2623,4 +2623,53 @@ static inline bool tcp_ao_required(struct sock *sk, const void *saddr,
return false;
}

+/* Called with rcu_read_lock() */
+static inline enum skb_drop_reason
+tcp_inbound_hash(struct sock *sk, const struct request_sock *req,
+ const struct sk_buff *skb,
+ const void *saddr, const void *daddr,
+ int family, int dif, int sdif)
+{
+ const struct tcphdr *th = tcp_hdr(skb);
+ const struct tcp_ao_hdr *aoh;
+ const __u8 *md5_location;
+ int l3index;
+
+ /* Invalid option or two times meet any of auth options */
+ if (tcp_parse_auth_options(th, &md5_location, &aoh))
+ return SKB_DROP_REASON_TCP_AUTH_HDR;
+
+ if (req) {
+ if (tcp_rsk_used_ao(req) != !!aoh)
+ return SKB_DROP_REASON_TCP_AOFAILURE;
+ }
+
+ /* sdif set, means packet ingressed via a device
+ * in an L3 domain and dif is set to the l3mdev
+ */
+ l3index = sdif ? dif : 0;
+
+ /* Fast path: unsigned segments */
+ if (likely(!md5_location && !aoh)) {
+ /* Drop if there's TCP-MD5 or TCP-AO key with any rcvid/sndid
+ * for the remote peer. On TCP-AO established connection
+ * the last key is impossible to remove, so there's
+ * always at least one current_key.
+ */
+ if (tcp_ao_required(sk, saddr, family))
+ return SKB_DROP_REASON_TCP_AONOTFOUND;
+ if (unlikely(tcp_md5_do_lookup(sk, l3index, saddr, family))) {
+ NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND);
+ return SKB_DROP_REASON_TCP_MD5NOTFOUND;
+ }
+ return SKB_NOT_DROPPED_YET;
+ }
+
+ if (aoh)
+ return tcp_inbound_ao_hash(sk, skb, family, req, aoh);
+
+ return tcp_inbound_md5_hash(sk, skb, saddr, daddr, family,
+ l3index, md5_location);
+}
+
#endif /* _TCP_H */
diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
index c922d2e31d08..135635203bd7 100644
--- a/include/net/tcp_ao.h
+++ b/include/net/tcp_ao.h
@@ -112,6 +112,10 @@ struct tcp6_ao_context {

struct tcp_sigpool;

+#define TCP_AO_ESTABLISHED (TCPF_ESTABLISHED | TCPF_FIN_WAIT1 | TCPF_FIN_WAIT2 | \
+ TCPF_CLOSE | TCPF_CLOSE_WAIT | \
+ TCPF_LAST_ACK | TCPF_CLOSING)
+
int tcp_ao_hash_skb(unsigned short int family,
char *ao_hash, struct tcp_ao_key *key,
const struct sock *sk, const struct sk_buff *skb,
@@ -127,6 +131,10 @@ int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx,
unsigned int len, struct tcp_sigpool *hp);
void tcp_ao_destroy_sock(struct sock *sk, bool twsk);
void tcp_ao_time_wait(struct tcp_timewait_sock *tcptw, struct tcp_sock *tp);
+enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk,
+ const struct sk_buff *skb, unsigned short int family,
+ const struct request_sock *req,
+ const struct tcp_ao_hdr *aoh);
struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk,
const union tcp_ao_addr *addr,
int family, int sndid, int rcvid);
@@ -197,6 +205,13 @@ static inline void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb,
{
}

+static inline enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk,
+ const struct sk_buff *skb, unsigned short int family,
+ const struct request_sock *req, const struct tcp_ao_hdr *aoh)
+{
+ return SKB_NOT_DROPPED_YET;
+}
+
static inline struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk,
const union tcp_ao_addr *addr, int family, int sndid, int rcvid)
{
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index f3de3615f414..8506be193843 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -4359,42 +4359,23 @@ EXPORT_SYMBOL(tcp_md5_hash_key);
enum skb_drop_reason
tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb,
const void *saddr, const void *daddr,
- int family, int dif, int sdif)
+ int family, int l3index, const __u8 *hash_location)
{
- /*
- * This gets called for each TCP segment that arrives
- * so we want to be efficient.
+ /* This gets called for each TCP segment that has TCP-MD5 option.
* We have 3 drop cases:
* o No MD5 hash and one expected.
* o MD5 hash and we're not expecting one.
* o MD5 hash and its wrong.
*/
- const __u8 *hash_location = NULL;
- struct tcp_md5sig_key *hash_expected;
const struct tcphdr *th = tcp_hdr(skb);
const struct tcp_sock *tp = tcp_sk(sk);
- int genhash, l3index;
+ struct tcp_md5sig_key *key;
u8 newhash[16];
+ int genhash;

- /* sdif set, means packet ingressed via a device
- * in an L3 domain and dif is set to the l3mdev
- */
- l3index = sdif ? dif : 0;
+ key = tcp_md5_do_lookup(sk, l3index, saddr, family);

- hash_expected = tcp_md5_do_lookup(sk, l3index, saddr, family);
- if (tcp_parse_auth_options(th, &hash_location, NULL))
- return SKB_DROP_REASON_TCP_AUTH_HDR;
-
- /* We've parsed the options - do we have a hash? */
- if (!hash_expected && !hash_location)
- return SKB_NOT_DROPPED_YET;
-
- if (hash_expected && !hash_location) {
- NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND);
- return SKB_DROP_REASON_TCP_MD5NOTFOUND;
- }
-
- if (!hash_expected && hash_location) {
+ if (!key && hash_location) {
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5UNEXPECTED);
return SKB_DROP_REASON_TCP_MD5UNEXPECTED;
}
@@ -4404,14 +4385,10 @@ tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb,
* IPv4-mapped case.
*/
if (family == AF_INET)
- genhash = tcp_v4_md5_hash_skb(newhash,
- hash_expected,
- NULL, skb);
+ genhash = tcp_v4_md5_hash_skb(newhash, key, NULL, skb);
else
- genhash = tp->af_specific->calc_md5_hash(newhash,
- hash_expected,
+ genhash = tp->af_specific->calc_md5_hash(newhash, key,
NULL, skb);
-
if (genhash || memcmp(hash_location, newhash, 16) != 0) {
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5FAILURE);
if (family == AF_INET) {
diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index 0d8ea381300b..4bcbf2d3fe79 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -728,6 +728,149 @@ void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb,
treq->maclen = tcp_ao_maclen(key);
}

+static enum skb_drop_reason
+tcp_ao_verify_hash(const struct sock *sk, const struct sk_buff *skb,
+ unsigned short int family, struct tcp_ao_info *info,
+ const struct tcp_ao_hdr *aoh, struct tcp_ao_key *key,
+ u8 *traffic_key, u8 *phash, u32 sne)
+{
+ u8 maclen = aoh->length - sizeof(struct tcp_ao_hdr);
+ const struct tcphdr *th = tcp_hdr(skb);
+ void *hash_buf = NULL;
+
+ if (maclen != tcp_ao_maclen(key))
+ return SKB_DROP_REASON_TCP_AOFAILURE;
+
+ hash_buf = kmalloc(tcp_ao_digest_size(key), GFP_ATOMIC);
+ if (!hash_buf)
+ return SKB_DROP_REASON_NOT_SPECIFIED;
+
+ /* XXX: make it per-AF callback? */
+ tcp_ao_hash_skb(family, hash_buf, key, sk, skb, traffic_key,
+ (phash - (u8 *)th), sne);
+ if (memcmp(phash, hash_buf, maclen)) {
+ kfree(hash_buf);
+ return SKB_DROP_REASON_TCP_AOFAILURE;
+ }
+ kfree(hash_buf);
+ return SKB_NOT_DROPPED_YET;
+}
+
+enum skb_drop_reason
+tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb,
+ unsigned short int family, const struct request_sock *req,
+ const struct tcp_ao_hdr *aoh)
+{
+ const struct tcphdr *th = tcp_hdr(skb);
+ u8 *phash = (u8 *)(aoh + 1); /* hash goes just after the header */
+ struct tcp_ao_info *info;
+ enum skb_drop_reason ret;
+ struct tcp_ao_key *key;
+ __be32 sisn, disn;
+ u8 *traffic_key;
+ u32 sne = 0;
+
+ info = rcu_dereference(tcp_sk(sk)->ao_info);
+ if (!info)
+ return SKB_DROP_REASON_TCP_AOUNEXPECTED;
+
+ if (unlikely(th->syn)) {
+ sisn = th->seq;
+ disn = 0;
+ }
+
+ /* Fast-path */
+ /* TODO: fix fastopen and simultaneous open (TCPF_SYN_RECV) */
+ if (likely((1 << sk->sk_state) & (TCP_AO_ESTABLISHED | TCPF_SYN_RECV))) {
+ enum skb_drop_reason err;
+ struct tcp_ao_key *current_key;
+
+ /* Check if this socket's rnext_key matches the keyid in the
+ * packet. If not we lookup the key based on the keyid
+ * matching the rcvid in the mkt.
+ */
+ key = READ_ONCE(info->rnext_key);
+ if (key->rcvid != aoh->keyid) {
+ key = tcp_ao_established_key(info, -1, aoh->keyid);
+ if (!key)
+ goto key_not_found;
+ }
+
+ /* Delayed retransmitted SYN */
+ if (unlikely(th->syn && !th->ack))
+ goto verify_hash;
+
+ sne = 0;
+ /* Established socket, traffic key are cached */
+ traffic_key = rcv_other_key(key);
+ err = tcp_ao_verify_hash(sk, skb, family, info, aoh, key,
+ traffic_key, phash, sne);
+ if (err)
+ return err;
+ current_key = READ_ONCE(info->current_key);
+ /* Key rotation: the peer asks us to use new key (RNext) */
+ if (unlikely(aoh->rnext_keyid != current_key->sndid)) {
+ /* If the key is not found we do nothing. */
+ key = tcp_ao_established_key(info, aoh->rnext_keyid, -1);
+ if (key)
+ /* pairs with tcp_ao_del_cmd */
+ WRITE_ONCE(info->current_key, key);
+ }
+ return SKB_NOT_DROPPED_YET;
+ }
+
+ /* Lookup key based on peer address and keyid.
+ * current_key and rnext_key must not be used on tcp listen
+ * sockets as otherwise:
+ * - request sockets would race on those key pointers
+ * - tcp_ao_del_cmd() allows async key removal
+ */
+ key = tcp_ao_inbound_lookup(family, sk, skb, -1, aoh->keyid);
+ if (!key)
+ goto key_not_found;
+
+ if (th->syn && !th->ack)
+ goto verify_hash;
+
+ if ((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_NEW_SYN_RECV)) {
+ /* Make the initial syn the likely case here */
+ if (unlikely(req)) {
+ sne = 0;
+ sisn = htonl(tcp_rsk(req)->rcv_isn);
+ disn = htonl(tcp_rsk(req)->snt_isn);
+ } else if (unlikely(th->ack && !th->syn)) {
+ /* Possible syncookie packet */
+ sisn = htonl(ntohl(th->seq) - 1);
+ disn = htonl(ntohl(th->ack_seq) - 1);
+ sne = 0;
+ } else if (unlikely(!th->syn)) {
+ /* no way to figure out initial sisn/disn - drop */
+ return SKB_DROP_REASON_TCP_FLAGS;
+ }
+ } else if (sk->sk_state == TCP_SYN_SENT) {
+ disn = info->lisn;
+ if (th->syn || th->rst)
+ sisn = th->seq;
+ else
+ sisn = info->risn;
+ } else {
+ WARN_ONCE(1, "TCP-AO: Unexpected sk_state %d", sk->sk_state);
+ return SKB_DROP_REASON_TCP_AOFAILURE;
+ }
+verify_hash:
+ traffic_key = kmalloc(tcp_ao_digest_size(key), GFP_ATOMIC);
+ if (!traffic_key)
+ return SKB_DROP_REASON_NOT_SPECIFIED;
+ tcp_ao_calc_key_skb(key, traffic_key, skb, sisn, disn, family);
+ ret = tcp_ao_verify_hash(sk, skb, family, info, aoh, key,
+ traffic_key, phash, sne);
+ kfree(traffic_key);
+ return ret;
+
+key_not_found:
+ return SKB_DROP_REASON_TCP_AOKEYNOTFOUND;
+}
+
static int tcp_ao_cache_traffic_keys(const struct sock *sk,
struct tcp_ao_info *ao,
struct tcp_ao_key *ao_key)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index c40da33d988b..96ca1baf0cd6 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2197,9 +2197,9 @@ int tcp_v4_rcv(struct sk_buff *skb)
if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb))
drop_reason = SKB_DROP_REASON_XFRM_POLICY;
else
- drop_reason = tcp_inbound_md5_hash(sk, skb,
- &iph->saddr, &iph->daddr,
- AF_INET, dif, sdif);
+ drop_reason = tcp_inbound_hash(sk, req, skb,
+ &iph->saddr, &iph->daddr,
+ AF_INET, dif, sdif);
if (unlikely(drop_reason)) {
sk_drops_add(sk, skb);
reqsk_put(req);
@@ -2276,8 +2276,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
goto discard_and_relse;
}

- drop_reason = tcp_inbound_md5_hash(sk, skb, &iph->saddr,
- &iph->daddr, AF_INET, dif, sdif);
+ drop_reason = tcp_inbound_hash(sk, NULL, skb, &iph->saddr, &iph->daddr,
+ AF_INET, dif, sdif);
if (drop_reason)
goto discard_and_relse;

diff --git a/net/ipv6/tcp_ao.c b/net/ipv6/tcp_ao.c
index 99753e12c08c..8b04611c9078 100644
--- a/net/ipv6/tcp_ao.c
+++ b/net/ipv6/tcp_ao.c
@@ -53,11 +53,12 @@ int tcp_v6_ao_calc_key_skb(struct tcp_ao_key *mkt, u8 *key,
const struct sk_buff *skb,
__be32 sisn, __be32 disn)
{
- const struct ipv6hdr *iph = ipv6_hdr(skb);
- const struct tcphdr *th = tcp_hdr(skb);
+ const struct ipv6hdr *iph = ipv6_hdr(skb);
+ const struct tcphdr *th = tcp_hdr(skb);

- return tcp_v6_ao_calc_key(mkt, key, &iph->saddr, &iph->daddr,
- th->source, th->dest, sisn, disn);
+ return tcp_v6_ao_calc_key(mkt, key, &iph->saddr,
+ &iph->daddr, th->source,
+ th->dest, sisn, disn);
}

int tcp_v6_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key,
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index f57617d2921a..39674a5485be 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1781,9 +1781,9 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
struct sock *nsk;

sk = req->rsk_listener;
- drop_reason = tcp_inbound_md5_hash(sk, skb,
- &hdr->saddr, &hdr->daddr,
- AF_INET6, dif, sdif);
+ drop_reason = tcp_inbound_hash(sk, req, skb,
+ &hdr->saddr, &hdr->daddr,
+ AF_INET6, dif, sdif);
if (drop_reason) {
sk_drops_add(sk, skb);
reqsk_put(req);
@@ -1856,8 +1856,8 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
goto discard_and_relse;
}

- drop_reason = tcp_inbound_md5_hash(sk, skb, &hdr->saddr, &hdr->daddr,
- AF_INET6, dif, sdif);
+ drop_reason = tcp_inbound_hash(sk, NULL, skb, &hdr->saddr, &hdr->daddr,
+ AF_INET6, dif, sdif);
if (drop_reason)
goto discard_and_relse;

@@ -2085,6 +2085,7 @@ static const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific = {
.ao_lookup = tcp_v6_ao_lookup,
.calc_ao_hash = tcp_v4_ao_hash_skb,
.ao_parse = tcp_v6_parse_ao,
+ .ao_calc_key_sk = tcp_v4_ao_calc_key_sk,
#endif
};
#endif
--
2.41.0

2023-09-13 02:17:05

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH v11 net-next 11/23] net/tcp: Sign SYN-ACK segments with TCP-AO

On Mon, Sep 11, 2023 at 11:04 PM Dmitry Safonov <[email protected]> wrote:
>
> Similarly to RST segments, wire SYN-ACKs to TCP-AO.
> tcp_rsk_used_ao() is handy here to check if the request socket used AO
> and needs a signature on the outgoing segments.
>
> Co-developed-by: Francesco Ruggeri <[email protected]>
> Signed-off-by: Francesco Ruggeri <[email protected]>
> Co-developed-by: Salam Noureddine <[email protected]>
> Signed-off-by: Salam Noureddine <[email protected]>
> Signed-off-by: Dmitry Safonov <[email protected]>
> Acked-by: David Ahern <[email protected]>
> ---
> include/net/tcp.h | 3 +++
> include/net/tcp_ao.h | 6 +++++
> net/ipv4/tcp_ao.c | 22 +++++++++++++++++
> net/ipv4/tcp_ipv4.c | 1 +
> net/ipv4/tcp_output.c | 57 ++++++++++++++++++++++++++++++++++++++-----
> net/ipv6/tcp_ao.c | 22 +++++++++++++++++
> net/ipv6/tcp_ipv6.c | 1 +
> 7 files changed, 106 insertions(+), 6 deletions(-)
>
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 5daa2e98e6a3..56f4180443c7 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -2179,6 +2179,9 @@ struct tcp_request_sock_ops {
> struct request_sock *req,
> int sndid, int rcvid);
> int (*ao_calc_key)(struct tcp_ao_key *mkt, u8 *key, struct request_sock *sk);
> + int (*ao_synack_hash)(char *ao_hash, struct tcp_ao_key *mkt,
> + struct request_sock *req, const struct sk_buff *skb,
> + int hash_offset, u32 sne);
> #endif
> #ifdef CONFIG_SYN_COOKIES
> __u32 (*cookie_init_seq)(const struct sk_buff *skb,
> diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
> index d26d98f1b048..c922d2e31d08 100644
> --- a/include/net/tcp_ao.h
> +++ b/include/net/tcp_ao.h
> @@ -144,6 +144,9 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb,
> int tcp_v4_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen);
> struct tcp_ao_key *tcp_v4_ao_lookup(const struct sock *sk, struct sock *addr_sk,
> int sndid, int rcvid);
> +int tcp_v4_ao_synack_hash(char *ao_hash, struct tcp_ao_key *mkt,
> + struct request_sock *req, const struct sk_buff *skb,
> + int hash_offset, u32 sne);
> int tcp_v4_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key,
> const struct sock *sk,
> __be32 sisn, __be32 disn, bool send);
> @@ -178,6 +181,9 @@ int tcp_v6_ao_hash_skb(char *ao_hash, struct tcp_ao_key *key,
> const struct sock *sk, const struct sk_buff *skb,
> const u8 *tkey, int hash_offset, u32 sne);
> int tcp_v6_parse_ao(struct sock *sk, int cmd, sockptr_t optval, int optlen);
> +int tcp_v6_ao_synack_hash(char *ao_hash, struct tcp_ao_key *ao_key,
> + struct request_sock *req, const struct sk_buff *skb,
> + int hash_offset, u32 sne);
> void tcp_ao_finish_connect(struct sock *sk, struct sk_buff *skb);
> void tcp_ao_connect_init(struct sock *sk);
> void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb,
> diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
> index b114f3d901a0..0d8ea381300b 100644
> --- a/net/ipv4/tcp_ao.c
> +++ b/net/ipv4/tcp_ao.c
> @@ -568,6 +568,28 @@ int tcp_v4_ao_hash_skb(char *ao_hash, struct tcp_ao_key *key,
> tkey, hash_offset, sne);
> }
>
> +int tcp_v4_ao_synack_hash(char *ao_hash, struct tcp_ao_key *ao_key,
> + struct request_sock *req, const struct sk_buff *skb,
> + int hash_offset, u32 sne)
> +{
> + void *hash_buf = NULL;
> + int err;
> +
> + hash_buf = kmalloc(tcp_ao_digest_size(ao_key), GFP_ATOMIC);
> + if (!hash_buf)
> + return -ENOMEM;
> +
> + err = tcp_v4_ao_calc_key_rsk(ao_key, hash_buf, req);
> + if (err)
> + goto out;
> +
> + err = tcp_ao_hash_skb(AF_INET, ao_hash, ao_key, req_to_sk(req), skb,
> + hash_buf, hash_offset, sne);
> +out:
> + kfree(hash_buf);
> + return err;
> +}
> +
> struct tcp_ao_key *tcp_v4_ao_lookup_rsk(const struct sock *sk,
> struct request_sock *req,
> int sndid, int rcvid)
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 9a4ffcc965f3..c40da33d988b 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -1675,6 +1675,7 @@ const struct tcp_request_sock_ops tcp_request_sock_ipv4_ops = {
> #ifdef CONFIG_TCP_AO
> .ao_lookup = tcp_v4_ao_lookup_rsk,
> .ao_calc_key = tcp_v4_ao_calc_key_rsk,
> + .ao_synack_hash = tcp_v4_ao_synack_hash,
> #endif
> #ifdef CONFIG_SYN_COOKIES
> .cookie_init_seq = cookie_v4_init_sequence,
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 44c97e6ddd50..c9d6decef443 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -900,6 +900,7 @@ static unsigned int tcp_synack_options(const struct sock *sk,
> unsigned int mss, struct sk_buff *skb,
> struct tcp_out_options *opts,
> const struct tcp_md5sig_key *md5,
> + const struct tcp_ao_key *ao,
> struct tcp_fastopen_cookie *foc,
> enum tcp_synack_type synack_type,
> struct sk_buff *syn_skb)
> @@ -921,6 +922,14 @@ static unsigned int tcp_synack_options(const struct sock *sk,
> ireq->tstamp_ok &= !ireq->sack_ok;
> }
> #endif
> +#ifdef CONFIG_TCP_AO
> + if (ao) {
> + opts->options |= OPTION_AO;
> + remaining -= tcp_ao_len(ao);
> + ireq->tstamp_ok &= !ireq->sack_ok;
> + }
> +#endif
> + WARN_ON_ONCE(md5 && ao);
>
> /* We always send an MSS option. */
> opts->mss = mss;
> @@ -3727,6 +3736,7 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
> struct inet_request_sock *ireq = inet_rsk(req);
> const struct tcp_sock *tp = tcp_sk(sk);
> struct tcp_md5sig_key *md5 = NULL;
> + struct tcp_ao_key *ao_key = NULL;
> struct tcp_out_options opts;
> struct sk_buff *skb;
> int tcp_header_size;
> @@ -3777,16 +3787,43 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
> tcp_rsk(req)->snt_synack = tcp_skb_timestamp_us(skb);
> }
>
> -#ifdef CONFIG_TCP_MD5SIG
> +#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
> rcu_read_lock();
> - md5 = tcp_rsk(req)->af_specific->req_md5_lookup(sk, req_to_sk(req));
> #endif
> + if (tcp_rsk_used_ao(req)) {
> +#ifdef CONFIG_TCP_AO
> + u8 maclen = tcp_rsk(req)->maclen;
> + u8 keyid = tcp_rsk(req)->ao_keyid;
> +
> + ao_key = tcp_sk(sk)->af_specific->ao_lookup(sk, req_to_sk(req),
> + keyid, -1);
> + /* If there is no matching key - avoid sending anything,
> + * especially usigned segments. It could try harder and lookup
> + * for another peer-matching key, but the peer has requested
> + * ao_keyid (RFC5925 RNextKeyID), so let's keep it simple here.
> + */
> + if (unlikely(!ao_key || tcp_ao_maclen(ao_key) != maclen)) {
> + rcu_read_unlock();
> + skb_dst_drop(skb);

This does look necessary ? kfree_skb(skb) should also skb_dst_drop(skb);


> + kfree_skb(skb);
> + net_warn_ratelimited("TCP-AO: the keyid %u with maclen %u|%u from SYN packet is not present - not sending SYNACK\n",
> + keyid, maclen,
> + ao_key ? tcp_ao_maclen(ao_key) : 0);

dereferencing ao_key after rcu_read_unlock() is a bug.


> + return NULL;
> + }
> +#endif
> + } else {
> +#ifdef CONFIG_TCP_MD5SIG
> + md5 = tcp_rsk(req)->af_specific->req_md5_lookup(sk,
> + req_to_sk(req));
> +#endif
> + }
> skb_set_hash(skb, READ_ONCE(tcp_rsk(req)->txhash), PKT_HASH_TYPE_L4);
> /* bpf program will be interested in the tcp_flags */
> TCP_SKB_CB(skb)->tcp_flags = TCPHDR_SYN | TCPHDR_ACK;
> tcp_header_size = tcp_synack_options(sk, req, mss, skb, &opts, md5,
> - foc, synack_type,
> - syn_skb) + sizeof(*th);
> + ao_key, foc, synack_type, syn_skb)
> + + sizeof(*th);
>
> skb_push(skb, tcp_header_size);
> skb_reset_transport_header(skb);
> @@ -3806,7 +3843,7 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
>
> /* RFC1323: The window in SYN & SYN/ACK segments is never scaled. */
> th->window = htons(min(req->rsk_rcv_wnd, 65535U));
> - tcp_options_write(th, NULL, NULL, &opts, NULL);
> + tcp_options_write(th, NULL, tcp_rsk(req), &opts, ao_key);
> th->doff = (tcp_header_size >> 2);
> TCP_INC_STATS(sock_net(sk), TCP_MIB_OUTSEGS);
>
> @@ -3814,7 +3851,15 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
> /* Okay, we have all we need - do the md5 hash if needed */
> if (md5)
> tcp_rsk(req)->af_specific->calc_md5_hash(opts.hash_location,
> - md5, req_to_sk(req), skb);
> + md5, req_to_sk(req), skb);
> +#endif
> +#ifdef CONFIG_TCP_AO
> + if (ao_key)
> + tcp_rsk(req)->af_specific->ao_synack_hash(opts.hash_location,
> + ao_key, req, skb,
> + opts.hash_location - (u8 *)th, 0);
> +#endif
> +#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
> rcu_read_unlock();
> #endif
>
> diff --git a/net/ipv6/tcp_ao.c b/net/ipv6/tcp_ao.c
> index c9a6fa84f6ce..99753e12c08c 100644
> --- a/net/ipv6/tcp_ao.c
> +++ b/net/ipv6/tcp_ao.c
> @@ -144,3 +144,25 @@ int tcp_v6_parse_ao(struct sock *sk, int cmd,
> {
> return tcp_parse_ao(sk, cmd, AF_INET6, optval, optlen);
> }
> +
> +int tcp_v6_ao_synack_hash(char *ao_hash, struct tcp_ao_key *ao_key,
> + struct request_sock *req, const struct sk_buff *skb,
> + int hash_offset, u32 sne)
> +{
> + void *hash_buf = NULL;
> + int err;
> +
> + hash_buf = kmalloc(tcp_ao_digest_size(ao_key), GFP_ATOMIC);
> + if (!hash_buf)
> + return -ENOMEM;
> +
> + err = tcp_v6_ao_calc_key_rsk(ao_key, hash_buf, req);
> + if (err)
> + goto out;
> +
> + err = tcp_ao_hash_skb(AF_INET6, ao_hash, ao_key, req_to_sk(req), skb,
> + hash_buf, hash_offset, sne);
> +out:
> + kfree(hash_buf);
> + return err;
> +}
> diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> index c060cd964f91..f57617d2921a 100644
> --- a/net/ipv6/tcp_ipv6.c
> +++ b/net/ipv6/tcp_ipv6.c
> @@ -845,6 +845,7 @@ const struct tcp_request_sock_ops tcp_request_sock_ipv6_ops = {
> #ifdef CONFIG_TCP_AO
> .ao_lookup = tcp_v6_ao_lookup_rsk,
> .ao_calc_key = tcp_v6_ao_calc_key_rsk,
> + .ao_synack_hash = tcp_v6_ao_synack_hash,
> #endif
> #ifdef CONFIG_SYN_COOKIES
> .cookie_init_seq = cookie_v6_init_sequence,
> --
> 2.41.0
>

2023-09-13 03:47:15

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH v11 net-next 11/23] net/tcp: Sign SYN-ACK segments with TCP-AO

On 9/12/23 17:47, Eric Dumazet wrote:
> On Mon, Sep 11, 2023 at 11:04 PM Dmitry Safonov <[email protected]> wrote:
[..]
>> @@ -3777,16 +3787,43 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
>> tcp_rsk(req)->snt_synack = tcp_skb_timestamp_us(skb);
>> }
>>
>> -#ifdef CONFIG_TCP_MD5SIG
>> +#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AO)
>> rcu_read_lock();
>> - md5 = tcp_rsk(req)->af_specific->req_md5_lookup(sk, req_to_sk(req));
>> #endif
>> + if (tcp_rsk_used_ao(req)) {
>> +#ifdef CONFIG_TCP_AO
>> + u8 maclen = tcp_rsk(req)->maclen;
>> + u8 keyid = tcp_rsk(req)->ao_keyid;
>> +
>> + ao_key = tcp_sk(sk)->af_specific->ao_lookup(sk, req_to_sk(req),
>> + keyid, -1);
>> + /* If there is no matching key - avoid sending anything,
>> + * especially usigned segments. It could try harder and lookup
>> + * for another peer-matching key, but the peer has requested
>> + * ao_keyid (RFC5925 RNextKeyID), so let's keep it simple here.
>> + */
>> + if (unlikely(!ao_key || tcp_ao_maclen(ao_key) != maclen)) {
>> + rcu_read_unlock();
>> + skb_dst_drop(skb);
>
> This does look necessary ? kfree_skb(skb) should also skb_dst_drop(skb);

Yeah, it seems not necessary, will drop this.

>
>
>> + kfree_skb(skb);
>> + net_warn_ratelimited("TCP-AO: the keyid %u with maclen %u|%u from SYN packet is not present - not sending SYNACK\n",
>> + keyid, maclen,
>> + ao_key ? tcp_ao_maclen(ao_key) : 0);
>
> dereferencing ao_key after rcu_read_unlock() is a bug.

Thanks for catching, will fix!

--
Dmitry

2023-09-14 05:25:33

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH v11 net-next 16/23] net/tcp: Ignore specific ICMPs for TCP-AO connections

Similarly to IPsec, RFC5925 prescribes:
">> A TCP-AO implementation MUST default to ignore incoming ICMPv4
messages of Type 3 (destination unreachable), Codes 2-4 (protocol
unreachable, port unreachable, and fragmentation needed -- ’hard
errors’), and ICMPv6 Type 1 (destination unreachable), Code 1
(administratively prohibited) and Code 4 (port unreachable) intended
for connections in synchronized states (ESTABLISHED, FIN-WAIT-1, FIN-
WAIT-2, CLOSE-WAIT, CLOSING, LAST-ACK, TIME-WAIT) that match MKTs."

A selftest (later in patch series) verifies that this attack is not
possible in this TCP-AO implementation.

Co-developed-by: Francesco Ruggeri <[email protected]>
Signed-off-by: Francesco Ruggeri <[email protected]>
Co-developed-by: Salam Noureddine <[email protected]>
Signed-off-by: Salam Noureddine <[email protected]>
Signed-off-by: Dmitry Safonov <[email protected]>
Acked-by: David Ahern <[email protected]>
---
include/net/tcp_ao.h | 10 ++++++-
include/uapi/linux/snmp.h | 1 +
include/uapi/linux/tcp.h | 4 ++-
net/ipv4/proc.c | 1 +
net/ipv4/tcp_ao.c | 58 +++++++++++++++++++++++++++++++++++++++
net/ipv4/tcp_ipv4.c | 7 +++++
net/ipv6/tcp_ipv6.c | 7 +++++
7 files changed, 86 insertions(+), 2 deletions(-)

diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
index 5c5d16b6f9f9..4c290c647272 100644
--- a/include/net/tcp_ao.h
+++ b/include/net/tcp_ao.h
@@ -24,6 +24,7 @@ struct tcp_ao_counters {
atomic64_t pkt_bad;
atomic64_t key_not_found;
atomic64_t ao_required;
+ atomic64_t dropped_icmp;
};

struct tcp_ao_key {
@@ -92,7 +93,8 @@ struct tcp_ao_info {
struct tcp_ao_key *rnext_key;
struct tcp_ao_counters counters;
u32 ao_required :1,
- __unused :31;
+ accept_icmps :1,
+ __unused :30;
__be32 lisn;
__be32 risn;
/* Sequence Number Extension (SNE) are upper 4 bytes for SEQ,
@@ -189,6 +191,7 @@ int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx,
unsigned int len, struct tcp_sigpool *hp);
void tcp_ao_destroy_sock(struct sock *sk, bool twsk);
void tcp_ao_time_wait(struct tcp_timewait_sock *tcptw, struct tcp_sock *tp);
+bool tcp_ao_ignore_icmp(const struct sock *sk, int type, int code);
enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk,
const struct sk_buff *skb, unsigned short int family,
const struct request_sock *req,
@@ -264,6 +267,11 @@ static inline void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb,
{
}

+static inline bool tcp_ao_ignore_icmp(const struct sock *sk, int type, int code)
+{
+ return false;
+}
+
static inline enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk,
const struct sk_buff *skb, unsigned short int family,
const struct request_sock *req, const struct tcp_ao_hdr *aoh)
diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h
index 06ddf4cd295c..47a6b47da66f 100644
--- a/include/uapi/linux/snmp.h
+++ b/include/uapi/linux/snmp.h
@@ -300,6 +300,7 @@ enum
LINUX_MIB_TCPAOBAD, /* TCPAOBad */
LINUX_MIB_TCPAOKEYNOTFOUND, /* TCPAOKeyNotFound */
LINUX_MIB_TCPAOGOOD, /* TCPAOGood */
+ LINUX_MIB_TCPAODROPPEDICMPS, /* TCPAODroppedIcmps */
__LINUX_MIB_MAX
};

diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index 3fe0612ec59a..ca7ed18ce67b 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -392,7 +392,8 @@ struct tcp_ao_info_opt { /* setsockopt(TCP_AO_INFO) */
set_rnext :1, /* corresponding ::rnext */
ao_required :1, /* don't accept non-AO connects */
set_counters :1, /* set/clear ::pkt_* counters */
- reserved :28; /* must be 0 */
+ accept_icmps :1, /* accept incoming ICMPs */
+ reserved :27; /* must be 0 */
__u16 reserved2; /* padding, must be 0 */
__u8 current_key; /* KeyID to set as Current_key */
__u8 rnext; /* KeyID to set as Rnext_key */
@@ -400,6 +401,7 @@ struct tcp_ao_info_opt { /* setsockopt(TCP_AO_INFO) */
__u64 pkt_bad; /* failed verification */
__u64 pkt_key_not_found; /* could not find a key to verify */
__u64 pkt_ao_required; /* segments missing TCP-AO sign */
+ __u64 pkt_dropped_icmp; /* ICMPs that were ignored */
} __attribute__((aligned(8)));

/* setsockopt(fd, IPPROTO_TCP, TCP_ZEROCOPY_RECEIVE, ...) */
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index 3f643cd29cfe..5d3c9c96773e 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -302,6 +302,7 @@ static const struct snmp_mib snmp4_net_list[] = {
SNMP_MIB_ITEM("TCPAOBad", LINUX_MIB_TCPAOBAD),
SNMP_MIB_ITEM("TCPAOKeyNotFound", LINUX_MIB_TCPAOKEYNOTFOUND),
SNMP_MIB_ITEM("TCPAOGood", LINUX_MIB_TCPAOGOOD),
+ SNMP_MIB_ITEM("TCPAODroppedIcmps", LINUX_MIB_TCPAODROPPEDICMPS),
SNMP_MIB_SENTINEL
};

diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index 2283203f1ac5..a8af93972ee5 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -15,6 +15,7 @@

#include <net/tcp.h>
#include <net/ipv6.h>
+#include <net/icmp.h>

int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx,
unsigned int len, struct tcp_sigpool *hp)
@@ -44,6 +45,60 @@ int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx,
return 1;
}

+bool tcp_ao_ignore_icmp(const struct sock *sk, int type, int code)
+{
+ bool ignore_icmp = false;
+ struct tcp_ao_info *ao;
+
+ /* RFC5925, 7.8:
+ * >> A TCP-AO implementation MUST default to ignore incoming ICMPv4
+ * messages of Type 3 (destination unreachable), Codes 2-4 (protocol
+ * unreachable, port unreachable, and fragmentation needed -- ’hard
+ * errors’), and ICMPv6 Type 1 (destination unreachable), Code 1
+ * (administratively prohibited) and Code 4 (port unreachable) intended
+ * for connections in synchronized states (ESTABLISHED, FIN-WAIT-1, FIN-
+ * WAIT-2, CLOSE-WAIT, CLOSING, LAST-ACK, TIME-WAIT) that match MKTs.
+ */
+ if (sk->sk_family == AF_INET) {
+ if (type != ICMP_DEST_UNREACH)
+ return false;
+ if (code < ICMP_PROT_UNREACH || code > ICMP_FRAG_NEEDED)
+ return false;
+ } else {
+ if (type != ICMPV6_DEST_UNREACH)
+ return false;
+ if (code != ICMPV6_ADM_PROHIBITED && code != ICMPV6_PORT_UNREACH)
+ return false;
+ }
+
+ rcu_read_lock();
+ switch (sk->sk_state) {
+ case TCP_TIME_WAIT:
+ ao = rcu_dereference(tcp_twsk(sk)->ao_info);
+ break;
+ case TCP_SYN_SENT:
+ case TCP_SYN_RECV:
+ case TCP_LISTEN:
+ case TCP_NEW_SYN_RECV:
+ /* RFC5925 specifies to ignore ICMPs *only* on connections
+ * in synchronized states.
+ */
+ rcu_read_unlock();
+ return false;
+ default:
+ ao = rcu_dereference(tcp_sk(sk)->ao_info);
+ }
+
+ if (ao && !ao->accept_icmps) {
+ ignore_icmp = true;
+ __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAODROPPEDICMPS);
+ atomic64_inc(&ao->counters.dropped_icmp);
+ }
+ rcu_read_unlock();
+
+ return ignore_icmp;
+}
+
/* Optimized version of tcp_ao_do_lookup(): only for sockets for which
* it's known that the keys in ao_info are matching peer's
* family/address/VRF/etc.
@@ -1035,6 +1090,7 @@ int tcp_ao_copy_all_matching(const struct sock *sk, struct sock *newsk,
new_ao->lisn = htonl(tcp_rsk(req)->snt_isn);
new_ao->risn = htonl(tcp_rsk(req)->rcv_isn);
new_ao->ao_required = ao->ao_required;
+ new_ao->accept_icmps = ao->accept_icmps;

if (family == AF_INET) {
addr = (union tcp_ao_addr *)&newsk->sk_daddr;
@@ -1741,9 +1797,11 @@ static int tcp_ao_info_cmd(struct sock *sk, unsigned short int family,
atomic64_set(&ao_info->counters.pkt_bad, cmd.pkt_bad);
atomic64_set(&ao_info->counters.key_not_found, cmd.pkt_key_not_found);
atomic64_set(&ao_info->counters.ao_required, cmd.pkt_ao_required);
+ atomic64_set(&ao_info->counters.dropped_icmp, cmd.pkt_dropped_icmp);
}

ao_info->ao_required = cmd.ao_required;
+ ao_info->accept_icmps = cmd.accept_icmps;
if (new_current)
WRITE_ONCE(ao_info->current_key, new_current);
if (new_rnext)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 0f702b33fdef..a4aef27b0f72 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -493,6 +493,8 @@ int tcp_v4_err(struct sk_buff *skb, u32 info)
return -ENOENT;
}
if (sk->sk_state == TCP_TIME_WAIT) {
+ /* To increase the counter of ignored icmps for TCP-AO */
+ tcp_ao_ignore_icmp(sk, type, code);
inet_twsk_put(inet_twsk(sk));
return 0;
}
@@ -506,6 +508,11 @@ int tcp_v4_err(struct sk_buff *skb, u32 info)
return 0;
}

+ if (tcp_ao_ignore_icmp(sk, type, code)) {
+ sock_put(sk);
+ return 0;
+ }
+
bh_lock_sock(sk);
/* If too many ICMPs get dropped on busy
* servers this needs to be solved differently.
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 59719c8bc5ac..60c2129890ec 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -395,6 +395,8 @@ static int tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
}

if (sk->sk_state == TCP_TIME_WAIT) {
+ /* To increase the counter of ignored icmps for TCP-AO */
+ tcp_ao_ignore_icmp(sk, type, code);
inet_twsk_put(inet_twsk(sk));
return 0;
}
@@ -405,6 +407,11 @@ static int tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
return 0;
}

+ if (tcp_ao_ignore_icmp(sk, type, code)) {
+ sock_put(sk);
+ return 0;
+ }
+
bh_lock_sock(sk);
if (sock_owned_by_user(sk) && type != ICMPV6_PKT_TOOBIG)
__NET_INC_STATS(net, LINUX_MIB_LOCKDROPPEDICMPS);
--
2.41.0