Hi,
This is version 8 of TCP-AO support. I base it on master and there
weren't any conflicts on my tentative merge to linux-next.
The good news is that all pre-required patches have merged to
Torvald's/master. Thanks to Herbert, crypto clone-tfm just works on
master for all TCP-AO supported algorithms.
So, this is the first version of the patch set that has only net-related
changes (well, selftests as well, but they'll be upstreamed separately).
In this version, I've finally spent time and written Documentation/ page
on TCP-AO. It has Frequently Asked Questions (FAQ) on RFC 5925 - I found
it very useful to answer those before writing the actual code.
It provides answers to common questions that arise on a quick read of
the RFC as well as how they were answered. There's also a comparison
to the TCP-MD5 option, an evaluation of per-socket vs in-kernel-DB
approaches and a description of uAPI provided.
I hope it will be as useful for reviewing the code as it was for writing.
The most important changes in this version are:
- CONFIG_TCP_AO implies CONFIG_IPV6 != m. I don't feel like that
combination would be useful to anyone and it'd be painful to fix.
- uAPI change in TCP_AO_REPAIR (introduced in version 7): I removed
{snd,rcv}_sne_seq counters. They were just copies of snd_nxt/snd_una.
No reason for polluting uAPI as well as needlessly copying them.
- TCP_AO_MAX_HASH_SIZE is removed and all temporary buffers are
kmalloc()'d. That also saves a couple of bytes for hmac(sha1) and
cmac(aes128) traffic keys as they now are allocated with
exact hash algo's digest_size.
There's an independent patch set for TCP-MD5 to verify segments on twsk:
https://lore.kernel.org/all/[email protected]/T/#u
That may be used to verify TCP-AO segments on twsk as well.
There seem to be more people that connected me off-list asking me about
the status of patches and when I expect them to merge. Cc'ing more
interested parties here (ping me directly if you don't want to be in
copy). It would be helpful if you provide your reviews and tested-by's.
As far as I'm aware, version 7 was ported to RHEL, so now there are
probably more downstream kernels with TCP-AO support.
Also available as a git branch for pulling:
https://github.com/0x7f454c46/linux/tree/tcp-ao-v8
And another branch with selftests, that will be sent later separately:
https://github.com/0x7f454c46/linux/tree/tcp-ao-v8-with-selftests
Thanks for your time and reviews,
Dmitry
--- Changelog ---
Changes from v7:
- Fixed copy'n'paste typo in unsigned-md5.c selftest output
- Fix build error in tcp_v6_send_reset() (kernel test robot <[email protected]>)
- Make CONFIG_TCP_AO imply IPV6 != m
- Cleanup EXPORT_SYMBOL*() as they aren't needed with IPV6 != m
- Used scratch area instead of on-stack buffer for scatter-gather list
in tcp_v{4,6}_ao_calc_key(). Fixes CONFIG_VMAP_STACK=y + CONFIG_DEBUG_SG=y
- Allocated digest_size'd buffers for traffic keys in tcp_ao_key instead
of maximum-sized buffers of TCP_AO_MAX_HASH_SIZE. That will save
little space per key and also potentially allow algorithms with
digest size > TCP_AO_MAX_HASH_SIZE.
- Removed TCP_AO_MAX_HASH_SIZE and used kmalloc(GFP_ATOMIC) instead of
on-stack hash buffer.
- Don't treat fd=0 as invalid in selftests
- Make TCP-AO selftests work with CONFIG_CRYPTO_FIPS=y
- Don't tcp_ao_compute_sne() for snd_sne on twsk: it's redundant as
no data can be sent on twsk
- Get rid of {snd,rcv}_sne_seq: use snd_nxt/snd_una or rcv_nxt instead
- {rcv,snd}_sne and tcp_ao_compute_sne() now are introduced in
"net/tcp: Add TCP-AO SNE support" patch
- trivial copy_to_sockptr() fixup for tcp_ao_get_repair() - it could
try copying bigger struct than the kernel one (embarrassing!)
- Added Documentation/networking/tcp_ao.rst that describes:
uAPI, has FAQ on RFC 5925 and has implementation details of Linux TCP-AO
Changes from v6:
- Some more trivial build warnings fixups (kernel test robot <[email protected]>)
- Added TCP_AO_REPAIR setsockopt(), getsockopt()
- Allowed TCP_AO_* setsockopts if (tp->repair) is on
- Added selftests for TCP_AO_REPAIR, that also check incorrect
ISNs/SNEs, which result in a broken TCP-AO connection - that verifies
that both Initial Sequence Numbers and Sequence Number Extension are
part of MAC generation
- Using TCP_AO_REPAIR added a selftest for SEQ numbers rollover,
checking that SNE was incremented, connection is alive post-rolloever
and no TCP segments with a wrong signature arrived
- Wrote a selftest for RST segments: both active reset (goes through
transmit_skb()) and passive reset (goes through tcp_v{4,6}_send_reset()).
- Refactored and made readable tcp_v{4,6}_send_reset(), also adding
support for TCP_LISTEN/TCP_NEW_SYN_RECV
- Dropped per-CPU ahash requests allocations in favor of Herbert's
clone-tfm crypto API
- Added Donald Cassidy to Cc as he's interested in getting it into RHEL.
Version 6: https://lore.kernel.org/all/[email protected]/T/#u
iperf[3] benchmarks for version 6:
v6.4-rc1 TCP-AO-v6
TCP 43.9 Gbits/sec 43.5 Gbits/sec
TCP-MD5 2.20 Gbits/sec 2.25 Gbits/sec
TCP-AO(hmac(sha1)) 2.53 Gbits/sec
TCP-AO(hmac(sha512)) 1.67 Gbits/sec
TCP-AO(hmac(sha384)) 1.77 Gbits/sec
TCP-AO(hmac(sha224)) 1.29 Gbits/sec
TCP-AO(hmac(sha3-512)) 481 Mbits/sec
TCP-AO(hmac(md5)) 2.07 Gbits/sec
TCP-AO(hmac(rmd160)) 1.01 Gbits/sec
TCP-AO(cmac(aes128)) 2.11 Gbits/sec
Changes from v5:
- removed check for TCP_AO_KEYF_IFINDEX in delete command:
VRF might have been destroyed, there still needs to be a way to delete
keys that were bound to that l3intf (should tcp_v{4,6}_parse_md5_keys()
avoid the same check as well?)
- corrected copy'n'paste typo in tcp_ao_info_cmd() (assign ao_info->rnext_key)
- simplified a bit tcp_ao_copy_mkts_to_user(); added more UAPI checks
for getsockopt(TCP_AO_GET_KEYS)
- More UAPI selftests in setsockopt-closed: 29 => 120
- ported TCP-AO patches on Herbert's clone-tfm changes
- adjusted iperf patch for TCP-AO UAPI changes from version 5
- added measures for TCP-AO with tcp_sigpool & clone_tfm backends
Version 5: https://lore.kernel.org/all/[email protected]/T/#u
Changes from v4:
- Renamed tcp_ao_matched_key() => tcp_ao_established_key()
- Missed `static` in function definitions
(kernel test robot <[email protected]>)
- Fixed CONFIG_IPV6=m build
- Unexported tcp_md5_*_sigpool() functions
- Cleaned up tcp_ao.h: undeclared tcp_ao_cache_traffic_keys(),
tcp_v4_ao_calc_key_skb(); removed tcp_v4_inbound_ao_hash()
- Marked "net/tcp: Prepare tcp_md5sig_pool for TCP-AO" as a [draft] patch
- getsockopt() now returns TCP-AO per-key counters
- Another getsockopt() now returns per-ao_info stats: counters
and accept_icmps flag state
- Wired up getsockopt() returning counters to selftests
- Fixed a porting mistake: TCP-AO hash in some cases was written in TCP
header without accounting for MAC length of the key, rewritting skb
shared info
- Fail adding a key with L3 ifindex when !TCP_AO_KEYF_IFINDEX, instead
of ignoring tcpa_ifindex (stricter UAPI check)
- Added more test-cases to setsockopt-closed.c selftest
- tcp_ao_hash_skb_data() was a copy'n'paste of tcp_md5_hash_skb_data()
share it now under tcp_sigpool_hash_skb_data()
- tcp_ao_mkt_overlap_v{4,6}() deleted as they just re-invented
tcp_ao_do_lookup(). That fixes an issue with multiple IPv4-mapped-IPv6
keys for different peers on a listening socket.
- getsockopt() now is tested to return correct VRF number for a key
- TCP-AO and TCP-MD5 interraction in non/default VRFs: added +19 selftests
made them SKIP when CONFIG_VRF=n
- unsigned-md5 selftests now checks both scenarios:
(1) adding TCP-AO key _after_ TCP-MD5 key
(2) adding TCP-MD5 key _after_ TCP-AO key
- Added a ratelimited warning if TCP-AO key.ifindex doesn't match
sk->sk_bound_dev_if - that will warn a user for potential VRF issues
- tcp_v{4,6}_parse_md5_keys() now allows adding TCP-MD5 key with
ifindex=0 and TCP_MD5SIG_FLAG_IFINDEX together with TCP-AO key from
another VRF
- Add TCP_AO_CMDF_AO_REQUIRED, which makes a socket TCP-AO only,
rejecting TCP-MD5 keys or any unsigned TCP segments
- Remove `tcpa_' prefix for UAPI structure members
- UAPI cleanup: I've separated & renamed per-socket settings
(such as ao_info flags + current/rnext set) from per-key changes:
TCP_AO => TCP_AO_ADD_KEY
TCP_AO_DEL => TCP_AO_DEL_KEY
TCP_AO_GET => TCP_AO_GET_KEYS
TCP_AO_MOD => TCP_AO_INFO, the structure is now valid for both
getsockopt() and setsockopt().
- tcp_ao_current_rnext() was split up in order to fail earlier when
sndid/rcvid specified can't be set, before anything was changed in ao_info
- fetch current_key before dumping TCP-AO keys in getsockopt(TCP_AO_GET_KEYS):
it may race with changing current_key by RX, which in result might
produce a dump with no current_key for userspace.
- instead of TCP_AO_CMDF_* flags, used bitfileds: the flags weren't
shared between all TCP_AO_{ADD,GET,DEL}_KEY{,S}, so bitfields are more
descriptive here
- use READ_ONCE()/WRITE_ONCE() for current_key and rnext_key more
consistently; document in comment the rules for accessing them
- selftests: check all setsockopts()/getsockopts() support extending
option structs
Version 4: https://lore.kernel.org/all/[email protected]/T/#u
Changes from v3:
- TCP_MD5 dynamic static key enable/disable patches merged separately [4]
- crypto_pool patches were nacked [5], so instead this patch set extends
TCP-MD5-sigpool to be used for TCP-AO as well as for TCP-MD5
- Added missing `static' for tcp_v6_ao_calc_key()
(kernel test robot <[email protected]>)
- Removed CONFIG_TCP_AO default=y and added "If unsure, say N."
- Don't leak ao_info and don't create an unsigned TCP socket if there was
a TCP-AO key during handshake, but it was removed from listening socket
while the connection was being established
- Migrate to use static_key_fast_inc_not_disabled() and check return
code of static_branch_inc()
- Change some return codes to EAFNOSUPPORT for error-pathes where
family is neither AF_INET nor AF_INET6
- setsockopt()s on a closed/listen socket might have created stray ao_info,
remove it if connect() is called with a correct TCP-MD5 key, the same
for the reverse situation: remove md5sig_info straight away from the
socket if it's going to be TCP-AO connection
- IPv4-mapped-IPv6 addresses + selftest in fcnal-test.sh (by Salam)
- fix using uninitialized sisn/disn from stack - it would only make
non-SYN packets fail verification on a listen socket, which are not
expected anyway (kernel test robot <[email protected]>)
- implicit padding in UAPI TCP-AO structures converted to explicit
(spotted-by David Laight)
- Some selftests missed zero-initializers for uapi structs on stack
- Removed tcp_ao_do_lookup_rcvid() and tcp_ao_do_lookup_sndid() in
favor of unified tcp_ao_matched_key()
- Disallowed setting current/rnext keys on listen sockets - that wasn't
supported and didn't affect anything, cleanup for the UAPI
- VRFs support for TCP-AO
Version 3: https://lore.kernel.org/all/[email protected]/T/#u
Changes from v2:
- Added more missing `static' declarations for local functions
(kernel test robot <[email protected]>)
- Building now with CONFIG_TCP_AO=n and CONFIG_TCP_MD5SIG=n
(kernel test robot <[email protected]>)
- Now setsockopt(TCP_AO) is allowed when it's TCP_LISTEN or TCP_CLOSE
state OR the key added is not the first key on a socket (by Salam)
- CONFIG_TCP_AO does not depend on CONFIG_TCP_MD5SIG anymore
- Don't leak tcp_md5_needed static branch counter when TCP-MD5 key
is modified/changed
- TCP-AO lookups are dynamically enabled/disabled with static key when
there is ao_info in the system (and when it is destroyed)
- Wired SYN cookies up to TCP-AO (by Salam)
- Fix verification for possible re-transmitted SYN packets (by Salam)
- use sockopt_lock_sock() instead of lock_sock()
(from v6.1 rebase, commit d51bbff2aba7)
- use sockptr_t in getsockopt(TCP_AO_GET)
(from v6.1 rebase, commit 34704ef024ae)
- Fixed reallocating crypto_pool's scratch area by IPI while
crypto_pool_get() was get by another CPU
- selftests on older kernels (or with CONFIG_TCP_AO=n) should exit with
SKIP, not FAIL (Shuah Khan <[email protected]>)
- selftests that check interaction between TCP-AO and TCP-MD5 now
SKIP when CONFIG_TCP_MD5SIG=n
- Measured the performance of different hashing algorithms for TCP-AO
and compare with TCP-MD5 performance. This is done with hacky patches
to iperf (see [3]). At this moment I've done it in qemu/KVM with CPU
affinities set on Intel(R) Core(TM) i7-7600U CPU @ 2.80GHz.
No performance degradation was noticed before/after patches, but given
the measures were done in a VM, without measuring it on a physical dut
it only gives a hint of relative speed for different hash algorithms
with TCP-AO. Here are results, averaging on 30 measures each:
TCP: 3.51Gbits/sec
TCP-MD5: 1.12Gbits/sec
TCP-AO(HMAC(SHA1)): 1.53Gbits/sec
TCP-AO(CMAC(AES128)): 621Mbits/sec
TCP-AO(HMAC(SHA512)): 1.21Gbits/sec
TCP-AO(HMAC(SHA384)): 1.20Gbits/sec
TCP-AO(HMAC(SHA224)): 961Mbits/sec
TCP-AO(HMAC(SHA3-512)): 157Mbits/sec
TCP-AO(HMAC(RMD160)): 659Mbits/sec
TCP-AO(HMAC(MD5): 1.12Gbits/sec
(the last one is just for fun, but may make sense as it provides
the same security as TCP-MD5, but allows multiple keys and a mechanism
to change them from RFC5925)
Version 2: https://lore.kernel.org/all/[email protected]/T/#u
Changes from v1:
- Building now with CONFIG_IPV6=n (kernel test robot <[email protected]>)
- Added missing static declarations for local functions
(kernel test robot <[email protected]>)
- Addressed static analyzer and review comments by Dan Carpenter
(thanks, they were very useful!)
- Fix elif without defined() for !CONFIG_TCP_AO
- Recursively build selftests/net/tcp_ao (Shuah Khan), patches in:
https://lore.kernel.org/all/[email protected]/T/#u
- Don't leak crypto_pool reference when TCP-MD5 key is modified/changed
- Add TCP-AO support for nettest.c and fcnal-test.sh
(will be used for VRF testing in later versions)
Comparison between Leonard proposal and this (overview):
https://lore.kernel.org/all/[email protected]/T/#u
Version 1: https://lore.kernel.org/all/[email protected]/T/#u
This patchset implements the TCP-AO option as described in RFC5925. There
is a request from industry to move away from TCP-MD5SIG and it seems the time
is right to have a TCP-AO upstreamed. This TCP option is meant to replace
the TCP MD5 option and address its shortcomings. Specifically, it provides
more secure hashing, key rotation and support for long-lived connections
(see the summary of TCP-AO advantages over TCP-MD5 in (1.3) of RFC5925).
The patch series starts with six patches that are not specific to TCP-AO
but implement a general crypto facility that we thought is useful
to eliminate code duplication between TCP-MD5SIG and TCP-AO as well as other
crypto users. These six patches are being submitted separately in
a different patchset [1]. Including them here will show better the gain
in code sharing. Next are 18 patches that implement the actual TCP-AO option,
followed by patches implementing selftests.
The patch set was written as a collaboration of three authors (in alphabetical
order): Dmitry Safonov, Francesco Ruggeri and Salam Noureddine. Additional
credits should be given to Prasad Koya, who was involved in early prototyping
a few years back. There is also a separate submission done by Leonard Crestez
whom we thank for his efforts getting an implementation of RFC5925 submitted
for review upstream [2]. This is an independent implementation that makes
different design decisions.
For example, we chose a similar design to the TCP-MD5SIG implementation and
used setsockopts to program per-socket keys, avoiding the extra complexity
of managing a centralized key database in the kernel. A centralized database
in the kernel has dubious benefits since it doesn’t eliminate per-socket
setsockopts needed to specify which sockets need TCP-AO and what are the
currently preferred keys. It also complicates traffic key caching and
preventing deletion of in-use keys.
In this implementation, a centralized database of keys can be thought of
as living in user space and user applications would have to program those
keys on matching sockets. On the server side, the user application programs
keys (MKTS in TCP-AO nomenclature) on the listening socket for all peers that
are expected to connect. Prefix matching on the peer address is supported.
When a peer issues a successful connect, all the MKTs matching the IP address
of the peer are copied to the newly created socket. On the active side,
when a connect() is issued all MKTs that do not match the peer are deleted
from the socket since they will never match the peer. This implementation
uses three setsockopt()s for adding, deleting and modifying keys on a socket.
All three setsockopt()s have extensive sanity checks that prevent
inconsistencies in the keys on a given socket. A getsockopt() is provided
to get key information from any given socket.
Few things to note about this implementation:
- Traffic keys are cached for established connections avoiding the cost of
such calculation for each packet received or sent.
- Great care has been taken to avoid deleting in-use MKTs
as required by the RFC.
- Any crypto algorithm supported by the Linux kernel can be used
to calculate packet hashes.
- Fastopen works with TCP-AO but hasn’t been tested extensively.
- Tested for interop with other major networking vendors (on linux-4.19),
including testing for key rotation and long lived connections.
[1]: https://lore.kernel.org/all/[email protected]/
[2]: https://lore.kernel.org/all/[email protected]/
[3]: https://github.com/0x7f454c46/iperf/tree/tcp-md5-ao
[4]: https://lore.kernel.org/all/166995421700.16716.17446147162780881407.git-patchwork-notify@kernel.org/T/#u
[5]: https://lore.kernel.org/all/[email protected]/T/#u
[6]: https://lore.kernel.org/all/[email protected]/T/#u
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Bob Gilligan <[email protected]>
Cc: Dan Carpenter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: David Laight <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: Donald Cassidy <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Francesco Ruggeri <[email protected]>
Cc: Gaillardetz, Dominik <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Hideaki YOSHIFUJI <[email protected]>
Cc: Ivan Delalande <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Leonard Crestez <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Salam Noureddine <[email protected]>
Cc: Tetreault, Francois <[email protected]>
Cc: [email protected]
Cc: [email protected]
Dmitry Safonov (23):
net/tcp: Prepare tcp_md5sig_pool for TCP-AO
net/tcp: Add TCP-AO config and structures
net/tcp: Introduce TCP_AO setsockopt()s
net/tcp: Prevent TCP-MD5 with TCP-AO being set
net/tcp: Calculate TCP-AO traffic keys
net/tcp: Add TCP-AO sign to outgoing packets
net/tcp: Add tcp_parse_auth_options()
net/tcp: Add AO sign to RST packets
net/tcp: Add TCP-AO sign to twsk
net/tcp: Wire TCP-AO to request sockets
net/tcp: Sign SYN-ACK segments with TCP-AO
net/tcp: Verify inbound TCP-AO signed segments
net/tcp: Add TCP-AO segments counters
net/tcp: Add TCP-AO SNE support
net/tcp: Add tcp_hash_fail() ratelimited logs
net/tcp: Ignore specific ICMPs for TCP-AO connections
net/tcp: Add option for TCP-AO to (not) hash header
net/tcp: Add TCP-AO getsockopt()s
net/tcp: Allow asynchronous delete for TCP-AO keys (MKTs)
net/tcp: Add static_key for TCP-AO
net/tcp: Wire up l3index to TCP-AO
net/tcp: Add TCP_AO_REPAIR
Documentation/tcp: Add TCP-AO documentation
Documentation/networking/index.rst | 1 +
Documentation/networking/tcp_ao.rst | 422 +++++
include/linux/sockptr.h | 23 +
include/linux/tcp.h | 30 +-
include/net/dropreason-core.h | 30 +
include/net/tcp.h | 224 ++-
include/net/tcp_ao.h | 349 ++++
include/uapi/linux/snmp.h | 5 +
include/uapi/linux/tcp.h | 105 ++
net/ipv4/Kconfig | 17 +
net/ipv4/Makefile | 2 +
net/ipv4/proc.c | 5 +
net/ipv4/syncookies.c | 4 +
net/ipv4/tcp.c | 236 +--
net/ipv4/tcp_ao.c | 2367 +++++++++++++++++++++++++++
net/ipv4/tcp_input.c | 97 +-
net/ipv4/tcp_ipv4.c | 333 +++-
net/ipv4/tcp_minisocks.c | 50 +-
net/ipv4/tcp_output.c | 232 ++-
net/ipv4/tcp_sigpool.c | 357 ++++
net/ipv6/Makefile | 1 +
net/ipv6/syncookies.c | 5 +
net/ipv6/tcp_ao.c | 167 ++
net/ipv6/tcp_ipv6.c | 334 +++-
24 files changed, 5026 insertions(+), 370 deletions(-)
create mode 100644 Documentation/networking/tcp_ao.rst
create mode 100644 include/net/tcp_ao.h
create mode 100644 net/ipv4/tcp_ao.c
create mode 100644 net/ipv4/tcp_sigpool.c
create mode 100644 net/ipv6/tcp_ao.c
base-commit: ccff6d117d8dc8d8d86e8695a75e5f8b01e573bf
--
2.41.0
Similarly to IPsec, RFC5925 prescribes:
">> A TCP-AO implementation MUST default to ignore incoming ICMPv4
messages of Type 3 (destination unreachable), Codes 2-4 (protocol
unreachable, port unreachable, and fragmentation needed -- ’hard
errors’), and ICMPv6 Type 1 (destination unreachable), Code 1
(administratively prohibited) and Code 4 (port unreachable) intended
for connections in synchronized states (ESTABLISHED, FIN-WAIT-1, FIN-
WAIT-2, CLOSE-WAIT, CLOSING, LAST-ACK, TIME-WAIT) that match MKTs."
A selftest (later in patch series) verifies that this attack is not
possible in this TCP-AO implementation.
Co-developed-by: Francesco Ruggeri <[email protected]>
Signed-off-by: Francesco Ruggeri <[email protected]>
Co-developed-by: Salam Noureddine <[email protected]>
Signed-off-by: Salam Noureddine <[email protected]>
Signed-off-by: Dmitry Safonov <[email protected]>
---
include/net/tcp_ao.h | 10 ++++++-
include/uapi/linux/snmp.h | 1 +
include/uapi/linux/tcp.h | 4 ++-
net/ipv4/proc.c | 1 +
net/ipv4/tcp_ao.c | 61 +++++++++++++++++++++++++++++++++++++++
net/ipv4/tcp_ipv4.c | 5 ++++
net/ipv6/tcp_ipv6.c | 4 +++
7 files changed, 84 insertions(+), 2 deletions(-)
diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
index 801dca405df7..cb16af2a9f39 100644
--- a/include/net/tcp_ao.h
+++ b/include/net/tcp_ao.h
@@ -24,6 +24,7 @@ struct tcp_ao_counters {
atomic64_t pkt_bad;
atomic64_t key_not_found;
atomic64_t ao_required;
+ atomic64_t dropped_icmp;
};
struct tcp_ao_key {
@@ -93,7 +94,8 @@ struct tcp_ao_info {
struct tcp_ao_key *rnext_key;
struct tcp_ao_counters counters;
u32 ao_required :1,
- __unused :31;
+ accept_icmps :1,
+ __unused :30;
__be32 lisn;
__be32 risn;
/* Sequence Number Extension (SNE) are upper 4 bytes for SEQ,
@@ -188,6 +190,7 @@ int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx,
unsigned int len, struct tcp_sigpool *hp);
void tcp_ao_destroy_sock(struct sock *sk, bool twsk);
void tcp_ao_time_wait(struct tcp_timewait_sock *tcptw, struct tcp_sock *tp);
+bool tcp_ao_ignore_icmp(struct sock *sk, int type, int code);
enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk,
const struct sk_buff *skb, unsigned short int family,
const struct request_sock *req,
@@ -266,6 +269,11 @@ static inline void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb,
{
}
+static inline bool tcp_ao_ignore_icmp(struct sock *sk, int type, int code)
+{
+ return false;
+}
+
static inline enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk,
const struct sk_buff *skb, unsigned short int family,
const struct request_sock *req, const struct tcp_ao_hdr *aoh)
diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h
index 06ddf4cd295c..47a6b47da66f 100644
--- a/include/uapi/linux/snmp.h
+++ b/include/uapi/linux/snmp.h
@@ -300,6 +300,7 @@ enum
LINUX_MIB_TCPAOBAD, /* TCPAOBad */
LINUX_MIB_TCPAOKEYNOTFOUND, /* TCPAOKeyNotFound */
LINUX_MIB_TCPAOGOOD, /* TCPAOGood */
+ LINUX_MIB_TCPAODROPPEDICMPS, /* TCPAODroppedIcmps */
__LINUX_MIB_MAX
};
diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index 3fe0612ec59a..ca7ed18ce67b 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -392,7 +392,8 @@ struct tcp_ao_info_opt { /* setsockopt(TCP_AO_INFO) */
set_rnext :1, /* corresponding ::rnext */
ao_required :1, /* don't accept non-AO connects */
set_counters :1, /* set/clear ::pkt_* counters */
- reserved :28; /* must be 0 */
+ accept_icmps :1, /* accept incoming ICMPs */
+ reserved :27; /* must be 0 */
__u16 reserved2; /* padding, must be 0 */
__u8 current_key; /* KeyID to set as Current_key */
__u8 rnext; /* KeyID to set as Rnext_key */
@@ -400,6 +401,7 @@ struct tcp_ao_info_opt { /* setsockopt(TCP_AO_INFO) */
__u64 pkt_bad; /* failed verification */
__u64 pkt_key_not_found; /* could not find a key to verify */
__u64 pkt_ao_required; /* segments missing TCP-AO sign */
+ __u64 pkt_dropped_icmp; /* ICMPs that were ignored */
} __attribute__((aligned(8)));
/* setsockopt(fd, IPPROTO_TCP, TCP_ZEROCOPY_RECEIVE, ...) */
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index 3f643cd29cfe..5d3c9c96773e 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -302,6 +302,7 @@ static const struct snmp_mib snmp4_net_list[] = {
SNMP_MIB_ITEM("TCPAOBad", LINUX_MIB_TCPAOBAD),
SNMP_MIB_ITEM("TCPAOKeyNotFound", LINUX_MIB_TCPAOKEYNOTFOUND),
SNMP_MIB_ITEM("TCPAOGood", LINUX_MIB_TCPAOGOOD),
+ SNMP_MIB_ITEM("TCPAODroppedIcmps", LINUX_MIB_TCPAODROPPEDICMPS),
SNMP_MIB_SENTINEL
};
diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index 9ee7fc9f6a76..4bde58ede63a 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -15,6 +15,7 @@
#include <net/tcp.h>
#include <net/ipv6.h>
+#include <net/icmp.h>
int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx,
unsigned int len, struct tcp_sigpool *hp)
@@ -44,6 +45,63 @@ int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx,
return 1;
}
+bool tcp_ao_ignore_icmp(struct sock *sk, int type, int code)
+{
+ struct tcp_ao_info *ao;
+ bool ignore_icmp = false;
+
+ /* RFC5925, 7.8:
+ * >> A TCP-AO implementation MUST default to ignore incoming ICMPv4
+ * messages of Type 3 (destination unreachable), Codes 2-4 (protocol
+ * unreachable, port unreachable, and fragmentation needed -- ’hard
+ * errors’), and ICMPv6 Type 1 (destination unreachable), Code 1
+ * (administratively prohibited) and Code 4 (port unreachable) intended
+ * for connections in synchronized states (ESTABLISHED, FIN-WAIT-1, FIN-
+ * WAIT-2, CLOSE-WAIT, CLOSING, LAST-ACK, TIME-WAIT) that match MKTs.
+ */
+ if (sk->sk_family == AF_INET) {
+ if (type != ICMP_DEST_UNREACH)
+ return false;
+ if (code < ICMP_PROT_UNREACH || code > ICMP_FRAG_NEEDED)
+ return false;
+ } else if (sk->sk_family == AF_INET6) {
+ if (type != ICMPV6_DEST_UNREACH)
+ return false;
+ if (code != ICMPV6_ADM_PROHIBITED && code != ICMPV6_PORT_UNREACH)
+ return false;
+ } else {
+ WARN_ON_ONCE(1);
+ return false;
+ }
+
+ rcu_read_lock();
+ switch (sk->sk_state) {
+ case TCP_TIME_WAIT:
+ ao = rcu_dereference(tcp_twsk(sk)->ao_info);
+ break;
+ case TCP_SYN_SENT:
+ case TCP_SYN_RECV:
+ case TCP_LISTEN:
+ case TCP_NEW_SYN_RECV:
+ /* RFC5925 specifies to ignore ICMPs *only* on connections
+ * in synchronized states.
+ */
+ rcu_read_unlock();
+ return false;
+ default:
+ ao = rcu_dereference(tcp_sk(sk)->ao_info);
+ }
+
+ if (ao && !ao->accept_icmps) {
+ ignore_icmp = true;
+ __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAODROPPEDICMPS);
+ atomic64_inc(&ao->counters.dropped_icmp);
+ }
+ rcu_read_unlock();
+
+ return ignore_icmp;
+}
+
/* Optimized version of tcp_ao_do_lookup(): only for sockets for which
* it's known that the keys in ao_info are matching peer's
* family/address/port/VRF/etc.
@@ -1044,6 +1102,7 @@ int tcp_ao_copy_all_matching(const struct sock *sk, struct sock *newsk,
new_ao->lisn = htonl(tcp_rsk(req)->snt_isn);
new_ao->risn = htonl(tcp_rsk(req)->rcv_isn);
new_ao->ao_required = ao->ao_required;
+ new_ao->accept_icmps = ao->accept_icmps;
if (family == AF_INET) {
addr = (union tcp_ao_addr *)&newsk->sk_daddr;
@@ -1764,9 +1823,11 @@ static int tcp_ao_info_cmd(struct sock *sk, unsigned short int family,
atomic64_set(&ao_info->counters.pkt_bad, cmd.pkt_bad);
atomic64_set(&ao_info->counters.key_not_found, cmd.pkt_key_not_found);
atomic64_set(&ao_info->counters.ao_required, cmd.pkt_ao_required);
+ atomic64_set(&ao_info->counters.dropped_icmp, cmd.pkt_dropped_icmp);
}
ao_info->ao_required = cmd.ao_required;
+ ao_info->accept_icmps = cmd.accept_icmps;
if (new_current)
WRITE_ONCE(ao_info->current_key, new_current);
if (new_rnext)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 1b53c4bd37ed..6cb282e0197c 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -492,6 +492,8 @@ int tcp_v4_err(struct sk_buff *skb, u32 info)
return -ENOENT;
}
if (sk->sk_state == TCP_TIME_WAIT) {
+ /* To increase the counter of ignored icmps for TCP-AO */
+ tcp_ao_ignore_icmp(sk, type, code);
inet_twsk_put(inet_twsk(sk));
return 0;
}
@@ -506,6 +508,9 @@ int tcp_v4_err(struct sk_buff *skb, u32 info)
}
bh_lock_sock(sk);
+ if (tcp_ao_ignore_icmp(sk, type, code))
+ goto out;
+
/* If too many ICMPs get dropped on busy
* servers this needs to be solved differently.
* We do take care of PMTU discovery (RFC1191) special case :
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index a666e8bab9d9..0adcd7faea08 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -395,6 +395,8 @@ static int tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
}
if (sk->sk_state == TCP_TIME_WAIT) {
+ /* To increase the counter of ignored icmps for TCP-AO */
+ tcp_ao_ignore_icmp(sk, type, code);
inet_twsk_put(inet_twsk(sk));
return 0;
}
@@ -406,6 +408,8 @@ static int tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
}
bh_lock_sock(sk);
+ if (tcp_ao_ignore_icmp(sk, type, code))
+ goto out;
if (sock_owned_by_user(sk) && type != ICMPV6_PKT_TOOBIG)
__NET_INC_STATS(net, LINUX_MIB_LOCKDROPPEDICMPS);
--
2.41.0
Now there is a common function to verify signature on TCP segments:
tcp_inbound_hash(). It has checks for all possible cross-interactions
with MD5 signs as well as with unsigned segments.
The rules from RFC5925 are:
(1) Any TCP segment can have at max only one signature.
(2) TCP connections can't switch between using TCP-MD5 and TCP-AO.
(3) TCP-AO connections can't stop using AO, as well as unsigned
connections can't suddenly start using AO.
Co-developed-by: Francesco Ruggeri <[email protected]>
Signed-off-by: Francesco Ruggeri <[email protected]>
Co-developed-by: Salam Noureddine <[email protected]>
Signed-off-by: Salam Noureddine <[email protected]>
Signed-off-by: Dmitry Safonov <[email protected]>
---
include/net/dropreason-core.h | 17 ++++
include/net/tcp.h | 53 ++++++++++-
include/net/tcp_ao.h | 17 ++++
net/ipv4/tcp.c | 39 ++------
net/ipv4/tcp_ao.c | 167 ++++++++++++++++++++++++++++++++++
net/ipv4/tcp_ipv4.c | 10 +-
net/ipv6/tcp_ao.c | 12 +++
net/ipv6/tcp_ipv6.c | 11 ++-
8 files changed, 283 insertions(+), 43 deletions(-)
diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h
index 383ac5215284..0ff272d3b680 100644
--- a/include/net/dropreason-core.h
+++ b/include/net/dropreason-core.h
@@ -24,6 +24,10 @@
FN(TCP_MD5NOTFOUND) \
FN(TCP_MD5UNEXPECTED) \
FN(TCP_MD5FAILURE) \
+ FN(TCP_AONOTFOUND) \
+ FN(TCP_AOUNEXPECTED) \
+ FN(TCP_AOKEYNOTFOUND) \
+ FN(TCP_AOFAILURE) \
FN(SOCKET_BACKLOG) \
FN(TCP_FLAGS) \
FN(TCP_ZEROWINDOW) \
@@ -160,6 +164,19 @@ enum skb_drop_reason {
* to LINUX_MIB_TCPMD5FAILURE
*/
SKB_DROP_REASON_TCP_MD5FAILURE,
+ /**
+ * @SKB_DROP_REASON_TCP_AONOTFOUND: no TCP-AO hash and one was expected
+ */
+ SKB_DROP_REASON_TCP_AONOTFOUND,
+ /**
+ * @SKB_DROP_REASON_TCP_AOUNEXPECTED: TCP-AO hash is present and it
+ * was not expected.
+ */
+ SKB_DROP_REASON_TCP_AOUNEXPECTED,
+ /** @SKB_DROP_REASON_TCP_AOKEYNOTFOUND: TCP-AO key is unknown */
+ SKB_DROP_REASON_TCP_AOKEYNOTFOUND,
+ /** @SKB_DROP_REASON_TCP_AOFAILURE: TCP-AO hash is wrong */
+ SKB_DROP_REASON_TCP_AOFAILURE,
/**
* @SKB_DROP_REASON_SOCKET_BACKLOG: failed to add skb to socket backlog (
* see LINUX_MIB_TCPBACKLOGDROP)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 77896df7952e..67b46787c3dd 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1733,7 +1733,7 @@ tcp_md5_do_lookup_any_l3index(const struct sock *sk,
enum skb_drop_reason
tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb,
const void *saddr, const void *daddr,
- int family, int dif, int sdif);
+ int family, int l3index, const __u8 *hash_location);
#define tcp_twsk_md5_key(twsk) ((twsk)->tw_md5_key)
@@ -1755,7 +1755,7 @@ tcp_md5_do_lookup_any_l3index(const struct sock *sk,
static inline enum skb_drop_reason
tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb,
const void *saddr, const void *daddr,
- int family, int dif, int sdif)
+ int family, int l3index, const __u8 *hash_location)
{
return SKB_NOT_DROPPED_YET;
}
@@ -2592,4 +2592,53 @@ static inline bool tcp_ao_required(struct sock *sk, const void *saddr,
return false;
}
+/* Called with rcu_read_lock() */
+static inline enum skb_drop_reason
+tcp_inbound_hash(struct sock *sk, const struct request_sock *req,
+ const struct sk_buff *skb,
+ const void *saddr, const void *daddr,
+ int family, int dif, int sdif)
+{
+ const struct tcphdr *th = tcp_hdr(skb);
+ const struct tcp_ao_hdr *aoh;
+ const __u8 *md5_location;
+ int l3index;
+
+ /* Invalid option or two times meet any of auth options */
+ if (tcp_parse_auth_options(th, &md5_location, &aoh))
+ return SKB_DROP_REASON_TCP_AUTH_HDR;
+
+ if (req) {
+ if (tcp_rsk_used_ao(req) != !!aoh)
+ return SKB_DROP_REASON_TCP_AOFAILURE;
+ }
+
+ /* sdif set, means packet ingressed via a device
+ * in an L3 domain and dif is set to the l3mdev
+ */
+ l3index = sdif ? dif : 0;
+
+ /* Fast path: unsigned segments */
+ if (likely(!md5_location && !aoh)) {
+ /* Drop if there's TCP-MD5 or TCP-AO key with any rcvid/sndid
+ * for the remote peer. On TCP-AO established connection
+ * the last key is impossible to remove, so there's
+ * always at least one current_key.
+ */
+ if (tcp_ao_required(sk, saddr, family))
+ return SKB_DROP_REASON_TCP_AONOTFOUND;
+ if (unlikely(tcp_md5_do_lookup(sk, l3index, saddr, family))) {
+ NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND);
+ return SKB_DROP_REASON_TCP_MD5NOTFOUND;
+ }
+ return SKB_NOT_DROPPED_YET;
+ }
+
+ if (aoh)
+ return tcp_inbound_ao_hash(sk, skb, family, req, aoh);
+
+ return tcp_inbound_md5_hash(sk, skb, saddr, daddr, family,
+ l3index, md5_location);
+}
+
#endif /* _TCP_H */
diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
index 4979c993973f..1b7ca7bf93a3 100644
--- a/include/net/tcp_ao.h
+++ b/include/net/tcp_ao.h
@@ -112,6 +112,9 @@ struct tcp6_ao_context {
};
struct tcp_sigpool;
+#define TCP_AO_ESTABLISHED (TCPF_ESTABLISHED|TCPF_FIN_WAIT1|TCPF_FIN_WAIT2|\
+ TCPF_CLOSE|TCPF_CLOSE_WAIT|TCPF_LAST_ACK|TCPF_CLOSING)
+
int tcp_ao_hash_skb(unsigned short int family,
char *ao_hash, struct tcp_ao_key *key,
const struct sock *sk, const struct sk_buff *skb,
@@ -127,6 +130,10 @@ int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx,
unsigned int len, struct tcp_sigpool *hp);
void tcp_ao_destroy_sock(struct sock *sk, bool twsk);
void tcp_ao_time_wait(struct tcp_timewait_sock *tcptw, struct tcp_sock *tp);
+enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk,
+ const struct sk_buff *skb, unsigned short int family,
+ const struct request_sock *req,
+ const struct tcp_ao_hdr *aoh);
struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk,
const union tcp_ao_addr *addr,
int family, int sndid, int rcvid, u16 port);
@@ -162,6 +169,9 @@ int tcp_v4_ao_hash_skb(char *ao_hash, struct tcp_ao_key *key,
int tcp_v6_ao_hash_pseudoheader(struct tcp_sigpool *hp,
const struct in6_addr *daddr,
const struct in6_addr *saddr, int nbytes);
+int tcp_v6_ao_calc_key_skb(struct tcp_ao_key *mkt, u8 *key,
+ const struct sk_buff *skb, __be32 sisn,
+ __be32 disn);
int tcp_v6_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key,
const struct sock *sk, __be32 sisn,
__be32 disn, bool send);
@@ -197,6 +207,13 @@ static inline void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb,
{
}
+static inline enum skb_drop_reason tcp_inbound_ao_hash(struct sock *sk,
+ const struct sk_buff *skb, unsigned short int family,
+ const struct request_sock *req, const struct tcp_ao_hdr *aoh)
+{
+ return SKB_NOT_DROPPED_YET;
+}
+
static inline struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk,
const union tcp_ao_addr *addr,
int family, int sndid, int rcvid, u16 port)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 0f9bc2b64018..fcedec0ae08e 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -4368,42 +4368,23 @@ EXPORT_SYMBOL(tcp_md5_hash_key);
enum skb_drop_reason
tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb,
const void *saddr, const void *daddr,
- int family, int dif, int sdif)
+ int family, int l3index, const __u8 *hash_location)
{
- /*
- * This gets called for each TCP segment that arrives
- * so we want to be efficient.
+ /* This gets called for each TCP segment that has TCP-MD5 option.
* We have 3 drop cases:
* o No MD5 hash and one expected.
* o MD5 hash and we're not expecting one.
* o MD5 hash and its wrong.
*/
- const __u8 *hash_location = NULL;
- struct tcp_md5sig_key *hash_expected;
const struct tcphdr *th = tcp_hdr(skb);
const struct tcp_sock *tp = tcp_sk(sk);
- int genhash, l3index;
+ struct tcp_md5sig_key *key;
+ int genhash;
u8 newhash[16];
- /* sdif set, means packet ingressed via a device
- * in an L3 domain and dif is set to the l3mdev
- */
- l3index = sdif ? dif : 0;
+ key = tcp_md5_do_lookup(sk, l3index, saddr, family);
- hash_expected = tcp_md5_do_lookup(sk, l3index, saddr, family);
- if (tcp_parse_auth_options(th, &hash_location, NULL))
- return SKB_DROP_REASON_TCP_AUTH_HDR;
-
- /* We've parsed the options - do we have a hash? */
- if (!hash_expected && !hash_location)
- return SKB_NOT_DROPPED_YET;
-
- if (hash_expected && !hash_location) {
- NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND);
- return SKB_DROP_REASON_TCP_MD5NOTFOUND;
- }
-
- if (!hash_expected && hash_location) {
+ if (!key && hash_location) {
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5UNEXPECTED);
return SKB_DROP_REASON_TCP_MD5UNEXPECTED;
}
@@ -4413,14 +4394,10 @@ tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb,
* IPv4-mapped case.
*/
if (family == AF_INET)
- genhash = tcp_v4_md5_hash_skb(newhash,
- hash_expected,
- NULL, skb);
+ genhash = tcp_v4_md5_hash_skb(newhash, key, NULL, skb);
else
- genhash = tp->af_specific->calc_md5_hash(newhash,
- hash_expected,
+ genhash = tp->af_specific->calc_md5_hash(newhash, key,
NULL, skb);
-
if (genhash || memcmp(hash_location, newhash, 16) != 0) {
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5FAILURE);
if (family == AF_INET) {
diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index 0233cf363de9..073be1680818 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -321,6 +321,30 @@ int tcp_v4_ao_calc_key_rsk(struct tcp_ao_key *mkt, u8 *key,
htonl(tcp_rsk(req)->rcv_isn));
}
+static int tcp_v4_ao_calc_key_skb(struct tcp_ao_key *mkt, u8 *key,
+ const struct sk_buff *skb,
+ __be32 sisn, __be32 disn)
+{
+ const struct iphdr *iph = ip_hdr(skb);
+ const struct tcphdr *th = tcp_hdr(skb);
+
+ return tcp_v4_ao_calc_key(mkt, key, iph->saddr, iph->daddr,
+ th->source, th->dest, sisn, disn);
+}
+
+static int tcp_ao_calc_key_skb(struct tcp_ao_key *mkt, u8 *key,
+ const struct sk_buff *skb, __be32 sisn,
+ __be32 disn, int family)
+{
+ if (family == AF_INET)
+ return tcp_v4_ao_calc_key_skb(mkt, key, skb, sisn, disn);
+#if IS_ENABLED(CONFIG_IPV6)
+ else if (family == AF_INET6)
+ return tcp_v6_ao_calc_key_skb(mkt, key, skb, sisn, disn);
+#endif
+ return -EAFNOSUPPORT;
+}
+
static int tcp_v4_ao_hash_pseudoheader(struct tcp_sigpool *hp,
__be32 daddr, __be32 saddr,
int nbytes)
@@ -711,6 +735,149 @@ void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb,
treq->maclen = tcp_ao_maclen(key);
}
+static enum skb_drop_reason
+tcp_ao_verify_hash(const struct sock *sk, const struct sk_buff *skb,
+ unsigned short int family, struct tcp_ao_info *info,
+ const struct tcp_ao_hdr *aoh, struct tcp_ao_key *key,
+ u8 *traffic_key, u8 *phash, u32 sne)
+{
+ u8 maclen = aoh->length - sizeof(struct tcp_ao_hdr);
+ const struct tcphdr *th = tcp_hdr(skb);
+ void *hash_buf = NULL;
+
+ if (maclen != tcp_ao_maclen(key))
+ return SKB_DROP_REASON_TCP_AOFAILURE;
+
+ hash_buf = kmalloc(tcp_ao_digest_size(key), GFP_ATOMIC);
+ if (!hash_buf)
+ return SKB_DROP_REASON_NOT_SPECIFIED;
+
+ /* XXX: make it per-AF callback? */
+ tcp_ao_hash_skb(family, hash_buf, key, sk, skb, traffic_key,
+ (phash - (u8 *)th), sne);
+ if (memcmp(phash, hash_buf, maclen)) {
+ kfree(hash_buf);
+ return SKB_DROP_REASON_TCP_AOFAILURE;
+ }
+ kfree(hash_buf);
+ return SKB_NOT_DROPPED_YET;
+}
+
+enum skb_drop_reason
+tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb,
+ unsigned short int family, const struct request_sock *req,
+ const struct tcp_ao_hdr *aoh)
+{
+ const struct tcphdr *th = tcp_hdr(skb);
+ u8 *phash = (u8 *)(aoh + 1); /* hash goes just after the header */
+ struct tcp_ao_info *info;
+ enum skb_drop_reason ret;
+ struct tcp_ao_key *key;
+ __be32 sisn, disn;
+ u8 *traffic_key;
+ u32 sne = 0;
+
+ info = rcu_dereference(tcp_sk(sk)->ao_info);
+ if (!info)
+ return SKB_DROP_REASON_TCP_AOUNEXPECTED;
+
+ if (unlikely(th->syn)) {
+ sisn = th->seq;
+ disn = 0;
+ }
+
+ /* Fast-path */
+ /* TODO: fix fastopen and simultaneous open (TCPF_SYN_RECV) */
+ if (likely((1 << sk->sk_state) & (TCP_AO_ESTABLISHED | TCPF_SYN_RECV))) {
+ enum skb_drop_reason err;
+ struct tcp_ao_key *current_key;
+
+ /* Check if this socket's rnext_key matches the keyid in the
+ * packet. If not we lookup the key based on the keyid
+ * matching the rcvid in the mkt.
+ */
+ key = READ_ONCE(info->rnext_key);
+ if (key->rcvid != aoh->keyid) {
+ key = tcp_ao_established_key(info, -1, aoh->keyid);
+ if (!key)
+ goto key_not_found;
+ }
+
+ /* Delayed retransmitted SYN */
+ if (unlikely(th->syn && !th->ack))
+ goto verify_hash;
+
+ sne = 0;
+ /* Established socket, traffic key are cached */
+ traffic_key = rcv_other_key(key);
+ err = tcp_ao_verify_hash(sk, skb, family, info, aoh, key,
+ traffic_key, phash, sne);
+ if (err)
+ return err;
+ current_key = READ_ONCE(info->current_key);
+ /* Key rotation: the peer asks us to use new key (RNext) */
+ if (unlikely(aoh->rnext_keyid != current_key->sndid)) {
+ /* If the key is not found we do nothing. */
+ key = tcp_ao_established_key(info, aoh->rnext_keyid, -1);
+ if (key)
+ /* pairs with tcp_ao_del_cmd */
+ WRITE_ONCE(info->current_key, key);
+ }
+ return SKB_NOT_DROPPED_YET;
+ }
+
+ /* Lookup key based on peer address and keyid.
+ * current_key and rnext_key must not be used on tcp listen
+ * sockets as otherwise:
+ * - request sockets would race on those key pointers
+ * - tcp_ao_del_cmd() allows async key removal
+ */
+ key = tcp_ao_inbound_lookup(family, sk, skb, -1, aoh->keyid);
+ if (!key)
+ goto key_not_found;
+
+ if (th->syn && !th->ack)
+ goto verify_hash;
+
+ if ((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_NEW_SYN_RECV)) {
+ /* Make the initial syn the likely case here */
+ if (unlikely(req)) {
+ sne = 0;
+ sisn = htonl(tcp_rsk(req)->rcv_isn);
+ disn = htonl(tcp_rsk(req)->snt_isn);
+ } else if (unlikely(th->ack && !th->syn)) {
+ /* Possible syncookie packet */
+ sisn = htonl(ntohl(th->seq) - 1);
+ disn = htonl(ntohl(th->ack_seq) - 1);
+ sne = 0;
+ } else if (unlikely(!th->syn)) {
+ /* no way to figure out initial sisn/disn - drop */
+ return SKB_DROP_REASON_TCP_FLAGS;
+ }
+ } else if (sk->sk_state == TCP_SYN_SENT) {
+ disn = info->lisn;
+ if (th->syn || th->rst)
+ sisn = th->seq;
+ else
+ sisn = info->risn;
+ } else {
+ WARN_ONCE(1, "TCP-AO: Unexpected sk_state %d", sk->sk_state);
+ return SKB_DROP_REASON_TCP_AOFAILURE;
+ }
+verify_hash:
+ traffic_key = kmalloc(tcp_ao_digest_size(key), GFP_ATOMIC);
+ if (!traffic_key)
+ return SKB_DROP_REASON_NOT_SPECIFIED;
+ tcp_ao_calc_key_skb(key, traffic_key, skb, sisn, disn, family);
+ ret = tcp_ao_verify_hash(sk, skb, family, info, aoh, key,
+ traffic_key, phash, sne);
+ kfree(traffic_key);
+ return ret;
+
+key_not_found:
+ return SKB_DROP_REASON_TCP_AOKEYNOTFOUND;
+}
+
static int tcp_ao_cache_traffic_keys(const struct sock *sk,
struct tcp_ao_info *ao,
struct tcp_ao_key *ao_key)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 43a5de1b14fa..a10a5011c60f 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2197,9 +2197,9 @@ int tcp_v4_rcv(struct sk_buff *skb)
if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb))
drop_reason = SKB_DROP_REASON_XFRM_POLICY;
else
- drop_reason = tcp_inbound_md5_hash(sk, skb,
- &iph->saddr, &iph->daddr,
- AF_INET, dif, sdif);
+ drop_reason = tcp_inbound_hash(sk, req, skb,
+ &iph->saddr, &iph->daddr,
+ AF_INET, dif, sdif);
if (unlikely(drop_reason)) {
sk_drops_add(sk, skb);
reqsk_put(req);
@@ -2276,8 +2276,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
goto discard_and_relse;
}
- drop_reason = tcp_inbound_md5_hash(sk, skb, &iph->saddr,
- &iph->daddr, AF_INET, dif, sdif);
+ drop_reason = tcp_inbound_hash(sk, NULL, skb, &iph->saddr, &iph->daddr,
+ AF_INET, dif, sdif);
if (drop_reason)
goto discard_and_relse;
diff --git a/net/ipv6/tcp_ao.c b/net/ipv6/tcp_ao.c
index ec320781bc5c..984a75f96e20 100644
--- a/net/ipv6/tcp_ao.c
+++ b/net/ipv6/tcp_ao.c
@@ -49,6 +49,18 @@ static int tcp_v6_ao_calc_key(struct tcp_ao_key *mkt, u8 *key,
return err;
}
+int tcp_v6_ao_calc_key_skb(struct tcp_ao_key *mkt, u8 *key,
+ const struct sk_buff *skb,
+ __be32 sisn, __be32 disn)
+{
+ const struct ipv6hdr *iph = ipv6_hdr(skb);
+ const struct tcphdr *th = tcp_hdr(skb);
+
+ return tcp_v6_ao_calc_key(mkt, key, &iph->saddr,
+ &iph->daddr, th->source,
+ th->dest, sisn, disn);
+}
+
int tcp_v6_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key,
const struct sock *sk, __be32 sisn,
__be32 disn, bool send)
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 1ef428d9f052..026e2d49706a 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1779,9 +1779,9 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
struct sock *nsk;
sk = req->rsk_listener;
- drop_reason = tcp_inbound_md5_hash(sk, skb,
- &hdr->saddr, &hdr->daddr,
- AF_INET6, dif, sdif);
+ drop_reason = tcp_inbound_hash(sk, req, skb,
+ &hdr->saddr, &hdr->daddr,
+ AF_INET6, dif, sdif);
if (drop_reason) {
sk_drops_add(sk, skb);
reqsk_put(req);
@@ -1854,8 +1854,8 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
goto discard_and_relse;
}
- drop_reason = tcp_inbound_md5_hash(sk, skb, &hdr->saddr, &hdr->daddr,
- AF_INET6, dif, sdif);
+ drop_reason = tcp_inbound_hash(sk, NULL, skb, &hdr->saddr, &hdr->daddr,
+ AF_INET6, dif, sdif);
if (drop_reason)
goto discard_and_relse;
@@ -2083,6 +2083,7 @@ static const struct tcp_sock_af_ops tcp_sock_ipv6_mapped_specific = {
.ao_lookup = tcp_v6_ao_lookup,
.calc_ao_hash = tcp_v4_ao_hash_skb,
.ao_parse = tcp_v6_parse_ao,
+ .ao_calc_key_sk = tcp_v4_ao_calc_key_sk,
#endif
};
#endif
--
2.41.0
Now when the new request socket is created from the listening socket,
it's recorded what MKT was used by the peer. tcp_rsk_used_ao() is
a new helper for checking if TCP-AO option was used to create the
request socket.
tcp_ao_copy_all_matching() will copy all keys that match the peer on the
request socket, as well as preparing them for the usage (creating
traffic keys).
Co-developed-by: Francesco Ruggeri <[email protected]>
Signed-off-by: Francesco Ruggeri <[email protected]>
Co-developed-by: Salam Noureddine <[email protected]>
Signed-off-by: Salam Noureddine <[email protected]>
Signed-off-by: Dmitry Safonov <[email protected]>
---
include/linux/tcp.h | 18 ++++
include/net/tcp.h | 7 ++
include/net/tcp_ao.h | 22 ++++
net/ipv4/syncookies.c | 2 +
net/ipv4/tcp_ao.c | 219 ++++++++++++++++++++++++++++++++++++++-
net/ipv4/tcp_input.c | 15 +++
net/ipv4/tcp_ipv4.c | 65 ++++++++++--
net/ipv4/tcp_minisocks.c | 10 ++
net/ipv4/tcp_output.c | 42 +++++---
net/ipv6/syncookies.c | 2 +
net/ipv6/tcp_ao.c | 21 ++++
net/ipv6/tcp_ipv6.c | 76 ++++++++++++--
12 files changed, 467 insertions(+), 32 deletions(-)
diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 0c50a9aaa780..a8409d37c6b8 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -165,6 +165,11 @@ struct tcp_request_sock {
* after data-in-SYN.
*/
u8 syn_tos;
+#ifdef CONFIG_TCP_AO
+ u8 ao_keyid;
+ u8 ao_rcv_next;
+ u8 maclen;
+#endif
};
static inline struct tcp_request_sock *tcp_rsk(const struct request_sock *req)
@@ -172,6 +177,19 @@ static inline struct tcp_request_sock *tcp_rsk(const struct request_sock *req)
return (struct tcp_request_sock *)req;
}
+static inline bool tcp_rsk_used_ao(const struct request_sock *req)
+{
+ /* The real length of MAC is saved in the request socket,
+ * signing anything with zero-length makes no sense, so here is
+ * a little hack..
+ */
+#ifndef CONFIG_TCP_AO
+ return false;
+#else
+ return tcp_rsk(req)->maclen != 0;
+#endif
+}
+
struct tcp_sock {
/* inet_connection_sock has to be the first member of tcp_sock */
struct inet_connection_sock inet_conn;
diff --git a/include/net/tcp.h b/include/net/tcp.h
index c9e27e4ab928..37ee218fe09c 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -2140,6 +2140,13 @@ struct tcp_request_sock_ops {
const struct sock *sk,
const struct sk_buff *skb);
#endif
+#ifdef CONFIG_TCP_AO
+ struct tcp_ao_key *(*ao_lookup)(const struct sock *sk,
+ struct request_sock *req,
+ int sndid, int rcvid);
+ int (*ao_calc_key)(struct tcp_ao_key *mkt, u8 *key,
+ struct request_sock *sk);
+#endif
#ifdef CONFIG_SYN_COOKIES
__u32 (*cookie_init_seq)(const struct sk_buff *skb,
__u16 *mss);
diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
index 7f6849f16d7a..b63c4f2c36ea 100644
--- a/include/net/tcp_ao.h
+++ b/include/net/tcp_ao.h
@@ -120,6 +120,9 @@ int tcp_parse_ao(struct sock *sk, int cmd, unsigned short int family,
sockptr_t optval, int optlen);
struct tcp_ao_key *tcp_ao_established_key(struct tcp_ao_info *ao,
int sndid, int rcvid);
+int tcp_ao_copy_all_matching(const struct sock *sk, struct sock *newsk,
+ struct request_sock *req, struct sk_buff *skb,
+ int family);
int tcp_ao_calc_traffic_key(struct tcp_ao_key *mkt, u8 *key, void *ctx,
unsigned int len, struct tcp_sigpool *hp);
void tcp_ao_destroy_sock(struct sock *sk, bool twsk);
@@ -144,6 +147,11 @@ struct tcp_ao_key *tcp_v4_ao_lookup(const struct sock *sk, struct sock *addr_sk,
int tcp_v4_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key,
const struct sock *sk,
__be32 sisn, __be32 disn, bool send);
+int tcp_v4_ao_calc_key_rsk(struct tcp_ao_key *mkt, u8 *key,
+ struct request_sock *req);
+struct tcp_ao_key *tcp_v4_ao_lookup_rsk(const struct sock *sk,
+ struct request_sock *req,
+ int sndid, int rcvid);
int tcp_v4_ao_hash_skb(char *ao_hash, struct tcp_ao_key *key,
const struct sock *sk, const struct sk_buff *skb,
const u8 *tkey, int hash_offset, u32 sne);
@@ -154,9 +162,17 @@ int tcp_v6_ao_hash_pseudoheader(struct tcp_sigpool *hp,
int tcp_v6_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key,
const struct sock *sk, __be32 sisn,
__be32 disn, bool send);
+int tcp_v6_ao_calc_key_rsk(struct tcp_ao_key *mkt, u8 *key,
+ struct request_sock *req);
+struct tcp_ao_key *tcp_v6_ao_do_lookup(const struct sock *sk,
+ const struct in6_addr *addr,
+ int sndid, int rcvid);
struct tcp_ao_key *tcp_v6_ao_lookup(const struct sock *sk,
struct sock *addr_sk,
int sndid, int rcvid);
+struct tcp_ao_key *tcp_v6_ao_lookup_rsk(const struct sock *sk,
+ struct request_sock *req,
+ int sndid, int rcvid);
int tcp_v6_ao_hash_skb(char *ao_hash, struct tcp_ao_key *key,
const struct sock *sk, const struct sk_buff *skb,
const u8 *tkey, int hash_offset, u32 sne);
@@ -169,6 +185,12 @@ void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb,
unsigned short int family);
#else /* CONFIG_TCP_AO */
+static inline void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb,
+ struct tcp_request_sock *treq,
+ unsigned short int family)
+{
+}
+
static inline struct tcp_ao_key *tcp_ao_do_lookup(const struct sock *sk,
const union tcp_ao_addr *addr,
int family, int sndid, int rcvid, u16 port)
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index dc478a0574cb..23fca22bc992 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -394,6 +394,8 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb)
treq->snt_synack = 0;
treq->tfo_listener = false;
+ tcp_ao_syncookie(sk, skb, treq, AF_INET);
+
if (IS_ENABLED(CONFIG_SMC))
ireq->smc_ok = 0;
diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index dc1423239160..be9349239139 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -171,6 +171,23 @@ static void tcp_ao_link_mkt(struct tcp_ao_info *ao, struct tcp_ao_key *mkt)
hlist_add_head_rcu(&mkt->node, &ao->head);
}
+static struct tcp_ao_key *tcp_ao_copy_key(struct sock *sk,
+ struct tcp_ao_key *key)
+{
+ struct tcp_ao_key *new_key;
+
+ new_key = sock_kmalloc(sk, tcp_ao_sizeof_key(key),
+ GFP_ATOMIC);
+ if (!new_key)
+ return NULL;
+
+ *new_key = *key;
+ INIT_HLIST_NODE(&new_key->node);
+ tcp_sigpool_get(new_key->tcp_sigpool_id);
+
+ return new_key;
+}
+
static void tcp_ao_key_free_rcu(struct rcu_head *head)
{
struct tcp_ao_key *key = container_of(head, struct tcp_ao_key, rcu);
@@ -292,6 +309,18 @@ static int tcp_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key,
return -EOPNOTSUPP;
}
+int tcp_v4_ao_calc_key_rsk(struct tcp_ao_key *mkt, u8 *key,
+ struct request_sock *req)
+{
+ struct inet_request_sock *ireq = inet_rsk(req);
+
+ return tcp_v4_ao_calc_key(mkt, key,
+ ireq->ir_loc_addr, ireq->ir_rmt_addr,
+ htons(ireq->ir_num), ireq->ir_rmt_port,
+ htonl(tcp_rsk(req)->snt_isn),
+ htonl(tcp_rsk(req)->rcv_isn));
+}
+
static int tcp_v4_ao_hash_pseudoheader(struct tcp_sigpool *hp,
__be32 daddr, __be32 saddr,
int nbytes)
@@ -517,6 +546,16 @@ int tcp_v4_ao_hash_skb(char *ao_hash, struct tcp_ao_key *key,
tkey, hash_offset, sne);
}
+struct tcp_ao_key *tcp_v4_ao_lookup_rsk(const struct sock *sk,
+ struct request_sock *req,
+ int sndid, int rcvid)
+{
+ union tcp_ao_addr *addr =
+ (union tcp_ao_addr *)&inet_rsk(req)->ir_rmt_addr;
+
+ return tcp_ao_do_lookup(sk, addr, AF_INET, sndid, rcvid, 0);
+}
+
struct tcp_ao_key *tcp_v4_ao_lookup(const struct sock *sk, struct sock *addr_sk,
int sndid, int rcvid)
{
@@ -545,7 +584,46 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb,
return -ENOTCONN;
if ((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_NEW_SYN_RECV)) {
- return -1;
+ union tcp_ao_addr *addr;
+ unsigned int family;
+ __be32 disn, sisn;
+
+ if (sk->sk_state == TCP_NEW_SYN_RECV) {
+ struct request_sock *req = inet_reqsk(sk);
+
+ sisn = htonl(tcp_rsk(req)->rcv_isn);
+ disn = htonl(tcp_rsk(req)->snt_isn);
+ *sne = 0;
+ } else {
+ sisn = th->seq;
+ disn = 0;
+ }
+ if (IS_ENABLED(CONFIG_IPV6) && sk->sk_family == AF_INET6)
+ addr = (union tcp_md5_addr *)&ipv6_hdr(skb)->saddr;
+ else
+ addr = (union tcp_md5_addr *)&ip_hdr(skb)->saddr;
+ family = sk->sk_family;
+#if IS_ENABLED(CONFIG_IPV6)
+ if (family == AF_INET6 && ipv6_addr_v4mapped(&sk->sk_v6_daddr))
+ family = AF_INET;
+#endif
+
+ sk = sk_const_to_full_sk(sk);
+ ao_info = rcu_dereference(tcp_sk(sk)->ao_info);
+ if (!ao_info)
+ return -ENOENT;
+ *key = tcp_ao_do_lookup(sk, l3index, addr, family,
+ -1, aoh->rnext_keyid, ntohs(th->source));
+ if (!*key)
+ return -ENOENT;
+ *traffic_key = kmalloc(tcp_ao_digest_size(*key), GFP_ATOMIC);
+ if (!*traffic_key)
+ return -ENOMEM;
+ *allocated_traffic_key = true;
+ if (tcp_ao_calc_key_skb(*key, *traffic_key, skb,
+ sisn, disn, family))
+ return -1;
+ *keyid = (*key)->rcvid;
} else {
struct tcp_ao_key *rnext_key;
@@ -567,6 +645,50 @@ int tcp_ao_prepare_reset(const struct sock *sk, struct sk_buff *skb,
return 0;
}
+static struct tcp_ao_key *tcp_ao_inbound_lookup(unsigned short int family,
+ const struct sock *sk, const struct sk_buff *skb,
+ int sndid, int rcvid)
+{
+ if (family == AF_INET) {
+ const struct iphdr *iph = ip_hdr(skb);
+
+ return tcp_ao_do_lookup(sk, (union tcp_ao_addr *)&iph->saddr,
+ AF_INET, sndid, rcvid, 0);
+ } else {
+ const struct ipv6hdr *iph = ipv6_hdr(skb);
+
+ return tcp_ao_do_lookup(sk, (union tcp_ao_addr *)&iph->saddr,
+ AF_INET6, sndid, rcvid, 0);
+ }
+}
+
+void tcp_ao_syncookie(struct sock *sk, const struct sk_buff *skb,
+ struct tcp_request_sock *treq,
+ unsigned short int family)
+{
+ const struct tcphdr *th = tcp_hdr(skb);
+ const struct tcp_ao_hdr *aoh;
+ struct tcp_ao_key *key;
+
+ treq->maclen = 0;
+
+ /* Shouldn't fail as this has been called on this packet
+ * in tcp_inbound_hash()
+ */
+ tcp_parse_auth_options(th, NULL, &aoh);
+ if (!aoh)
+ return;
+
+ key = tcp_ao_inbound_lookup(family, sk, skb, -1, aoh->keyid);
+ if (!key)
+ /* Key not found, continue without TCP-AO */
+ return;
+
+ treq->ao_rcv_next = aoh->keyid;
+ treq->ao_keyid = aoh->rnext_keyid;
+ treq->maclen = tcp_ao_maclen(key);
+}
+
static int tcp_ao_cache_traffic_keys(const struct sock *sk,
struct tcp_ao_info *ao,
struct tcp_ao_key *ao_key)
@@ -659,6 +781,101 @@ void tcp_ao_finish_connect(struct sock *sk, struct sk_buff *skb)
tcp_ao_cache_traffic_keys(sk, ao, key);
}
+int tcp_ao_copy_all_matching(const struct sock *sk, struct sock *newsk,
+ struct request_sock *req, struct sk_buff *skb,
+ int family)
+{
+ struct tcp_ao_key *key, *new_key, *first_key;
+ struct tcp_ao_info *new_ao, *ao;
+ struct hlist_node *key_head;
+ union tcp_ao_addr *addr;
+ bool match = false;
+ int ret = -ENOMEM;
+
+ ao = rcu_dereference(tcp_sk(sk)->ao_info);
+ if (!ao)
+ return 0;
+
+ /* New socket without TCP-AO on it */
+ if (!tcp_rsk_used_ao(req))
+ return 0;
+
+ new_ao = tcp_ao_alloc_info(GFP_ATOMIC);
+ if (!new_ao)
+ return -ENOMEM;
+ new_ao->lisn = htonl(tcp_rsk(req)->snt_isn);
+ new_ao->risn = htonl(tcp_rsk(req)->rcv_isn);
+ new_ao->ao_required = ao->ao_required;
+
+ if (family == AF_INET) {
+ addr = (union tcp_ao_addr *)&newsk->sk_daddr;
+#if IS_ENABLED(CONFIG_IPV6)
+ } else if (family == AF_INET6) {
+ addr = (union tcp_ao_addr *)&newsk->sk_v6_daddr;
+#endif
+ } else {
+ ret = -EAFNOSUPPORT;
+ goto free_ao;
+ }
+
+ hlist_for_each_entry_rcu(key, &ao->head, node) {
+ if (tcp_ao_key_cmp(key, addr, key->prefixlen, family,
+ -1, -1, 0))
+ continue;
+
+ new_key = tcp_ao_copy_key(newsk, key);
+ if (!new_key)
+ goto free_and_exit;
+
+ tcp_ao_cache_traffic_keys(newsk, new_ao, new_key);
+ tcp_ao_link_mkt(new_ao, new_key);
+ match = true;
+ }
+
+ if (!match) {
+ /* RFC5925 (7.4.1) specifies that the TCP-AO status
+ * of a connection is determined on the initial SYN.
+ * At this point the connection was TCP-AO enabled, so
+ * it can't switch to being unsigned if peer's key
+ * disappears on the listening socket.
+ */
+ ret = -EKEYREJECTED;
+ goto free_and_exit;
+ }
+
+ key_head = rcu_dereference(hlist_first_rcu(&new_ao->head));
+ first_key = hlist_entry_safe(key_head, struct tcp_ao_key, node);
+
+ key = tcp_ao_established_key(new_ao, tcp_rsk(req)->ao_keyid, -1);
+ if (key)
+ new_ao->current_key = key;
+ else
+ new_ao->current_key = first_key;
+
+ /* set rnext_key */
+ key = tcp_ao_established_key(new_ao, -1, tcp_rsk(req)->ao_rcv_next);
+ if (key)
+ new_ao->rnext_key = key;
+ else
+ new_ao->rnext_key = first_key;
+
+ sk_gso_disable(newsk);
+ rcu_assign_pointer(tcp_sk(newsk)->ao_info, new_ao);
+
+ return 0;
+
+free_and_exit:
+ hlist_for_each_entry_safe(key, key_head, &new_ao->head, node) {
+ hlist_del(&key->node);
+ tcp_sigpool_release(key->tcp_sigpool_id);
+ atomic_sub(tcp_ao_sizeof_key(key), &newsk->sk_omem_alloc);
+ kfree(key);
+ }
+free_ao:
+ kfree(new_ao);
+ return ret;
+}
+
static bool tcp_ao_can_set_current_rnext(struct sock *sk)
{
/* There aren't current/rnext keys on TCP_LISTEN sockets */
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 47817b93db9a..fc2fbacfbd0b 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6963,6 +6963,10 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
struct flowi fl;
u8 syncookies;
+#ifdef CONFIG_TCP_AO
+ const struct tcp_ao_hdr *aoh;
+#endif
+
syncookies = READ_ONCE(net->ipv4.sysctl_tcp_syncookies);
/* TW buckets are converted to open requests without
@@ -7048,6 +7052,17 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
inet_rsk(req)->ecn_ok = 0;
}
+#ifdef CONFIG_TCP_AO
+ if (tcp_parse_auth_options(tcp_hdr(skb), NULL, &aoh))
+ goto drop_and_release; /* Invalid TCP options */
+ if (aoh) {
+ tcp_rsk(req)->maclen = aoh->length - sizeof(struct tcp_ao_hdr);
+ tcp_rsk(req)->ao_rcv_next = aoh->keyid;
+ tcp_rsk(req)->ao_keyid = aoh->rnext_keyid;
+ } else {
+ tcp_rsk(req)->maclen = 0;
+ }
+#endif
tcp_rsk(req)->snt_isn = isn;
tcp_rsk(req)->txhash = net_tx_rndhash();
tcp_rsk(req)->syn_tos = TCP_SKB_CB(skb)->ip_dsfield;
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index eaf79e6674d1..57513de663ef 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1064,32 +1064,77 @@ static void tcp_v4_timewait_ack(struct sock *sk, struct sk_buff *skb)
static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb,
struct request_sock *req)
{
+ struct tcp_md5sig_key *md5_key = NULL;
+ struct tcp_ao_key *ao_key = NULL;
const union tcp_md5_addr *addr;
- int l3index;
+ u8 keyid = 0;
+ u8 *traffic_key = NULL;
+#ifdef CONFIG_TCP_AO
+ const struct tcp_ao_hdr *aoh;
+#endif
/* sk->sk_state == TCP_LISTEN -> for regular TCP_SYN_RECV
* sk->sk_state == TCP_SYN_RECV -> for Fast Open.
*/
u32 seq = (sk->sk_state == TCP_LISTEN) ? tcp_rsk(req)->snt_isn + 1 :
tcp_sk(sk)->snd_nxt;
+ addr = (union tcp_md5_addr *)&ip_hdr(skb)->saddr;
+ if (tcp_rsk_used_ao(req)) {
+#ifdef CONFIG_TCP_AO
+ /* Invalid TCP option size or twice included auth */
+ if (tcp_parse_auth_options(tcp_hdr(skb), NULL, &aoh))
+ return;
+
+ if (!aoh)
+ return;
+
+ ao_key = tcp_ao_do_lookup(sk, addr, AF_INET,
+ aoh->rnext_keyid, -1, 0);
+ if (unlikely(!ao_key)) {
+ /* Send ACK with any matching MKT for the peer */
+ ao_key = tcp_ao_do_lookup(sk, addr,
+ AF_INET, -1, -1, 0);
+ /* Matching key disappeared (user removed the key?)
+ * let the handshake timeout.
+ */
+ if (!ao_key) {
+ net_info_ratelimited("TCP-AO key for (%pI4, %d)->(%pI4, %d) suddenly disappeared, won't ACK new connection\n",
+ addr,
+ ntohs(tcp_hdr(skb)->source),
+ &ip_hdr(skb)->daddr,
+ ntohs(tcp_hdr(skb)->dest));
+ return;
+ }
+ }
+ traffic_key = kmalloc(tcp_ao_digest_size(ao_key), GFP_ATOMIC);
+ if (!traffic_key)
+ return;
+
+ keyid = aoh->keyid;
+ tcp_v4_ao_calc_key_rsk(ao_key, traffic_key, req);
+#endif
+ } else {
+ int l3index;
+
+ l3index = tcp_v4_sdif(skb) ? inet_iif(skb) : 0;
+ md5_key = tcp_md5_do_lookup(sk, l3index, addr, AF_INET);
+ }
/* RFC 7323 2.3
* The window field (SEG.WND) of every outgoing segment, with the
* exception of <SYN> segments, MUST be right-shifted by
* Rcv.Wind.Shift bits:
*/
- addr = (union tcp_md5_addr *)&ip_hdr(skb)->saddr;
- l3index = tcp_v4_sdif(skb) ? inet_iif(skb) : 0;
tcp_v4_send_ack(sk, skb, seq,
tcp_rsk(req)->rcv_nxt,
req->rsk_rcv_wnd >> inet_rsk(req)->rcv_wscale,
tcp_time_stamp_raw() + tcp_rsk(req)->ts_off,
req->ts_recent,
0,
- tcp_md5_do_lookup(sk, l3index, addr, AF_INET),
- NULL, NULL, 0, 0,
+ md5_key, ao_key, traffic_key, keyid, 0,
inet_rsk(req)->no_srccheck ? IP_REPLY_ARG_NOSRCCHECK : 0,
ip_hdr(skb)->tos, tcp_rsk(req)->txhash);
+ kfree(traffic_key);
}
/*
@@ -1627,6 +1672,10 @@ const struct tcp_request_sock_ops tcp_request_sock_ipv4_ops = {
.req_md5_lookup = tcp_v4_md5_lookup,
.calc_md5_hash = tcp_v4_md5_hash_skb,
#endif
+#ifdef CONFIG_TCP_AO
+ .ao_lookup = tcp_v4_ao_lookup_rsk,
+ .ao_calc_key = tcp_v4_ao_calc_key_rsk,
+#endif
#ifdef CONFIG_SYN_COOKIES
.cookie_init_seq = cookie_v4_init_sequence,
#endif
@@ -1728,12 +1777,16 @@ struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb,
/* Copy over the MD5 key from the original socket */
addr = (union tcp_md5_addr *)&newinet->inet_daddr;
key = tcp_md5_do_lookup(sk, l3index, addr, AF_INET);
- if (key) {
+ if (key && !tcp_rsk_used_ao(req)) {
if (tcp_md5_key_copy(newsk, addr, AF_INET, 32, l3index, key))
goto put_and_exit;
sk_gso_disable(newsk);
}
#endif
+#ifdef CONFIG_TCP_AO
+ if (tcp_ao_copy_all_matching(sk, newsk, req, skb, AF_INET))
+ goto put_and_exit; /* OOM, release back memory */
+#endif
if (__inet_inherit_port(sk, newsk) < 0)
goto put_and_exit;
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 42af2e91c462..969e662fd733 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -506,6 +506,9 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
const struct tcp_sock *oldtp;
struct tcp_sock *newtp;
u32 seq;
+#ifdef CONFIG_TCP_AO
+ struct tcp_ao_key *ao_key;
+#endif
if (!newsk)
return NULL;
@@ -584,6 +587,13 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
if (treq->af_specific->req_md5_lookup(sk, req_to_sk(req)))
newtp->tcp_header_len += TCPOLEN_MD5SIG_ALIGNED;
#endif
+#ifdef CONFIG_TCP_AO
+ newtp->ao_info = NULL;
+ ao_key = treq->af_specific->ao_lookup(sk, req,
+ tcp_rsk(req)->ao_keyid, -1);
+ if (ao_key)
+ newtp->tcp_header_len += tcp_ao_len(ao_key);
+ #endif
if (skb->len >= TCP_MSS_DEFAULT + newtp->tcp_header_len)
newicsk->icsk_ack.last_seg_size = skb->len - newtp->tcp_header_len;
newtp->rx_opt.mss_clamp = req->mss;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 6d59b77958e5..f8296851e820 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -608,6 +608,7 @@ static void bpf_skops_write_hdr_opt(struct sock *sk, struct sk_buff *skb,
* (but it may well be that other scenarios fail similarly).
*/
static void tcp_options_write(struct tcphdr *th, struct tcp_sock *tp,
+ const struct tcp_request_sock *tcprsk,
struct tcp_out_options *opts,
struct tcp_ao_key *ao_key)
{
@@ -622,23 +623,36 @@ static void tcp_options_write(struct tcphdr *th, struct tcp_sock *tp,
ptr += 4;
}
#ifdef CONFIG_TCP_AO
- if (unlikely(OPTION_AO & options) && tp) {
- struct tcp_ao_key *rnext_key;
- struct tcp_ao_info *ao_info;
+ if (unlikely(OPTION_AO & options)) {
u8 maclen;
if (WARN_ON_ONCE(!ao_key))
goto out_ao;
- ao_info = rcu_dereference_check(tp->ao_info,
- lockdep_sock_is_held(&tp->inet_conn.icsk_inet.sk));
- rnext_key = READ_ONCE(ao_info->rnext_key);
- if (WARN_ON_ONCE(!rnext_key))
- goto out_ao;
maclen = tcp_ao_maclen(ao_key);
- *ptr++ = htonl((TCPOPT_AO << 24) |
- (tcp_ao_len(ao_key) << 16) |
- (ao_key->sndid << 8) |
- (rnext_key->rcvid));
+
+ if (tp) {
+ struct tcp_ao_key *rnext_key;
+ struct tcp_ao_info *ao_info;
+
+ ao_info = rcu_dereference_check(tp->ao_info,
+ lockdep_sock_is_held(&tp->inet_conn.icsk_inet.sk));
+ rnext_key = READ_ONCE(ao_info->rnext_key);
+ if (WARN_ON_ONCE(!rnext_key))
+ goto out_ao;
+ *ptr++ = htonl((TCPOPT_AO << 24) |
+ (tcp_ao_len(ao_key) << 16) |
+ (ao_key->sndid << 8) |
+ (rnext_key->rcvid));
+ } else if (tcprsk) {
+ u8 aolen = maclen + sizeof(struct tcp_ao_hdr);
+
+ *ptr++ = htonl((TCPOPT_AO << 24) | (aolen << 16) |
+ (tcprsk->ao_keyid << 8) |
+ (tcprsk->ao_rcv_next));
+ } else {
+ WARN_ON_ONCE(1);
+ goto out_ao;
+ }
opts->hash_location = (__u8 *)ptr;
ptr += maclen / sizeof(*ptr);
if (unlikely(maclen % sizeof(*ptr))) {
@@ -1414,7 +1428,7 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,
th->window = htons(min(tp->rcv_wnd, 65535U));
}
- tcp_options_write(th, tp, &opts, ao_key);
+ tcp_options_write(th, tp, NULL, &opts, ao_key);
#ifdef CONFIG_TCP_MD5SIG
/* Calculate the MD5 hash, as we have all we need now */
@@ -3787,7 +3801,7 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
/* RFC1323: The window in SYN & SYN/ACK segments is never scaled. */
th->window = htons(min(req->rsk_rcv_wnd, 65535U));
- tcp_options_write(th, NULL, &opts, NULL);
+ tcp_options_write(th, NULL, NULL, &opts, NULL);
th->doff = (tcp_header_size >> 2);
TCP_INC_STATS(sock_net(sk), TCP_MIB_OUTSEGS);
diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c
index 5014aa663452..ad7a8caa7b2a 100644
--- a/net/ipv6/syncookies.c
+++ b/net/ipv6/syncookies.c
@@ -214,6 +214,8 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb)
treq->snt_isn = cookie;
treq->ts_off = 0;
treq->txhash = net_tx_rndhash();
+ tcp_ao_syncookie(sk, skb, treq, AF_INET6);
+
if (IS_ENABLED(CONFIG_SMC))
ireq->smc_ok = 0;
diff --git a/net/ipv6/tcp_ao.c b/net/ipv6/tcp_ao.c
index 54c3714a05b4..65fa52f8c2e3 100644
--- a/net/ipv6/tcp_ao.c
+++ b/net/ipv6/tcp_ao.c
@@ -63,6 +63,18 @@ int tcp_v6_ao_calc_key_sk(struct tcp_ao_key *mkt, u8 *key,
htons(sk->sk_num), disn, sisn);
}
+int tcp_v6_ao_calc_key_rsk(struct tcp_ao_key *mkt, u8 *key,
+ struct request_sock *req)
+{
+ struct inet_request_sock *ireq = inet_rsk(req);
+
+ return tcp_v6_ao_calc_key(mkt, key,
+ &ireq->ir_v6_loc_addr, &ireq->ir_v6_rmt_addr,
+ htons(ireq->ir_num), ireq->ir_rmt_port,
+ htonl(tcp_rsk(req)->snt_isn),
+ htonl(tcp_rsk(req)->rcv_isn));
+}
+
struct tcp_ao_key *tcp_v6_ao_do_lookup(const struct sock *sk,
const struct in6_addr *addr,
int sndid, int rcvid)
@@ -80,6 +92,15 @@ struct tcp_ao_key *tcp_v6_ao_lookup(const struct sock *sk,
return tcp_v6_ao_do_lookup(sk, addr, sndid, rcvid);
}
+struct tcp_ao_key *tcp_v6_ao_lookup_rsk(const struct sock *sk,
+ struct request_sock *req,
+ int sndid, int rcvid)
+{
+ struct in6_addr *addr = &inet_rsk(req)->ir_v6_rmt_addr;
+
+ return tcp_v6_ao_do_lookup(sk, addr, sndid, rcvid);
+}
+
int tcp_v6_ao_hash_pseudoheader(struct tcp_sigpool *hp,
const struct in6_addr *daddr,
const struct in6_addr *saddr, int nbytes)
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 99266446ef54..367fa10c5e4d 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -842,6 +842,10 @@ const struct tcp_request_sock_ops tcp_request_sock_ipv6_ops = {
.req_md5_lookup = tcp_v6_md5_lookup,
.calc_md5_hash = tcp_v6_md5_hash_skb,
#endif
+#ifdef CONFIG_TCP_AO
+ .ao_lookup = tcp_v6_ao_lookup_rsk,
+ .ao_calc_key = tcp_v6_ao_calc_key_rsk,
+#endif
#ifdef CONFIG_SYN_COOKIES
.cookie_init_seq = cookie_v6_init_sequence,
#endif
@@ -1194,9 +1198,51 @@ static void tcp_v6_timewait_ack(struct sock *sk, struct sk_buff *skb)
static void tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb,
struct request_sock *req)
{
+ struct tcp_md5sig_key *md5_key = NULL;
+ struct tcp_ao_key *ao_key = NULL;
+ const struct in6_addr *addr;
+ u8 *traffic_key = NULL;
+ u8 keyid = 0;
int l3index;
l3index = tcp_v6_sdif(skb) ? tcp_v6_iif_l3_slave(skb) : 0;
+ addr = &ipv6_hdr(skb)->saddr;
+
+ if (tcp_rsk_used_ao(req)) {
+#ifdef CONFIG_TCP_AO
+ const struct tcp_ao_hdr *aoh;
+
+ /* Invalid TCP option size or twice included auth */
+ if (tcp_parse_auth_options(tcp_hdr(skb), NULL, &aoh))
+ return;
+ if (!aoh)
+ return;
+ ao_key = tcp_v6_ao_do_lookup(sk, addr, aoh->rnext_keyid, -1);
+ if (unlikely(!ao_key)) {
+ /* Send ACK with any matching MKT for the peer */
+ ao_key = tcp_v6_ao_do_lookup(sk, addr, -1, -1);
+ /* Matching key disappeared (user removed the key?)
+ * let the handshake timeout.
+ */
+ if (!ao_key) {
+ net_info_ratelimited("TCP-AO key for (%pI6, %d)->(%pI6, %d) suddenly disappeared, won't ACK new connection\n",
+ addr,
+ ntohs(tcp_hdr(skb)->source),
+ &ipv6_hdr(skb)->daddr,
+ ntohs(tcp_hdr(skb)->dest));
+ return;
+ }
+ }
+ traffic_key = kmalloc(tcp_ao_digest_size(ao_key), GFP_ATOMIC);
+ if (!traffic_key)
+ return;
+
+ keyid = aoh->keyid;
+ tcp_v6_ao_calc_key_rsk(ao_key, traffic_key, req);
+#endif
+ } else {
+ md5_key = tcp_v6_md5_do_lookup(sk, addr, l3index);
+ }
/* sk->sk_state == TCP_LISTEN -> for regular TCP_SYN_RECV
* sk->sk_state == TCP_SYN_RECV -> for Fast Open.
@@ -1212,9 +1258,10 @@ static void tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb,
req->rsk_rcv_wnd >> inet_rsk(req)->rcv_wscale,
tcp_time_stamp_raw() + tcp_rsk(req)->ts_off,
req->ts_recent, sk->sk_bound_dev_if,
- tcp_v6_md5_do_lookup(sk, &ipv6_hdr(skb)->saddr, l3index),
+ md5_key,
ipv6_get_dsfield(ipv6_hdr(skb)), 0, sk->sk_priority,
- tcp_rsk(req)->txhash, NULL, NULL, 0, 0);
+ tcp_rsk(req)->txhash, ao_key, traffic_key, keyid, 0);
+ kfree(traffic_key);
}
@@ -1444,19 +1491,26 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
#ifdef CONFIG_TCP_MD5SIG
l3index = l3mdev_master_ifindex_by_index(sock_net(sk), ireq->ir_iif);
- /* Copy over the MD5 key from the original socket */
- key = tcp_v6_md5_do_lookup(sk, &newsk->sk_v6_daddr, l3index);
- if (key) {
- const union tcp_md5_addr *addr;
+ if (!tcp_rsk_used_ao(req)) {
+ /* Copy over the MD5 key from the original socket */
+ key = tcp_v6_md5_do_lookup(sk, &newsk->sk_v6_daddr, l3index);
+ if (key) {
+ const union tcp_md5_addr *addr;
- addr = (union tcp_md5_addr *)&newsk->sk_v6_daddr;
- if (tcp_md5_key_copy(newsk, addr, AF_INET6, 128, l3index, key)) {
- inet_csk_prepare_forced_close(newsk);
- tcp_done(newsk);
- goto out;
+ addr = (union tcp_md5_addr *)&newsk->sk_v6_daddr;
+ if (tcp_md5_key_copy(newsk, addr, AF_INET6, 128, l3index, key)) {
+ inet_csk_prepare_forced_close(newsk);
+ tcp_done(newsk);
+ goto out;
+ }
}
}
#endif
+#ifdef CONFIG_TCP_AO
+ /* Copy over tcp_ao_info if any */
+ if (tcp_ao_copy_all_matching(sk, newsk, req, skb, AF_INET6))
+ goto out; /* OOM */
+#endif
if (__inet_inherit_port(sk, newsk) < 0) {
inet_csk_prepare_forced_close(newsk);
--
2.41.0
On Wed, 19 Jul 2023 21:26:05 +0100 Dmitry Safonov wrote:
> This is version 8 of TCP-AO support. I base it on master and there
> weren't any conflicts on my tentative merge to linux-next.
doesn't apply
--
pw-bot: cr
On 7/20/23 18:53, Jakub Kicinski wrote:
> On Wed, 19 Jul 2023 21:26:05 +0100 Dmitry Safonov wrote:
>> This is version 8 of TCP-AO support. I base it on master and there
>> weren't any conflicts on my tentative merge to linux-next.
>
> doesn't apply
Yeah, it did yesterday on next-20230719, but today's next-20230720 has
commit 5e5265522a9a ("tcp: annotate data-races around
tcp_rsk(req)->txhash"), which has a conflict.
I'll resend rebased v8.1-next
Thanks,
Dmitry