2021-12-08 11:38:09

by Leonard Crestez

[permalink] [raw]
Subject: [PATCH v3 00/18] tcp: Initial support for RFC5925 auth option

This is similar to TCP MD5 in functionality but it's sufficiently
different that wire formats are incompatible. Compared to TCP-MD5 more
algorithms are supported and multiple keys can be used on the same
connection but there is still no negotiation mechanism.

Expected use-case is protecting long-duration BGP/LDP connections
between routers using pre-shared keys. The goal of this series is to
allow routers using the linux TCP stack to interoperate with vendors
such as Cisco and Juniper.

Both algorithms described in RFC5926 are implemented but the code is not
very easily extensible beyond that. In particular there are several code
paths making stack allocations based on RFC5926 maximum, those would
have to be increased. Support for arbitrary algorithms was requested
in reply to previous posts but I believe there is no real use case for
that.

The current implementation is somewhat loose regarding configuration:
* Overlaping MKTs can be configured despite what RFC5925 says
* Current key can be deleted
* If multiple keys are valid for a destination the kernel picks one
in an unpredictable manner (this can be overridden).
These conditions could be tightened but it is not clear the kernel
should prevent misconfiguration from userspace.

This version implements prefixlen and incorporates comments from v2 as
well as some unrelated fixes. Here are some known flaws and limitations:

* Crypto API is used with buffers on the stack and inside struct sock,
this might not work on all arches. I'm currently only testing x64 VMs
* Interaction with TCP-MD5 not tested in all corners.
* Interaction with FASTOPEN not tested and unlikely to work because
sequence number assumptions for syn/ack.
* Not clear if crypto_ahash_setkey might sleep. If some implementation
do that then maybe they could be excluded through alloc flags.
* Traffic key is not cached (reducing performance)
* There is no useful way to list keys, making userspace debug difficult.

Some testing support is included in nettest and fcnal-test.sh, similar
to the current level of tcp-md5 testing.

A more elaborate test suite using pytest and scapy is available out of
tree: https://github.com/cdleonard/tcp-authopt-test That test suite is
much larger that the kernel code and did not receive many comments so
I will attempt to push it separately (if at all).

Changes for frr (old): https://github.com/FRRouting/frr/pull/9442
That PR was made early for ABI feedback, it has many issues.

Changes for yabgp (old): https://github.com/cdleonard/yabgp/commits/tcp_authopt
This can be use for easy interoperability testing with cisco/juniper/etc.

Changes since PATCH v2:
* Protect tcp_authopt_alg_get/put_tfm with local_bh_disable instead of
preempt_disable. This caused signature corruption when send path executing
with BH enabled was interrupted by recv.
* Fix accepted keyids not configured locally as "unexpected". If any key
is configured that matches the peer then traffic MUST be signed.
* Fix issues related to sne rollover during handshake itself. (Francesco)
* Implement and test prefixlen (David)
* Replace shash with ahash and reuse some of the MD5 code (Dmitry)
* Parse md5+ao options only once in the same function (Dmitry)
* Pass tcp_authopt_info into inbound check path, this avoids second rcu
dereference for same packet.
* Pass tcp_request_socket into inbound check path instead of just listen
socket. This is required for SNE rollover during handshake and clearifies
ISN handling.
* Do not allow disabling via sysctl after enabling once, this is difficult
to support well (David)
* Verbose check for sysctl_tcp_authopt (Dmitry)
* Use netif_index_is_l3_master (David)
* Cleanup ipvx_addr_match (David)
* Add a #define tcp_authopt_needed to wrap static key usage because it looks
nicer.
* Replace rcu_read_lock with rcu_dereference_protected in SNE updates (Eric)
Link: https://lore.kernel.org/netdev/[email protected]/

Changes since PATCH v1:
* Implement Sequence Number Extension
* Implement l3index for vrf: TCP_AUTHOPT_KEY_IFINDEX as equivalent of
TCP_MD5SIG_FLAG_IFINDEX
* Expand TCP-AO tests in fcnal-test.sh to near-parity with md5.
* Show addr/port on failure similar to md5
* Remove tox dependency from test suite (create venv directly)
* Switch default pytest output format to TAP (kselftest standard)
* Fix _copy_from_sockptr_tolerant stack corruption on short sockopts.
This was covered in test but error was invisible without STACKPROTECTOR=y
* Fix sysctl_tcp_authopt check in tcp_get_authopt_val before memset. This
was harmless because error code is checked in getsockopt anyway.
* Fix dropping md5 packets on all sockets with AO enabled
* Fix checking (key->recv_id & TCP_AUTHOPT_KEY_ADDR_BIND) instead of
key->flags in tcp_authopt_key_match_exact
* Fix PATCH 1/19 not compiling due to missing "int err" declaration
* Add ratelimited message for AO and MD5 both present
* Export all symbols required by CONFIG_IPV6=m (again)
* Fix compilation with CONFIG_TCP_AUTHOPT=y CONFIG_TCP_MD5SIG=n
* Fix checkpatch issues
* Pass -rrequirements.txt to tox to avoid dependency variation.
Link: https://lore.kernel.org/netdev/[email protected]/

Changes since RFCv3:
* Implement TCP_AUTHOPT handling for timewait and reset replies. Write
tests to execute these paths by injecting packets with scapy
* Handle combining md5 and authopt: if both are configured use authopt.
* Fix locking issues around send_key, introduced in on of the later patches.
* Handle IPv4-mapped-IPv6 addresses: it used to be that an ipv4 SYN sent
to an ipv6 socket with TCP-AO triggered WARN
* Implement un-namespaced sysctl disabled this feature by default
* Allocate new key before removing any old one in setsockopt (Dmitry)
* Remove tcp_authopt_key_info.local_id because it's no longer used (Dmitry)
* Propagate errors from TCP_AUTHOPT getsockopt (Dmitry)
* Fix no-longer-correct TCP_AUTHOPT_KEY_DEL docs (Dmitry)
* Simplify crypto allocation (Eric)
* Use kzmalloc instead of __GFP_ZERO (Eric)
* Add static_key_false tcp_authopt_needed (Eric)
* Clear authopt_info copied from oldsk in __tcp_authopt_openreq (Eric)
* Replace memcmp in ipv4 and ipv6 addr comparisons (Eric)
* Export symbols for CONFIG_IPV6=m (kernel test robot)
* Mark more functions static (kernel test robot)
* Fix build with CONFIG_PROVE_RCU_LIST=y (kernel test robot)
Link: https://lore.kernel.org/netdev/[email protected]/

Changes since RFCv2:
* Removed local_id from ABI and match on send_id/recv_id/addr
* Add all relevant out-of-tree tests to tools/testing/selftests
* Return an error instead of ignoring unknown flags, hopefully this makes
it easier to extend.
* Check sk_family before __tcp_authopt_info_get_or_create in tcp_set_authopt_key
* Use sock_owned_by_me instead of WARN_ON(!lockdep_sock_is_held(sk))
* Fix some intermediate build failures reported by kbuild robot
* Improve documentation
Link: https://lore.kernel.org/netdev/[email protected]/

Changes since RFC:
* Split into per-topic commits for ease of review. The intermediate
commits compile with a few "unused function" warnings and don't do
anything useful by themselves.
* Add ABI documention including kernel-doc on uapi
* Fix lockdep warnings from crypto by creating pools with one shash for
each cpu
* Accept short options to setsockopt by padding with zeros; this
approach allows increasing the size of the structs in the future.
* Support for aes-128-cmac-96
* Support for binding addresses to keys in a way similar to old tcp_md5
* Add support for retrieving received keyid/rnextkeyid and controling
the keyid/rnextkeyid being sent.
Link: https://lore.kernel.org/netdev/01383a8751e97ef826ef2adf93bfde3a08195a43.1626693859.git.cdleonard@gmail.com/
```

Leonard Crestez (18):
tcp: authopt: Initial support and key management
docs: Add user documentation for tcp_authopt
tcp: authopt: Add crypto initialization
tcp: md5: Refactor tcp_sig_hash_skb_data for AO
tcp: authopt: Compute packet signatures
tcp: authopt: Hook into tcp core
tcp: authopt: Disable via sysctl by default
tcp: authopt: Implement Sequence Number Extension
tcp: ipv6: Add AO signing for tcp_v6_send_response
tcp: authopt: Add support for signing skb-less replies
tcp: ipv4: Add AO signing for skb-less replies
tcp: authopt: Add key selection controls
tcp: authopt: Add initial l3index support
tcp: authopt: Add NOSEND/NORECV flags
tcp: authopt: Add prefixlen support
selftests: nettest: Rename md5_prefix to key_addr_prefix
selftests: nettest: Initial tcp_authopt support
selftests: net/fcnal: Initial tcp_authopt support

Documentation/networking/index.rst | 1 +
Documentation/networking/ip-sysctl.rst | 6 +
Documentation/networking/tcp_authopt.rst | 69 +
include/linux/tcp.h | 9 +
include/net/tcp.h | 27 +-
include/net/tcp_authopt.h | 316 ++++
include/uapi/linux/snmp.h | 1 +
include/uapi/linux/tcp.h | 137 ++
net/ipv4/Kconfig | 14 +
net/ipv4/Makefile | 1 +
net/ipv4/proc.c | 1 +
net/ipv4/sysctl_net_ipv4.c | 39 +
net/ipv4/tcp.c | 68 +-
net/ipv4/tcp_authopt.c | 1671 +++++++++++++++++++++
net/ipv4/tcp_input.c | 41 +-
net/ipv4/tcp_ipv4.c | 136 +-
net/ipv4/tcp_minisocks.c | 12 +
net/ipv4/tcp_output.c | 86 +-
net/ipv6/tcp_ipv6.c | 108 +-
tools/testing/selftests/net/fcnal-test.sh | 298 ++++
tools/testing/selftests/net/nettest.c | 123 +-
21 files changed, 3085 insertions(+), 79 deletions(-)
create mode 100644 Documentation/networking/tcp_authopt.rst
create mode 100644 include/net/tcp_authopt.h
create mode 100644 net/ipv4/tcp_authopt.c


base-commit: 1fe5b01262844be03de98afdd56d1d393df04d7e
--
2.25.1



2021-12-08 11:38:15

by Leonard Crestez

[permalink] [raw]
Subject: [PATCH v3 04/18] tcp: md5: Refactor tcp_sig_hash_skb_data for AO

This chunk of code is identical between the implementation of TCP-MD5
and TCP-AO so rename and refactor

Signed-off-by: Leonard Crestez <[email protected]>
---
include/net/tcp.h | 2 +-
net/ipv4/tcp.c | 38 ++++++++++++++++++++------------------
net/ipv4/tcp_ipv4.c | 2 +-
net/ipv6/tcp_ipv6.c | 2 +-
4 files changed, 23 insertions(+), 21 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 6cc2eeb45deb..1a0513b0ead0 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1688,11 +1688,11 @@ struct tcp_md5sig_pool *tcp_get_md5sig_pool(void);
static inline void tcp_put_md5sig_pool(void)
{
local_bh_enable();
}

-int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *, const struct sk_buff *,
+int tcp_sig_hash_skb_data(struct ahash_request *, const struct sk_buff *,
unsigned int header_len);
int tcp_md5_hash_key(struct tcp_md5sig_pool *hp,
const struct tcp_md5sig_key *key);

/* From tcp_fastopen.c */
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 022811dd705d..8e6cbb5a1da7 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -4400,16 +4400,31 @@ struct tcp_md5sig_pool *tcp_get_md5sig_pool(void)
local_bh_enable();
return NULL;
}
EXPORT_SYMBOL(tcp_get_md5sig_pool);

-int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp,
+int tcp_md5_hash_key(struct tcp_md5sig_pool *hp, const struct tcp_md5sig_key *key)
+{
+ u8 keylen = READ_ONCE(key->keylen); /* paired with WRITE_ONCE() in tcp_md5_do_add */
+ struct scatterlist sg;
+
+ sg_init_one(&sg, key->key, keylen);
+ ahash_request_set_crypt(hp->md5_req, &sg, NULL, keylen);
+
+ /* We use data_race() because tcp_md5_do_add() might change key->key under us */
+ return data_race(crypto_ahash_update(hp->md5_req));
+}
+EXPORT_SYMBOL(tcp_md5_hash_key);
+#endif /* CONFIG_TCP_MD5SIG */
+
+#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AUTHOPT)
+
+int tcp_sig_hash_skb_data(struct ahash_request *req,
const struct sk_buff *skb, unsigned int header_len)
{
struct scatterlist sg;
const struct tcphdr *tp = tcp_hdr(skb);
- struct ahash_request *req = hp->md5_req;
unsigned int i;
const unsigned int head_data_len = skb_headlen(skb) > header_len ?
skb_headlen(skb) - header_len : 0;
const struct skb_shared_info *shi = skb_shinfo(skb);
struct sk_buff *frag_iter;
@@ -4432,31 +4447,18 @@ int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp,
if (crypto_ahash_update(req))
return 1;
}

skb_walk_frags(skb, frag_iter)
- if (tcp_md5_hash_skb_data(hp, frag_iter, 0))
+ if (tcp_sig_hash_skb_data(req, frag_iter, 0))
return 1;

return 0;
}
-EXPORT_SYMBOL(tcp_md5_hash_skb_data);
+EXPORT_SYMBOL(tcp_sig_hash_skb_data);

-int tcp_md5_hash_key(struct tcp_md5sig_pool *hp, const struct tcp_md5sig_key *key)
-{
- u8 keylen = READ_ONCE(key->keylen); /* paired with WRITE_ONCE() in tcp_md5_do_add */
- struct scatterlist sg;
-
- sg_init_one(&sg, key->key, keylen);
- ahash_request_set_crypt(hp->md5_req, &sg, NULL, keylen);
-
- /* We use data_race() because tcp_md5_do_add() might change key->key under us */
- return data_race(crypto_ahash_update(hp->md5_req));
-}
-EXPORT_SYMBOL(tcp_md5_hash_key);
-
-#endif
+#endif /* defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AUTHOPT) */

void tcp_done(struct sock *sk)
{
struct request_sock *req;

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 0aa5122b29e0..91cad11db32e 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1381,11 +1381,11 @@ int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key,
if (crypto_ahash_init(req))
goto clear_hash;

if (tcp_v4_md5_hash_headers(hp, daddr, saddr, th, skb->len))
goto clear_hash;
- if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2))
+ if (tcp_sig_hash_skb_data(hp->md5_req, skb, th->doff << 2))
goto clear_hash;
if (tcp_md5_hash_key(hp, key))
goto clear_hash;
ahash_request_set_crypt(req, NULL, md5_hash, 0);
if (crypto_ahash_final(req))
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 3b7d6ede1364..e98fc6f12c61 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -750,11 +750,11 @@ static int tcp_v6_md5_hash_skb(char *md5_hash,
if (crypto_ahash_init(req))
goto clear_hash;

if (tcp_v6_md5_hash_headers(hp, daddr, saddr, th, skb->len))
goto clear_hash;
- if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2))
+ if (tcp_sig_hash_skb_data(hp->md5_req, skb, th->doff << 2))
goto clear_hash;
if (tcp_md5_hash_key(hp, key))
goto clear_hash;
ahash_request_set_crypt(req, NULL, md5_hash, 0);
if (crypto_ahash_final(req))
--
2.25.1


2021-12-08 11:38:19

by Leonard Crestez

[permalink] [raw]
Subject: [PATCH v3 02/18] docs: Add user documentation for tcp_authopt

The .rst documentation contains a brief description of the user
interface and includes kernel-doc generated from uapi header.

Signed-off-by: Leonard Crestez <[email protected]>
---
Documentation/networking/index.rst | 1 +
Documentation/networking/tcp_authopt.rst | 44 ++++++++++++++++++++++++
2 files changed, 45 insertions(+)
create mode 100644 Documentation/networking/tcp_authopt.rst

diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index 58bc8cd367c6..f5c324a060d8 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -100,10 +100,11 @@ Contents:
strparser
switchdev
sysfs-tagging
tc-actions-env-rules
tcp-thin
+ tcp_authopt
team
timestamping
tipc
tproxy
tuntap
diff --git a/Documentation/networking/tcp_authopt.rst b/Documentation/networking/tcp_authopt.rst
new file mode 100644
index 000000000000..484f66f41ad5
--- /dev/null
+++ b/Documentation/networking/tcp_authopt.rst
@@ -0,0 +1,44 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=========================
+TCP Authentication Option
+=========================
+
+The TCP Authentication option specified by RFC5925 replaces the TCP MD5
+Signature option. It similar in goals but not compatible in either wire formats
+or ABI.
+
+Interface
+=========
+
+Individual keys can be added to or removed from a TCP socket by using
+TCP_AUTHOPT_KEY setsockopt and a ``struct tcp_authopt_key``. There is no
+support for reading back keys and updates always replace the old key. These
+structures represent "Master Key Tuples (MKTs)" as described by the RFC.
+
+Per-socket options can set or read using the TCP_AUTHOPT sockopt and a ``struct
+tcp_authopt``. This is optional: doing setsockopt TCP_AUTHOPT_KEY is
+sufficient to enable the feature.
+
+Configuration associated with TCP Authentication is indepedently attached to
+each TCP socket. After listen and accept the newly returned socket gets an
+independent copy of relevant settings from the listen socket.
+
+Key binding
+-----------
+
+Keys can be bound to remote addresses in a way that is similar to TCP_MD5.
+
+ * The full address must match (/32 or /128)
+ * Ports are ignored
+ * Address binding is optional, by default keys match all addresses
+
+RFC5925 requires that key ids do not overlap when tcp identifiers (addr/port)
+overlap. This is not enforced by linux, configuring ambiguous keys will result
+in packet drops and lost connections.
+
+ABI Reference
+=============
+
+.. kernel-doc:: include/uapi/linux/tcp.h
+ :identifiers: tcp_authopt tcp_authopt_flag tcp_authopt_key tcp_authopt_key_flag tcp_authopt_alg
--
2.25.1


2021-12-08 11:38:19

by Leonard Crestez

[permalink] [raw]
Subject: [PATCH v3 03/18] tcp: authopt: Add crypto initialization

The crypto_shash API is used in order to compute packet signatures. The
API comes with several unfortunate limitations:

1) Allocating a crypto_shash can sleep and must be done in user context.
2) Packet signatures must be computed in softirq context
3) Packet signatures use dynamic "traffic keys" which require exclusive
access to crypto_shash for crypto_setkey.

The solution is to allocate one crypto_shash for each possible cpu for
each algorithm at setsockopt time. The per-cpu tfm is then borrowed from
softirq context, signatures are computed and the tfm is returned.

The pool for each algorithm is allocated on first use.

Signed-off-by: Leonard Crestez <[email protected]>
---
include/net/tcp_authopt.h | 16 ++++
net/ipv4/tcp_authopt.c | 166 ++++++++++++++++++++++++++++++++++++++
2 files changed, 182 insertions(+)

diff --git a/include/net/tcp_authopt.h b/include/net/tcp_authopt.h
index 42ad764e98c2..5217b6c7c900 100644
--- a/include/net/tcp_authopt.h
+++ b/include/net/tcp_authopt.h
@@ -2,10 +2,24 @@
#ifndef _LINUX_TCP_AUTHOPT_H
#define _LINUX_TCP_AUTHOPT_H

#include <uapi/linux/tcp.h>

+/* According to RFC5925 the length of the authentication option varies based on
+ * the signature algorithm. Linux only implements the algorithms defined in
+ * RFC5926 which have a constant length of 16.
+ *
+ * This is used in stack allocation of tcp option buffers for output. It is
+ * shorter than the length of the MD5 option.
+ *
+ * Input packets can have authentication options of different lengths but they
+ * will always be flagged as invalid (since no such algorithms are supported).
+ */
+#define TCPOLEN_AUTHOPT_OUTPUT 16
+
+struct tcp_authopt_alg_imp;
+
/**
* struct tcp_authopt_key_info - Representation of a Master Key Tuple as per RFC5925
*
* Key structure lifetime is only protected by RCU so readers needs to hold a
* single rcu_read_lock until they're done with the key.
@@ -27,10 +41,12 @@ struct tcp_authopt_key_info {
u8 keylen;
/** @key: Same as &tcp_authopt_key.key */
u8 key[TCP_AUTHOPT_MAXKEYLEN];
/** @addr: Same as &tcp_authopt_key.addr */
struct sockaddr_storage addr;
+ /** @alg: Algorithm implementation matching alg_id */
+ struct tcp_authopt_alg_imp *alg;
};

/**
* struct tcp_authopt_info - Per-socket information regarding tcp_authopt
*
diff --git a/net/ipv4/tcp_authopt.c b/net/ipv4/tcp_authopt.c
index e6740ef29a84..478969d53094 100644
--- a/net/ipv4/tcp_authopt.c
+++ b/net/ipv4/tcp_authopt.c
@@ -3,10 +3,164 @@
#include <linux/kernel.h>
#include <net/tcp.h>
#include <net/tcp_authopt.h>
#include <crypto/hash.h>

+/* All current algorithms have a mac length of 12 but crypto API digestsize can be larger */
+#define TCP_AUTHOPT_MAXMACBUF 20
+#define TCP_AUTHOPT_MAX_TRAFFIC_KEY_LEN 20
+#define TCP_AUTHOPT_MACLEN 12
+
+/* Constant data with per-algorithm information from RFC5926
+ * The "KDF" and "MAC" happen to be the same for both algorithms.
+ */
+struct tcp_authopt_alg_imp {
+ /* Name of algorithm in crypto-api */
+ const char *alg_name;
+ /* One of the TCP_AUTHOPT_ALG_* constants from uapi */
+ u8 alg_id;
+ /* Length of traffic key */
+ u8 traffic_key_len;
+
+ /* shared crypto_shash */
+ struct mutex init_mutex;
+ bool init_done;
+ struct crypto_shash * __percpu *tfms;
+};
+
+static struct tcp_authopt_alg_imp tcp_authopt_alg_list[] = {
+ {
+ .alg_id = TCP_AUTHOPT_ALG_HMAC_SHA_1_96,
+ .alg_name = "hmac(sha1)",
+ .traffic_key_len = 20,
+ .init_mutex = __MUTEX_INITIALIZER(tcp_authopt_alg_list[0].init_mutex),
+ },
+ {
+ .alg_id = TCP_AUTHOPT_ALG_AES_128_CMAC_96,
+ .alg_name = "cmac(aes)",
+ .traffic_key_len = 16,
+ .init_mutex = __MUTEX_INITIALIZER(tcp_authopt_alg_list[1].init_mutex),
+ },
+};
+
+/* get a pointer to the tcp_authopt_alg instance or NULL if id invalid */
+static inline struct tcp_authopt_alg_imp *tcp_authopt_alg_get(int alg_num)
+{
+ if (alg_num <= 0 || alg_num > 2)
+ return NULL;
+ return &tcp_authopt_alg_list[alg_num - 1];
+}
+
+static void __tcp_authopt_alg_free(struct tcp_authopt_alg_imp *alg)
+{
+ int cpu;
+ struct crypto_shash *tfm;
+
+ if (!alg->tfms)
+ return;
+ for_each_possible_cpu(cpu) {
+ tfm = *per_cpu_ptr(alg->tfms, cpu);
+ if (tfm) {
+ crypto_free_shash(tfm);
+ *per_cpu_ptr(alg->tfms, cpu) = NULL;
+ }
+ }
+ free_percpu(alg->tfms);
+ alg->tfms = NULL;
+}
+
+static int __tcp_authopt_alg_init(struct tcp_authopt_alg_imp *alg)
+{
+ struct crypto_shash *tfm;
+ int cpu;
+ int err;
+
+ BUILD_BUG_ON(TCP_AUTHOPT_MAXMACBUF < TCPOLEN_AUTHOPT_OUTPUT);
+ if (WARN_ON_ONCE(alg->traffic_key_len > TCP_AUTHOPT_MAX_TRAFFIC_KEY_LEN))
+ return -ENOBUFS;
+
+ alg->tfms = alloc_percpu(struct crypto_shash *);
+ if (!alg->tfms)
+ return -ENOMEM;
+ for_each_possible_cpu(cpu) {
+ tfm = crypto_alloc_shash(alg->alg_name, 0, 0);
+ if (IS_ERR(tfm)) {
+ err = PTR_ERR(tfm);
+ goto out_err;
+ }
+
+ /* sanity checks: */
+ if (WARN_ON_ONCE(crypto_shash_digestsize(tfm) != alg->traffic_key_len)) {
+ err = -EINVAL;
+ goto out_err;
+ }
+ if (WARN_ON_ONCE(crypto_shash_digestsize(tfm) > TCP_AUTHOPT_MAXMACBUF)) {
+ err = -EINVAL;
+ goto out_err;
+ }
+
+ *per_cpu_ptr(alg->tfms, cpu) = tfm;
+ }
+ return 0;
+
+out_err:
+ __tcp_authopt_alg_free(alg);
+ return err;
+}
+
+static int tcp_authopt_alg_require(struct tcp_authopt_alg_imp *alg)
+{
+ int err = 0;
+
+ mutex_lock(&alg->init_mutex);
+ if (alg->init_done)
+ goto out;
+ err = __tcp_authopt_alg_init(alg);
+ if (err)
+ goto out;
+ pr_info("initialized tcp-ao algorithm %s", alg->alg_name);
+ alg->init_done = true;
+
+out:
+ mutex_unlock(&alg->init_mutex);
+ return err;
+}
+
+static struct crypto_shash *tcp_authopt_alg_get_tfm(struct tcp_authopt_alg_imp *alg)
+{
+ local_bh_disable();
+ return *this_cpu_ptr(alg->tfms);
+}
+
+static void tcp_authopt_alg_put_tfm(struct tcp_authopt_alg_imp *alg, struct crypto_shash *tfm)
+{
+ WARN_ON(tfm != *this_cpu_ptr(alg->tfms));
+ local_bh_enable();
+}
+
+static struct crypto_shash *tcp_authopt_get_kdf_shash(struct tcp_authopt_key_info *key)
+{
+ return tcp_authopt_alg_get_tfm(key->alg);
+}
+
+static void tcp_authopt_put_kdf_shash(struct tcp_authopt_key_info *key,
+ struct crypto_shash *tfm)
+{
+ return tcp_authopt_alg_put_tfm(key->alg, tfm);
+}
+
+static struct crypto_shash *tcp_authopt_get_mac_shash(struct tcp_authopt_key_info *key)
+{
+ return tcp_authopt_alg_get_tfm(key->alg);
+}
+
+static void tcp_authopt_put_mac_shash(struct tcp_authopt_key_info *key,
+ struct crypto_shash *tfm)
+{
+ return tcp_authopt_alg_put_tfm(key->alg, tfm);
+}
+
/* checks that ipv4 or ipv6 addr matches. */
static bool ipvx_addr_match(struct sockaddr_storage *a1,
struct sockaddr_storage *a2)
{
if (a1->ss_family != a2->ss_family)
@@ -202,10 +356,11 @@ void tcp_authopt_clear(struct sock *sk)
int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
{
struct tcp_authopt_key opt;
struct tcp_authopt_info *info;
struct tcp_authopt_key_info *key_info, *old_key_info;
+ struct tcp_authopt_alg_imp *alg;
int err;

sock_owned_by_me(sk);

err = _copy_from_sockptr_tolerant((u8 *)&opt, sizeof(opt), optval, optlen);
@@ -239,10 +394,20 @@ int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
/* Initialize tcp_authopt_info if not already set */
info = __tcp_authopt_info_get_or_create(sk);
if (IS_ERR(info))
return PTR_ERR(info);

+ /* check the algorithm */
+ alg = tcp_authopt_alg_get(opt.alg);
+ if (!alg)
+ return -EINVAL;
+ if (WARN_ON_ONCE(alg->alg_id != opt.alg))
+ return -EINVAL;
+ err = tcp_authopt_alg_require(alg);
+ if (err)
+ return err;
+
key_info = sock_kmalloc(sk, sizeof(*key_info), GFP_KERNEL | __GFP_ZERO);
if (!key_info)
return -ENOMEM;
/* If an old key exists with exact ID then remove and replace.
* RCU-protected readers might observe both and pick any.
@@ -252,10 +417,11 @@ int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
tcp_authopt_key_del(sk, info, old_key_info);
key_info->flags = opt.flags & TCP_AUTHOPT_KEY_KNOWN_FLAGS;
key_info->send_id = opt.send_id;
key_info->recv_id = opt.recv_id;
key_info->alg_id = opt.alg;
+ key_info->alg = alg;
key_info->keylen = opt.keylen;
memcpy(key_info->key, opt.key, opt.keylen);
memcpy(&key_info->addr, &opt.addr, sizeof(key_info->addr));
hlist_add_head_rcu(&key_info->node, &info->head);

--
2.25.1


2021-12-08 11:38:30

by Leonard Crestez

[permalink] [raw]
Subject: [PATCH v3 05/18] tcp: authopt: Compute packet signatures

Computing tcp authopt packet signatures is a two step process:

* traffic key is computed based on tcp 4-tuple, initial sequence numbers
and the secret key.
* packet mac is computed based on traffic key and content of individual
packets.

The traffic key could be cached for established sockets but it is not.

A single code path exists for ipv4/ipv6 and input/output. This keeps the
code short but slightly slower due to lots of conditionals.

On output we read remote IP address from socket members on output, we
can't use skb network header because it's computed after TCP options.

On input we read remote IP address from skb network headers, we can't
use socket binding members because those are not available for SYN.

Signed-off-by: Leonard Crestez <[email protected]>
---
include/net/tcp_authopt.h | 9 +
net/ipv4/tcp_authopt.c | 537 +++++++++++++++++++++++++++++++++++---
2 files changed, 510 insertions(+), 36 deletions(-)

diff --git a/include/net/tcp_authopt.h b/include/net/tcp_authopt.h
index 5217b6c7c900..ce005f7ce797 100644
--- a/include/net/tcp_authopt.h
+++ b/include/net/tcp_authopt.h
@@ -64,10 +64,19 @@ struct tcp_authopt_info {
u32 src_isn;
/** @dst_isn: Remote Initial Sequence Number */
u32 dst_isn;
};

+/* TCP authopt as found in header */
+struct tcphdr_authopt {
+ u8 num;
+ u8 len;
+ u8 keyid;
+ u8 rnextkeyid;
+ u8 mac[0];
+};
+
#ifdef CONFIG_TCP_AUTHOPT
void tcp_authopt_clear(struct sock *sk);
int tcp_set_authopt(struct sock *sk, sockptr_t optval, unsigned int optlen);
int tcp_get_authopt_val(struct sock *sk, struct tcp_authopt *key);
int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen);
diff --git a/net/ipv4/tcp_authopt.c b/net/ipv4/tcp_authopt.c
index 478969d53094..29524ed56733 100644
--- a/net/ipv4/tcp_authopt.c
+++ b/net/ipv4/tcp_authopt.c
@@ -8,10 +8,15 @@
/* All current algorithms have a mac length of 12 but crypto API digestsize can be larger */
#define TCP_AUTHOPT_MAXMACBUF 20
#define TCP_AUTHOPT_MAX_TRAFFIC_KEY_LEN 20
#define TCP_AUTHOPT_MACLEN 12

+struct tcp_authopt_alg_pool {
+ struct crypto_ahash *tfm;
+ struct ahash_request *req;
+};
+
/* Constant data with per-algorithm information from RFC5926
* The "KDF" and "MAC" happen to be the same for both algorithms.
*/
struct tcp_authopt_alg_imp {
/* Name of algorithm in crypto-api */
@@ -19,14 +24,14 @@ struct tcp_authopt_alg_imp {
/* One of the TCP_AUTHOPT_ALG_* constants from uapi */
u8 alg_id;
/* Length of traffic key */
u8 traffic_key_len;

- /* shared crypto_shash */
+ /* shared crypto_ahash */
struct mutex init_mutex;
bool init_done;
- struct crypto_shash * __percpu *tfms;
+ struct tcp_authopt_alg_pool __percpu *pool;
};

static struct tcp_authopt_alg_imp tcp_authopt_alg_list[] = {
{
.alg_id = TCP_AUTHOPT_ALG_HMAC_SHA_1_96,
@@ -48,59 +53,77 @@ static inline struct tcp_authopt_alg_imp *tcp_authopt_alg_get(int alg_num)
if (alg_num <= 0 || alg_num > 2)
return NULL;
return &tcp_authopt_alg_list[alg_num - 1];
}

+static int tcp_authopt_alg_pool_init(struct tcp_authopt_alg_imp *alg,
+ struct tcp_authopt_alg_pool *pool)
+{
+ pool->tfm = crypto_alloc_ahash(alg->alg_name, 0, CRYPTO_ALG_ASYNC);
+ if (IS_ERR(pool->tfm))
+ return PTR_ERR(pool->tfm);
+
+ pool->req = ahash_request_alloc(pool->tfm, GFP_ATOMIC);
+ if (IS_ERR(pool->req))
+ return PTR_ERR(pool->req);
+ ahash_request_set_callback(pool->req, 0, NULL, NULL);
+
+ return 0;
+}
+
+static void tcp_authopt_alg_pool_free(struct tcp_authopt_alg_pool *pool)
+{
+ ahash_request_free(pool->req);
+ pool->req = NULL;
+ crypto_free_ahash(pool->tfm);
+ pool->tfm = NULL;
+}
+
static void __tcp_authopt_alg_free(struct tcp_authopt_alg_imp *alg)
{
int cpu;
- struct crypto_shash *tfm;
+ struct tcp_authopt_alg_pool *pool;

- if (!alg->tfms)
+ if (!alg->pool)
return;
for_each_possible_cpu(cpu) {
- tfm = *per_cpu_ptr(alg->tfms, cpu);
- if (tfm) {
- crypto_free_shash(tfm);
- *per_cpu_ptr(alg->tfms, cpu) = NULL;
- }
+ pool = per_cpu_ptr(alg->pool, cpu);
+ tcp_authopt_alg_pool_free(pool);
}
- free_percpu(alg->tfms);
- alg->tfms = NULL;
+ free_percpu(alg->pool);
+ alg->pool = NULL;
}

static int __tcp_authopt_alg_init(struct tcp_authopt_alg_imp *alg)
{
- struct crypto_shash *tfm;
+ struct tcp_authopt_alg_pool *pool;
int cpu;
int err;

BUILD_BUG_ON(TCP_AUTHOPT_MAXMACBUF < TCPOLEN_AUTHOPT_OUTPUT);
if (WARN_ON_ONCE(alg->traffic_key_len > TCP_AUTHOPT_MAX_TRAFFIC_KEY_LEN))
return -ENOBUFS;

- alg->tfms = alloc_percpu(struct crypto_shash *);
- if (!alg->tfms)
+ alg->pool = alloc_percpu(struct tcp_authopt_alg_pool);
+ if (!alg->pool)
return -ENOMEM;
for_each_possible_cpu(cpu) {
- tfm = crypto_alloc_shash(alg->alg_name, 0, 0);
- if (IS_ERR(tfm)) {
- err = PTR_ERR(tfm);
+ pool = per_cpu_ptr(alg->pool, cpu);
+ err = tcp_authopt_alg_pool_init(alg, pool);
+ if (err)
goto out_err;
- }

+ pool = per_cpu_ptr(alg->pool, cpu);
/* sanity checks: */
- if (WARN_ON_ONCE(crypto_shash_digestsize(tfm) != alg->traffic_key_len)) {
+ if (WARN_ON_ONCE(crypto_ahash_digestsize(pool->tfm) != alg->traffic_key_len)) {
err = -EINVAL;
goto out_err;
}
- if (WARN_ON_ONCE(crypto_shash_digestsize(tfm) > TCP_AUTHOPT_MAXMACBUF)) {
+ if (WARN_ON_ONCE(crypto_ahash_digestsize(pool->tfm) > TCP_AUTHOPT_MAXMACBUF)) {
err = -EINVAL;
goto out_err;
}
-
- *per_cpu_ptr(alg->tfms, cpu) = tfm;
}
return 0;

out_err:
__tcp_authopt_alg_free(alg);
@@ -123,42 +146,43 @@ static int tcp_authopt_alg_require(struct tcp_authopt_alg_imp *alg)
out:
mutex_unlock(&alg->init_mutex);
return err;
}

-static struct crypto_shash *tcp_authopt_alg_get_tfm(struct tcp_authopt_alg_imp *alg)
+static struct tcp_authopt_alg_pool *tcp_authopt_alg_get_pool(struct tcp_authopt_alg_imp *alg)
{
local_bh_disable();
- return *this_cpu_ptr(alg->tfms);
+ return this_cpu_ptr(alg->pool);
}

-static void tcp_authopt_alg_put_tfm(struct tcp_authopt_alg_imp *alg, struct crypto_shash *tfm)
+static void tcp_authopt_alg_put_pool(struct tcp_authopt_alg_imp *alg,
+ struct tcp_authopt_alg_pool *pool)
{
- WARN_ON(tfm != *this_cpu_ptr(alg->tfms));
+ WARN_ON(pool != this_cpu_ptr(alg->pool));
local_bh_enable();
}

-static struct crypto_shash *tcp_authopt_get_kdf_shash(struct tcp_authopt_key_info *key)
+static struct tcp_authopt_alg_pool *tcp_authopt_get_kdf_pool(struct tcp_authopt_key_info *key)
{
- return tcp_authopt_alg_get_tfm(key->alg);
+ return tcp_authopt_alg_get_pool(key->alg);
}

-static void tcp_authopt_put_kdf_shash(struct tcp_authopt_key_info *key,
- struct crypto_shash *tfm)
+static void tcp_authopt_put_kdf_pool(struct tcp_authopt_key_info *key,
+ struct tcp_authopt_alg_pool *pool)
{
- return tcp_authopt_alg_put_tfm(key->alg, tfm);
+ return tcp_authopt_alg_put_pool(key->alg, pool);
}

-static struct crypto_shash *tcp_authopt_get_mac_shash(struct tcp_authopt_key_info *key)
+static struct tcp_authopt_alg_pool *tcp_authopt_get_mac_pool(struct tcp_authopt_key_info *key)
{
- return tcp_authopt_alg_get_tfm(key->alg);
+ return tcp_authopt_alg_get_pool(key->alg);
}

-static void tcp_authopt_put_mac_shash(struct tcp_authopt_key_info *key,
- struct crypto_shash *tfm)
+static void tcp_authopt_put_mac_pool(struct tcp_authopt_key_info *key,
+ struct tcp_authopt_alg_pool *pool)
{
- return tcp_authopt_alg_put_tfm(key->alg, tfm);
+ return tcp_authopt_alg_put_pool(key->alg, pool);
}

/* checks that ipv4 or ipv6 addr matches. */
static bool ipvx_addr_match(struct sockaddr_storage *a1,
struct sockaddr_storage *a2)
@@ -425,5 +449,446 @@ int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
memcpy(&key_info->addr, &opt.addr, sizeof(key_info->addr));
hlist_add_head_rcu(&key_info->node, &info->head);

return 0;
}
+
+static int tcp_authopt_get_isn(struct sock *sk,
+ struct tcp_authopt_info *info,
+ struct sk_buff *skb,
+ int input,
+ __be32 *sisn,
+ __be32 *disn)
+{
+ struct tcphdr *th = tcp_hdr(skb);
+
+ /* Special cases for SYN and SYN/ACK */
+ if (th->syn && !th->ack) {
+ *sisn = th->seq;
+ *disn = 0;
+ return 0;
+ }
+ if (th->syn && th->ack) {
+ *sisn = th->seq;
+ *disn = htonl(ntohl(th->ack_seq) - 1);
+ return 0;
+ }
+
+ if (sk->sk_state == TCP_NEW_SYN_RECV) {
+ struct tcp_request_sock *rsk = (struct tcp_request_sock *)sk;
+
+ if (WARN_ONCE(!input, "Caller passed wrong socket"))
+ return -EINVAL;
+ *sisn = htonl(rsk->rcv_isn);
+ *disn = htonl(rsk->snt_isn);
+ return 0;
+ } else if (sk->sk_state == TCP_LISTEN) {
+ /* Signature computation for non-syn packet on a listen
+ * socket is not possible because we lack the initial
+ * sequence numbers.
+ *
+ * Input segments that are not matched by any request,
+ * established or timewait socket will get here. These
+ * are not normally sent by peers.
+ *
+ * Their signature might be valid but we don't have
+ * enough state to determine that. TCP-MD5 can attempt
+ * to validate and reply with a signed RST because it
+ * doesn't care about ISNs.
+ *
+ * Reporting an error from signature code causes the
+ * packet to be discarded which is good.
+ */
+ if (WARN_ONCE(!input, "Caller passed wrong socket"))
+ return -EINVAL;
+ *sisn = 0;
+ *disn = 0;
+ return 0;
+ }
+ if (WARN_ONCE(!info, "caller did not pass tcp_authopt_info\n"))
+ return -EINVAL;
+ /* Initial sequence numbers for ESTABLISHED connections from info */
+ if (input) {
+ *sisn = htonl(info->dst_isn);
+ *disn = htonl(info->src_isn);
+ } else {
+ *sisn = htonl(info->src_isn);
+ *disn = htonl(info->dst_isn);
+ }
+ return 0;
+}
+
+/* Feed one buffer into ahash
+ * The buffer is assumed to be DMA-able
+ */
+static int crypto_ahash_buf(struct ahash_request *req, u8 *buf, uint len)
+{
+ struct scatterlist sg;
+
+ sg_init_one(&sg, buf, len);
+ ahash_request_set_crypt(req, &sg, NULL, len);
+
+ return crypto_ahash_update(req);
+}
+
+/* feed traffic key into ahash */
+static int tcp_authopt_ahash_traffic_key(struct tcp_authopt_alg_pool *pool,
+ struct sock *sk,
+ struct sk_buff *skb,
+ struct tcp_authopt_info *info,
+ bool input,
+ bool ipv6)
+{
+ struct tcphdr *th = tcp_hdr(skb);
+ int err;
+ __be32 sisn, disn;
+ __be16 digestbits = htons(crypto_ahash_digestsize(pool->tfm) * 8);
+
+ // RFC5926 section 3.1.1.1
+ err = crypto_ahash_buf(pool->req, "\x01TCP-AO", 7);
+ if (err)
+ return err;
+
+ /* Addresses from packet on input and from sk_common on output
+ * This is because on output MAC is computed before prepending IP header
+ */
+ if (input) {
+ if (ipv6)
+ err = crypto_ahash_buf(pool->req, (u8 *)&ipv6_hdr(skb)->saddr, 32);
+ else
+ err = crypto_ahash_buf(pool->req, (u8 *)&ip_hdr(skb)->saddr, 8);
+ if (err)
+ return err;
+ } else {
+ if (ipv6) {
+ err = crypto_ahash_buf(pool->req, (u8 *)&sk->sk_v6_rcv_saddr, 16);
+ if (err)
+ return err;
+ err = crypto_ahash_buf(pool->req, (u8 *)&sk->sk_v6_daddr, 16);
+ if (err)
+ return err;
+ } else {
+ err = crypto_ahash_buf(pool->req, (u8 *)&sk->sk_rcv_saddr, 4);
+ if (err)
+ return err;
+ err = crypto_ahash_buf(pool->req, (u8 *)&sk->sk_daddr, 4);
+ if (err)
+ return err;
+ }
+ }
+
+ /* TCP ports from header */
+ err = crypto_ahash_buf(pool->req, (u8 *)&th->source, 4);
+ if (err)
+ return err;
+ err = tcp_authopt_get_isn(sk, info, skb, input, &sisn, &disn);
+ if (err)
+ return err;
+ err = crypto_ahash_buf(pool->req, (u8 *)&sisn, 4);
+ if (err)
+ return err;
+ err = crypto_ahash_buf(pool->req, (u8 *)&disn, 4);
+ if (err)
+ return err;
+ err = crypto_ahash_buf(pool->req, (u8 *)&digestbits, 2);
+ if (err)
+ return err;
+
+ return 0;
+}
+
+/* Convert a variable-length key to a 16-byte fixed-length key for AES-CMAC
+ * This is described in RFC5926 section 3.1.1.2
+ */
+static int aes_setkey_derived(struct crypto_ahash *tfm, struct ahash_request *req,
+ u8 *key, size_t keylen)
+{
+ static const u8 zeros[16] = {0};
+ struct scatterlist sg;
+ u8 derived_key[16];
+ int err;
+
+ if (WARN_ON_ONCE(crypto_ahash_digestsize(tfm) != sizeof(derived_key)))
+ return -EINVAL;
+ err = crypto_ahash_setkey(tfm, zeros, sizeof(zeros));
+ if (err)
+ return err;
+ err = crypto_ahash_init(req);
+ if (err)
+ return err;
+ sg_init_one(&sg, key, keylen);
+ ahash_request_set_crypt(req, &sg, derived_key, keylen);
+ err = crypto_ahash_digest(req);
+ if (err)
+ return err;
+ return crypto_ahash_setkey(tfm, derived_key, sizeof(derived_key));
+}
+
+static int tcp_authopt_setkey(struct tcp_authopt_alg_pool *pool, struct tcp_authopt_key_info *key)
+{
+ if (key->alg_id == TCP_AUTHOPT_ALG_AES_128_CMAC_96 && key->keylen != 16)
+ return aes_setkey_derived(pool->tfm, pool->req, key->key, key->keylen);
+ else
+ return crypto_ahash_setkey(pool->tfm, key->key, key->keylen);
+}
+
+static int tcp_authopt_get_traffic_key(struct sock *sk,
+ struct sk_buff *skb,
+ struct tcp_authopt_key_info *key,
+ struct tcp_authopt_info *info,
+ bool input,
+ bool ipv6,
+ u8 *traffic_key)
+{
+ struct tcp_authopt_alg_pool *pool;
+ int err;
+
+ pool = tcp_authopt_get_kdf_pool(key);
+ if (IS_ERR(pool))
+ return PTR_ERR(pool);
+
+ err = tcp_authopt_setkey(pool, key);
+ if (err)
+ goto out;
+ err = crypto_ahash_init(pool->req);
+ if (err)
+ goto out;
+
+ err = tcp_authopt_ahash_traffic_key(pool, sk, skb, info, input, ipv6);
+ if (err)
+ goto out;
+
+ ahash_request_set_crypt(pool->req, NULL, traffic_key, 0);
+ err = crypto_ahash_final(pool->req);
+ if (err)
+ return err;
+
+out:
+ tcp_authopt_put_kdf_pool(key, pool);
+ return err;
+}
+
+static int crypto_ahash_buf_zero(struct ahash_request *req, int len)
+{
+ u8 zeros[TCP_AUTHOPT_MACLEN] = {0};
+ int buflen, err;
+
+ /* In practice this is always called with len exactly 12.
+ * Even on input we drop unusual signature sizes early.
+ */
+ while (len) {
+ buflen = min_t(int, len, sizeof(zeros));
+ err = crypto_ahash_buf(req, zeros, buflen);
+ if (err)
+ return err;
+ len -= buflen;
+ }
+
+ return 0;
+}
+
+static int tcp_authopt_hash_tcp4_pseudoheader(struct tcp_authopt_alg_pool *pool,
+ __be32 saddr,
+ __be32 daddr,
+ int nbytes)
+{
+ struct tcp4_pseudohdr phdr = {
+ .saddr = saddr,
+ .daddr = daddr,
+ .pad = 0,
+ .protocol = IPPROTO_TCP,
+ .len = htons(nbytes)
+ };
+ return crypto_ahash_buf(pool->req, (u8 *)&phdr, sizeof(phdr));
+}
+
+static int tcp_authopt_hash_tcp6_pseudoheader(struct tcp_authopt_alg_pool *pool,
+ struct in6_addr *saddr,
+ struct in6_addr *daddr,
+ u32 plen)
+{
+ int err;
+ __be32 buf[2];
+
+ buf[0] = htonl(plen);
+ buf[1] = htonl(IPPROTO_TCP);
+
+ err = crypto_ahash_buf(pool->req, (u8 *)saddr, sizeof(*saddr));
+ if (err)
+ return err;
+ err = crypto_ahash_buf(pool->req, (u8 *)daddr, sizeof(*daddr));
+ if (err)
+ return err;
+ return crypto_ahash_buf(pool->req, (u8 *)&buf, sizeof(buf));
+}
+
+/** Hash tcphdr options.
+ *
+ * If include_options is false then only the TCPOPT_AUTHOPT option itself is hashed
+ * Point to AO inside TH is passed by the caller
+ */
+static int tcp_authopt_hash_opts(struct tcp_authopt_alg_pool *pool,
+ struct tcphdr *th,
+ struct tcphdr_authopt *aoptr,
+ bool include_options)
+{
+ int err;
+ /* start of options */
+ u8 *tcp_opts = (u8 *)(th + 1);
+ /* start of options */
+ u8 *aobuf = (u8 *)aoptr;
+ u8 aolen = aoptr->len;
+
+ if (WARN_ONCE(aoptr->num != TCPOPT_AUTHOPT, "Bad aoptr\n"))
+ return -EINVAL;
+
+ if (include_options) {
+ /* end of options */
+ u8 *tcp_data = ((u8 *)th) + th->doff * 4;
+
+ err = crypto_ahash_buf(pool->req, tcp_opts, aobuf - tcp_opts + 4);
+ if (err)
+ return err;
+ err = crypto_ahash_buf_zero(pool->req, aolen - 4);
+ if (err)
+ return err;
+ err = crypto_ahash_buf(pool->req, aobuf + aolen, tcp_data - (aobuf + aolen));
+ if (err)
+ return err;
+ } else {
+ err = crypto_ahash_buf(pool->req, aobuf, 4);
+ if (err)
+ return err;
+ err = crypto_ahash_buf_zero(pool->req, aolen - 4);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+static int tcp_authopt_hash_packet(struct tcp_authopt_alg_pool *pool,
+ struct sock *sk,
+ struct sk_buff *skb,
+ struct tcphdr_authopt *aoptr,
+ bool input,
+ bool ipv6,
+ bool include_options,
+ u8 *macbuf)
+{
+ struct tcphdr *th = tcp_hdr(skb);
+ int err;
+
+ /* NOTE: SNE unimplemented */
+ __be32 sne = 0;
+
+ err = crypto_ahash_init(pool->req);
+ if (err)
+ return err;
+
+ err = crypto_ahash_buf(pool->req, (u8 *)&sne, 4);
+ if (err)
+ return err;
+
+ if (ipv6) {
+ struct in6_addr *saddr;
+ struct in6_addr *daddr;
+
+ if (input) {
+ saddr = &ipv6_hdr(skb)->saddr;
+ daddr = &ipv6_hdr(skb)->daddr;
+ } else {
+ saddr = &sk->sk_v6_rcv_saddr;
+ daddr = &sk->sk_v6_daddr;
+ }
+ err = tcp_authopt_hash_tcp6_pseudoheader(pool, saddr, daddr, skb->len);
+ if (err)
+ return err;
+ } else {
+ __be32 saddr;
+ __be32 daddr;
+
+ if (input) {
+ saddr = ip_hdr(skb)->saddr;
+ daddr = ip_hdr(skb)->daddr;
+ } else {
+ saddr = sk->sk_rcv_saddr;
+ daddr = sk->sk_daddr;
+ }
+ err = tcp_authopt_hash_tcp4_pseudoheader(pool, saddr, daddr, skb->len);
+ if (err)
+ return err;
+ }
+
+ // TCP header with checksum set to zero
+ {
+ struct tcphdr hashed_th = *th;
+
+ hashed_th.check = 0;
+ err = crypto_ahash_buf(pool->req, (u8 *)&hashed_th, sizeof(hashed_th));
+ if (err)
+ return err;
+ }
+
+ // TCP options
+ err = tcp_authopt_hash_opts(pool, th, aoptr, include_options);
+ if (err)
+ return err;
+
+ // Rest of SKB->data
+ err = tcp_sig_hash_skb_data(pool->req, skb, th->doff << 2);
+ if (err)
+ return err;
+
+ ahash_request_set_crypt(pool->req, NULL, macbuf, 0);
+ return crypto_ahash_final(pool->req);
+}
+
+/* __tcp_authopt_calc_mac - Compute packet MAC using key
+ *
+ * The macbuf output buffer must be large enough to fit the digestsize of the
+ * underlying transform before truncation.
+ * This means TCP_AUTHOPT_MAXMACBUF, not TCP_AUTHOPT_MACLEN
+ */
+static int __tcp_authopt_calc_mac(struct sock *sk,
+ struct sk_buff *skb,
+ struct tcphdr_authopt *aoptr,
+ struct tcp_authopt_key_info *key,
+ struct tcp_authopt_info *info,
+ bool input,
+ char *macbuf)
+{
+ struct tcp_authopt_alg_pool *mac_pool;
+ u8 traffic_key[TCP_AUTHOPT_MAX_TRAFFIC_KEY_LEN];
+ int err;
+ bool ipv6 = (sk->sk_family != AF_INET);
+
+ if (sk->sk_family != AF_INET && sk->sk_family != AF_INET6)
+ return -EINVAL;
+
+ err = tcp_authopt_get_traffic_key(sk, skb, key, info, input, ipv6, traffic_key);
+ if (err)
+ return err;
+
+ mac_pool = tcp_authopt_get_mac_pool(key);
+ if (IS_ERR(mac_pool))
+ return PTR_ERR(mac_pool);
+ err = crypto_ahash_setkey(mac_pool->tfm, traffic_key, key->alg->traffic_key_len);
+ if (err)
+ goto out;
+ err = crypto_ahash_init(mac_pool->req);
+ if (err)
+ return err;
+
+ err = tcp_authopt_hash_packet(mac_pool,
+ sk,
+ skb,
+ aoptr,
+ input,
+ ipv6,
+ !(key->flags & TCP_AUTHOPT_KEY_EXCLUDE_OPTS),
+ macbuf);
+
+out:
+ tcp_authopt_put_mac_pool(key, mac_pool);
+ return err;
+}
--
2.25.1


2021-12-08 11:38:35

by Leonard Crestez

[permalink] [raw]
Subject: [PATCH v3 06/18] tcp: authopt: Hook into tcp core

The tcp_authopt features exposes a minimal interface to the rest of the
TCP stack. Only a few functions are exposed and if the feature is
disabled they return neutral values, avoiding ifdefs in the rest of the
code. This approach is different from MD5.

There very few interactions with MD5 but tcp_parse_md5sig_option was
modifed to parse AO and MD5 simultaneously. If both are present the
packet is droppped as required by RFC5925.

Add calls into tcp authopt from send, receive and accept code.

Signed-off-by: Leonard Crestez <[email protected]>
---
include/net/tcp.h | 24 +++-
include/net/tcp_authopt.h | 135 ++++++++++++++++++
include/uapi/linux/snmp.h | 1 +
net/ipv4/proc.c | 1 +
net/ipv4/tcp_authopt.c | 293 ++++++++++++++++++++++++++++++++++++++
net/ipv4/tcp_input.c | 40 ++++--
net/ipv4/tcp_ipv4.c | 50 ++++++-
net/ipv4/tcp_minisocks.c | 12 ++
net/ipv4/tcp_output.c | 85 ++++++++++-
net/ipv6/tcp_ipv6.c | 49 ++++++-
10 files changed, 665 insertions(+), 25 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 1a0513b0ead0..eed4bbfdca78 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -422,11 +422,33 @@ int tcp_mmap(struct file *file, struct socket *sock,
struct vm_area_struct *vma);
#endif
void tcp_parse_options(const struct net *net, const struct sk_buff *skb,
struct tcp_options_received *opt_rx,
int estab, struct tcp_fastopen_cookie *foc);
-const u8 *tcp_parse_md5sig_option(const struct tcphdr *th);
+#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AUTHOPT)
+int tcp_parse_sig_options(const struct tcphdr *th,
+ const u8 **md5ptr,
+ const u8 **aoptr);
+#else
+static inline int tcp_parse_sig_options(const struct tcphdr *th,
+ const u8 **md5ptr,
+ const u8 **aoptr)
+{
+ aoptr = NULL;
+ md5ptr = NULL;
+ return 0;
+}
+#endif
+static inline const u8 *tcp_parse_md5sig_option(const struct tcphdr *th)
+{
+ const u8 *md5, *ao;
+ int ret;
+
+ ret = tcp_parse_sig_options(th, &md5, &ao);
+
+ return (md5 && !ao && !ret) ? md5 : NULL;
+}

/*
* BPF SKB-less helpers
*/
u16 tcp_v4_get_syncookie(struct sock *sk, struct iphdr *iph,
diff --git a/include/net/tcp_authopt.h b/include/net/tcp_authopt.h
index ce005f7ce797..fb07394de261 100644
--- a/include/net/tcp_authopt.h
+++ b/include/net/tcp_authopt.h
@@ -74,28 +74,163 @@ struct tcphdr_authopt {
u8 rnextkeyid;
u8 mac[0];
};

#ifdef CONFIG_TCP_AUTHOPT
+DECLARE_STATIC_KEY_FALSE(tcp_authopt_needed_key);
+#define tcp_authopt_needed (static_branch_unlikely(&tcp_authopt_needed_key))
+
+void tcp_authopt_free(struct sock *sk, struct tcp_authopt_info *info);
void tcp_authopt_clear(struct sock *sk);
int tcp_set_authopt(struct sock *sk, sockptr_t optval, unsigned int optlen);
int tcp_get_authopt_val(struct sock *sk, struct tcp_authopt *key);
int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen);
+struct tcp_authopt_key_info *__tcp_authopt_select_key(
+ const struct sock *sk,
+ struct tcp_authopt_info *info,
+ const struct sock *addr_sk,
+ u8 *rnextkeyid);
+static inline struct tcp_authopt_key_info *tcp_authopt_select_key(
+ const struct sock *sk,
+ const struct sock *addr_sk,
+ struct tcp_authopt_info **info,
+ u8 *rnextkeyid)
+{
+ if (tcp_authopt_needed) {
+ *info = rcu_dereference(tcp_sk(sk)->authopt_info);
+
+ if (*info)
+ return __tcp_authopt_select_key(sk, *info, addr_sk, rnextkeyid);
+ }
+ return NULL;
+}
+int tcp_authopt_hash(
+ char *hash_location,
+ struct tcp_authopt_key_info *key,
+ struct tcp_authopt_info *info,
+ struct sock *sk, struct sk_buff *skb);
+int __tcp_authopt_openreq(struct sock *newsk, const struct sock *oldsk, struct request_sock *req);
+static inline int tcp_authopt_openreq(
+ struct sock *newsk,
+ const struct sock *oldsk,
+ struct request_sock *req)
+{
+ if (!rcu_dereference(tcp_sk(oldsk)->authopt_info))
+ return 0;
+ else
+ return __tcp_authopt_openreq(newsk, oldsk, req);
+}
+void __tcp_authopt_finish_connect(struct sock *sk, struct sk_buff *skb,
+ struct tcp_authopt_info *info);
+static inline void tcp_authopt_finish_connect(struct sock *sk, struct sk_buff *skb)
+{
+ struct tcp_authopt_info *info;
+
+ if (tcp_authopt_needed) {
+ info = rcu_dereference_protected(tcp_sk(sk)->authopt_info,
+ lockdep_sock_is_held(sk));
+
+ if (info)
+ __tcp_authopt_finish_connect(sk, skb, info);
+ }
+}
+static inline void tcp_authopt_time_wait(
+ struct tcp_timewait_sock *tcptw,
+ struct tcp_sock *tp)
+{
+ if (tcp_authopt_needed) {
+ /* Transfer ownership of authopt_info to the twsk
+ * This requires no other users of the origin sock.
+ */
+ sock_owned_by_me((struct sock *)tp);
+ tcptw->tw_authopt_info = tp->authopt_info;
+ tp->authopt_info = NULL;
+ } else {
+ tcptw->tw_authopt_info = NULL;
+ }
+}
+/** tcp_authopt_inbound_check - check for valid TCP-AO signature.
+ *
+ * Return negative ERRNO on error, 0 if not present and 1 if present and valid.
+ *
+ * If the AO signature is present and valid then caller skips MD5 check.
+ */
+int __tcp_authopt_inbound_check(
+ struct sock *sk,
+ struct sk_buff *skb,
+ struct tcp_authopt_info *info,
+ const u8 *opt);
+static inline int tcp_authopt_inbound_check(struct sock *sk, struct sk_buff *skb, const u8 *opt)
+{
+ if (tcp_authopt_needed) {
+ struct tcp_authopt_info *info = rcu_dereference(tcp_sk(sk)->authopt_info);
+
+ if (info)
+ return __tcp_authopt_inbound_check(sk, skb, info, opt);
+ }
+ return 0;
+}
+static inline int tcp_authopt_inbound_check_req(struct request_sock *req, struct sk_buff *skb,
+ const u8 *opt)
+{
+ if (tcp_authopt_needed) {
+ struct sock *lsk = req->rsk_listener;
+ struct tcp_authopt_info *info = rcu_dereference(tcp_sk(lsk)->authopt_info);
+
+ if (info)
+ return __tcp_authopt_inbound_check((struct sock *)req, skb, info, opt);
+ }
+ return 0;
+}
#else
static inline int tcp_set_authopt(struct sock *sk, sockptr_t optval, unsigned int optlen)
{
return -ENOPROTOOPT;
}
static inline int tcp_get_authopt_val(struct sock *sk, struct tcp_authopt *key)
{
return -ENOPROTOOPT;
}
+static inline void tcp_authopt_free(struct sock *sk, struct tcp_authopt_info *info)
+{
+}
static inline void tcp_authopt_clear(struct sock *sk)
{
}
static inline int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
{
return -ENOPROTOOPT;
}
+static inline int tcp_authopt_hash(
+ char *hash_location,
+ struct tcp_authopt_key_info *key,
+ struct tcp_authopt_key *info,
+ struct sock *sk, struct sk_buff *skb)
+{
+ return -EINVAL;
+}
+static inline int tcp_authopt_openreq(struct sock *newsk,
+ const struct sock *oldsk,
+ struct request_sock *req)
+{
+ return 0;
+}
+static inline void tcp_authopt_finish_connect(struct sock *sk, struct sk_buff *skb)
+{
+}
+static inline void tcp_authopt_time_wait(
+ struct tcp_timewait_sock *tcptw,
+ struct tcp_sock *tp)
+{
+}
+static inline int tcp_authopt_inbound_check(struct sock *sk, struct sk_buff *skb, const u8 *opt)
+{
+ return 0;
+}
+static inline int tcp_authopt_inbound_check_req(struct request_sock *sk, struct sk_buff *skb,
+ const u8 *opt)
+{
+ return 0;
+}
#endif

#endif /* _LINUX_TCP_AUTHOPT_H */
diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h
index 904909d020e2..1d96030889a1 100644
--- a/include/uapi/linux/snmp.h
+++ b/include/uapi/linux/snmp.h
@@ -290,10 +290,11 @@ enum
LINUX_MIB_TCPDUPLICATEDATAREHASH, /* TCPDuplicateDataRehash */
LINUX_MIB_TCPDSACKRECVSEGS, /* TCPDSACKRecvSegs */
LINUX_MIB_TCPDSACKIGNOREDDUBIOUS, /* TCPDSACKIgnoredDubious */
LINUX_MIB_TCPMIGRATEREQSUCCESS, /* TCPMigrateReqSuccess */
LINUX_MIB_TCPMIGRATEREQFAILURE, /* TCPMigrateReqFailure */
+ LINUX_MIB_TCPAUTHOPTFAILURE, /* TCPAuthOptFailure */
__LINUX_MIB_MAX
};

/* linux Xfrm mib definitions */
enum
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index f30273afb539..70f7a8a47045 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -295,10 +295,11 @@ static const struct snmp_mib snmp4_net_list[] = {
SNMP_MIB_ITEM("TcpDuplicateDataRehash", LINUX_MIB_TCPDUPLICATEDATAREHASH),
SNMP_MIB_ITEM("TCPDSACKRecvSegs", LINUX_MIB_TCPDSACKRECVSEGS),
SNMP_MIB_ITEM("TCPDSACKIgnoredDubious", LINUX_MIB_TCPDSACKIGNOREDDUBIOUS),
SNMP_MIB_ITEM("TCPMigrateReqSuccess", LINUX_MIB_TCPMIGRATEREQSUCCESS),
SNMP_MIB_ITEM("TCPMigrateReqFailure", LINUX_MIB_TCPMIGRATEREQFAILURE),
+ SNMP_MIB_ITEM("TCPAuthOptFailure", LINUX_MIB_TCPAUTHOPTFAILURE),
SNMP_MIB_SENTINEL
};

static void icmpmsg_put_line(struct seq_file *seq, unsigned long *vals,
unsigned short *type, int count)
diff --git a/net/ipv4/tcp_authopt.c b/net/ipv4/tcp_authopt.c
index 29524ed56733..dd9b89b1f137 100644
--- a/net/ipv4/tcp_authopt.c
+++ b/net/ipv4/tcp_authopt.c
@@ -3,10 +3,14 @@
#include <linux/kernel.h>
#include <net/tcp.h>
#include <net/tcp_authopt.h>
#include <crypto/hash.h>

+/* This is enabled when first struct tcp_authopt_info is allocated and never released */
+DEFINE_STATIC_KEY_FALSE(tcp_authopt_needed_key);
+EXPORT_SYMBOL(tcp_authopt_needed_key);
+
/* All current algorithms have a mac length of 12 but crypto API digestsize can be larger */
#define TCP_AUTHOPT_MAXMACBUF 20
#define TCP_AUTHOPT_MAX_TRAFFIC_KEY_LEN 20
#define TCP_AUTHOPT_MACLEN 12

@@ -214,10 +218,55 @@ static bool tcp_authopt_key_match_exact(struct tcp_authopt_key_info *info,
return false;

return true;
}

+static bool tcp_authopt_key_match_skb_addr(struct tcp_authopt_key_info *key,
+ struct sk_buff *skb)
+{
+ u16 keyaf = key->addr.ss_family;
+ struct iphdr *iph = (struct iphdr *)skb_network_header(skb);
+
+ if (keyaf == AF_INET && iph->version == 4) {
+ struct sockaddr_in *key_addr = (struct sockaddr_in *)&key->addr;
+
+ return iph->saddr == key_addr->sin_addr.s_addr;
+ } else if (keyaf == AF_INET6 && iph->version == 6) {
+ struct ipv6hdr *ip6h = (struct ipv6hdr *)skb_network_header(skb);
+ struct sockaddr_in6 *key_addr = (struct sockaddr_in6 *)&key->addr;
+
+ return ipv6_addr_equal(&ip6h->saddr, &key_addr->sin6_addr);
+ }
+
+ /* This actually happens with ipv6-mapped-ipv4-addresses
+ * IPv6 listen sockets will be asked to validate ipv4 packets.
+ */
+ return false;
+}
+
+static bool tcp_authopt_key_match_sk_addr(struct tcp_authopt_key_info *key,
+ const struct sock *addr_sk)
+{
+ u16 keyaf = key->addr.ss_family;
+
+ /* This probably can't happen even with ipv4-mapped-ipv6 */
+ if (keyaf != addr_sk->sk_family)
+ return false;
+
+ if (keyaf == AF_INET) {
+ struct sockaddr_in *key_addr = (struct sockaddr_in *)&key->addr;
+
+ return addr_sk->sk_daddr == key_addr->sin_addr.s_addr;
+ } else if (keyaf == AF_INET6) {
+ struct sockaddr_in6 *key_addr = (struct sockaddr_in6 *)&key->addr;
+
+ return ipv6_addr_equal(&addr_sk->sk_v6_daddr, &key_addr->sin6_addr);
+ }
+
+ return false;
+}
+
static struct tcp_authopt_key_info *tcp_authopt_key_lookup_exact(const struct sock *sk,
struct tcp_authopt_info *info,
struct tcp_authopt_key *ukey)
{
struct tcp_authopt_key_info *key_info;
@@ -227,10 +276,50 @@ static struct tcp_authopt_key_info *tcp_authopt_key_lookup_exact(const struct so
return key_info;

return NULL;
}

+static struct tcp_authopt_key_info *tcp_authopt_lookup_send(struct tcp_authopt_info *info,
+ const struct sock *addr_sk,
+ int send_id)
+{
+ struct tcp_authopt_key_info *result = NULL;
+ struct tcp_authopt_key_info *key;
+
+ hlist_for_each_entry_rcu(key, &info->head, node, 0) {
+ if (send_id >= 0 && key->send_id != send_id)
+ continue;
+ if (key->flags & TCP_AUTHOPT_KEY_ADDR_BIND)
+ if (!tcp_authopt_key_match_sk_addr(key, addr_sk))
+ continue;
+ if (result && net_ratelimit())
+ pr_warn("ambiguous tcp authentication keys configured for send\n");
+ result = key;
+ }
+
+ return result;
+}
+
+/**
+ * __tcp_authopt_select_key - select key for sending
+ *
+ * @sk: socket
+ * @info: socket's tcp_authopt_info
+ * @addr_sk: socket used for address lookup. Same as sk except for synack case
+ * @rnextkeyid: value of rnextkeyid caller should write in packet
+ *
+ * Result is protected by RCU and can't be stored, it may only be passed to
+ * tcp_authopt_hash and only under a single rcu_read_lock.
+ */
+struct tcp_authopt_key_info *__tcp_authopt_select_key(const struct sock *sk,
+ struct tcp_authopt_info *info,
+ const struct sock *addr_sk,
+ u8 *rnextkeyid)
+{
+ return tcp_authopt_lookup_send(info, addr_sk, -1);
+}
+
static struct tcp_authopt_info *__tcp_authopt_info_get_or_create(struct sock *sk)
{
struct tcp_sock *tp = tcp_sk(sk);
struct tcp_authopt_info *info;

@@ -240,10 +329,12 @@ static struct tcp_authopt_info *__tcp_authopt_info_get_or_create(struct sock *sk

info = kzalloc(sizeof(*info), GFP_KERNEL);
if (!info)
return ERR_PTR(-ENOMEM);

+ /* Never released: */
+ static_branch_inc(&tcp_authopt_needed_key);
sk_gso_disable(sk);
INIT_HLIST_HEAD(&info->head);
rcu_assign_pointer(tp->authopt_info, info);

return info;
@@ -528,10 +619,71 @@ static int crypto_ahash_buf(struct ahash_request *req, u8 *buf, uint len)
ahash_request_set_crypt(req, &sg, NULL, len);

return crypto_ahash_update(req);
}

+static int tcp_authopt_clone_keys(struct sock *newsk,
+ const struct sock *oldsk,
+ struct tcp_authopt_info *new_info,
+ struct tcp_authopt_info *old_info)
+{
+ struct tcp_authopt_key_info *old_key;
+ struct tcp_authopt_key_info *new_key;
+
+ hlist_for_each_entry_rcu(old_key, &old_info->head, node, lockdep_sock_is_held(oldsk)) {
+ new_key = sock_kmalloc(newsk, sizeof(*new_key), GFP_ATOMIC);
+ if (!new_key)
+ return -ENOMEM;
+ memcpy(new_key, old_key, sizeof(*new_key));
+ hlist_add_head_rcu(&new_key->node, &new_info->head);
+ }
+
+ return 0;
+}
+
+/** Called to create accepted sockets.
+ *
+ * Need to copy authopt info from listen socket.
+ */
+int __tcp_authopt_openreq(struct sock *newsk, const struct sock *oldsk, struct request_sock *req)
+{
+ struct tcp_authopt_info *old_info;
+ struct tcp_authopt_info *new_info;
+ int err;
+
+ old_info = rcu_dereference(tcp_sk(oldsk)->authopt_info);
+ if (!old_info)
+ return 0;
+
+ /* Clear value copies from oldsk: */
+ rcu_assign_pointer(tcp_sk(newsk)->authopt_info, NULL);
+
+ new_info = kzalloc(sizeof(*new_info), GFP_ATOMIC);
+ if (!new_info)
+ return -ENOMEM;
+
+ new_info->src_isn = tcp_rsk(req)->snt_isn;
+ new_info->dst_isn = tcp_rsk(req)->rcv_isn;
+ INIT_HLIST_HEAD(&new_info->head);
+ err = tcp_authopt_clone_keys(newsk, oldsk, new_info, old_info);
+ if (err) {
+ tcp_authopt_free(newsk, new_info);
+ return err;
+ }
+ sk_gso_disable(newsk);
+ rcu_assign_pointer(tcp_sk(newsk)->authopt_info, new_info);
+
+ return 0;
+}
+
+void __tcp_authopt_finish_connect(struct sock *sk, struct sk_buff *skb,
+ struct tcp_authopt_info *info)
+{
+ info->src_isn = ntohl(tcp_hdr(skb)->ack_seq) - 1;
+ info->dst_isn = ntohl(tcp_hdr(skb)->seq);
+}
+
/* feed traffic key into ahash */
static int tcp_authopt_ahash_traffic_key(struct tcp_authopt_alg_pool *pool,
struct sock *sk,
struct sk_buff *skb,
struct tcp_authopt_info *info,
@@ -768,10 +920,11 @@ static int tcp_authopt_hash_opts(struct tcp_authopt_alg_pool *pool,

static int tcp_authopt_hash_packet(struct tcp_authopt_alg_pool *pool,
struct sock *sk,
struct sk_buff *skb,
struct tcphdr_authopt *aoptr,
+ struct tcp_authopt_info *info,
bool input,
bool ipv6,
bool include_options,
u8 *macbuf)
{
@@ -881,14 +1034,154 @@ static int __tcp_authopt_calc_mac(struct sock *sk,

err = tcp_authopt_hash_packet(mac_pool,
sk,
skb,
aoptr,
+ info,
input,
ipv6,
!(key->flags & TCP_AUTHOPT_KEY_EXCLUDE_OPTS),
macbuf);

out:
tcp_authopt_put_mac_pool(key, mac_pool);
return err;
}
+
+/* tcp_authopt_hash - fill in the mac
+ *
+ * The key must come from tcp_authopt_select_key.
+ */
+int tcp_authopt_hash(char *hash_location,
+ struct tcp_authopt_key_info *key,
+ struct tcp_authopt_info *info,
+ struct sock *sk,
+ struct sk_buff *skb)
+{
+ /* MAC inside option is truncated to 12 bytes but crypto API needs output
+ * buffer to be large enough so we use a buffer on the stack.
+ */
+ u8 macbuf[TCP_AUTHOPT_MAXMACBUF];
+ int err;
+ struct tcphdr_authopt *aoptr = (struct tcphdr_authopt *)(hash_location - 4);
+
+ err = __tcp_authopt_calc_mac(sk, skb, aoptr, key, info, false, macbuf);
+ if (err)
+ goto fail;
+ memcpy(hash_location, macbuf, TCP_AUTHOPT_MACLEN);
+
+ return 0;
+
+fail:
+ /* If mac calculation fails and caller doesn't handle the error
+ * try to make it obvious inside the packet.
+ */
+ memset(hash_location, 0, TCP_AUTHOPT_MACLEN);
+ return err;
+}
+
+static struct tcp_authopt_key_info *tcp_authopt_lookup_recv(struct sock *sk,
+ struct sk_buff *skb,
+ struct tcp_authopt_info *info,
+ int recv_id,
+ bool *anykey)
+{
+ struct tcp_authopt_key_info *result = NULL;
+ struct tcp_authopt_key_info *key;
+
+ *anykey = false;
+ /* multiple matches will cause occasional failures */
+ hlist_for_each_entry_rcu(key, &info->head, node, 0) {
+ if (key->flags & TCP_AUTHOPT_KEY_ADDR_BIND &&
+ !tcp_authopt_key_match_skb_addr(key, skb))
+ continue;
+ *anykey = true;
+ if (recv_id >= 0 && key->recv_id != recv_id)
+ continue;
+ if (!result)
+ result = key;
+ else if (result)
+ net_warn_ratelimited("ambiguous tcp authentication keys configured for recv\n");
+ }
+
+ return result;
+}
+
+/* Show a rate-limited message for authentication fail */
+static void print_tcpao_notice(const char *msg, struct sk_buff *skb)
+{
+ struct iphdr *iph = (struct iphdr *)skb_network_header(skb);
+ struct tcphdr *th = (struct tcphdr *)skb_transport_header(skb);
+
+ if (iph->version == 4) {
+ net_info_ratelimited("%s (%pI4, %d)->(%pI4, %d)\n", msg,
+ &iph->saddr, ntohs(th->source),
+ &iph->daddr, ntohs(th->dest));
+ } else if (iph->version == 6) {
+ struct ipv6hdr *ip6h = (struct ipv6hdr *)skb_network_header(skb);
+
+ net_info_ratelimited("%s (%pI6, %d)->(%pI6, %d)\n", msg,
+ &ip6h->saddr, ntohs(th->source),
+ &ip6h->daddr, ntohs(th->dest));
+ } else {
+ WARN_ONCE(1, "%s unknown IP version\n", msg);
+ }
+}
+
+int __tcp_authopt_inbound_check(struct sock *sk, struct sk_buff *skb,
+ struct tcp_authopt_info *info, const u8 *_opt)
+{
+ struct tcphdr_authopt *opt = (struct tcphdr_authopt *)_opt;
+ struct tcp_authopt_key_info *key;
+ bool anykey;
+ u8 macbuf[TCP_AUTHOPT_MAXMACBUF];
+ int err;
+
+ key = tcp_authopt_lookup_recv(sk, skb, info, opt ? opt->keyid : -1, &anykey);
+
+ /* nothing found or expected */
+ if (!opt && !key)
+ return 0;
+ if (!opt && key) {
+ NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAUTHOPTFAILURE);
+ print_tcpao_notice("TCP Authentication Missing", skb);
+ return -EINVAL;
+ }
+ if (opt && !anykey) {
+ /* RFC5925 Section 7.3:
+ * A TCP-AO implementation MUST allow for configuration of the behavior
+ * of segments with TCP-AO but that do not match an MKT. The initial
+ * default of this configuration SHOULD be to silently accept such
+ * connections.
+ */
+ if (info->flags & TCP_AUTHOPT_FLAG_REJECT_UNEXPECTED) {
+ NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAUTHOPTFAILURE);
+ print_tcpao_notice("TCP Authentication Unexpected: Rejected", skb);
+ return -EINVAL;
+ }
+ print_tcpao_notice("TCP Authentication Unexpected: Accepted", skb);
+ return 0;
+ }
+ if (opt && !key) {
+ /* Keys are configured for peer but with different keyid than packet */
+ NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAUTHOPTFAILURE);
+ print_tcpao_notice("TCP Authentication Failed", skb);
+ return -EINVAL;
+ }
+
+ /* bad inbound key len */
+ if (opt->len != TCPOLEN_AUTHOPT_OUTPUT)
+ return -EINVAL;
+
+ err = __tcp_authopt_calc_mac(sk, skb, opt, key, info, true, macbuf);
+ if (err)
+ return err;
+
+ if (memcmp(macbuf, opt->mac, TCP_AUTHOPT_MACLEN)) {
+ NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAUTHOPTFAILURE);
+ print_tcpao_notice("TCP Authentication Failed", skb);
+ return -EINVAL;
+ }
+
+ return 1;
+}
+EXPORT_SYMBOL(__tcp_authopt_inbound_check);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 3658b9c3dd2b..4c9e403971fb 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -70,10 +70,11 @@
#include <linux/sysctl.h>
#include <linux/kernel.h>
#include <linux/prefetch.h>
#include <net/dst.h>
#include <net/tcp.h>
+#include <net/tcp_authopt.h>
#include <net/inet_common.h>
#include <linux/ipsec.h>
#include <asm/unaligned.h>
#include <linux/errqueue.h>
#include <trace/events/tcp.h>
@@ -4173,43 +4174,60 @@ static bool tcp_fast_parse_options(const struct net *net,
tp->rx_opt.rcv_tsecr -= tp->tsoffset;

return true;
}

-#ifdef CONFIG_TCP_MD5SIG
+#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AUTHOPT)
/*
- * Parse MD5 Signature option
+ * Parse MD5 and AO options
+ *
+ * md5ptr: pointer to content of MD5 option (16-byte hash)
+ * aoptr: pointer to start of AO option (variable length)
*/
-const u8 *tcp_parse_md5sig_option(const struct tcphdr *th)
+int tcp_parse_sig_options(const struct tcphdr *th, const u8 **md5ptr, const u8 **aoptr)
{
int length = (th->doff << 2) - sizeof(*th);
const u8 *ptr = (const u8 *)(th + 1);

+ *md5ptr = NULL;
+ *aoptr = NULL;
+
/* If not enough data remaining, we can short cut */
- while (length >= TCPOLEN_MD5SIG) {
+ while (length >= 4) {
int opcode = *ptr++;
int opsize;

switch (opcode) {
case TCPOPT_EOL:
- return NULL;
+ goto out;
case TCPOPT_NOP:
length--;
continue;
default:
opsize = *ptr++;
if (opsize < 2 || opsize > length)
- return NULL;
- if (opcode == TCPOPT_MD5SIG)
- return opsize == TCPOLEN_MD5SIG ? ptr : NULL;
+ goto out;
+ if (opcode == TCPOPT_MD5SIG && opsize == TCPOLEN_MD5SIG)
+ *md5ptr = ptr;
+ if (opcode == TCPOPT_AUTHOPT)
+ *aoptr = ptr - 2;
}
ptr += opsize - 2;
length -= opsize;
}
- return NULL;
+
+out:
+ /* RFC5925 2.2: An endpoint MUST NOT use TCP-AO for the same connection
+ * in which TCP MD5 is used. When both options appear, TCP MUST silently
+ * discard the segment.
+ */
+ if (*md5ptr && *aoptr)
+ return -EINVAL;
+
+ return 0;
}
-EXPORT_SYMBOL(tcp_parse_md5sig_option);
+EXPORT_SYMBOL(tcp_parse_sig_options);
#endif

/* Sorry, PAWS as specified is broken wrt. pure-ACKs -DaveM
*
* It is not fatal. If this ACK does _not_ change critical state (seqs, window)
@@ -5992,10 +6010,12 @@ void tcp_finish_connect(struct sock *sk, struct sk_buff *skb)
struct inet_connection_sock *icsk = inet_csk(sk);

tcp_set_state(sk, TCP_ESTABLISHED);
icsk->icsk_ack.lrcvtime = tcp_jiffies32;

+ tcp_authopt_finish_connect(sk, skb);
+
if (skb) {
icsk->icsk_af_ops->sk_rx_dst_set(sk, skb);
security_inet_conn_established(sk, skb);
sk_mark_napi_id(sk, skb);
}
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 91cad11db32e..b16f263c3121 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1405,22 +1405,20 @@ EXPORT_SYMBOL(tcp_v4_md5_hash_skb);
#endif

/* Called with rcu_read_lock() */
static bool tcp_v4_inbound_md5_hash(const struct sock *sk,
const struct sk_buff *skb,
- int dif, int sdif)
+ int dif, int sdif,
+ const u8 *hash_location)
{
#ifdef CONFIG_TCP_MD5SIG
/*
- * This gets called for each TCP segment that arrives
- * so we want to be efficient.
* We have 3 drop cases:
* o No MD5 hash and one expected.
* o MD5 hash and we're not expecting one.
* o MD5 hash and its wrong.
*/
- const __u8 *hash_location = NULL;
struct tcp_md5sig_key *hash_expected;
const struct iphdr *iph = ip_hdr(skb);
const struct tcphdr *th = tcp_hdr(skb);
const union tcp_md5_addr *addr;
unsigned char newhash[16];
@@ -1431,11 +1429,10 @@ static bool tcp_v4_inbound_md5_hash(const struct sock *sk,
*/
l3index = sdif ? dif : 0;

addr = (union tcp_md5_addr *)&iph->saddr;
hash_expected = tcp_md5_do_lookup(sk, l3index, addr, AF_INET);
- hash_location = tcp_parse_md5sig_option(th);

/* We've parsed the options - do we have a hash? */
if (!hash_expected && !hash_location)
return false;

@@ -1954,10 +1951,49 @@ static void tcp_v4_fill_cb(struct sk_buff *skb, const struct iphdr *iph,
TCP_SKB_CB(skb)->sacked = 0;
TCP_SKB_CB(skb)->has_rxtstamp =
skb->tstamp || skb_hwtstamps(skb)->hwtstamp;
}

+static int tcp_v4_sig_check(struct sock *sk,
+ struct sk_buff *skb,
+ int dif,
+ int sdif)
+{
+ const u8 *md5, *ao;
+ int ret;
+
+ ret = tcp_parse_sig_options(tcp_hdr(skb), &md5, &ao);
+ if (ret)
+ return ret;
+ ret = tcp_authopt_inbound_check(sk, skb, ao);
+ if (ret < 0)
+ return ret;
+ if (ret == 1)
+ return 0;
+ return tcp_v4_inbound_md5_hash(sk, skb, dif, sdif, md5);
+}
+
+static int tcp_v4_sig_check_req(struct request_sock *req,
+ struct sk_buff *skb,
+ int dif,
+ int sdif)
+{
+ struct sock *lsk = req->rsk_listener;
+ const u8 *md5, *ao;
+ int ret;
+
+ ret = tcp_parse_sig_options(tcp_hdr(skb), &md5, &ao);
+ if (ret)
+ return ret;
+ ret = tcp_authopt_inbound_check_req(req, skb, ao);
+ if (ret < 0)
+ return ret;
+ if (ret == 1)
+ return 0;
+ return tcp_v4_inbound_md5_hash(lsk, skb, dif, sdif, md5);
+}
+
/*
* From tcp_input.c
*/

int tcp_v4_rcv(struct sk_buff *skb)
@@ -2011,11 +2047,11 @@ int tcp_v4_rcv(struct sk_buff *skb)
struct request_sock *req = inet_reqsk(sk);
bool req_stolen = false;
struct sock *nsk;

sk = req->rsk_listener;
- if (unlikely(tcp_v4_inbound_md5_hash(sk, skb, dif, sdif))) {
+ if (unlikely(tcp_v4_sig_check_req(req, skb, dif, sdif))) {
sk_drops_add(sk, skb);
reqsk_put(req);
goto discard_it;
}
if (tcp_checksum_complete(skb)) {
@@ -2081,11 +2117,11 @@ int tcp_v4_rcv(struct sk_buff *skb)
}

if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb))
goto discard_and_relse;

- if (tcp_v4_inbound_md5_hash(sk, skb, dif, sdif))
+ if (tcp_v4_sig_check(sk, skb, dif, sdif))
goto discard_and_relse;

nf_reset_ct(skb);

if (tcp_filter(sk, skb))
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index cf913a66df17..19f749b60231 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -18,10 +18,11 @@
* Arnt Gulbrandsen, <[email protected]>
* Jorge Cwik, <[email protected]>
*/

#include <net/tcp.h>
+#include <net/tcp_authopt.h>
#include <net/xfrm.h>
#include <net/busy_poll.h>

static bool tcp_in_window(u32 seq, u32 end_seq, u32 s_win, u32 e_win)
{
@@ -300,10 +301,11 @@ void tcp_time_wait(struct sock *sk, int state, int timeo)
BUG_ON(tcptw->tw_md5_key && !tcp_alloc_md5sig_pool());
}
}
} while (0);
#endif
+ tcp_authopt_time_wait(tcptw, tcp_sk(sk));

/* Get the TIME_WAIT timeout firing. */
if (timeo < rto)
timeo = rto;

@@ -342,10 +344,19 @@ void tcp_twsk_destructor(struct sock *sk)

if (twsk->tw_md5_key)
kfree_rcu(twsk->tw_md5_key, rcu);
}
#endif
+#ifdef CONFIG_TCP_AUTHOPT
+ if (tcp_authopt_needed) {
+ struct tcp_timewait_sock *twsk = tcp_twsk(sk);
+
+ /* twsk only contains sock_common so pass NULL as sk. */
+ if (twsk->tw_authopt_info)
+ tcp_authopt_free(NULL, twsk->tw_authopt_info);
+ }
+#endif
}
EXPORT_SYMBOL_GPL(tcp_twsk_destructor);

/* Warning : This function is called without sk_listener being locked.
* Be sure to read socket fields once, as their value could change under us.
@@ -532,10 +543,11 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
#ifdef CONFIG_TCP_MD5SIG
newtp->md5sig_info = NULL; /*XXX*/
if (newtp->af_specific->md5_lookup(sk, newsk))
newtp->tcp_header_len += TCPOLEN_MD5SIG_ALIGNED;
#endif
+ tcp_authopt_openreq(newsk, sk, req);
if (skb->len >= TCP_MSS_DEFAULT + newtp->tcp_header_len)
newicsk->icsk_ack.last_seg_size = skb->len - newtp->tcp_header_len;
newtp->rx_opt.mss_clamp = req->mss;
tcp_ecn_openreq_child(newtp, req);
newtp->fastopen_req = NULL;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 5079832af5c1..b959e8b949b6 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -37,10 +37,11 @@

#define pr_fmt(fmt) "TCP: " fmt

#include <net/tcp.h>
#include <net/mptcp.h>
+#include <net/tcp_authopt.h>

#include <linux/compiler.h>
#include <linux/gfp.h>
#include <linux/module.h>
#include <linux/static_key.h>
@@ -410,10 +411,11 @@ static inline bool tcp_urg_mode(const struct tcp_sock *tp)

#define OPTION_SACK_ADVERTISE BIT(0)
#define OPTION_TS BIT(1)
#define OPTION_MD5 BIT(2)
#define OPTION_WSCALE BIT(3)
+#define OPTION_AUTHOPT BIT(4)
#define OPTION_FAST_OPEN_COOKIE BIT(8)
#define OPTION_SMC BIT(9)
#define OPTION_MPTCP BIT(10)

static void smc_options_write(__be32 *ptr, u16 *options)
@@ -434,16 +436,22 @@ static void smc_options_write(__be32 *ptr, u16 *options)
struct tcp_out_options {
u16 options; /* bit field of OPTION_* */
u16 mss; /* 0 to disable */
u8 ws; /* window scale, 0 to disable */
u8 num_sack_blocks; /* number of SACK blocks to include */
- u8 hash_size; /* bytes in hash_location */
u8 bpf_opt_len; /* length of BPF hdr option */
+#ifdef CONFIG_TCP_AUTHOPT
+ u8 authopt_rnextkeyid; /* rnextkey */
+#endif
__u8 *hash_location; /* temporary pointer, overloaded */
__u32 tsval, tsecr; /* need to include OPTION_TS */
struct tcp_fastopen_cookie *fastopen_cookie; /* Fast open cookie */
struct mptcp_out_options mptcp;
+#ifdef CONFIG_TCP_AUTHOPT
+ struct tcp_authopt_info *authopt_info;
+ struct tcp_authopt_key_info *authopt_key;
+#endif
};

static void mptcp_options_write(__be32 *ptr, const struct tcp_sock *tp,
struct tcp_out_options *opts)
{
@@ -616,10 +624,25 @@ static void tcp_options_write(__be32 *ptr, struct tcp_sock *tp,
/* overload cookie hash location */
opts->hash_location = (__u8 *)ptr;
ptr += 4;
}

+#ifdef CONFIG_TCP_AUTHOPT
+ if (unlikely(OPTION_AUTHOPT & options)) {
+ struct tcp_authopt_key_info *key = opts->authopt_key;
+
+ WARN_ON(!key);
+ *ptr = htonl((TCPOPT_AUTHOPT << 24) |
+ (TCPOLEN_AUTHOPT_OUTPUT << 16) |
+ (key->send_id << 8) |
+ opts->authopt_rnextkeyid);
+ /* overload cookie hash location */
+ opts->hash_location = (__u8 *)(ptr + 1);
+ ptr += TCPOLEN_AUTHOPT_OUTPUT / 4;
+ }
+#endif
+
if (unlikely(opts->mss)) {
*ptr++ = htonl((TCPOPT_MSS << 24) |
(TCPOLEN_MSS << 16) |
opts->mss);
}
@@ -751,10 +774,28 @@ static void mptcp_set_option_cond(const struct request_sock *req,
}
}
}
}

+static int tcp_authopt_init_options(const struct sock *sk,
+ const struct sock *addr_sk,
+ struct tcp_out_options *opts)
+{
+#ifdef CONFIG_TCP_AUTHOPT
+ struct tcp_authopt_key_info *key;
+
+ key = tcp_authopt_select_key(sk, addr_sk, &opts->authopt_info, &opts->authopt_rnextkeyid);
+ if (key) {
+ opts->options |= OPTION_AUTHOPT;
+ opts->authopt_key = key;
+ return TCPOLEN_AUTHOPT_OUTPUT;
+ }
+#endif
+
+ return 0;
+}
+
/* Compute TCP options for SYN packets. This is not the final
* network wire format yet.
*/
static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb,
struct tcp_out_options *opts,
@@ -763,12 +804,15 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb,
struct tcp_sock *tp = tcp_sk(sk);
unsigned int remaining = MAX_TCP_OPTION_SPACE;
struct tcp_fastopen_request *fastopen = tp->fastopen_req;

*md5 = NULL;
+
+ remaining -= tcp_authopt_init_options(sk, sk, opts);
#ifdef CONFIG_TCP_MD5SIG
if (static_branch_unlikely(&tcp_md5_needed) &&
+ !(opts->options & OPTION_AUTHOPT) &&
rcu_access_pointer(tp->md5sig_info)) {
*md5 = tp->af_specific->md5_lookup(sk, sk);
if (*md5) {
opts->options |= OPTION_MD5;
remaining -= TCPOLEN_MD5SIG_ALIGNED;
@@ -847,12 +891,13 @@ static unsigned int tcp_synack_options(const struct sock *sk,
struct sk_buff *syn_skb)
{
struct inet_request_sock *ireq = inet_rsk(req);
unsigned int remaining = MAX_TCP_OPTION_SPACE;

+ remaining -= tcp_authopt_init_options(sk, req_to_sk(req), opts);
#ifdef CONFIG_TCP_MD5SIG
- if (md5) {
+ if (md5 && !(opts->options & OPTION_AUTHOPT)) {
opts->options |= OPTION_MD5;
remaining -= TCPOLEN_MD5SIG_ALIGNED;

/* We can't fit any SACK blocks in a packet with MD5 + TS
* options. There was discussion about disabling SACK
@@ -918,13 +963,15 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
unsigned int size = 0;
unsigned int eff_sacks;

opts->options = 0;

+ size += tcp_authopt_init_options(sk, sk, opts);
*md5 = NULL;
#ifdef CONFIG_TCP_MD5SIG
if (static_branch_unlikely(&tcp_md5_needed) &&
+ !(opts->options & OPTION_AUTHOPT) &&
rcu_access_pointer(tp->md5sig_info)) {
*md5 = tp->af_specific->md5_lookup(sk, sk);
if (*md5) {
opts->options |= OPTION_MD5;
size += TCPOLEN_MD5SIG_ALIGNED;
@@ -1274,10 +1321,14 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,

inet = inet_sk(sk);
tcb = TCP_SKB_CB(skb);
memset(&opts, 0, sizeof(opts));

+#ifdef CONFIG_TCP_AUTHOPT
+ /* for tcp_authopt_init_options inside tcp_syn_options or tcp_established_options */
+ rcu_read_lock();
+#endif
if (unlikely(tcb->tcp_flags & TCPHDR_SYN)) {
tcp_options_size = tcp_syn_options(sk, skb, &opts, &md5);
} else {
tcp_options_size = tcp_established_options(sk, skb, &opts,
&md5);
@@ -1362,10 +1413,17 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,
sk_gso_disable(sk);
tp->af_specific->calc_md5_hash(opts.hash_location,
md5, sk, skb);
}
#endif
+#ifdef CONFIG_TCP_AUTHOPT
+ if (opts.authopt_key) {
+ sk_gso_disable(sk);
+ tcp_authopt_hash(opts.hash_location, opts.authopt_key, opts.authopt_info, sk, skb);
+ }
+ rcu_read_unlock();
+#endif

/* BPF prog is the last one writing header option */
bpf_skops_write_hdr_opt(sk, skb, NULL, NULL, 0, &opts);

INDIRECT_CALL_INET(icsk->icsk_af_ops->send_check,
@@ -1831,12 +1889,21 @@ unsigned int tcp_current_mss(struct sock *sk)
u32 mtu = dst_mtu(dst);
if (mtu != inet_csk(sk)->icsk_pmtu_cookie)
mss_now = tcp_sync_mss(sk, mtu);
}

+#ifdef CONFIG_TCP_AUTHOPT
+ /* Even if the result is not used rcu_read_lock is required when scanning for
+ * tcp authentication keys. Otherwise lockdep will complain.
+ */
+ rcu_read_lock();
+#endif
header_len = tcp_established_options(sk, NULL, &opts, &md5) +
sizeof(struct tcphdr);
+#ifdef CONFIG_TCP_AUTHOPT
+ rcu_read_unlock();
+#endif
/* The mss_cache is sized based on tp->tcp_header_len, which assumes
* some common options. If this is an odd packet (because we have SACK
* blocks etc) then our calculated header_len will be different, and
* we have to adjust mss_now correspondingly */
if (header_len != tp->tcp_header_len) {
@@ -3551,10 +3618,14 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
}

#ifdef CONFIG_TCP_MD5SIG
rcu_read_lock();
md5 = tcp_rsk(req)->af_specific->req_md5_lookup(sk, req_to_sk(req));
+#endif
+#ifdef CONFIG_TCP_AUTHOPT
+ /* for tcp_authopt_init_options inside tcp_synack_options */
+ rcu_read_lock();
#endif
skb_set_hash(skb, tcp_rsk(req)->txhash, PKT_HASH_TYPE_L4);
/* bpf program will be interested in the tcp_flags */
TCP_SKB_CB(skb)->tcp_flags = TCPHDR_SYN | TCPHDR_ACK;
tcp_header_size = tcp_synack_options(sk, req, mss, skb, &opts, md5,
@@ -3588,10 +3659,20 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
if (md5)
tcp_rsk(req)->af_specific->calc_md5_hash(opts.hash_location,
md5, req_to_sk(req), skb);
rcu_read_unlock();
#endif
+#ifdef CONFIG_TCP_AUTHOPT
+ /* If signature fails we do nothing */
+ if (opts.authopt_key)
+ tcp_authopt_hash(opts.hash_location,
+ opts.authopt_key,
+ opts.authopt_info,
+ req_to_sk(req),
+ skb);
+ rcu_read_unlock();
+#endif

bpf_skops_write_hdr_opt((struct sock *)sk, skb, req, syn_skb,
synack_type, &opts);

skb->skb_mstamp_ns = now;
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index e98fc6f12c61..3105a367d6b5 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -40,10 +40,11 @@
#include <linux/icmpv6.h>
#include <linux/random.h>
#include <linux/indirect_call_wrapper.h>

#include <net/tcp.h>
+#include <net/tcp_authopt.h>
#include <net/ndisc.h>
#include <net/inet6_hashtables.h>
#include <net/inet6_connection_sock.h>
#include <net/ipv6.h>
#include <net/transp_v6.h>
@@ -772,14 +773,14 @@ static int tcp_v6_md5_hash_skb(char *md5_hash,

#endif

static bool tcp_v6_inbound_md5_hash(const struct sock *sk,
const struct sk_buff *skb,
- int dif, int sdif)
+ int dif, int sdif,
+ const u8 *hash_location)
{
#ifdef CONFIG_TCP_MD5SIG
- const __u8 *hash_location = NULL;
struct tcp_md5sig_key *hash_expected;
const struct ipv6hdr *ip6h = ipv6_hdr(skb);
const struct tcphdr *th = tcp_hdr(skb);
int genhash, l3index;
u8 newhash[16];
@@ -788,11 +789,10 @@ static bool tcp_v6_inbound_md5_hash(const struct sock *sk,
* in an L3 domain and dif is set to the l3mdev
*/
l3index = sdif ? dif : 0;

hash_expected = tcp_v6_md5_do_lookup(sk, &ip6h->saddr, l3index);
- hash_location = tcp_parse_md5sig_option(th);

/* We've parsed the options - do we have a hash? */
if (!hash_expected && !hash_location)
return false;

@@ -1619,10 +1619,49 @@ static void tcp_v6_fill_cb(struct sk_buff *skb, const struct ipv6hdr *hdr,
TCP_SKB_CB(skb)->sacked = 0;
TCP_SKB_CB(skb)->has_rxtstamp =
skb->tstamp || skb_hwtstamps(skb)->hwtstamp;
}

+static int tcp_v6_sig_check(struct sock *sk,
+ struct sk_buff *skb,
+ int dif,
+ int sdif)
+{
+ const u8 *md5, *ao;
+ int ret;
+
+ ret = tcp_parse_sig_options(tcp_hdr(skb), &md5, &ao);
+ if (ret)
+ return ret;
+ ret = tcp_authopt_inbound_check(sk, skb, ao);
+ if (ret < 0)
+ return ret;
+ if (ret == 1)
+ return 0;
+ return tcp_v6_inbound_md5_hash(sk, skb, dif, sdif, md5);
+}
+
+static int tcp_v6_sig_check_req(struct request_sock *req,
+ struct sk_buff *skb,
+ int dif,
+ int sdif)
+{
+ struct sock *lsk = req->rsk_listener;
+ const u8 *md5, *ao;
+ int ret;
+
+ ret = tcp_parse_sig_options(tcp_hdr(skb), &md5, &ao);
+ if (ret)
+ return ret;
+ ret = tcp_authopt_inbound_check_req(req, skb, ao);
+ if (ret < 0)
+ return ret;
+ if (ret == 1)
+ return 0;
+ return tcp_v6_inbound_md5_hash(lsk, skb, dif, sdif, md5);
+}
+
INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
{
int sdif = inet6_sdif(skb);
int dif = inet6_iif(skb);
const struct tcphdr *th;
@@ -1671,11 +1710,11 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
struct request_sock *req = inet_reqsk(sk);
bool req_stolen = false;
struct sock *nsk;

sk = req->rsk_listener;
- if (tcp_v6_inbound_md5_hash(sk, skb, dif, sdif)) {
+ if (tcp_v6_sig_check_req(req, skb, dif, sdif)) {
sk_drops_add(sk, skb);
reqsk_put(req);
goto discard_it;
}
if (tcp_checksum_complete(skb)) {
@@ -1738,11 +1777,11 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
}

if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb))
goto discard_and_relse;

- if (tcp_v6_inbound_md5_hash(sk, skb, dif, sdif))
+ if (tcp_v6_sig_check(sk, skb, dif, sdif))
goto discard_and_relse;

if (tcp_filter(sk, skb))
goto discard_and_relse;
th = (const struct tcphdr *)skb->data;
--
2.25.1


2021-12-08 11:38:43

by Leonard Crestez

[permalink] [raw]
Subject: [PATCH v3 07/18] tcp: authopt: Disable via sysctl by default

This is mainly intended to protect against local privilege escalations
through a rarely used feature so it is deliberately not namespaced.

Enforcement is only at the setsockopt level, this should be enough to
ensure that the tcp_authopt_needed static key never turns on.

No effort is made to handle disabling when the feature is already in
use.

Signed-off-by: Leonard Crestez <[email protected]>
---
Documentation/networking/ip-sysctl.rst | 6 ++++
include/net/tcp_authopt.h | 1 +
net/ipv4/sysctl_net_ipv4.c | 39 ++++++++++++++++++++++++++
net/ipv4/tcp_authopt.c | 27 +++++++++++++++++-
4 files changed, 72 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index c04431144f7a..2fa992ef4e02 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -987,10 +987,16 @@ tcp_limit_output_bytes - INTEGER
tcp_challenge_ack_limit - INTEGER
Limits number of Challenge ACK sent per second, as recommended
in RFC 5961 (Improving TCP's Robustness to Blind In-Window Attacks)
Default: 1000

+tcp_authopt - BOOLEAN
+ Enable the TCP Authentication Option (RFC5925), a replacement for TCP
+ MD5 Signatures (RFC2835).
+
+ Default: 0
+
UDP variables
=============

udp_l3mdev_accept - BOOLEAN
Enabling this option allows a "global" bound socket to work
diff --git a/include/net/tcp_authopt.h b/include/net/tcp_authopt.h
index fb07394de261..4e872af7b180 100644
--- a/include/net/tcp_authopt.h
+++ b/include/net/tcp_authopt.h
@@ -76,10 +76,11 @@ struct tcphdr_authopt {
};

#ifdef CONFIG_TCP_AUTHOPT
DECLARE_STATIC_KEY_FALSE(tcp_authopt_needed_key);
#define tcp_authopt_needed (static_branch_unlikely(&tcp_authopt_needed_key))
+extern int sysctl_tcp_authopt;

void tcp_authopt_free(struct sock *sk, struct tcp_authopt_info *info);
void tcp_authopt_clear(struct sock *sk);
int tcp_set_authopt(struct sock *sk, sockptr_t optval, unsigned int optlen);
int tcp_get_authopt_val(struct sock *sk, struct tcp_authopt *key);
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 97eb54774924..07de2666314c 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -17,10 +17,11 @@
#include <net/udp.h>
#include <net/cipso_ipv4.h>
#include <net/ping.h>
#include <net/protocol.h>
#include <net/netevent.h>
+#include <net/tcp_authopt.h>

static int two = 2;
static int three __maybe_unused = 3;
static int four = 4;
static int thousand = 1000;
@@ -472,10 +473,37 @@ static int proc_fib_multipath_hash_fields(struct ctl_table *table, int write,

return ret;
}
#endif

+#ifdef CONFIG_TCP_AUTHOPT
+static int proc_tcp_authopt(struct ctl_table *ctl,
+ int write, void *buffer, size_t *lenp,
+ loff_t *ppos)
+{
+ int val = sysctl_tcp_authopt;
+ struct ctl_table tmp = {
+ .data = &val,
+ .mode = ctl->mode,
+ .maxlen = sizeof(val),
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ };
+ int err;
+
+ err = proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos);
+ if (err)
+ return err;
+ if (sysctl_tcp_authopt && !val) {
+ net_warn_ratelimited("Enabling TCP Authentication Option is permanent\n");
+ return -EINVAL;
+ }
+ sysctl_tcp_authopt = val;
+ return 0;
+}
+#endif
+
static struct ctl_table ipv4_table[] = {
{
.procname = "tcp_max_orphans",
.data = &sysctl_tcp_max_orphans,
.maxlen = sizeof(int),
@@ -583,10 +611,21 @@ static struct ctl_table ipv4_table[] = {
.mode = 0644,
.proc_handler = proc_douintvec_minmax,
.extra1 = &sysctl_fib_sync_mem_min,
.extra2 = &sysctl_fib_sync_mem_max,
},
+#ifdef CONFIG_TCP_AUTHOPT
+ {
+ .procname = "tcp_authopt",
+ .data = &sysctl_tcp_authopt,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_tcp_authopt,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+#endif
{ }
};

static struct ctl_table ipv4_net_table[] = {
{
diff --git a/net/ipv4/tcp_authopt.c b/net/ipv4/tcp_authopt.c
index dd9b89b1f137..85de430d3326 100644
--- a/net/ipv4/tcp_authopt.c
+++ b/net/ipv4/tcp_authopt.c
@@ -3,10 +3,15 @@
#include <linux/kernel.h>
#include <net/tcp.h>
#include <net/tcp_authopt.h>
#include <crypto/hash.h>

+/* This is mainly intended to protect against local privilege escalations through
+ * a rarely used feature so it is deliberately not namespaced.
+ */
+int sysctl_tcp_authopt;
+
/* This is enabled when first struct tcp_authopt_info is allocated and never released */
DEFINE_STATIC_KEY_FALSE(tcp_authopt_needed_key);
EXPORT_SYMBOL(tcp_authopt_needed_key);

/* All current algorithms have a mac length of 12 but crypto API digestsize can be larger */
@@ -377,17 +382,30 @@ static int _copy_from_sockptr_tolerant(u8 *dst,
memset(dst + srclen, 0, dstlen - srclen);

return err;
}

+static int check_sysctl_tcp_authopt(void)
+{
+ if (!sysctl_tcp_authopt) {
+ net_warn_ratelimited("TCP Authentication Option disabled by sysctl.\n");
+ return -EPERM;
+ }
+
+ return 0;
+}
+
int tcp_set_authopt(struct sock *sk, sockptr_t optval, unsigned int optlen)
{
struct tcp_authopt opt;
struct tcp_authopt_info *info;
int err;

sock_owned_by_me(sk);
+ err = check_sysctl_tcp_authopt();
+ if (err)
+ return err;

err = _copy_from_sockptr_tolerant((u8 *)&opt, sizeof(opt), optval, optlen);
if (err)
return err;

@@ -405,14 +423,18 @@ int tcp_set_authopt(struct sock *sk, sockptr_t optval, unsigned int optlen)

int tcp_get_authopt_val(struct sock *sk, struct tcp_authopt *opt)
{
struct tcp_sock *tp = tcp_sk(sk);
struct tcp_authopt_info *info;
+ int err;

+ memset(opt, 0, sizeof(*opt));
sock_owned_by_me(sk);
+ err = check_sysctl_tcp_authopt();
+ if (err)
+ return err;

- memset(opt, 0, sizeof(*opt));
info = rcu_dereference_check(tp->authopt_info, lockdep_sock_is_held(sk));
if (!info)
return -ENOENT;

opt->flags = info->flags & TCP_AUTHOPT_KNOWN_FLAGS;
@@ -475,10 +497,13 @@ int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
struct tcp_authopt_key_info *key_info, *old_key_info;
struct tcp_authopt_alg_imp *alg;
int err;

sock_owned_by_me(sk);
+ err = check_sysctl_tcp_authopt();
+ if (err)
+ return err;

err = _copy_from_sockptr_tolerant((u8 *)&opt, sizeof(opt), optval, optlen);
if (err)
return err;

--
2.25.1


2021-12-08 11:39:31

by Leonard Crestez

[permalink] [raw]
Subject: [PATCH v3 09/18] tcp: ipv6: Add AO signing for tcp_v6_send_response

This is a special code path for acks and resets outside of normal
connection establishment and closing.

Signed-off-by: Leonard Crestez <[email protected]>
---
net/ipv4/tcp_authopt.c | 2 ++
net/ipv6/tcp_ipv6.c | 57 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 59 insertions(+)

diff --git a/net/ipv4/tcp_authopt.c b/net/ipv4/tcp_authopt.c
index 05234923cb9f..f1213c7db63b 100644
--- a/net/ipv4/tcp_authopt.c
+++ b/net/ipv4/tcp_authopt.c
@@ -320,10 +320,11 @@ struct tcp_authopt_key_info *__tcp_authopt_select_key(const struct sock *sk,
const struct sock *addr_sk,
u8 *rnextkeyid)
{
return tcp_authopt_lookup_send(info, addr_sk, -1);
}
+EXPORT_SYMBOL(__tcp_authopt_select_key);

static struct tcp_authopt_info *__tcp_authopt_info_get_or_create(struct sock *sk)
{
struct tcp_sock *tp = tcp_sk(sk);
struct tcp_authopt_info *info;
@@ -1195,10 +1196,11 @@ int tcp_authopt_hash(char *hash_location,
* try to make it obvious inside the packet.
*/
memset(hash_location, 0, TCP_AUTHOPT_MACLEN);
return err;
}
+EXPORT_SYMBOL(tcp_authopt_hash);

static struct tcp_authopt_key_info *tcp_authopt_lookup_recv(struct sock *sk,
struct sk_buff *skb,
struct tcp_authopt_info *info,
int recv_id,
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 3105a367d6b5..cd8544d08a36 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -886,10 +886,48 @@ const struct tcp_request_sock_ops tcp_request_sock_ipv6_ops = {
.init_seq = tcp_v6_init_seq,
.init_ts_off = tcp_v6_init_ts_off,
.send_synack = tcp_v6_send_synack,
};

+#ifdef CONFIG_TCP_AUTHOPT
+static int tcp_v6_send_response_init_authopt(const struct sock *sk,
+ struct tcp_authopt_info **info,
+ struct tcp_authopt_key_info **key,
+ u8 *rnextkeyid)
+{
+ /* Key lookup before SKB allocation */
+ if (!(tcp_authopt_needed && sk))
+ return 0;
+ if (sk->sk_state == TCP_TIME_WAIT)
+ *info = tcp_twsk(sk)->tw_authopt_info;
+ else
+ *info = rcu_dereference(tcp_sk(sk)->authopt_info);
+ if (!*info)
+ return 0;
+ *key = __tcp_authopt_select_key(sk, *info, sk, rnextkeyid);
+ if (*key)
+ return TCPOLEN_AUTHOPT_OUTPUT;
+ return 0;
+}
+
+static void tcp_v6_send_response_sign_authopt(const struct sock *sk,
+ struct tcp_authopt_info *info,
+ struct tcp_authopt_key_info *key,
+ struct sk_buff *skb,
+ struct tcphdr_authopt *ptr,
+ u8 rnextkeyid)
+{
+ if (!(tcp_authopt_needed && key))
+ return;
+ ptr->num = TCPOPT_AUTHOPT;
+ ptr->len = TCPOLEN_AUTHOPT_OUTPUT;
+ ptr->keyid = key->send_id;
+ ptr->rnextkeyid = rnextkeyid;
+ tcp_authopt_hash(ptr->mac, key, info, (struct sock *)sk, skb);
+}
+#endif
+
static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32 seq,
u32 ack, u32 win, u32 tsval, u32 tsecr,
int oif, struct tcp_md5sig_key *key, int rst,
u8 tclass, __be32 label, u32 priority)
{
@@ -901,13 +939,28 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32
struct sock *ctl_sk = net->ipv6.tcp_sk;
unsigned int tot_len = sizeof(struct tcphdr);
__be32 mrst = 0, *topt;
struct dst_entry *dst;
__u32 mark = 0;
+#ifdef CONFIG_TCP_AUTHOPT
+ struct tcp_authopt_info *aoinfo;
+ struct tcp_authopt_key_info *aokey;
+ u8 aornextkeyid;
+ int aolen;
+#endif

if (tsecr)
tot_len += TCPOLEN_TSTAMP_ALIGNED;
+#ifdef CONFIG_TCP_AUTHOPT
+ /* Key lookup before SKB allocation */
+ aolen = tcp_v6_send_response_init_authopt(sk, &aoinfo, &aokey, &aornextkeyid);
+ if (aolen) {
+ tot_len += aolen;
+ /* Don't use MD5 */
+ key = NULL;
+ }
+#endif
#ifdef CONFIG_TCP_MD5SIG
if (key)
tot_len += TCPOLEN_MD5SIG_ALIGNED;
#endif

@@ -960,10 +1013,14 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32
tcp_v6_md5_hash_hdr((__u8 *)topt, key,
&ipv6_hdr(skb)->saddr,
&ipv6_hdr(skb)->daddr, t1);
}
#endif
+#ifdef CONFIG_TCP_AUTHOPT
+ tcp_v6_send_response_sign_authopt(sk, aoinfo, aokey, buff,
+ (struct tcphdr_authopt *)topt, aornextkeyid);
+#endif

memset(&fl6, 0, sizeof(fl6));
fl6.daddr = ipv6_hdr(skb)->saddr;
fl6.saddr = ipv6_hdr(skb)->daddr;
fl6.flowlabel = label;
--
2.25.1


2021-12-08 11:39:30

by Leonard Crestez

[permalink] [raw]
Subject: [PATCH v3 08/18] tcp: authopt: Implement Sequence Number Extension

Add a compute_sne function which finds the value of SNE for a certain
SEQ given an already known "recent" SNE/SEQ. This is implemented using
the standard tcp before/after macro and will work for SEQ values that
are without 2^31 of the SEQ for which we know the SNE.

For updating we advance the value for rcv_sne at the same time as
rcv_nxt and for snd_sne at the same time as snd_nxt. We could track
other values (for example snd_una) but this is good enough and works
very easily for timewait socket.

This implementation is different from RFC suggestions and doesn't
require additional flags. It does pass tests from this draft:
https://datatracker.ietf.org/doc/draft-touch-sne/

Signed-off-by: Leonard Crestez <[email protected]>
---
include/net/tcp_authopt.h | 34 ++++++++++++++
net/ipv4/tcp_authopt.c | 98 ++++++++++++++++++++++++++++++++++++++-
net/ipv4/tcp_input.c | 1 +
net/ipv4/tcp_output.c | 1 +
4 files changed, 132 insertions(+), 2 deletions(-)

diff --git a/include/net/tcp_authopt.h b/include/net/tcp_authopt.h
index 4e872af7b180..d5d344d599f7 100644
--- a/include/net/tcp_authopt.h
+++ b/include/net/tcp_authopt.h
@@ -62,10 +62,14 @@ struct tcp_authopt_info {
u32 flags;
/** @src_isn: Local Initial Sequence Number */
u32 src_isn;
/** @dst_isn: Remote Initial Sequence Number */
u32 dst_isn;
+ /** @rcv_sne: Recv-side Sequence Number Extension tracking tcp_sock.rcv_nxt */
+ u32 rcv_sne;
+ /** @snd_sne: Send-side Sequence Number Extension tracking tcp_sock.snd_nxt */
+ u32 snd_sne;
};

/* TCP authopt as found in header */
struct tcphdr_authopt {
u8 num;
@@ -180,10 +184,34 @@ static inline int tcp_authopt_inbound_check_req(struct request_sock *req, struct
if (info)
return __tcp_authopt_inbound_check((struct sock *)req, skb, info, opt);
}
return 0;
}
+void __tcp_authopt_update_rcv_sne(struct tcp_sock *tp, struct tcp_authopt_info *info, u32 seq);
+static inline void tcp_authopt_update_rcv_sne(struct tcp_sock *tp, u32 seq)
+{
+ struct tcp_authopt_info *info;
+
+ if (tcp_authopt_needed) {
+ info = rcu_dereference_protected(tp->authopt_info,
+ lockdep_sock_is_held((struct sock *)tp));
+ if (info)
+ __tcp_authopt_update_rcv_sne(tp, info, seq);
+ }
+}
+void __tcp_authopt_update_snd_sne(struct tcp_sock *tp, struct tcp_authopt_info *info, u32 seq);
+static inline void tcp_authopt_update_snd_sne(struct tcp_sock *tp, u32 seq)
+{
+ struct tcp_authopt_info *info;
+
+ if (tcp_authopt_needed) {
+ info = rcu_dereference_protected(tp->authopt_info,
+ lockdep_sock_is_held((struct sock *)tp));
+ if (info)
+ __tcp_authopt_update_snd_sne(tp, info, seq);
+ }
+}
#else
static inline int tcp_set_authopt(struct sock *sk, sockptr_t optval, unsigned int optlen)
{
return -ENOPROTOOPT;
}
@@ -230,8 +258,14 @@ static inline int tcp_authopt_inbound_check(struct sock *sk, struct sk_buff *skb
static inline int tcp_authopt_inbound_check_req(struct request_sock *sk, struct sk_buff *skb,
const u8 *opt)
{
return 0;
}
+static inline void tcp_authopt_update_rcv_sne(struct tcp_sock *tp, u32 seq)
+{
+}
+static inline void tcp_authopt_update_snd_sne(struct tcp_sock *tp, u32 seq)
+{
+}
#endif

#endif /* _LINUX_TCP_AUTHOPT_H */
diff --git a/net/ipv4/tcp_authopt.c b/net/ipv4/tcp_authopt.c
index 85de430d3326..05234923cb9f 100644
--- a/net/ipv4/tcp_authopt.c
+++ b/net/ipv4/tcp_authopt.c
@@ -631,10 +631,97 @@ static int tcp_authopt_get_isn(struct sock *sk,
*disn = htonl(info->dst_isn);
}
return 0;
}

+/* compute_sne - Calculate Sequence Number Extension
+ *
+ * Give old upper/lower 32bit values and a new lower 32bit value determine the
+ * new value of the upper 32 bit. The new sequence number can be 2^31 before or
+ * after prev_seq but TCP window scaling should limit this further.
+ *
+ * For correct accounting the stored SNE value should be only updated together
+ * with the SEQ.
+ */
+static u32 compute_sne(u32 sne, u32 prev_seq, u32 seq)
+{
+ if (before(seq, prev_seq)) {
+ if (seq > prev_seq)
+ --sne;
+ } else {
+ if (seq < prev_seq)
+ ++sne;
+ }
+
+ return sne;
+}
+
+/* Update rcv_sne, must be called immediately before rcv_nxt update */
+void __tcp_authopt_update_rcv_sne(struct tcp_sock *tp,
+ struct tcp_authopt_info *info, u32 seq)
+{
+ info->rcv_sne = compute_sne(info->rcv_sne, tp->rcv_nxt, seq);
+}
+
+/* Update snd_sne, must be called immediately before snd_nxt update */
+void __tcp_authopt_update_snd_sne(struct tcp_sock *tp,
+ struct tcp_authopt_info *info, u32 seq)
+{
+ info->snd_sne = compute_sne(info->snd_sne, tp->snd_nxt, seq);
+}
+
+/* Compute SNE for a specific packet (by seq). */
+static int compute_packet_sne(struct sock *sk, struct tcp_authopt_info *info,
+ u32 seq, bool input, __be32 *sne)
+{
+ u32 rcv_nxt, snd_nxt;
+
+ // For TCP_NEW_SYN_RECV we have no tcp_authopt_info but tcp_request_sock holds ISN.
+ if (sk->sk_state == TCP_NEW_SYN_RECV) {
+ struct tcp_request_sock *rsk = tcp_rsk((struct request_sock *)sk);
+
+ if (input)
+ *sne = htonl(compute_sne(0, rsk->rcv_isn, seq));
+ else
+ *sne = htonl(compute_sne(0, rsk->snt_isn, seq));
+ return 0;
+ }
+
+ /* TCP_LISTEN only receives SYN */
+ if (sk->sk_state == TCP_LISTEN && input)
+ return 0;
+
+ /* TCP_SYN_SENT only sends SYN and receives SYN/ACK
+ * For the input case rcv_nxt is initialized after the packet is
+ * validated so tcp_sk(sk)->rcv_nxt is not initialized.
+ */
+ if (sk->sk_state == TCP_SYN_SENT)
+ return 0;
+
+ if (sk->sk_state == TCP_TIME_WAIT) {
+ rcv_nxt = tcp_twsk(sk)->tw_rcv_nxt;
+ snd_nxt = tcp_twsk(sk)->tw_snd_nxt;
+ } else {
+ if (WARN_ONCE(!sk_fullsock(sk),
+ "unexpected minisock sk=%p state=%d", sk,
+ sk->sk_state))
+ return -EINVAL;
+ rcv_nxt = tcp_sk(sk)->rcv_nxt;
+ snd_nxt = tcp_sk(sk)->snd_nxt;
+ }
+
+ if (WARN_ONCE(!info, "unexpected missing info for sk=%p sk_state=%d", sk, sk->sk_state))
+ return -EINVAL;
+
+ if (input)
+ *sne = htonl(compute_sne(info->rcv_sne, rcv_nxt, seq));
+ else
+ *sne = htonl(compute_sne(info->snd_sne, snd_nxt, seq));
+
+ return 0;
+}
+
/* Feed one buffer into ahash
* The buffer is assumed to be DMA-able
*/
static int crypto_ahash_buf(struct ahash_request *req, u8 *buf, uint len)
{
@@ -686,10 +773,13 @@ int __tcp_authopt_openreq(struct sock *newsk, const struct sock *oldsk, struct r
if (!new_info)
return -ENOMEM;

new_info->src_isn = tcp_rsk(req)->snt_isn;
new_info->dst_isn = tcp_rsk(req)->rcv_isn;
+ /* Caller is tcp_create_openreq_child and already initializes snd_nxt/rcv_nxt */
+ new_info->snd_sne = compute_sne(0, new_info->src_isn, tcp_sk(newsk)->snd_nxt);
+ new_info->rcv_sne = compute_sne(0, new_info->dst_isn, tcp_sk(newsk)->rcv_nxt);
INIT_HLIST_HEAD(&new_info->head);
err = tcp_authopt_clone_keys(newsk, oldsk, new_info, old_info);
if (err) {
tcp_authopt_free(newsk, new_info);
return err;
@@ -703,10 +793,12 @@ int __tcp_authopt_openreq(struct sock *newsk, const struct sock *oldsk, struct r
void __tcp_authopt_finish_connect(struct sock *sk, struct sk_buff *skb,
struct tcp_authopt_info *info)
{
info->src_isn = ntohl(tcp_hdr(skb)->ack_seq) - 1;
info->dst_isn = ntohl(tcp_hdr(skb)->seq);
+ info->snd_sne = compute_sne(0, info->src_isn, tcp_sk(sk)->snd_nxt);
+ info->rcv_sne = compute_sne(0, info->dst_isn, tcp_sk(sk)->rcv_nxt);
}

/* feed traffic key into ahash */
static int tcp_authopt_ahash_traffic_key(struct tcp_authopt_alg_pool *pool,
struct sock *sk,
@@ -952,14 +1044,16 @@ static int tcp_authopt_hash_packet(struct tcp_authopt_alg_pool *pool,
bool ipv6,
bool include_options,
u8 *macbuf)
{
struct tcphdr *th = tcp_hdr(skb);
+ __be32 sne = 0;
int err;

- /* NOTE: SNE unimplemented */
- __be32 sne = 0;
+ err = compute_packet_sne(sk, info, ntohl(th->seq), input, &sne);
+ if (err)
+ return err;

err = crypto_ahash_init(pool->req);
if (err)
return err;

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 4c9e403971fb..bc0a90c72391 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3517,10 +3517,11 @@ static void tcp_snd_una_update(struct tcp_sock *tp, u32 ack)
static void tcp_rcv_nxt_update(struct tcp_sock *tp, u32 seq)
{
u32 delta = seq - tp->rcv_nxt;

sock_owned_by_me((struct sock *)tp);
+ tcp_authopt_update_rcv_sne(tp, seq);
tp->bytes_received += delta;
WRITE_ONCE(tp->rcv_nxt, seq);
}

/* Update our send window.
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index b959e8b949b6..6a6024a0b9e9 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -67,10 +67,11 @@ static void tcp_event_new_data_sent(struct sock *sk, struct sk_buff *skb)
{
struct inet_connection_sock *icsk = inet_csk(sk);
struct tcp_sock *tp = tcp_sk(sk);
unsigned int prior_packets = tp->packets_out;

+ tcp_authopt_update_snd_sne(tp, TCP_SKB_CB(skb)->end_seq);
WRITE_ONCE(tp->snd_nxt, TCP_SKB_CB(skb)->end_seq);

__skb_unlink(skb, &sk->sk_write_queue);
tcp_rbtree_insert(&sk->tcp_rtx_queue, skb);

--
2.25.1


2021-12-08 11:39:49

by Leonard Crestez

[permalink] [raw]
Subject: [PATCH v3 10/18] tcp: authopt: Add support for signing skb-less replies

This is required because tcp ipv4 sometimes sends replies without
allocating a full skb that can be signed by tcp authopt.

Handle this with additional code in tcp authopt.

Signed-off-by: Leonard Crestez <[email protected]>
---
include/net/tcp_authopt.h | 7 ++
net/ipv4/tcp_authopt.c | 144 ++++++++++++++++++++++++++++++++++++++
2 files changed, 151 insertions(+)

diff --git a/include/net/tcp_authopt.h b/include/net/tcp_authopt.h
index d5d344d599f7..411e7a0bdd43 100644
--- a/include/net/tcp_authopt.h
+++ b/include/net/tcp_authopt.h
@@ -111,10 +111,17 @@ static inline struct tcp_authopt_key_info *tcp_authopt_select_key(
int tcp_authopt_hash(
char *hash_location,
struct tcp_authopt_key_info *key,
struct tcp_authopt_info *info,
struct sock *sk, struct sk_buff *skb);
+int tcp_v4_authopt_hash_reply(
+ char *hash_location,
+ struct tcp_authopt_info *info,
+ struct tcp_authopt_key_info *key,
+ __be32 saddr,
+ __be32 daddr,
+ struct tcphdr *th);
int __tcp_authopt_openreq(struct sock *newsk, const struct sock *oldsk, struct request_sock *req);
static inline int tcp_authopt_openreq(
struct sock *newsk,
const struct sock *oldsk,
struct request_sock *req)
diff --git a/net/ipv4/tcp_authopt.c b/net/ipv4/tcp_authopt.c
index f1213c7db63b..a4f3eac20b29 100644
--- a/net/ipv4/tcp_authopt.c
+++ b/net/ipv4/tcp_authopt.c
@@ -935,10 +935,72 @@ static int tcp_authopt_get_traffic_key(struct sock *sk,
out:
tcp_authopt_put_kdf_pool(key, pool);
return err;
}

+struct tcp_v4_authopt_context_data {
+ __be32 saddr;
+ __be32 daddr;
+ __be16 sport;
+ __be16 dport;
+ __be32 sisn;
+ __be32 disn;
+ __be16 digestbits;
+} __packed;
+
+static int tcp_v4_authopt_get_traffic_key_noskb(struct tcp_authopt_key_info *key,
+ __be32 saddr,
+ __be32 daddr,
+ __be16 sport,
+ __be16 dport,
+ __be32 sisn,
+ __be32 disn,
+ u8 *traffic_key)
+{
+ int err;
+ struct tcp_authopt_alg_pool *pool;
+ struct tcp_v4_authopt_context_data data;
+
+ BUILD_BUG_ON(sizeof(data) != 22);
+
+ pool = tcp_authopt_get_kdf_pool(key);
+ if (IS_ERR(pool))
+ return PTR_ERR(pool);
+
+ err = tcp_authopt_setkey(pool, key);
+ if (err)
+ goto out;
+ err = crypto_ahash_init(pool->req);
+ if (err)
+ goto out;
+
+ // RFC5926 section 3.1.1.1
+ // Separate to keep alignment semi-sane
+ err = crypto_ahash_buf(pool->req, "\x01TCP-AO", 7);
+ if (err)
+ return err;
+ data.saddr = saddr;
+ data.daddr = daddr;
+ data.sport = sport;
+ data.dport = dport;
+ data.sisn = sisn;
+ data.disn = disn;
+ data.digestbits = htons(crypto_ahash_digestsize(pool->tfm) * 8);
+
+ err = crypto_ahash_buf(pool->req, (u8 *)&data, sizeof(data));
+ if (err)
+ goto out;
+ ahash_request_set_crypt(pool->req, NULL, traffic_key, 0);
+ err = crypto_ahash_final(pool->req);
+ if (err)
+ goto out;
+
+out:
+ tcp_authopt_put_kdf_pool(key, pool);
+ return err;
+}
+
static int crypto_ahash_buf_zero(struct ahash_request *req, int len)
{
u8 zeros[TCP_AUTHOPT_MACLEN] = {0};
int buflen, err;

@@ -1198,10 +1260,92 @@ int tcp_authopt_hash(char *hash_location,
memset(hash_location, 0, TCP_AUTHOPT_MACLEN);
return err;
}
EXPORT_SYMBOL(tcp_authopt_hash);

+/**
+ * tcp_v4_authopt_hash_reply - Hash tcp+ipv4 header without SKB
+ *
+ * @hash_location: output buffer
+ * @info: sending socket's tcp_authopt_info
+ * @key: signing key, from tcp_authopt_select_key.
+ * @saddr: source address
+ * @daddr: destination address
+ * @th: Pointer to TCP header and options
+ */
+int tcp_v4_authopt_hash_reply(char *hash_location,
+ struct tcp_authopt_info *info,
+ struct tcp_authopt_key_info *key,
+ __be32 saddr,
+ __be32 daddr,
+ struct tcphdr *th)
+{
+ struct tcp_authopt_alg_pool *pool;
+ u8 macbuf[TCP_AUTHOPT_MAXMACBUF];
+ u8 traffic_key[TCP_AUTHOPT_MAX_TRAFFIC_KEY_LEN];
+ __be32 sne = 0;
+ int err;
+
+ /* Call special code path for computing traffic key without skb
+ * This can be called from tcp_v4_reqsk_send_ack so caching would be
+ * difficult here.
+ */
+ err = tcp_v4_authopt_get_traffic_key_noskb(key, saddr, daddr,
+ th->source, th->dest,
+ htonl(info->src_isn), htonl(info->dst_isn),
+ traffic_key);
+ if (err)
+ goto out_err_traffic_key;
+
+ /* Init mac shash */
+ pool = tcp_authopt_get_mac_pool(key);
+ if (IS_ERR(pool))
+ return PTR_ERR(pool);
+ err = crypto_ahash_setkey(pool->tfm, traffic_key, key->alg->traffic_key_len);
+ if (err)
+ goto out_err;
+ err = crypto_ahash_init(pool->req);
+ if (err)
+ return err;
+
+ err = crypto_ahash_buf(pool->req, (u8 *)&sne, 4);
+ if (err)
+ return err;
+
+ err = tcp_authopt_hash_tcp4_pseudoheader(pool, saddr, daddr, th->doff * 4);
+ if (err)
+ return err;
+
+ // TCP header with checksum set to zero. Caller ensures this.
+ if (WARN_ON_ONCE(th->check != 0))
+ goto out_err;
+ err = crypto_ahash_buf(pool->req, (u8 *)th, sizeof(*th));
+ if (err)
+ goto out_err;
+
+ // TCP options
+ err = tcp_authopt_hash_opts(pool, th, (struct tcphdr_authopt *)(hash_location - 4),
+ !(key->flags & TCP_AUTHOPT_KEY_EXCLUDE_OPTS));
+ if (err)
+ goto out_err;
+
+ ahash_request_set_crypt(pool->req, NULL, macbuf, 0);
+ err = crypto_ahash_final(pool->req);
+ if (err)
+ goto out_err;
+ memcpy(hash_location, macbuf, TCP_AUTHOPT_MACLEN);
+
+ tcp_authopt_put_mac_pool(key, pool);
+ return 0;
+
+out_err:
+ tcp_authopt_put_mac_pool(key, pool);
+out_err_traffic_key:
+ memset(hash_location, 0, TCP_AUTHOPT_MACLEN);
+ return err;
+}
+
static struct tcp_authopt_key_info *tcp_authopt_lookup_recv(struct sock *sk,
struct sk_buff *skb,
struct tcp_authopt_info *info,
int recv_id,
bool *anykey)
--
2.25.1


2021-12-08 11:39:49

by Leonard Crestez

[permalink] [raw]
Subject: [PATCH v3 11/18] tcp: ipv4: Add AO signing for skb-less replies

The code in tcp_v4_send_ack and tcp_v4_send_reset does not allocate a
full skb so special handling is required for tcp-authopt handling.

Signed-off-by: Leonard Crestez <[email protected]>
---
net/ipv4/tcp_ipv4.c | 82 +++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 79 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index b16f263c3121..be531e2f52ae 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -644,10 +644,50 @@ void tcp_v4_send_check(struct sock *sk, struct sk_buff *skb)

__tcp_v4_send_check(skb, inet->inet_saddr, inet->inet_daddr);
}
EXPORT_SYMBOL(tcp_v4_send_check);

+#ifdef CONFIG_TCP_AUTHOPT
+/** tcp_v4_authopt_handle_reply - Insert TCPOPT_AUTHOPT if required
+ *
+ * returns number of bytes (always aligned to 4) or zero
+ */
+static int tcp_v4_authopt_handle_reply(const struct sock *sk,
+ struct sk_buff *skb,
+ __be32 *optptr,
+ struct tcphdr *th)
+{
+ struct tcp_authopt_info *info;
+ struct tcp_authopt_key_info *key_info;
+ u8 rnextkeyid;
+
+ if (sk->sk_state == TCP_TIME_WAIT)
+ info = tcp_twsk(sk)->tw_authopt_info;
+ else
+ info = tcp_sk(sk)->authopt_info;
+ if (!info)
+ return 0;
+ key_info = __tcp_authopt_select_key(sk, info, sk, &rnextkeyid);
+ if (!key_info)
+ return 0;
+ *optptr = htonl((TCPOPT_AUTHOPT << 24) |
+ (TCPOLEN_AUTHOPT_OUTPUT << 16) |
+ (key_info->send_id << 8) |
+ (rnextkeyid));
+ /* must update doff before signature computation */
+ th->doff += TCPOLEN_AUTHOPT_OUTPUT / 4;
+ tcp_v4_authopt_hash_reply((char *)(optptr + 1),
+ info,
+ key_info,
+ ip_hdr(skb)->daddr,
+ ip_hdr(skb)->saddr,
+ th);
+
+ return TCPOLEN_AUTHOPT_OUTPUT;
+}
+#endif
+
/*
* This routine will send an RST to the other tcp.
*
* Someone asks: why I NEVER use socket parameters (TOS, TTL etc.)
* for reset.
@@ -659,10 +699,12 @@ EXPORT_SYMBOL(tcp_v4_send_check);
* Exception: precedence violation. We do not implement it in any case.
*/

#ifdef CONFIG_TCP_MD5SIG
#define OPTION_BYTES TCPOLEN_MD5SIG_ALIGNED
+#elif defined(OPTION_BYTES_TCP_AUTHOPT)
+#define OPTION_BYTES TCPOLEN_AUTHOPT_OUTPUT
#else
#define OPTION_BYTES sizeof(__be32)
#endif

static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb)
@@ -712,12 +754,29 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb)
memset(&arg, 0, sizeof(arg));
arg.iov[0].iov_base = (unsigned char *)&rep;
arg.iov[0].iov_len = sizeof(rep.th);

net = sk ? sock_net(sk) : dev_net(skb_dst(skb)->dev);
-#ifdef CONFIG_TCP_MD5SIG
+#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AUTHOPT)
rcu_read_lock();
+#endif
+#ifdef CONFIG_TCP_AUTHOPT
+ /* Unlike TCP-MD5 the signatures for TCP-AO depend on initial sequence
+ * numbers so we can only handle established and time-wait sockets.
+ */
+ if (tcp_authopt_needed && sk &&
+ sk->sk_state != TCP_NEW_SYN_RECV &&
+ sk->sk_state != TCP_LISTEN) {
+ int tcp_authopt_ret = tcp_v4_authopt_handle_reply(sk, skb, rep.opt, &rep.th);
+
+ if (tcp_authopt_ret) {
+ arg.iov[0].iov_len += tcp_authopt_ret;
+ goto skip_md5sig;
+ }
+ }
+#endif
+#ifdef CONFIG_TCP_MD5SIG
hash_location = tcp_parse_md5sig_option(th);
if (sk && sk_fullsock(sk)) {
const union tcp_md5_addr *addr;
int l3index;

@@ -755,11 +814,10 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb)
addr = (union tcp_md5_addr *)&ip_hdr(skb)->saddr;
key = tcp_md5_do_lookup(sk1, l3index, addr, AF_INET);
if (!key)
goto out;

-
genhash = tcp_v4_md5_hash_skb(newhash, key, NULL, skb);
if (genhash || memcmp(hash_location, newhash, 16) != 0)
goto out;

}
@@ -775,10 +833,13 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb)

tcp_v4_md5_hash_hdr((__u8 *) &rep.opt[1],
key, ip_hdr(skb)->saddr,
ip_hdr(skb)->daddr, &rep.th);
}
+#endif
+#ifdef CONFIG_TCP_AUTHOPT
+skip_md5sig:
#endif
/* Can't co-exist with TCPMD5, hence check rep.opt[0] */
if (rep.opt[0] == 0) {
__be32 mrst = mptcp_reset_option(skb);

@@ -828,12 +889,14 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb)
ctl_sk->sk_mark = 0;
__TCP_INC_STATS(net, TCP_MIB_OUTSEGS);
__TCP_INC_STATS(net, TCP_MIB_OUTRSTS);
local_bh_enable();

-#ifdef CONFIG_TCP_MD5SIG
+#if defined(CONFIG_TCP_MD5SIG)
out:
+#endif
+#if defined(CONFIG_TCP_MD5SIG) || defined(CONFIG_TCP_AUTHOPT)
rcu_read_unlock();
#endif
}

/* The code following below sending ACKs in SYN-RECV and TIME-WAIT states
@@ -850,10 +913,12 @@ static void tcp_v4_send_ack(const struct sock *sk,
struct {
struct tcphdr th;
__be32 opt[(TCPOLEN_TSTAMP_ALIGNED >> 2)
#ifdef CONFIG_TCP_MD5SIG
+ (TCPOLEN_MD5SIG_ALIGNED >> 2)
+#elif defined(CONFIG_TCP_AUTHOPT)
+ + (TCPOLEN_AUTHOPT_OUTPUT >> 2)
#endif
];
} rep;
struct net *net = sock_net(sk);
struct ip_reply_arg arg;
@@ -881,10 +946,21 @@ static void tcp_v4_send_ack(const struct sock *sk,
rep.th.seq = htonl(seq);
rep.th.ack_seq = htonl(ack);
rep.th.ack = 1;
rep.th.window = htons(win);

+#ifdef CONFIG_TCP_AUTHOPT
+ if (tcp_authopt_needed) {
+ int aoret, offset = (tsecr) ? 3 : 0;
+
+ aoret = tcp_v4_authopt_handle_reply(sk, skb, &rep.opt[offset], &rep.th);
+ if (aoret) {
+ arg.iov[0].iov_len += aoret;
+ key = NULL;
+ }
+ }
+#endif
#ifdef CONFIG_TCP_MD5SIG
if (key) {
int offset = (tsecr) ? 3 : 0;

rep.opt[offset++] = htonl((TCPOPT_NOP << 24) |
--
2.25.1


2021-12-08 11:39:50

by Leonard Crestez

[permalink] [raw]
Subject: [PATCH v3 12/18] tcp: authopt: Add key selection controls

The RFC requires that TCP can report the keyid and rnextkeyid values
being sent or received, implement this via getsockopt values.

The RFC also requires that user can select the sending key and that the
sending key is automatically switched based on rnextkeyid. These
requirements can conflict so we implement both and add a flag which
specifies if user or peer request takes priority.

Also add an option to control rnextkeyid explicitly from userspace.

Signed-off-by: Leonard Crestez <[email protected]>
---
Documentation/networking/tcp_authopt.rst | 25 ++++++
include/net/tcp_authopt.h | 38 ++++++++-
include/uapi/linux/tcp.h | 31 ++++++++
net/ipv4/tcp_authopt.c | 98 +++++++++++++++++++++++-
net/ipv4/tcp_ipv4.c | 2 +-
net/ipv6/tcp_ipv6.c | 2 +-
6 files changed, 189 insertions(+), 7 deletions(-)

diff --git a/Documentation/networking/tcp_authopt.rst b/Documentation/networking/tcp_authopt.rst
index 484f66f41ad5..cded87a70d05 100644
--- a/Documentation/networking/tcp_authopt.rst
+++ b/Documentation/networking/tcp_authopt.rst
@@ -35,10 +35,35 @@ Keys can be bound to remote addresses in a way that is similar to TCP_MD5.

RFC5925 requires that key ids do not overlap when tcp identifiers (addr/port)
overlap. This is not enforced by linux, configuring ambiguous keys will result
in packet drops and lost connections.

+Key selection
+-------------
+
+On getsockopt(TCP_AUTHOPT) information is provided about keyid/rnextkeyid in
+the last send packet and about the keyid/rnextkeyd in the last valid received
+packet.
+
+By default the sending keyid is selected to match the "rnextkeyid" value sent
+by the remote side. If that keyid is not available (or for new connections) a
+random matching key is selected.
+
+If the `TCP_AUTHOPT_LOCK_KEYID` is set then the sending key is selected by the
+`tcp_authopt.send_local_id` field and rnextkeyid is ignored. If no key with
+local_id == send_local_id is configured then a random matching key is
+selected.
+
+The current sending key is cached in the socket and will not change unless
+requested by remote rnextkeyid or by setsockopt.
+
+The rnextkeyid value sent on the wire is usually the recv_id of the current
+key used for sending. If the TCP_AUTHOPT_LOCK_RNEXTKEY flag is set in
+`tcp_authopt.flags` the value of `tcp_authopt.send_rnextkeyid` is send
+instead. This can be used to implement smooth rollover: the peer will switch
+its keyid to the received rnextkeyid when it is available.
+
ABI Reference
=============

.. kernel-doc:: include/uapi/linux/tcp.h
:identifiers: tcp_authopt tcp_authopt_flag tcp_authopt_key tcp_authopt_key_flag tcp_authopt_alg
diff --git a/include/net/tcp_authopt.h b/include/net/tcp_authopt.h
index 411e7a0bdd43..020637265ce9 100644
--- a/include/net/tcp_authopt.h
+++ b/include/net/tcp_authopt.h
@@ -66,10 +66,43 @@ struct tcp_authopt_info {
u32 dst_isn;
/** @rcv_sne: Recv-side Sequence Number Extension tracking tcp_sock.rcv_nxt */
u32 rcv_sne;
/** @snd_sne: Send-side Sequence Number Extension tracking tcp_sock.snd_nxt */
u32 snd_sne;
+
+ /**
+ * @send_keyid: keyid currently being sent
+ *
+ * This is controlled by userspace by userspace if
+ * TCP_AUTHOPT_FLAG_LOCK_KEYID, otherwise we try to match recv_rnextkeyid
+ */
+ u8 send_keyid;
+ /**
+ * @send_rnextkeyid: rnextkeyid currently being sent
+ *
+ * This is controlled by userspace if TCP_AUTHOPT_FLAG_LOCK_RNEXTKEYID is set
+ */
+ u8 send_rnextkeyid;
+ /**
+ * @recv_keyid: last keyid received from remote
+ *
+ * This is reported to userspace but has no other special behavior attached.
+ */
+ u8 recv_keyid;
+ /**
+ * @recv_rnextkeyid: last rnextkeyid received from remote
+ *
+ * Linux tries to honor this unless TCP_AUTHOPT_FLAG_LOCK_KEYID is set
+ */
+ u8 recv_rnextkeyid;
+
+ /**
+ * @send_key: Current key used for sending, cached.
+ *
+ * Once a key is found it only changes by user or remote request.
+ */
+ struct tcp_authopt_key_info *send_key;
};

/* TCP authopt as found in header */
struct tcphdr_authopt {
u8 num;
@@ -91,22 +124,23 @@ int tcp_get_authopt_val(struct sock *sk, struct tcp_authopt *key);
int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen);
struct tcp_authopt_key_info *__tcp_authopt_select_key(
const struct sock *sk,
struct tcp_authopt_info *info,
const struct sock *addr_sk,
- u8 *rnextkeyid);
+ u8 *rnextkeyid,
+ bool locked);
static inline struct tcp_authopt_key_info *tcp_authopt_select_key(
const struct sock *sk,
const struct sock *addr_sk,
struct tcp_authopt_info **info,
u8 *rnextkeyid)
{
if (tcp_authopt_needed) {
*info = rcu_dereference(tcp_sk(sk)->authopt_info);

if (*info)
- return __tcp_authopt_select_key(sk, *info, addr_sk, rnextkeyid);
+ return __tcp_authopt_select_key(sk, *info, addr_sk, rnextkeyid, true);
}
return NULL;
}
int tcp_authopt_hash(
char *hash_location,
diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index 76d7be6b27f4..e02176390519 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -346,10 +346,24 @@ struct tcp_diag_md5sig {

/**
* enum tcp_authopt_flag - flags for `tcp_authopt.flags`
*/
enum tcp_authopt_flag {
+ /**
+ * @TCP_AUTHOPT_FLAG_LOCK_KEYID: keyid controlled by sockopt
+ *
+ * If this is set `tcp_authopt.send_keyid` is used to determined sending
+ * key. Otherwise a key with send_id == recv_rnextkeyid is preferred.
+ */
+ TCP_AUTHOPT_FLAG_LOCK_KEYID = (1 << 0),
+ /**
+ * @TCP_AUTHOPT_FLAG_LOCK_RNEXTKEYID: Override rnextkeyid from userspace
+ *
+ * If this is set then `tcp_authopt.send_rnextkeyid` is sent on outbound
+ * packets. Other the recv_id of the current sending key is sent.
+ */
+ TCP_AUTHOPT_FLAG_LOCK_RNEXTKEYID = (1 << 1),
/**
* @TCP_AUTHOPT_FLAG_REJECT_UNEXPECTED:
* Configure behavior of segments with TCP-AO coming from hosts for which no
* key is configured. The default recommended by RFC is to silently accept
* such connections.
@@ -361,10 +375,27 @@ enum tcp_authopt_flag {
* struct tcp_authopt - Per-socket options related to TCP Authentication Option
*/
struct tcp_authopt {
/** @flags: Combination of &enum tcp_authopt_flag */
__u32 flags;
+ /**
+ * @send_keyid: `tcp_authopt_key.send_id` of preferred send key
+ *
+ * This is only used if `TCP_AUTHOPT_FLAG_LOCK_KEYID` is set.
+ */
+ __u8 send_keyid;
+ /**
+ * @send_rnextkeyid: The rnextkeyid to send in packets
+ *
+ * This is controlled by the user iff TCP_AUTHOPT_FLAG_LOCK_RNEXTKEYID is
+ * set. Otherwise rnextkeyid is the recv_id of the current key.
+ */
+ __u8 send_rnextkeyid;
+ /** @recv_keyid: A recently-received keyid value. Only for getsockopt. */
+ __u8 recv_keyid;
+ /** @recv_rnextkeyid: A recently-received rnextkeyid value. Only for getsockopt. */
+ __u8 recv_rnextkeyid;
};

/**
* enum tcp_authopt_key_flag - flags for `tcp_authopt.flags`
*
diff --git a/net/ipv4/tcp_authopt.c b/net/ipv4/tcp_authopt.c
index a4f3eac20b29..a8950c9a7e84 100644
--- a/net/ipv4/tcp_authopt.c
+++ b/net/ipv4/tcp_authopt.c
@@ -309,20 +309,76 @@ static struct tcp_authopt_key_info *tcp_authopt_lookup_send(struct tcp_authopt_i
*
* @sk: socket
* @info: socket's tcp_authopt_info
* @addr_sk: socket used for address lookup. Same as sk except for synack case
* @rnextkeyid: value of rnextkeyid caller should write in packet
+ * @locked: If we're holding the socket lock. This is false for some timewait and reset cases
*
* Result is protected by RCU and can't be stored, it may only be passed to
* tcp_authopt_hash and only under a single rcu_read_lock.
*/
struct tcp_authopt_key_info *__tcp_authopt_select_key(const struct sock *sk,
struct tcp_authopt_info *info,
const struct sock *addr_sk,
- u8 *rnextkeyid)
+ u8 *rnextkeyid,
+ bool locked)
{
- return tcp_authopt_lookup_send(info, addr_sk, -1);
+ struct tcp_authopt_key_info *key, *new_key = NULL;
+
+ /* Listen sockets don't refer to any specific connection so we don't try
+ * to keep using the same key and ignore any received keyids.
+ */
+ if (sk->sk_state == TCP_LISTEN) {
+ int send_keyid = -1;
+
+ if (info->flags & TCP_AUTHOPT_FLAG_LOCK_KEYID)
+ send_keyid = info->send_keyid;
+ key = tcp_authopt_lookup_send(info, addr_sk, send_keyid);
+ if (key)
+ *rnextkeyid = key->recv_id;
+
+ return key;
+ }
+
+ if (locked)
+ key = rcu_dereference_protected(info->send_key, lockdep_sock_is_held(sk));
+ else
+ key = rcu_dereference(info->send_key);
+
+ /* Try to keep the same sending key unless user or peer requires a different key
+ * User request (via TCP_AUTHOPT_FLAG_LOCK_KEYID) always overrides peer request.
+ */
+ if (info->flags & TCP_AUTHOPT_FLAG_LOCK_KEYID) {
+ int send_keyid = info->send_keyid;
+
+ if (!key || key->send_id != send_keyid)
+ new_key = tcp_authopt_lookup_send(info, addr_sk, send_keyid);
+ } else {
+ if (!key || key->send_id != info->recv_rnextkeyid)
+ new_key = tcp_authopt_lookup_send(info, addr_sk, info->recv_rnextkeyid);
+ }
+ /* If no key found with specific send_id try anything else. */
+ if (!key && !new_key)
+ new_key = tcp_authopt_lookup_send(info, addr_sk, -1);
+
+ /* Update current key only if we hold the socket lock, otherwise we might
+ * store a pointer that goes stale
+ */
+ if (new_key && key != new_key) {
+ key = new_key;
+ if (locked)
+ rcu_assign_pointer(info->send_key, key);
+ }
+
+ if (key) {
+ if (info->flags & TCP_AUTHOPT_FLAG_LOCK_RNEXTKEYID)
+ *rnextkeyid = info->send_rnextkeyid;
+ else
+ *rnextkeyid = info->send_rnextkeyid = key->recv_id;
+ }
+
+ return key;
}
EXPORT_SYMBOL(__tcp_authopt_select_key);

static struct tcp_authopt_info *__tcp_authopt_info_get_or_create(struct sock *sk)
{
@@ -345,10 +401,12 @@ static struct tcp_authopt_info *__tcp_authopt_info_get_or_create(struct sock *sk

return info;
}

#define TCP_AUTHOPT_KNOWN_FLAGS ( \
+ TCP_AUTHOPT_FLAG_LOCK_KEYID | \
+ TCP_AUTHOPT_FLAG_LOCK_RNEXTKEYID | \
TCP_AUTHOPT_FLAG_REJECT_UNEXPECTED)

/* Like copy_from_sockopt except tolerate different optlen for compatibility reasons
*
* If the src is shorter then it's from an old userspace and the rest of dst is
@@ -416,18 +474,23 @@ int tcp_set_authopt(struct sock *sk, sockptr_t optval, unsigned int optlen)
info = __tcp_authopt_info_get_or_create(sk);
if (IS_ERR(info))
return PTR_ERR(info);

info->flags = opt.flags & TCP_AUTHOPT_KNOWN_FLAGS;
+ if (opt.flags & TCP_AUTHOPT_FLAG_LOCK_KEYID)
+ info->send_keyid = opt.send_keyid;
+ if (opt.flags & TCP_AUTHOPT_FLAG_LOCK_RNEXTKEYID)
+ info->send_rnextkeyid = opt.send_rnextkeyid;

return 0;
}

int tcp_get_authopt_val(struct sock *sk, struct tcp_authopt *opt)
{
struct tcp_sock *tp = tcp_sk(sk);
struct tcp_authopt_info *info;
+ struct tcp_authopt_key_info *send_key;
int err;

memset(opt, 0, sizeof(*opt));
sock_owned_by_me(sk);
err = check_sysctl_tcp_authopt();
@@ -437,10 +500,22 @@ int tcp_get_authopt_val(struct sock *sk, struct tcp_authopt *opt)
info = rcu_dereference_check(tp->authopt_info, lockdep_sock_is_held(sk));
if (!info)
return -ENOENT;

opt->flags = info->flags & TCP_AUTHOPT_KNOWN_FLAGS;
+ /* These keyids might be undefined, for example before connect.
+ * Reporting zero is not strictly correct because there are no reserved
+ * values.
+ */
+ send_key = rcu_dereference_check(info->send_key, lockdep_sock_is_held(sk));
+ if (send_key)
+ opt->send_keyid = send_key->send_id;
+ else
+ opt->send_keyid = 0;
+ opt->send_rnextkeyid = info->send_rnextkeyid;
+ opt->recv_keyid = info->recv_keyid;
+ opt->recv_rnextkeyid = info->recv_rnextkeyid;

return 0;
}

/* Free key nicely, for living sockets */
@@ -448,10 +523,12 @@ static void tcp_authopt_key_del(struct sock *sk,
struct tcp_authopt_info *info,
struct tcp_authopt_key_info *key)
{
sock_owned_by_me(sk);
hlist_del_rcu(&key->node);
+ if (rcu_dereference_protected(info->send_key, lockdep_sock_is_held(sk)) == key)
+ rcu_assign_pointer(info->send_key, NULL);
atomic_sub(sizeof(*key), &sk->sk_omem_alloc);
kfree_rcu(key, rcu);
}

/* Free info and keys.
@@ -1422,11 +1499,11 @@ int __tcp_authopt_inbound_check(struct sock *sk, struct sk_buff *skb,
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAUTHOPTFAILURE);
print_tcpao_notice("TCP Authentication Unexpected: Rejected", skb);
return -EINVAL;
}
print_tcpao_notice("TCP Authentication Unexpected: Accepted", skb);
- return 0;
+ goto accept;
}
if (opt && !key) {
/* Keys are configured for peer but with different keyid than packet */
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAUTHOPTFAILURE);
print_tcpao_notice("TCP Authentication Failed", skb);
@@ -1445,8 +1522,23 @@ int __tcp_authopt_inbound_check(struct sock *sk, struct sk_buff *skb,
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAUTHOPTFAILURE);
print_tcpao_notice("TCP Authentication Failed", skb);
return -EINVAL;
}

+accept:
+ /* Doing this for all valid packets will results in keyids temporarily
+ * flipping back and forth if packets are reordered or retransmitted
+ * but keys should eventually stabilize.
+ *
+ * This is connection-specific so don't store for listen sockets.
+ *
+ * We could store rnextkeyid from SYN in a request sock and use it for
+ * the SYNACK but we don't.
+ */
+ if (sk->sk_state != TCP_LISTEN) {
+ info->recv_keyid = opt->keyid;
+ info->recv_rnextkeyid = opt->rnextkeyid;
+ }
+
return 1;
}
EXPORT_SYMBOL(__tcp_authopt_inbound_check);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index be531e2f52ae..edfb76f76485 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -664,11 +664,11 @@ static int tcp_v4_authopt_handle_reply(const struct sock *sk,
info = tcp_twsk(sk)->tw_authopt_info;
else
info = tcp_sk(sk)->authopt_info;
if (!info)
return 0;
- key_info = __tcp_authopt_select_key(sk, info, sk, &rnextkeyid);
+ key_info = __tcp_authopt_select_key(sk, info, sk, &rnextkeyid, false);
if (!key_info)
return 0;
*optptr = htonl((TCPOPT_AUTHOPT << 24) |
(TCPOLEN_AUTHOPT_OUTPUT << 16) |
(key_info->send_id << 8) |
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index cd8544d08a36..6ed13f8f489f 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -901,11 +901,11 @@ static int tcp_v6_send_response_init_authopt(const struct sock *sk,
*info = tcp_twsk(sk)->tw_authopt_info;
else
*info = rcu_dereference(tcp_sk(sk)->authopt_info);
if (!*info)
return 0;
- *key = __tcp_authopt_select_key(sk, *info, sk, rnextkeyid);
+ *key = __tcp_authopt_select_key(sk, *info, sk, rnextkeyid, false);
if (*key)
return TCPOLEN_AUTHOPT_OUTPUT;
return 0;
}

--
2.25.1


2021-12-08 11:40:04

by Leonard Crestez

[permalink] [raw]
Subject: [PATCH v3 18/18] selftests: net/fcnal: Initial tcp_authopt support

Tests are mostly copied from tcp_md5 with minor changes.

It covers VRF support but only based on binding multiple servers: not
multiple keys bound to different interfaces.

Also add a specific -t tcp_authopt to run only these tests specifically.

Signed-off-by: Leonard Crestez <[email protected]>
---
tools/testing/selftests/net/fcnal-test.sh | 298 ++++++++++++++++++++++
1 file changed, 298 insertions(+)

diff --git a/tools/testing/selftests/net/fcnal-test.sh b/tools/testing/selftests/net/fcnal-test.sh
index 32d5f7bc588e..e99b44f0ae5a 100755
--- a/tools/testing/selftests/net/fcnal-test.sh
+++ b/tools/testing/selftests/net/fcnal-test.sh
@@ -807,10 +807,301 @@ ipv4_ping()
}

################################################################################
# IPv4 TCP

+#
+# TCP Authentication Option Tests
+#
+
+# try to enable tcp_authopt sysctl
+enable_tcp_authopt()
+{
+ if [[ -e /proc/sys/net/ipv4/tcp_authopt ]]; then
+ sysctl -w net.ipv4.tcp_authopt=1
+ fi
+}
+
+# check if tcp_authopt is compiled with a client-side bind test
+has_tcp_authopt()
+{
+ run_cmd_nsb nettest -b -A ${MD5_PW} -r ${NSA_IP}
+}
+
+ipv4_tcp_authopt_novrf()
+{
+ enable_tcp_authopt
+ if ! has_tcp_authopt; then
+ echo "TCP-AO appears to be missing, skip"
+ return 0
+ fi
+
+ log_start
+ run_cmd nettest -s -A ${MD5_PW} -m ${NSB_IP} &
+ sleep 1
+ run_cmd_nsb nettest -r ${NSA_IP} -A ${MD5_PW}
+ log_test $? 0 "AO: Single address config"
+
+ log_start
+ run_cmd nettest -s &
+ sleep 1
+ run_cmd_nsb nettest -r ${NSA_IP} -A ${MD5_PW}
+ log_test $? 2 "AO: Server no config, client uses password"
+
+ log_start
+ run_cmd nettest -s -A ${MD5_PW} -m ${NSB_IP} &
+ sleep 1
+ run_cmd_nsb nettest -r ${NSA_IP} -A ${MD5_WRONG_PW}
+ log_test $? 2 "AO: Client uses wrong password"
+
+ log_start
+ run_cmd nettest -s -A ${MD5_PW} -m ${NSB_LO_IP} &
+ sleep 1
+ run_cmd_nsb nettest -r ${NSA_IP} -A ${MD5_PW}
+ log_test $? 2 "AO: Client address does not match address configured on server"
+
+ # client in prefix
+ log_start
+ run_cmd nettest -s -A ${MD5_PW} -m ${NS_NET} &
+ sleep 1
+ run_cmd_nsb nettest -r ${NSA_IP} -A ${MD5_PW}
+ log_test $? 0 "AO: Prefix config"
+
+ # client in prefix, wrong password
+ log_start
+ show_hint "Should timeout since client uses wrong password"
+ run_cmd nettest -s -A ${MD5_PW} -m ${NS_NET} &
+ sleep 1
+ run_cmd_nsb nettest -r ${NSA_IP} -A ${MD5_WRONG_PW}
+ log_test $? 2 "AO: Prefix config, client uses wrong password"
+
+ # client outside of prefix
+ log_start
+ show_hint "Should timeout due to MD5 mismatch"
+ run_cmd nettest -s -A ${MD5_PW} -m ${NS_NET} &
+ sleep 1
+ run_cmd_nsb nettest -c ${NSB_LO_IP} -r ${NSA_IP} -A ${MD5_PW}
+ log_test $? 2 "AO: Prefix config, client address not in configured prefix"
+}
+
+ipv6_tcp_authopt_novrf()
+{
+ enable_tcp_authopt
+ if ! has_tcp_authopt; then
+ echo "TCP-AO appears to be missing, skip"
+ return 0
+ fi
+
+ log_start
+ run_cmd nettest -6 -s -A ${MD5_PW} &
+ sleep 1
+ run_cmd_nsb nettest -6 -r ${NSA_IP6} -A ${MD5_PW}
+ log_test $? 0 "AO: Simple correct config"
+
+ log_start
+ run_cmd nettest -6 -s
+ sleep 1
+ run_cmd_nsb nettest -6 -r ${NSA_IP6} -A ${MD5_PW}
+ log_test $? 2 "AO: Server no config, client uses password"
+
+ log_start
+ run_cmd nettest -6 -s -A ${MD5_PW} -m ${NSB_IP6} &
+ sleep 1
+ run_cmd_nsb nettest -6 -r ${NSA_IP6} -A ${MD5_WRONG_PW}
+ log_test $? 2 "AO: Client uses wrong password"
+
+ log_start
+ run_cmd nettest -6 -s -A ${MD5_PW} -m ${NSB_LO_IP6} &
+ sleep 1
+ run_cmd_nsb nettest -6 -r ${NSA_IP6} -A ${MD5_PW}
+ log_test $? 2 "AO: Client address does not match address configured on server"
+}
+
+ipv4_tcp_authopt_vrf()
+{
+ enable_tcp_authopt
+ if ! has_tcp_authopt; then
+ echo "TCP-AO appears to be missing, skip"
+ return 0
+ fi
+
+ log_start
+ run_cmd nettest -s -I ${VRF} -A ${MD5_PW} &
+ sleep 1
+ run_cmd_nsb nettest -r ${NSA_IP} -A ${MD5_PW}
+ log_test $? 0 "AO: VRF: Simple config"
+
+ #
+ # duplicate config between default VRF and a VRF
+ #
+
+ log_start
+ run_cmd nettest -s -I ${VRF} -A ${MD5_PW} -m ${NSB_IP} &
+ run_cmd nettest -s -A ${MD5_WRONG_PW} -m ${NSB_IP} &
+ sleep 1
+ run_cmd_nsb nettest -r ${NSA_IP} -A ${MD5_PW}
+ log_test $? 0 "AO: VRF: Servers in default-VRF and VRF, client in VRF"
+
+ log_start
+ run_cmd nettest -s -I ${VRF} -A ${MD5_PW} -m ${NSB_IP} &
+ run_cmd nettest -s -A ${MD5_WRONG_PW} -m ${NSB_IP} &
+ sleep 1
+ run_cmd_nsc nettest -r ${NSA_IP} -A ${MD5_WRONG_PW}
+ log_test $? 0 "AO: VRF: Servers in default-VRF and VRF, client in default-VRF"
+
+ log_start
+ show_hint "Should timeout since client in default VRF uses VRF password"
+ run_cmd nettest -s -I ${VRF} -A ${MD5_PW} -m ${NSB_IP} &
+ run_cmd nettest -s -A ${MD5_WRONG_PW} -m ${NSB_IP} &
+ sleep 1
+ run_cmd_nsc nettest -r ${NSA_IP} -A ${MD5_PW}
+ log_test $? 2 "AO: VRF: Servers in default VRF and VRF, conn in default-VRF with VRF pw"
+
+ log_start
+ show_hint "Should timeout since client in VRF uses default VRF password"
+ run_cmd nettest -s -I ${VRF} -A ${MD5_PW} -m ${NSB_IP} &
+ run_cmd nettest -s -A ${MD5_WRONG_PW} -m ${NSB_IP} &
+ sleep 1
+ run_cmd_nsb nettest -r ${NSA_IP} -A ${MD5_WRONG_PW}
+ log_test $? 2 "AO: VRF: Servers in default VRF and VRF, conn in VRF with default-VRF pw"
+
+ test_ipv4_tcp_authopt_vrf__global_server__bind_ifindex0
+}
+
+test_ipv4_tcp_authopt_vrf__global_server__bind_ifindex0()
+{
+ # This particular test needs tcp_l3mdev_accept=1 for Global server to accept VRF connections
+ local old_tcp_l3mdev_accept
+ old_tcp_l3mdev_accept=$(get_sysctl net.ipv4.tcp_l3mdev_accept)
+ set_sysctl net.ipv4.tcp_l3mdev_accept=1
+
+ log_start
+ run_cmd nettest -s -A ${MD5_PW} --force-bind-key-ifindex &
+ sleep 1
+ run_cmd_nsb nettest -r ${NSA_IP} -A ${MD5_PW}
+ log_test $? 2 "AO: VRF: Global server, Key bound to ifindex=0 rejects VRF connection"
+
+ log_start
+ run_cmd nettest -s -A ${MD5_PW} --force-bind-key-ifindex &
+ sleep 1
+ run_cmd_nsc nettest -r ${NSA_IP} -A ${MD5_PW}
+ log_test $? 0 "AO: VRF: Global server, key bound to ifindex=0 accepts non-VRF connection"
+ log_start
+
+ run_cmd nettest -s -A ${MD5_PW} --no-bind-key-ifindex &
+ sleep 1
+ run_cmd_nsb nettest -r ${NSA_IP} -A ${MD5_PW}
+ log_test $? 0 "AO: VRF: Global server, key not bound to ifindex accepts VRF connection"
+
+ log_start
+ run_cmd nettest -s -A ${MD5_PW} --no-bind-key-ifindex &
+ sleep 1
+ run_cmd_nsc nettest -r ${NSA_IP} -A ${MD5_PW}
+ log_test $? 0 "AO: VRF: Global server, key not bound to ifindex accepts non-VRF connection"
+
+ # restore value
+ set_sysctl net.ipv4.tcp_l3mdev_accept="$old_tcp_l3mdev_accept"
+}
+
+ipv6_tcp_authopt_vrf()
+{
+ enable_tcp_authopt
+ if ! has_tcp_authopt; then
+ echo "TCP-AO appears to be missing, skip"
+ return 0
+ fi
+
+ log_start
+ run_cmd nettest -6 -s -I ${VRF} -A ${MD5_PW} &
+ sleep 1
+ run_cmd_nsb nettest -6 -r ${NSA_IP6} -A ${MD5_PW}
+ log_test $? 0 "AO: VRF: Simple config"
+
+ #
+ # duplicate config between default VRF and a VRF
+ #
+
+ log_start
+ run_cmd nettest -6 -s -I ${VRF} -A ${MD5_PW} -m ${NSB_IP6} &
+ run_cmd nettest -6 -s -A ${MD5_WRONG_PW} -m ${NSB_IP6} &
+ sleep 1
+ run_cmd_nsb nettest -6 -r ${NSA_IP6} -A ${MD5_PW}
+ log_test $? 0 "AO: VRF: Servers in default-VRF and VRF, client in VRF"
+
+ log_start
+ run_cmd nettest -6 -s -I ${VRF} -A ${MD5_PW} -m ${NSB_IP6} &
+ run_cmd nettest -6 -s -A ${MD5_WRONG_PW} -m ${NSB_IP6} &
+ sleep 1
+ run_cmd_nsc nettest -6 -r ${NSA_IP6} -A ${MD5_WRONG_PW}
+ log_test $? 0 "AO: VRF: Servers in default-VRF and VRF, client in default-VRF"
+
+ log_start
+ show_hint "Should timeout since client in default VRF uses VRF password"
+ run_cmd nettest -6 -s -I ${VRF} -A ${MD5_PW} -m ${NSB_IP6} &
+ run_cmd nettest -6 -s -A ${MD5_WRONG_PW} -m ${NSB_IP6} &
+ sleep 1
+ run_cmd_nsc nettest -6 -r ${NSA_IP6} -A ${MD5_PW}
+ log_test $? 2 "AO: VRF: Servers in default VRF and VRF, conn in default-VRF with VRF pw"
+
+ log_start
+ show_hint "Should timeout since client in VRF uses default VRF password"
+ run_cmd nettest -6 -s -I ${VRF} -A ${MD5_PW} -m ${NSB_IP6} &
+ run_cmd nettest -6 -s -A ${MD5_WRONG_PW} -m ${NSB_IP6} &
+ sleep 1
+ run_cmd_nsb nettest -6 -r ${NSA_IP6} -A ${MD5_WRONG_PW}
+ log_test $? 2 "AO: VRF: Servers in default VRF and VRF, conn in VRF with default-VRF pw"
+
+ log_start
+ run_cmd nettest -6 -s -I ${VRF} -A ${MD5_PW} -m ${NS_NET6} &
+ run_cmd nettest -6 -s -A ${MD5_WRONG_PW} -m ${NS_NET6} &
+ sleep 1
+ run_cmd_nsb nettest -6 -r ${NSA_IP6} -A ${MD5_PW}
+ log_test $? 0 "AO: VRF: Prefix config in default VRF and VRF, conn in VRF"
+
+ log_start
+ run_cmd nettest -6 -s -I ${VRF} -A ${MD5_PW} -m ${NS_NET6} &
+ run_cmd nettest -6 -s -A ${MD5_WRONG_PW} -m ${NS_NET6} &
+ sleep 1
+ run_cmd_nsc nettest -6 -r ${NSA_IP6} -A ${MD5_WRONG_PW}
+ log_test $? 0 "AO: VRF: Prefix config in default VRF and VRF, conn in default VRF"
+
+ log_start
+ show_hint "Should timeout since client in default VRF uses VRF password"
+ run_cmd nettest -6 -s -I ${VRF} -A ${MD5_PW} -m ${NS_NET6} &
+ run_cmd nettest -6 -s -A ${MD5_WRONG_PW} -m ${NS_NET6} &
+ sleep 1
+ run_cmd_nsc nettest -6 -r ${NSA_IP6} -A ${MD5_PW}
+ log_test $? 2 "AO: VRF: Prefix config in def VRF and VRF, conn in def VRF with VRF pw"
+
+ log_start
+ show_hint "Should timeout since client in VRF uses default VRF password"
+ run_cmd nettest -6 -s -I ${VRF} -A ${MD5_PW} -m ${NS_NET6} &
+ run_cmd nettest -6 -s -A ${MD5_WRONG_PW} -m ${NS_NET6} &
+ sleep 1
+ run_cmd_nsb nettest -6 -r ${NSA_IP6} -A ${MD5_WRONG_PW}
+ log_test $? 2 "AO: VRF: Prefix config in dev VRF and VRF, conn in VRF with def VRF pw"
+}
+
+only_tcp_authopt()
+{
+ log_section "TCP Authentication Option"
+
+ setup
+ set_sysctl net.ipv4.tcp_l3mdev_accept=0
+ log_subsection "TCP-AO IPv4 no VRF"
+ ipv4_tcp_authopt_novrf
+ log_subsection "TCP-AO IPv6 no VRF"
+ ipv6_tcp_authopt_novrf
+
+ setup "yes"
+ set_sysctl net.ipv4.tcp_l3mdev_accept=0
+ log_subsection "TCP-AO IPv4 VRF"
+ ipv4_tcp_authopt_vrf
+ log_subsection "TCP-AO IPv6 VRF"
+ ipv6_tcp_authopt_vrf
+}
+
#
# MD5 tests without VRF
#
ipv4_tcp_md5_novrf()
{
@@ -1192,10 +1483,11 @@ ipv4_tcp_novrf()
show_hint "Should fail 'Connection refused'"
run_cmd nettest -d ${NSA_DEV} -r ${a}
log_test_addr ${a} $? 1 "No server, device client, local conn"

ipv4_tcp_md5_novrf
+ ipv4_tcp_authopt_novrf
}

ipv4_tcp_vrf()
{
local a
@@ -1246,10 +1538,12 @@ ipv4_tcp_vrf()
run_cmd nettest -r ${a} -d ${NSA_DEV}
log_test_addr ${a} $? 1 "Global server, local connection"

# run MD5 tests
ipv4_tcp_md5
+ # run AO tests
+ ipv6_tcp_md5_vrf

#
# enable VRF global server
#
log_subsection "VRF Global server enabled"
@@ -2672,10 +2966,11 @@ ipv6_tcp_novrf()
run_cmd nettest -6 -d ${NSA_DEV} -r ${a}
log_test_addr ${a} $? 1 "No server, device client, local conn"
done

ipv6_tcp_md5_novrf
+ ipv6_tcp_authopt_novrf
}

ipv6_tcp_vrf()
{
local a
@@ -2742,10 +3037,12 @@ ipv6_tcp_vrf()
run_cmd nettest -6 -r ${a} -d ${NSA_DEV}
log_test_addr ${a} $? 1 "Global server, local connection"

# run MD5 tests
ipv6_tcp_md5
+ # run AO tests
+ ipv6_tcp_authopt_vrf

#
# enable VRF global server
#
log_subsection "VRF Global server enabled"
@@ -4102,10 +4399,11 @@ do
ipv6_bind|bind6) ipv6_addr_bind;;
ipv6_runtime) ipv6_runtime;;
ipv6_netfilter) ipv6_netfilter;;

use_cases) use_cases;;
+ tcp_authopt) only_tcp_authopt;;

# setup namespaces and config, but do not run any tests
setup) setup; exit 0;;
vrf_setup) setup "yes"; exit 0;;
esac
--
2.25.1


2021-12-08 11:40:06

by Leonard Crestez

[permalink] [raw]
Subject: [PATCH v3 16/18] selftests: nettest: Rename md5_prefix to key_addr_prefix

This is in preparation for reusing the same option for TCP-AO

Reviewed-by: David Ahern <[email protected]>
Signed-off-by: Leonard Crestez <[email protected]>
---
tools/testing/selftests/net/nettest.c | 50 +++++++++++++--------------
1 file changed, 25 insertions(+), 25 deletions(-)

diff --git a/tools/testing/selftests/net/nettest.c b/tools/testing/selftests/net/nettest.c
index d9a6fd2cd9d3..3841e5fec7c7 100644
--- a/tools/testing/selftests/net/nettest.c
+++ b/tools/testing/selftests/net/nettest.c
@@ -94,17 +94,17 @@ struct sock_args {
const char *clientns;
const char *serverns;

const char *password;
const char *client_pw;
- /* prefix for MD5 password */
- const char *md5_prefix_str;
+ /* prefix for MD5/AO*/
+ const char *key_addr_prefix_str;
union {
struct sockaddr_in v4;
struct sockaddr_in6 v6;
- } md5_prefix;
- unsigned int prefix_len;
+ } key_addr;
+ unsigned int key_addr_prefix_len;
/* 0: default, -1: force off, +1: force on */
int bind_key_ifindex;

/* expected addresses and device index for connection */
const char *expected_dev;
@@ -264,16 +264,16 @@ static int tcp_md5sig(int sd, void *addr, socklen_t alen, struct sock_args *args
int rc;

md5sig.tcpm_keylen = keylen;
memcpy(md5sig.tcpm_key, args->password, keylen);

- if (args->prefix_len) {
+ if (args->key_addr_prefix_len) {
opt = TCP_MD5SIG_EXT;
md5sig.tcpm_flags |= TCP_MD5SIG_FLAG_PREFIX;

- md5sig.tcpm_prefixlen = args->prefix_len;
- addr = &args->md5_prefix;
+ md5sig.tcpm_prefixlen = args->key_addr_prefix_len;
+ addr = &args->key_addr;
}
memcpy(&md5sig.tcpm_addr, addr, alen);

if ((args->ifindex && args->bind_key_ifindex >= 0) || args->bind_key_ifindex >= 1) {
opt = TCP_MD5SIG_EXT;
@@ -309,17 +309,17 @@ static int tcp_md5_remote(int sd, struct sock_args *args)
int alen;

switch (args->version) {
case AF_INET:
sin.sin_port = htons(args->port);
- sin.sin_addr = args->md5_prefix.v4.sin_addr;
+ sin.sin_addr = args->key_addr.v4.sin_addr;
addr = &sin;
alen = sizeof(sin);
break;
case AF_INET6:
sin6.sin6_port = htons(args->port);
- sin6.sin6_addr = args->md5_prefix.v6.sin6_addr;
+ sin6.sin6_addr = args->key_addr.v6.sin6_addr;
addr = &sin6;
alen = sizeof(sin6);
break;
default:
log_error("unknown address family\n");
@@ -705,11 +705,11 @@ enum addr_type {
ADDR_TYPE_LOCAL,
ADDR_TYPE_REMOTE,
ADDR_TYPE_MCAST,
ADDR_TYPE_EXPECTED_LOCAL,
ADDR_TYPE_EXPECTED_REMOTE,
- ADDR_TYPE_MD5_PREFIX,
+ ADDR_TYPE_KEY_PREFIX,
};

static int convert_addr(struct sock_args *args, const char *_str,
enum addr_type atype)
{
@@ -745,32 +745,32 @@ static int convert_addr(struct sock_args *args, const char *_str,
break;
case ADDR_TYPE_EXPECTED_REMOTE:
desc = "expected remote";
addr = &args->expected_raddr;
break;
- case ADDR_TYPE_MD5_PREFIX:
- desc = "md5 prefix";
+ case ADDR_TYPE_KEY_PREFIX:
+ desc = "key addr prefix";
if (family == AF_INET) {
- args->md5_prefix.v4.sin_family = AF_INET;
- addr = &args->md5_prefix.v4.sin_addr;
+ args->key_addr.v4.sin_family = AF_INET;
+ addr = &args->key_addr.v4.sin_addr;
} else if (family == AF_INET6) {
- args->md5_prefix.v6.sin6_family = AF_INET6;
- addr = &args->md5_prefix.v6.sin6_addr;
+ args->key_addr.v6.sin6_family = AF_INET6;
+ addr = &args->key_addr.v6.sin6_addr;
} else
return 1;

sep = strchr(str, '/');
if (sep) {
*sep = '\0';
sep++;
if (str_to_uint(sep, 1, pfx_len_max,
- &args->prefix_len) != 0) {
- fprintf(stderr, "Invalid port\n");
+ &args->key_addr_prefix_len) != 0) {
+ fprintf(stderr, "Invalid prefix\n");
return 1;
}
} else {
- args->prefix_len = 0;
+ args->key_addr_prefix_len = 0;
}
break;
default:
log_error("unknown address type\n");
exit(1);
@@ -835,13 +835,13 @@ static int validate_addresses(struct sock_args *args)

if (args->remote_addr_str &&
convert_addr(args, args->remote_addr_str, ADDR_TYPE_REMOTE) < 0)
return 1;

- if (args->md5_prefix_str &&
- convert_addr(args, args->md5_prefix_str,
- ADDR_TYPE_MD5_PREFIX) < 0)
+ if (args->key_addr_prefix_str &&
+ convert_addr(args, args->key_addr_prefix_str,
+ ADDR_TYPE_KEY_PREFIX) < 0)
return 1;

if (args->expected_laddr_str &&
convert_addr(args, args->expected_laddr_str,
ADDR_TYPE_EXPECTED_LOCAL))
@@ -2020,11 +2020,11 @@ int main(int argc, char *argv[])
break;
case 'X':
args.client_pw = optarg;
break;
case 'm':
- args.md5_prefix_str = optarg;
+ args.key_addr_prefix_str = optarg;
break;
case 'S':
args.use_setsockopt = 1;
break;
case 'f':
@@ -2079,17 +2079,17 @@ int main(int argc, char *argv[])
return 1;
}
}

if (args.password &&
- ((!args.has_remote_ip && !args.md5_prefix_str) ||
+ ((!args.has_remote_ip && !args.key_addr_prefix_str) ||
args.type != SOCK_STREAM)) {
log_error("MD5 passwords apply to TCP only and require a remote ip for the password\n");
return 1;
}

- if (args.md5_prefix_str && !args.password) {
+ if (args.key_addr_prefix_str && !args.password) {
log_error("Prefix range for MD5 protection specified without a password\n");
return 1;
}

if (iter == 0) {
--
2.25.1


2021-12-08 11:40:06

by Leonard Crestez

[permalink] [raw]
Subject: [PATCH v3 17/18] selftests: nettest: Initial tcp_authopt support

Add support for configuring TCP Authentication Option. Only a single key
is supported with default options.

Reviewed-by: David Ahern <[email protected]>
Signed-off-by: Leonard Crestez <[email protected]>
---
tools/testing/selftests/net/nettest.c | 75 ++++++++++++++++++++++++---
1 file changed, 69 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/net/nettest.c b/tools/testing/selftests/net/nettest.c
index 3841e5fec7c7..5fe7354c40cf 100644
--- a/tools/testing/selftests/net/nettest.c
+++ b/tools/testing/selftests/net/nettest.c
@@ -104,10 +104,12 @@ struct sock_args {
} key_addr;
unsigned int key_addr_prefix_len;
/* 0: default, -1: force off, +1: force on */
int bind_key_ifindex;

+ const char *authopt_password;
+
/* expected addresses and device index for connection */
const char *expected_dev;
const char *expected_server_dev;
int expected_ifindex;

@@ -254,10 +256,54 @@ static int switch_ns(const char *ns)
close(fd);

return ret;
}

+static int tcp_set_authopt(int sd, struct sock_args *args)
+{
+ struct tcp_authopt_key key;
+ int rc;
+
+ memset(&key, 0, sizeof(key));
+ strcpy((char *)key.key, args->authopt_password);
+ key.keylen = strlen(args->authopt_password);
+ key.alg = TCP_AUTHOPT_ALG_HMAC_SHA_1_96;
+
+ if (args->key_addr_prefix_str) {
+ key.flags |= TCP_AUTHOPT_KEY_ADDR_BIND;
+ switch (args->version) {
+ case AF_INET:
+ memcpy(&key.addr, &args->key_addr.v4, sizeof(args->key_addr.v4));
+ break;
+ case AF_INET6:
+ memcpy(&key.addr, &args->key_addr.v6, sizeof(args->key_addr.v6));
+ break;
+ default:
+ log_error("unknown address family\n");
+ exit(1);
+ }
+ if (args->key_addr_prefix_len) {
+ key.flags |= TCP_AUTHOPT_KEY_PREFIXLEN;
+ key.prefixlen = args->key_addr_prefix_len;
+ }
+ }
+
+ if ((args->ifindex && args->bind_key_ifindex >= 0) || args->bind_key_ifindex >= 1) {
+ key.flags |= TCP_AUTHOPT_KEY_IFINDEX;
+ key.ifindex = args->ifindex;
+ log_msg("TCP_AUTHOPT_KEY_IFINDEX set ifindex=%d\n", key.ifindex);
+ } else {
+ log_msg("TCP_AUTHOPT_KEY_IFINDEX off\n", key.ifindex);
+ }
+
+ rc = setsockopt(sd, IPPROTO_TCP, TCP_AUTHOPT_KEY, &key, sizeof(key));
+ if (rc < 0)
+ log_err_errno("setsockopt(TCP_AUTHOPT_KEY)");
+
+ return rc;
+}
+
static int tcp_md5sig(int sd, void *addr, socklen_t alen, struct sock_args *args)
{
int keylen = strlen(args->password);
struct tcp_md5sig md5sig = {};
int opt = TCP_MD5SIG;
@@ -1541,10 +1587,15 @@ static int do_server(struct sock_args *args, int ipc_fd)
if (args->password && tcp_md5_remote(lsd, args)) {
close(lsd);
goto err_exit;
}

+ if (args->authopt_password && tcp_set_authopt(lsd, args)) {
+ close(lsd);
+ goto err_exit;
+ }
+
ipc_write(ipc_fd, 1);
while (1) {
log_msg("waiting for client connection.\n");
FD_ZERO(&rfds);
FD_SET(lsd, &rfds);
@@ -1663,10 +1714,13 @@ static int connectsock(void *addr, socklen_t alen, struct sock_args *args)
goto out;

if (args->password && tcp_md5sig(sd, addr, alen, args))
goto err;

+ if (args->authopt_password && tcp_set_authopt(sd, args))
+ goto err;
+
if (args->bind_test_only)
goto out;

if (connect(sd, addr, alen) < 0) {
if (errno != EINPROGRESS) {
@@ -1852,11 +1906,11 @@ static int ipc_parent(int cpid, int fd, struct sock_args *args)

wait(&status);
return client_status;
}

-#define GETOPT_STR "sr:l:c:p:t:g:P:DRn:M:X:m:d:I:BN:O:SCi6xL:0:1:2:3:Fbqf"
+#define GETOPT_STR "sr:l:c:p:t:g:P:DRn:M:X:m:A:d:I:BN:O:SCi6xL:0:1:2:3:Fbqf"
#define OPT_FORCE_BIND_KEY_IFINDEX 1001
#define OPT_NO_BIND_KEY_IFINDEX 1002

static struct option long_opts[] = {
{"force-bind-key-ifindex", 0, 0, OPT_FORCE_BIND_KEY_IFINDEX},
@@ -1897,14 +1951,15 @@ static void print_usage(char *prog)
" -L len send random message of given length\n"
" -n num number of times to send message\n"
"\n"
" -M password use MD5 sum protection\n"
" -X password MD5 password for client mode\n"
- " -m prefix/len prefix and length to use for MD5 key\n"
- " --no-bind-key-ifindex: Force TCP_MD5SIG_FLAG_IFINDEX off\n"
- " --force-bind-key-ifindex: Force TCP_MD5SIG_FLAG_IFINDEX on\n"
+ " -m prefix/len prefix and length to use for MD5/AO key\n"
+ " --no-bind-key-ifindex: Force disable binding key to ifindex\n"
+ " --force-bind-key-ifindex: Force enable binding key to ifindex\n"
" (default: only if -I is passed)\n"
+ " -A password use RFC5925 TCP Authentication Option with password\n"
"\n"
" -g grp multicast group (e.g., 239.1.1.1)\n"
" -i interactive mode (default is echo and terminate)\n"
"\n"
" -0 addr Expected local address\n"
@@ -2022,10 +2077,13 @@ int main(int argc, char *argv[])
args.client_pw = optarg;
break;
case 'm':
args.key_addr_prefix_str = optarg;
break;
+ case 'A':
+ args.authopt_password = optarg;
+ break;
case 'S':
args.use_setsockopt = 1;
break;
case 'f':
args.use_freebind = 1;
@@ -2085,12 +2143,17 @@ int main(int argc, char *argv[])
args.type != SOCK_STREAM)) {
log_error("MD5 passwords apply to TCP only and require a remote ip for the password\n");
return 1;
}

- if (args.key_addr_prefix_str && !args.password) {
- log_error("Prefix range for MD5 protection specified without a password\n");
+ if (args.key_addr_prefix_str && !args.password && !args.authopt_password) {
+ log_error("Prefix range for authentication requires -M or -A\n");
+ return 1;
+ }
+
+ if (args.key_addr_prefix_len && args.authopt_password) {
+ log_error("TCP-AO does not support prefix match, only full address\n");
return 1;
}

if (iter == 0) {
fprintf(stderr, "Invalid number of messages to send\n");
--
2.25.1


2021-12-08 11:40:07

by Leonard Crestez

[permalink] [raw]
Subject: [PATCH v3 14/18] tcp: authopt: Add NOSEND/NORECV flags

Add flags to allow marking individual keys and invalid for send or recv.
Making keys assymetric this way is not mentioned in RFC5925 but RFC8177
requires that keys inside a keychain have independent "accept" and
"send" lifetimes.

Flag names are negative so that the default behavior is for keys to be
valid for both send and recv.

Setting both NOSEND and NORECV for a certain peer address can be used on
a listen socket can be used to mean "TCP-AO is required from this peer
but no keys are currently valid".

Signed-off-by: Leonard Crestez <[email protected]>
---
include/uapi/linux/tcp.h | 4 ++++
net/ipv4/tcp_authopt.c | 9 ++++++++-
2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index a7f5f918ed5a..ed27feb93b0e 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -401,16 +401,20 @@ struct tcp_authopt {
*
* @TCP_AUTHOPT_KEY_DEL: Delete the key and ignore non-id fields
* @TCP_AUTHOPT_KEY_EXCLUDE_OPTS: Exclude TCP options from signature
* @TCP_AUTHOPT_KEY_ADDR_BIND: Key only valid for `tcp_authopt.addr`
* @TCP_AUTHOPT_KEY_IFINDEX: Key only valid for `tcp_authopt.ifindex`
+ * @TCP_AUTHOPT_KEY_NOSEND: Key invalid for send (expired)
+ * @TCP_AUTHOPT_KEY_NORECV: Key invalid for recv (expired)
*/
enum tcp_authopt_key_flag {
TCP_AUTHOPT_KEY_DEL = (1 << 0),
TCP_AUTHOPT_KEY_EXCLUDE_OPTS = (1 << 1),
TCP_AUTHOPT_KEY_ADDR_BIND = (1 << 2),
TCP_AUTHOPT_KEY_IFINDEX = (1 << 3),
+ TCP_AUTHOPT_KEY_NOSEND = (1 << 4),
+ TCP_AUTHOPT_KEY_NORECV = (1 << 5),
};

/**
* enum tcp_authopt_alg - Algorithms for TCP Authentication Option
*/
diff --git a/net/ipv4/tcp_authopt.c b/net/ipv4/tcp_authopt.c
index b3fd12fcb948..946d598258b1 100644
--- a/net/ipv4/tcp_authopt.c
+++ b/net/ipv4/tcp_authopt.c
@@ -311,10 +311,12 @@ static struct tcp_authopt_key_info *tcp_authopt_lookup_send(struct tcp_authopt_i
int l3index = -1;

hlist_for_each_entry_rcu(key, &info->head, node, 0) {
if (send_id >= 0 && key->send_id != send_id)
continue;
+ if (key->flags & TCP_AUTHOPT_KEY_NOSEND)
+ continue;
if (key->flags & TCP_AUTHOPT_KEY_ADDR_BIND)
if (!tcp_authopt_key_match_sk_addr(key, addr_sk))
continue;
if (key->flags & TCP_AUTHOPT_KEY_IFINDEX) {
if (l3index < 0)
@@ -593,11 +595,13 @@ void tcp_authopt_clear(struct sock *sk)

#define TCP_AUTHOPT_KEY_KNOWN_FLAGS ( \
TCP_AUTHOPT_KEY_DEL | \
TCP_AUTHOPT_KEY_EXCLUDE_OPTS | \
TCP_AUTHOPT_KEY_ADDR_BIND | \
- TCP_AUTHOPT_KEY_IFINDEX)
+ TCP_AUTHOPT_KEY_IFINDEX | \
+ TCP_AUTHOPT_KEY_NOSEND | \
+ TCP_AUTHOPT_KEY_NORECV)

int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
{
struct tcp_authopt_key opt;
struct tcp_authopt_info *info;
@@ -1496,10 +1500,13 @@ static struct tcp_authopt_key_info *tcp_authopt_lookup_recv(struct sock *sk,

if (l3index != key->l3index)
continue;
}
*anykey = true;
+ // If only keys with norecv flag are present still consider that
+ if (key->flags & TCP_AUTHOPT_KEY_NORECV)
+ continue;
if (recv_id >= 0 && key->recv_id != recv_id)
continue;
if (better_key_match(result, key))
result = key;
else if (result)
--
2.25.1


2021-12-08 11:40:07

by Leonard Crestez

[permalink] [raw]
Subject: [PATCH v3 15/18] tcp: authopt: Add prefixlen support

This allows making a key apply to an addr/prefix instead of just the
full addr. This is enabled through a custom flag, default behavior is
still full address match.

This is equivalent to TCP_MD5SIG_FLAG_PREFIX from TCP_MD5SIG and has
the same use-cases.

Signed-off-by: Leonard Crestez <[email protected]>
---
include/net/tcp_authopt.h | 2 ++
include/uapi/linux/tcp.h | 10 ++++++
net/ipv4/tcp_authopt.c | 67 ++++++++++++++++++++++++++++++++++++---
3 files changed, 75 insertions(+), 4 deletions(-)

diff --git a/include/net/tcp_authopt.h b/include/net/tcp_authopt.h
index e57bc732f737..6c50395c0412 100644
--- a/include/net/tcp_authopt.h
+++ b/include/net/tcp_authopt.h
@@ -41,10 +41,12 @@ struct tcp_authopt_key_info {
u8 keylen;
/** @key: Same as &tcp_authopt_key.key */
u8 key[TCP_AUTHOPT_MAXKEYLEN];
/** @l3index: Same as &tcp_authopt_key.ifindex */
int l3index;
+ /** @prefix: Length of addr match (default full) */
+ int prefixlen;
/** @addr: Same as &tcp_authopt_key.addr */
struct sockaddr_storage addr;
/** @alg: Algorithm implementation matching alg_id */
struct tcp_authopt_alg_imp *alg;
};
diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index ed27feb93b0e..b1063e1e1b9f 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -403,18 +403,21 @@ struct tcp_authopt {
* @TCP_AUTHOPT_KEY_EXCLUDE_OPTS: Exclude TCP options from signature
* @TCP_AUTHOPT_KEY_ADDR_BIND: Key only valid for `tcp_authopt.addr`
* @TCP_AUTHOPT_KEY_IFINDEX: Key only valid for `tcp_authopt.ifindex`
* @TCP_AUTHOPT_KEY_NOSEND: Key invalid for send (expired)
* @TCP_AUTHOPT_KEY_NORECV: Key invalid for recv (expired)
+ * @TCP_AUTHOPT_KEY_PREFIXLEN: Valid value in `tcp_authopt.prefixlen`, otherwise
+ * match full address length
*/
enum tcp_authopt_key_flag {
TCP_AUTHOPT_KEY_DEL = (1 << 0),
TCP_AUTHOPT_KEY_EXCLUDE_OPTS = (1 << 1),
TCP_AUTHOPT_KEY_ADDR_BIND = (1 << 2),
TCP_AUTHOPT_KEY_IFINDEX = (1 << 3),
TCP_AUTHOPT_KEY_NOSEND = (1 << 4),
TCP_AUTHOPT_KEY_NORECV = (1 << 5),
+ TCP_AUTHOPT_KEY_PREFIXLEN = (1 << 6),
};

/**
* enum tcp_authopt_alg - Algorithms for TCP Authentication Option
*/
@@ -465,10 +468,17 @@ struct tcp_authopt_key {
* connections through this interface. Interface must be an vrf master.
*
* This is similar to `tcp_msg5sig.tcpm_ifindex`
*/
int ifindex;
+ /**
+ * @prefixlen: length of prefix to match
+ *
+ * Without the TCP_AUTHOPT_KEY_PREFIXLEN flag this is ignored and a full
+ * address match is performed.
+ */
+ int prefixlen;
};

/* setsockopt(fd, IPPROTO_TCP, TCP_ZEROCOPY_RECEIVE, ...) */

#define TCP_RECEIVE_ZEROCOPY_FLAG_TLB_CLEAN_HINT 0x1
diff --git a/net/ipv4/tcp_authopt.c b/net/ipv4/tcp_authopt.c
index 946d598258b1..5069bb80054e 100644
--- a/net/ipv4/tcp_authopt.c
+++ b/net/ipv4/tcp_authopt.c
@@ -1,8 +1,13 @@
// SPDX-License-Identifier: GPL-2.0-or-later

+#include "linux/in6.h"
+#include "linux/inetdevice.h"
#include "linux/net.h"
+#include "linux/socket.h"
+#include "linux/tcp.h"
+#include "net/ipv6.h"
#include <linux/kernel.h>
#include <net/tcp.h>
#include <net/tcp_authopt.h>
#include <crypto/hash.h>

@@ -219,10 +224,14 @@ static bool tcp_authopt_key_match_exact(struct tcp_authopt_key_info *info,
return false;
if ((info->flags & TCP_AUTHOPT_KEY_IFINDEX) != (key->flags & TCP_AUTHOPT_KEY_IFINDEX))
return false;
if ((info->flags & TCP_AUTHOPT_KEY_IFINDEX) && info->l3index != key->ifindex)
return false;
+ if ((info->flags & TCP_AUTHOPT_KEY_PREFIXLEN) != (key->flags & TCP_AUTHOPT_KEY_PREFIXLEN))
+ return false;
+ if ((info->flags & TCP_AUTHOPT_KEY_PREFIXLEN) && info->prefixlen != key->prefixlen)
+ return false;
if ((info->flags & TCP_AUTHOPT_KEY_ADDR_BIND) != (key->flags & TCP_AUTHOPT_KEY_ADDR_BIND))
return false;
if (info->flags & TCP_AUTHOPT_KEY_ADDR_BIND)
if (!ipvx_addr_match(&info->addr, &key->addr))
return false;
@@ -236,17 +245,20 @@ static bool tcp_authopt_key_match_skb_addr(struct tcp_authopt_key_info *key,
u16 keyaf = key->addr.ss_family;
struct iphdr *iph = (struct iphdr *)skb_network_header(skb);

if (keyaf == AF_INET && iph->version == 4) {
struct sockaddr_in *key_addr = (struct sockaddr_in *)&key->addr;
+ u32 mask = inet_make_mask(key->prefixlen);

- return iph->saddr == key_addr->sin_addr.s_addr;
+ return (iph->saddr & mask) == key_addr->sin_addr.s_addr;
} else if (keyaf == AF_INET6 && iph->version == 6) {
struct ipv6hdr *ip6h = (struct ipv6hdr *)skb_network_header(skb);
struct sockaddr_in6 *key_addr = (struct sockaddr_in6 *)&key->addr;

- return ipv6_addr_equal(&ip6h->saddr, &key_addr->sin6_addr);
+ return ipv6_prefix_equal(&ip6h->saddr,
+ &key_addr->sin6_addr,
+ key->prefixlen);
}

/* This actually happens with ipv6-mapped-ipv4-addresses
* IPv6 listen sockets will be asked to validate ipv4 packets.
*/
@@ -262,16 +274,19 @@ static bool tcp_authopt_key_match_sk_addr(struct tcp_authopt_key_info *key,
if (keyaf != addr_sk->sk_family)
return false;

if (keyaf == AF_INET) {
struct sockaddr_in *key_addr = (struct sockaddr_in *)&key->addr;
+ u32 mask = inet_make_mask(key->prefixlen);

- return addr_sk->sk_daddr == key_addr->sin_addr.s_addr;
+ return (addr_sk->sk_daddr & mask) == key_addr->sin_addr.s_addr;
} else if (keyaf == AF_INET6) {
struct sockaddr_in6 *key_addr = (struct sockaddr_in6 *)&key->addr;

- return ipv6_addr_equal(&addr_sk->sk_v6_daddr, &key_addr->sin6_addr);
+ return ipv6_prefix_equal(&addr_sk->sk_v6_daddr,
+ &key_addr->sin6_addr,
+ key->prefixlen);
}

return false;
}

@@ -296,10 +311,16 @@ static bool better_key_match(struct tcp_authopt_key_info *old, struct tcp_authop
/* l3index always overrides non-l3index */
if (old->l3index && new->l3index == 0)
return false;
if (old->l3index == 0 && new->l3index)
return true;
+ /* Full address match overrides match by prefixlen */
+ if (!(new->flags & TCP_AUTHOPT_KEY_PREFIXLEN) && (old->flags & TCP_AUTHOPT_KEY_PREFIXLEN))
+ return false;
+ /* Longer prefixes are better matches */
+ if (new->prefixlen > old->prefixlen)
+ return true;

return false;
}

static struct tcp_authopt_key_info *tcp_authopt_lookup_send(struct tcp_authopt_info *info,
@@ -596,20 +617,31 @@ void tcp_authopt_clear(struct sock *sk)
#define TCP_AUTHOPT_KEY_KNOWN_FLAGS ( \
TCP_AUTHOPT_KEY_DEL | \
TCP_AUTHOPT_KEY_EXCLUDE_OPTS | \
TCP_AUTHOPT_KEY_ADDR_BIND | \
TCP_AUTHOPT_KEY_IFINDEX | \
+ TCP_AUTHOPT_KEY_PREFIXLEN | \
TCP_AUTHOPT_KEY_NOSEND | \
TCP_AUTHOPT_KEY_NORECV)

+static bool ipv6_addr_is_prefix(struct in6_addr *addr, int plen)
+{
+ struct in6_addr copy;
+
+ ipv6_addr_prefix_copy(&copy, addr, plen);
+
+ return !!memcmp(&copy, addr, sizeof(*addr));
+}
+
int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
{
struct tcp_authopt_key opt;
struct tcp_authopt_info *info;
struct tcp_authopt_key_info *key_info, *old_key_info;
struct tcp_authopt_alg_imp *alg;
int l3index = 0;
+ int prefixlen;
int err;

sock_owned_by_me(sk);
err = check_sysctl_tcp_authopt();
if (err)
@@ -641,10 +673,36 @@ int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
if (opt.flags & TCP_AUTHOPT_KEY_ADDR_BIND) {
if (sk->sk_family != opt.addr.ss_family)
return -EINVAL;
}

+ /* check prefixlen */
+ if (opt.flags & TCP_AUTHOPT_KEY_PREFIXLEN) {
+ prefixlen = opt.prefixlen;
+ if (sk->sk_family == AF_INET) {
+ if (prefixlen < 0 || prefixlen > 32)
+ return -EINVAL;
+ if (((struct sockaddr_in *)&opt.addr)->sin_addr.s_addr &
+ ~inet_make_mask(prefixlen))
+ return -EINVAL;
+ }
+ if (sk->sk_family == AF_INET6) {
+ if (prefixlen < 0 || prefixlen > 128)
+ return -EINVAL;
+ if (!ipv6_addr_is_prefix(&((struct sockaddr_in6 *)&opt.addr)->sin6_addr,
+ prefixlen))
+ return -EINVAL;
+ }
+ } else {
+ if (sk->sk_family == AF_INET)
+ prefixlen = 32;
+ else if (sk->sk_family == AF_INET6)
+ prefixlen = 128;
+ else
+ return -EINVAL;
+ }
+
/* Initialize tcp_authopt_info if not already set */
info = __tcp_authopt_info_get_or_create(sk);
if (IS_ERR(info))
return PTR_ERR(info);

@@ -688,10 +746,11 @@ int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
key_info->alg = alg;
key_info->keylen = opt.keylen;
memcpy(key_info->key, opt.key, opt.keylen);
memcpy(&key_info->addr, &opt.addr, sizeof(key_info->addr));
key_info->l3index = l3index;
+ key_info->prefixlen = prefixlen;
hlist_add_head_rcu(&key_info->node, &info->head);

return 0;
}

--
2.25.1


2021-12-08 11:40:11

by Leonard Crestez

[permalink] [raw]
Subject: [PATCH v3 13/18] tcp: authopt: Add initial l3index support

This is a parallel feature to tcp_md5sig.tcpm_ifindex support and allows
applications to server multiple VRFs with a single socket.

The ifindex argument must be the ifindex of a VRF device and must match
exactly, keys with ifindex == 0 (outside of VRF) will not match for
connections inside a VRF.

Keys without the TCP_AUTHOPT_KEY_IFINDEX will ignore ifindex and match
both inside and outside VRF.

Signed-off-by: Leonard Crestez <[email protected]>
---
include/net/tcp_authopt.h | 2 ++
include/uapi/linux/tcp.h | 11 ++++++
net/ipv4/tcp_authopt.c | 71 ++++++++++++++++++++++++++++++++++++---
3 files changed, 79 insertions(+), 5 deletions(-)

diff --git a/include/net/tcp_authopt.h b/include/net/tcp_authopt.h
index 020637265ce9..e57bc732f737 100644
--- a/include/net/tcp_authopt.h
+++ b/include/net/tcp_authopt.h
@@ -39,10 +39,12 @@ struct tcp_authopt_key_info {
u8 alg_id;
/** @keylen: Same as &tcp_authopt_key.keylen */
u8 keylen;
/** @key: Same as &tcp_authopt_key.key */
u8 key[TCP_AUTHOPT_MAXKEYLEN];
+ /** @l3index: Same as &tcp_authopt_key.ifindex */
+ int l3index;
/** @addr: Same as &tcp_authopt_key.addr */
struct sockaddr_storage addr;
/** @alg: Algorithm implementation matching alg_id */
struct tcp_authopt_alg_imp *alg;
};
diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index e02176390519..a7f5f918ed5a 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -400,15 +400,17 @@ struct tcp_authopt {
* enum tcp_authopt_key_flag - flags for `tcp_authopt.flags`
*
* @TCP_AUTHOPT_KEY_DEL: Delete the key and ignore non-id fields
* @TCP_AUTHOPT_KEY_EXCLUDE_OPTS: Exclude TCP options from signature
* @TCP_AUTHOPT_KEY_ADDR_BIND: Key only valid for `tcp_authopt.addr`
+ * @TCP_AUTHOPT_KEY_IFINDEX: Key only valid for `tcp_authopt.ifindex`
*/
enum tcp_authopt_key_flag {
TCP_AUTHOPT_KEY_DEL = (1 << 0),
TCP_AUTHOPT_KEY_EXCLUDE_OPTS = (1 << 1),
TCP_AUTHOPT_KEY_ADDR_BIND = (1 << 2),
+ TCP_AUTHOPT_KEY_IFINDEX = (1 << 3),
};

/**
* enum tcp_authopt_alg - Algorithms for TCP Authentication Option
*/
@@ -450,10 +452,19 @@ struct tcp_authopt_key {
* @addr: Key is only valid for this address
*
* Ignored unless TCP_AUTHOPT_KEY_ADDR_BIND flag is set
*/
struct __kernel_sockaddr_storage addr;
+ /**
+ * @ifindex: ifindex of vrf (l3mdev_master) interface
+ *
+ * If the TCP_AUTHOPT_KEY_IFINDEX flag is set then key only applies for
+ * connections through this interface. Interface must be an vrf master.
+ *
+ * This is similar to `tcp_msg5sig.tcpm_ifindex`
+ */
+ int ifindex;
};

/* setsockopt(fd, IPPROTO_TCP, TCP_ZEROCOPY_RECEIVE, ...) */

#define TCP_RECEIVE_ZEROCOPY_FLAG_TLB_CLEAN_HINT 0x1
diff --git a/net/ipv4/tcp_authopt.c b/net/ipv4/tcp_authopt.c
index a8950c9a7e84..b3fd12fcb948 100644
--- a/net/ipv4/tcp_authopt.c
+++ b/net/ipv4/tcp_authopt.c
@@ -1,7 +1,8 @@
// SPDX-License-Identifier: GPL-2.0-or-later

+#include "linux/net.h"
#include <linux/kernel.h>
#include <net/tcp.h>
#include <net/tcp_authopt.h>
#include <crypto/hash.h>

@@ -214,10 +215,14 @@ static bool tcp_authopt_key_match_exact(struct tcp_authopt_key_info *info,
{
if (info->send_id != key->send_id)
return false;
if (info->recv_id != key->recv_id)
return false;
+ if ((info->flags & TCP_AUTHOPT_KEY_IFINDEX) != (key->flags & TCP_AUTHOPT_KEY_IFINDEX))
+ return false;
+ if ((info->flags & TCP_AUTHOPT_KEY_IFINDEX) && info->l3index != key->ifindex)
+ return false;
if ((info->flags & TCP_AUTHOPT_KEY_ADDR_BIND) != (key->flags & TCP_AUTHOPT_KEY_ADDR_BIND))
return false;
if (info->flags & TCP_AUTHOPT_KEY_ADDR_BIND)
if (!ipvx_addr_match(&info->addr, &key->addr))
return false;
@@ -281,26 +286,49 @@ static struct tcp_authopt_key_info *tcp_authopt_key_lookup_exact(const struct so
return key_info;

return NULL;
}

+static bool better_key_match(struct tcp_authopt_key_info *old, struct tcp_authopt_key_info *new)
+{
+ if (!old)
+ return true;
+
+ /* l3index always overrides non-l3index */
+ if (old->l3index && new->l3index == 0)
+ return false;
+ if (old->l3index == 0 && new->l3index)
+ return true;
+
+ return false;
+}
+
static struct tcp_authopt_key_info *tcp_authopt_lookup_send(struct tcp_authopt_info *info,
const struct sock *addr_sk,
int send_id)
{
struct tcp_authopt_key_info *result = NULL;
struct tcp_authopt_key_info *key;
+ int l3index = -1;

hlist_for_each_entry_rcu(key, &info->head, node, 0) {
if (send_id >= 0 && key->send_id != send_id)
continue;
if (key->flags & TCP_AUTHOPT_KEY_ADDR_BIND)
if (!tcp_authopt_key_match_sk_addr(key, addr_sk))
continue;
- if (result && net_ratelimit())
- pr_warn("ambiguous tcp authentication keys configured for send\n");
- result = key;
+ if (key->flags & TCP_AUTHOPT_KEY_IFINDEX) {
+ if (l3index < 0)
+ l3index = l3mdev_master_ifindex_by_index(sock_net(addr_sk),
+ addr_sk->sk_bound_dev_if);
+ if (l3index != key->l3index)
+ continue;
+ }
+ if (better_key_match(result, key))
+ result = key;
+ else if (result)
+ net_warn_ratelimited("ambiguous tcp authentication keys configured for send\n");
}

return result;
}

@@ -564,18 +592,20 @@ void tcp_authopt_clear(struct sock *sk)
}

#define TCP_AUTHOPT_KEY_KNOWN_FLAGS ( \
TCP_AUTHOPT_KEY_DEL | \
TCP_AUTHOPT_KEY_EXCLUDE_OPTS | \
- TCP_AUTHOPT_KEY_ADDR_BIND)
+ TCP_AUTHOPT_KEY_ADDR_BIND | \
+ TCP_AUTHOPT_KEY_IFINDEX)

int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
{
struct tcp_authopt_key opt;
struct tcp_authopt_info *info;
struct tcp_authopt_key_info *key_info, *old_key_info;
struct tcp_authopt_alg_imp *alg;
+ int l3index = 0;
int err;

sock_owned_by_me(sk);
err = check_sysctl_tcp_authopt();
if (err)
@@ -622,10 +652,24 @@ int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
return -EINVAL;
err = tcp_authopt_alg_require(alg);
if (err)
return err;

+ /* check ifindex is valid (zero is always valid) */
+ if (opt.flags & TCP_AUTHOPT_KEY_IFINDEX && opt.ifindex) {
+ struct net_device *dev;
+
+ rcu_read_lock();
+ dev = dev_get_by_index_rcu(sock_net(sk), opt.ifindex);
+ if (dev && netif_is_l3_master(dev))
+ l3index = dev->ifindex;
+ rcu_read_unlock();
+
+ if (!l3index)
+ return -EINVAL;
+ }
+
key_info = sock_kmalloc(sk, sizeof(*key_info), GFP_KERNEL | __GFP_ZERO);
if (!key_info)
return -ENOMEM;
/* If an old key exists with exact ID then remove and replace.
* RCU-protected readers might observe both and pick any.
@@ -639,10 +683,11 @@ int tcp_set_authopt_key(struct sock *sk, sockptr_t optval, unsigned int optlen)
key_info->alg_id = opt.alg;
key_info->alg = alg;
key_info->keylen = opt.keylen;
memcpy(key_info->key, opt.key, opt.keylen);
memcpy(&key_info->addr, &opt.addr, sizeof(key_info->addr));
+ key_info->l3index = l3index;
hlist_add_head_rcu(&key_info->node, &info->head);

return 0;
}

@@ -1427,21 +1472,37 @@ static struct tcp_authopt_key_info *tcp_authopt_lookup_recv(struct sock *sk,
int recv_id,
bool *anykey)
{
struct tcp_authopt_key_info *result = NULL;
struct tcp_authopt_key_info *key;
+ int l3index = -1;

*anykey = false;
/* multiple matches will cause occasional failures */
hlist_for_each_entry_rcu(key, &info->head, node, 0) {
if (key->flags & TCP_AUTHOPT_KEY_ADDR_BIND &&
!tcp_authopt_key_match_skb_addr(key, skb))
continue;
+ if (key->flags & TCP_AUTHOPT_KEY_IFINDEX) {
+ if (l3index < 0) {
+ if (skb->protocol == htons(ETH_P_IP)) {
+ l3index = inet_sdif(skb) ? inet_iif(skb) : 0;
+ } else if (skb->protocol == htons(ETH_P_IPV6)) {
+ l3index = inet6_sdif(skb) ? inet6_iif(skb) : 0;
+ } else {
+ WARN_ONCE(1, "unexpected skb->protocol=%x", skb->protocol);
+ continue;
+ }
+ }
+
+ if (l3index != key->l3index)
+ continue;
+ }
*anykey = true;
if (recv_id >= 0 && key->recv_id != recv_id)
continue;
- if (!result)
+ if (better_key_match(result, key))
result = key;
else if (result)
net_warn_ratelimited("ambiguous tcp authentication keys configured for recv\n");
}

--
2.25.1


2021-12-13 10:40:47

by Leonard Crestez

[permalink] [raw]
Subject: Re: [PATCH v3 00/18] tcp: Initial support for RFC5925 auth option

On 12/8/21 1:37 PM, Leonard Crestez wrote:
> This is similar to TCP MD5 in functionality but it's sufficiently
> different that wire formats are incompatible. Compared to TCP-MD5 more
> algorithms are supported and multiple keys can be used on the same
> connection but there is still no negotiation mechanism.
>
> Expected use-case is protecting long-duration BGP/LDP connections
> between routers using pre-shared keys. The goal of this series is to
> allow routers using the linux TCP stack to interoperate with vendors
> such as Cisco and Juniper.
>
> Both algorithms described in RFC5926 are implemented but the code is not
> very easily extensible beyond that. In particular there are several code
> paths making stack allocations based on RFC5926 maximum, those would
> have to be increased. Support for arbitrary algorithms was requested
> in reply to previous posts but I believe there is no real use case for
> that.
>
> The current implementation is somewhat loose regarding configuration:
> * Overlaping MKTs can be configured despite what RFC5925 says
> * Current key can be deleted
> * If multiple keys are valid for a destination the kernel picks one
> in an unpredictable manner (this can be overridden).
> These conditions could be tightened but it is not clear the kernel
> should prevent misconfiguration from userspace.
>
> This version implements prefixlen and incorporates comments from v2 as
> well as some unrelated fixes. Here are some known flaws and limitations:
>
> * Crypto API is used with buffers on the stack and inside struct sock,
> this might not work on all arches. I'm currently only testing x64 VMs
> * Interaction with TCP-MD5 not tested in all corners.
> * Interaction with FASTOPEN not tested and unlikely to work because
> sequence number assumptions for syn/ack.
> * Not clear if crypto_ahash_setkey might sleep. If some implementation
> do that then maybe they could be excluded through alloc flags.
> * Traffic key is not cached (reducing performance)
> * There is no useful way to list keys, making userspace debug difficult.
>
> Some testing support is included in nettest and fcnal-test.sh, similar
> to the current level of tcp-md5 testing.
>
> A more elaborate test suite using pytest and scapy is available out of
> tree: https://github.com/cdleonard/tcp-authopt-test That test suite is
> much larger that the kernel code and did not receive many comments so
> I will attempt to push it separately (if at all).
>
> Changes for frr (old): https://github.com/FRRouting/frr/pull/9442
> That PR was made early for ABI feedback, it has many issues.
>
> Changes for yabgp (old): https://github.com/cdleonard/yabgp/commits/tcp_authopt
> This can be use for easy interoperability testing with cisco/juniper/etc.
>
> Changes since PATCH v2:
> * Protect tcp_authopt_alg_get/put_tfm with local_bh_disable instead of
> preempt_disable. This caused signature corruption when send path executing
> with BH enabled was interrupted by recv.
> * Fix accepted keyids not configured locally as "unexpected". If any key
> is configured that matches the peer then traffic MUST be signed.
> * Fix issues related to sne rollover during handshake itself. (Francesco)
> * Implement and test prefixlen (David)
> * Replace shash with ahash and reuse some of the MD5 code (Dmitry)
> * Parse md5+ao options only once in the same function (Dmitry)
> * Pass tcp_authopt_info into inbound check path, this avoids second rcu
> dereference for same packet.
> * Pass tcp_request_socket into inbound check path instead of just listen
> socket. This is required for SNE rollover during handshake and clearifies
> ISN handling.
> * Do not allow disabling via sysctl after enabling once, this is difficult
> to support well (David)
> * Verbose check for sysctl_tcp_authopt (Dmitry)
> * Use netif_index_is_l3_master (David)
> * Cleanup ipvx_addr_match (David)
> * Add a #define tcp_authopt_needed to wrap static key usage because it looks
> nicer.
> * Replace rcu_read_lock with rcu_dereference_protected in SNE updates (Eric)
> Link: https://lore.kernel.org/netdev/[email protected]/
>
> Changes since PATCH v1:
> * Implement Sequence Number Extension
> * Implement l3index for vrf: TCP_AUTHOPT_KEY_IFINDEX as equivalent of
> TCP_MD5SIG_FLAG_IFINDEX
> * Expand TCP-AO tests in fcnal-test.sh to near-parity with md5.
> * Show addr/port on failure similar to md5
> * Remove tox dependency from test suite (create venv directly)
> * Switch default pytest output format to TAP (kselftest standard)
> * Fix _copy_from_sockptr_tolerant stack corruption on short sockopts.
> This was covered in test but error was invisible without STACKPROTECTOR=y
> * Fix sysctl_tcp_authopt check in tcp_get_authopt_val before memset. This
> was harmless because error code is checked in getsockopt anyway.
> * Fix dropping md5 packets on all sockets with AO enabled
> * Fix checking (key->recv_id & TCP_AUTHOPT_KEY_ADDR_BIND) instead of
> key->flags in tcp_authopt_key_match_exact
> * Fix PATCH 1/19 not compiling due to missing "int err" declaration
> * Add ratelimited message for AO and MD5 both present
> * Export all symbols required by CONFIG_IPV6=m (again)
> * Fix compilation with CONFIG_TCP_AUTHOPT=y CONFIG_TCP_MD5SIG=n
> * Fix checkpatch issues
> * Pass -rrequirements.txt to tox to avoid dependency variation.
> Link: https://lore.kernel.org/netdev/[email protected]/
>
> Changes since RFCv3:
> * Implement TCP_AUTHOPT handling for timewait and reset replies. Write
> tests to execute these paths by injecting packets with scapy
> * Handle combining md5 and authopt: if both are configured use authopt.
> * Fix locking issues around send_key, introduced in on of the later patches.
> * Handle IPv4-mapped-IPv6 addresses: it used to be that an ipv4 SYN sent
> to an ipv6 socket with TCP-AO triggered WARN
> * Implement un-namespaced sysctl disabled this feature by default
> * Allocate new key before removing any old one in setsockopt (Dmitry)
> * Remove tcp_authopt_key_info.local_id because it's no longer used (Dmitry)
> * Propagate errors from TCP_AUTHOPT getsockopt (Dmitry)
> * Fix no-longer-correct TCP_AUTHOPT_KEY_DEL docs (Dmitry)
> * Simplify crypto allocation (Eric)
> * Use kzmalloc instead of __GFP_ZERO (Eric)
> * Add static_key_false tcp_authopt_needed (Eric)
> * Clear authopt_info copied from oldsk in __tcp_authopt_openreq (Eric)
> * Replace memcmp in ipv4 and ipv6 addr comparisons (Eric)
> * Export symbols for CONFIG_IPV6=m (kernel test robot)
> * Mark more functions static (kernel test robot)
> * Fix build with CONFIG_PROVE_RCU_LIST=y (kernel test robot)
> Link: https://lore.kernel.org/netdev/[email protected]/
>
> Changes since RFCv2:
> * Removed local_id from ABI and match on send_id/recv_id/addr
> * Add all relevant out-of-tree tests to tools/testing/selftests
> * Return an error instead of ignoring unknown flags, hopefully this makes
> it easier to extend.
> * Check sk_family before __tcp_authopt_info_get_or_create in tcp_set_authopt_key
> * Use sock_owned_by_me instead of WARN_ON(!lockdep_sock_is_held(sk))
> * Fix some intermediate build failures reported by kbuild robot
> * Improve documentation
> Link: https://lore.kernel.org/netdev/[email protected]/
>
> Changes since RFC:
> * Split into per-topic commits for ease of review. The intermediate
> commits compile with a few "unused function" warnings and don't do
> anything useful by themselves.
> * Add ABI documention including kernel-doc on uapi
> * Fix lockdep warnings from crypto by creating pools with one shash for
> each cpu
> * Accept short options to setsockopt by padding with zeros; this
> approach allows increasing the size of the structs in the future.
> * Support for aes-128-cmac-96
> * Support for binding addresses to keys in a way similar to old tcp_md5
> * Add support for retrieving received keyid/rnextkeyid and controling
> the keyid/rnextkeyid being sent.
> Link: https://lore.kernel.org/netdev/01383a8751e97ef826ef2adf93bfde3a08195a43.1626693859.git.cdleonard@gmail.com/
> ```
>
> Leonard Crestez (18):
> tcp: authopt: Initial support and key management
> docs: Add user documentation for tcp_authopt
> tcp: authopt: Add crypto initialization
> tcp: md5: Refactor tcp_sig_hash_skb_data for AO
> tcp: authopt: Compute packet signatures
> tcp: authopt: Hook into tcp core
> tcp: authopt: Disable via sysctl by default
> tcp: authopt: Implement Sequence Number Extension
> tcp: ipv6: Add AO signing for tcp_v6_send_response
> tcp: authopt: Add support for signing skb-less replies
> tcp: ipv4: Add AO signing for skb-less replies
> tcp: authopt: Add key selection controls
> tcp: authopt: Add initial l3index support
> tcp: authopt: Add NOSEND/NORECV flags
> tcp: authopt: Add prefixlen support
> selftests: nettest: Rename md5_prefix to key_addr_prefix
> selftests: nettest: Initial tcp_authopt support
> selftests: net/fcnal: Initial tcp_authopt support
>
> Documentation/networking/index.rst | 1 +
> Documentation/networking/ip-sysctl.rst | 6 +
> Documentation/networking/tcp_authopt.rst | 69 +
> include/linux/tcp.h | 9 +
> include/net/tcp.h | 27 +-
> include/net/tcp_authopt.h | 316 ++++
> include/uapi/linux/snmp.h | 1 +
> include/uapi/linux/tcp.h | 137 ++
> net/ipv4/Kconfig | 14 +
> net/ipv4/Makefile | 1 +
> net/ipv4/proc.c | 1 +
> net/ipv4/sysctl_net_ipv4.c | 39 +
> net/ipv4/tcp.c | 68 +-
> net/ipv4/tcp_authopt.c | 1671 +++++++++++++++++++++
> net/ipv4/tcp_input.c | 41 +-
> net/ipv4/tcp_ipv4.c | 136 +-
> net/ipv4/tcp_minisocks.c | 12 +
> net/ipv4/tcp_output.c | 86 +-
> net/ipv6/tcp_ipv6.c | 108 +-
> tools/testing/selftests/net/fcnal-test.sh | 298 ++++
> tools/testing/selftests/net/nettest.c | 123 +-
> 21 files changed, 3085 insertions(+), 79 deletions(-)
> create mode 100644 Documentation/networking/tcp_authopt.rst
> create mode 100644 include/net/tcp_authopt.h
> create mode 100644 net/ipv4/tcp_authopt.c
>
>
> base-commit: 1fe5b01262844be03de98afdd56d1d393df04d7e


One issue that I realized recently is that there is an ugly race with
the following steps:

1) remote connect
2) server key delete on listen socket
3) server accept

Under the current implementation the new file descriptor created at step
3 has a set of key that is essentially unpredictable. This is because
the list of keys is copied from listen socket to newly accepted socket
at "tcp_create_openreq_child" time which happens after the 3-way
handshake is completely (with an ACK from client) but before accept().

The simplest incorrect behavior is that an incoming connection using a
key that is being deleted may be accepted and it will last indefinitely.
But it's worse than that: if server sockets have an arbitrarily old
version of the key chain then userspace can't reliably implement key
management.

Possible solutions:

1. Make linux TCP copy the keys list from the listen socket on "accept"
specifically. Not clear where in the code.
2. Add a sockopt to copy keys on demand from listen to accept. Like
solution 2 but dumped on userspace.
3. Make userspace check if current key on accepted sockets is expired?
Not clear how exactly and right now there isn't even an interface to
list current keys on a socket.
4. Make keys global, per-namespace instead of per-socket.

Solution 4 is quite a large change to make so late but it would also
simplify userspace in other ways, for example there would no longer be a
need to do the same keyadd/keydel on all the established sockets.

Can you think of other solutions? TCP_MD5SIG does not suffer from this
very much because it only ever supports one for each established socket.

---

I also had a recent conversation with Philip Paeps which plans to
implement TCP-AO for freebsd and various open-source userspace daemons,
hopefully in a way that requires minimum os-specific ifdefs.

His initial idea was to use PF_KEY for a global list of keys but I don't
think it has any clear advantage over just doing SOL_TCP setsockopts on
any random TCP socket. I think if an interface such as PF_KEY or xfrm or
netlink would be supported it would still make sense to have an internal
list of custom struct tcp_authopt_key anyway. So why not just implement
the most specific interface.

Global keys would also be vaguely aligned to this suggestion from FRR:
(https://github.com/FRRouting/frr/pull/9442#issuecomment-904766419)

In theory it would be possible to support both socket-specific and
global keys, but I can't think of a reason to do so other than
compatibility with older version of this patch series.

Can anyone think of a realistic scenario that can only be made to work
with per-socket keys?

---

It would also be possible to push all key rollover policy into the
kernel. Current proprietary implementations implement something similar
to RFC8177: optional time intervals for "accept" and "send". This logic
is very simple, it just requires extending the key struct with 4
wall-time fields.

This is not however required for making keys global so it can be a
separate discussion.

I also received a bunch of kernel build robot complains that I need to fix.

It's also not clear that making the tcp_authopt_info member in twsk a
plain pointer is correct, it might need RCU annotations.

--
Regards,
Leonard