Subject: [PATCH net-next v2 0/6] net/tcp: TCP-AO and TCP-MD5 tracepoints

Signed-off-by: Dmitry Safonov <[email protected]>
---
Changes in v2:
- Fix the build with CONFIG_IPV6=m (Eric Dumazet)
- Move unused keyid/rnext/maclen later in the series to the patch
that uses them (Simon Horman)
- Reworked tcp_ao selftest lib to allow async tracing non-tcp events
(was working on a stress-test that needs trace_kfree_skb() event,
not in this series).
- Separated selftest changes from kernel, as they now have a couple
of unrelated to tracepoints changes
- Wrote a few lines of Documentation/
- Link to v1: https://lore.kernel.org/r/[email protected]

---
Dmitry Safonov (6):
net/tcp: Use static_branch_tcp_{md5,ao} to drop ifdefs
net/tcp: Add a helper tcp_ao_hdr_maclen()
net/tcp: Move tcp_inbound_hash() from headers
net/tcp: Add tcp-md5 and tcp-ao tracepoints
net/tcp: Remove tcp_hash_fail()
Documentation/tcp-ao: Add a few lines on tracepoints

Documentation/networking/tcp_ao.rst | 9 +
include/net/tcp.h | 79 +--------
include/net/tcp_ao.h | 42 +----
include/trace/events/tcp.h | 317 ++++++++++++++++++++++++++++++++++++
net/ipv4/tcp.c | 87 ++++++++--
net/ipv4/tcp_ao.c | 24 +--
net/ipv4/tcp_input.c | 8 +-
net/ipv4/tcp_ipv4.c | 8 +-
net/ipv4/tcp_output.c | 2 +
9 files changed, 434 insertions(+), 142 deletions(-)
---
base-commit: a6ba5125f10bd7307e775e585ad21a8f7eda1b59
change-id: 20240531-tcp_ao-tracepoints-fa2e14e1f0dd

Best regards,
--
Dmitry Safonov <[email protected]>




Subject: [PATCH net-next v2 2/6] net/tcp: Add a helper tcp_ao_hdr_maclen()

From: Dmitry Safonov <[email protected]>

It's going to be used more in TCP-AO tracepoints.

Signed-off-by: Dmitry Safonov <[email protected]>
---
include/net/tcp_ao.h | 5 +++++
net/ipv4/tcp_ao.c | 2 +-
2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
index 471e177362b4..6501ed1dfa1e 100644
--- a/include/net/tcp_ao.h
+++ b/include/net/tcp_ao.h
@@ -19,6 +19,11 @@ struct tcp_ao_hdr {
u8 rnext_keyid;
};

+static inline u8 tcp_ao_hdr_maclen(const struct tcp_ao_hdr *aoh)
+{
+ return aoh->length - sizeof(struct tcp_ao_hdr);
+}
+
struct tcp_ao_counters {
atomic64_t pkt_good;
atomic64_t pkt_bad;
diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index 781b67a52571..7c9e90e531e3 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -884,8 +884,8 @@ tcp_ao_verify_hash(const struct sock *sk, const struct sk_buff *skb,
const struct tcp_ao_hdr *aoh, struct tcp_ao_key *key,
u8 *traffic_key, u8 *phash, u32 sne, int l3index)
{
- u8 maclen = aoh->length - sizeof(struct tcp_ao_hdr);
const struct tcphdr *th = tcp_hdr(skb);
+ u8 maclen = tcp_ao_hdr_maclen(aoh);
void *hash_buf = NULL;

if (maclen != tcp_ao_maclen(key)) {

--
2.42.0



Subject: [PATCH net-next v2 3/6] net/tcp: Move tcp_inbound_hash() from headers

From: Dmitry Safonov <[email protected]>

Two reasons:
1. It's grown up enough
2. In order to not do header spaghetti by including
<trace/events/tcp.h>, which is necessary for TCP tracepoints.

Signed-off-by: Dmitry Safonov <[email protected]>
---
include/net/tcp.h | 65 ++++---------------------------------------------------
net/ipv4/tcp.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 67 insertions(+), 61 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index f6dd035e0fa9..ba594ef70c2d 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -2806,66 +2806,9 @@ static inline bool tcp_ao_required(struct sock *sk, const void *saddr,
return false;
}

-/* Called with rcu_read_lock() */
-static inline enum skb_drop_reason
-tcp_inbound_hash(struct sock *sk, const struct request_sock *req,
- const struct sk_buff *skb,
- const void *saddr, const void *daddr,
- int family, int dif, int sdif)
-{
- const struct tcphdr *th = tcp_hdr(skb);
- const struct tcp_ao_hdr *aoh;
- const __u8 *md5_location;
- int l3index;
-
- /* Invalid option or two times meet any of auth options */
- if (tcp_parse_auth_options(th, &md5_location, &aoh)) {
- tcp_hash_fail("TCP segment has incorrect auth options set",
- family, skb, "");
- return SKB_DROP_REASON_TCP_AUTH_HDR;
- }
-
- if (req) {
- if (tcp_rsk_used_ao(req) != !!aoh) {
- NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOBAD);
- tcp_hash_fail("TCP connection can't start/end using TCP-AO",
- family, skb, "%s",
- !aoh ? "missing AO" : "AO signed");
- return SKB_DROP_REASON_TCP_AOFAILURE;
- }
- }
-
- /* sdif set, means packet ingressed via a device
- * in an L3 domain and dif is set to the l3mdev
- */
- l3index = sdif ? dif : 0;
-
- /* Fast path: unsigned segments */
- if (likely(!md5_location && !aoh)) {
- /* Drop if there's TCP-MD5 or TCP-AO key with any rcvid/sndid
- * for the remote peer. On TCP-AO established connection
- * the last key is impossible to remove, so there's
- * always at least one current_key.
- */
- if (tcp_ao_required(sk, saddr, family, l3index, true)) {
- tcp_hash_fail("AO hash is required, but not found",
- family, skb, "L3 index %d", l3index);
- return SKB_DROP_REASON_TCP_AONOTFOUND;
- }
- if (unlikely(tcp_md5_do_lookup(sk, l3index, saddr, family))) {
- NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND);
- tcp_hash_fail("MD5 Hash not found",
- family, skb, "L3 index %d", l3index);
- return SKB_DROP_REASON_TCP_MD5NOTFOUND;
- }
- return SKB_NOT_DROPPED_YET;
- }
-
- if (aoh)
- return tcp_inbound_ao_hash(sk, skb, family, req, l3index, aoh);
-
- return tcp_inbound_md5_hash(sk, skb, saddr, daddr, family,
- l3index, md5_location);
-}
+enum skb_drop_reason tcp_inbound_hash(struct sock *sk,
+ const struct request_sock *req, const struct sk_buff *skb,
+ const void *saddr, const void *daddr,
+ int family, int dif, int sdif);

#endif /* _TCP_H */
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 5fa68e7f6ddb..99fea9919b08 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -4512,6 +4512,69 @@ EXPORT_SYMBOL(tcp_inbound_md5_hash);

#endif

+/* Called with rcu_read_lock() */
+enum skb_drop_reason
+tcp_inbound_hash(struct sock *sk, const struct request_sock *req,
+ const struct sk_buff *skb,
+ const void *saddr, const void *daddr,
+ int family, int dif, int sdif)
+{
+ const struct tcphdr *th = tcp_hdr(skb);
+ const struct tcp_ao_hdr *aoh;
+ const __u8 *md5_location;
+ int l3index;
+
+ /* Invalid option or two times meet any of auth options */
+ if (tcp_parse_auth_options(th, &md5_location, &aoh)) {
+ tcp_hash_fail("TCP segment has incorrect auth options set",
+ family, skb, "");
+ return SKB_DROP_REASON_TCP_AUTH_HDR;
+ }
+
+ if (req) {
+ if (tcp_rsk_used_ao(req) != !!aoh) {
+ NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOBAD);
+ tcp_hash_fail("TCP connection can't start/end using TCP-AO",
+ family, skb, "%s",
+ !aoh ? "missing AO" : "AO signed");
+ return SKB_DROP_REASON_TCP_AOFAILURE;
+ }
+ }
+
+ /* sdif set, means packet ingressed via a device
+ * in an L3 domain and dif is set to the l3mdev
+ */
+ l3index = sdif ? dif : 0;
+
+ /* Fast path: unsigned segments */
+ if (likely(!md5_location && !aoh)) {
+ /* Drop if there's TCP-MD5 or TCP-AO key with any rcvid/sndid
+ * for the remote peer. On TCP-AO established connection
+ * the last key is impossible to remove, so there's
+ * always at least one current_key.
+ */
+ if (tcp_ao_required(sk, saddr, family, l3index, true)) {
+ tcp_hash_fail("AO hash is required, but not found",
+ family, skb, "L3 index %d", l3index);
+ return SKB_DROP_REASON_TCP_AONOTFOUND;
+ }
+ if (unlikely(tcp_md5_do_lookup(sk, l3index, saddr, family))) {
+ NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND);
+ tcp_hash_fail("MD5 Hash not found",
+ family, skb, "L3 index %d", l3index);
+ return SKB_DROP_REASON_TCP_MD5NOTFOUND;
+ }
+ return SKB_NOT_DROPPED_YET;
+ }
+
+ if (aoh)
+ return tcp_inbound_ao_hash(sk, skb, family, req, l3index, aoh);
+
+ return tcp_inbound_md5_hash(sk, skb, saddr, daddr, family,
+ l3index, md5_location);
+}
+EXPORT_SYMBOL_GPL(tcp_inbound_hash);
+
void tcp_done(struct sock *sk)
{
struct request_sock *req;

--
2.42.0



Subject: [PATCH net-next v2 5/6] net/tcp: Remove tcp_hash_fail()

From: Dmitry Safonov <[email protected]>

Now there are tracepoints, that cover all functionality of
tcp_hash_fail(), but also wire up missing places
They are also faster, can be disabled and provide filtering.

This potentially may create a regression if a userspace depends on dmesg
logs. Fingers crossed, let's see if anyone complains in reality.

Signed-off-by: Dmitry Safonov <[email protected]>
---
include/net/tcp_ao.h | 37 -------------------------------------
net/ipv4/tcp.c | 25 -------------------------
net/ipv4/tcp_ao.c | 9 ---------
3 files changed, 71 deletions(-)

diff --git a/include/net/tcp_ao.h b/include/net/tcp_ao.h
index 6501ed1dfa1e..ebc6d4e3c073 100644
--- a/include/net/tcp_ao.h
+++ b/include/net/tcp_ao.h
@@ -148,43 +148,6 @@ extern struct static_key_false_deferred tcp_ao_needed;
#define static_branch_tcp_ao() false
#endif

-static inline bool tcp_hash_should_produce_warnings(void)
-{
- return static_branch_tcp_md5() || static_branch_tcp_ao();
-}
-
-#define tcp_hash_fail(msg, family, skb, fmt, ...) \
-do { \
- const struct tcphdr *th = tcp_hdr(skb); \
- char hdr_flags[6]; \
- char *f = hdr_flags; \
- \
- if (!tcp_hash_should_produce_warnings()) \
- break; \
- if (th->fin) \
- *f++ = 'F'; \
- if (th->syn) \
- *f++ = 'S'; \
- if (th->rst) \
- *f++ = 'R'; \
- if (th->psh) \
- *f++ = 'P'; \
- if (th->ack) \
- *f++ = '.'; \
- *f = 0; \
- if ((family) == AF_INET) { \
- net_info_ratelimited("%s for %pI4.%d->%pI4.%d [%s] " fmt "\n", \
- msg, &ip_hdr(skb)->saddr, ntohs(th->source), \
- &ip_hdr(skb)->daddr, ntohs(th->dest), \
- hdr_flags, ##__VA_ARGS__); \
- } else { \
- net_info_ratelimited("%s for [%pI6c].%d->[%pI6c].%d [%s]" fmt "\n", \
- msg, &ipv6_hdr(skb)->saddr, ntohs(th->source), \
- &ipv6_hdr(skb)->daddr, ntohs(th->dest), \
- hdr_flags, ##__VA_ARGS__); \
- } \
-} while (0)
-
#ifdef CONFIG_TCP_AO
/* TCP-AO structures and functions */
struct tcp4_ao_context {
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index a9b6ab19eda8..8f7d93da4782 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -4475,7 +4475,6 @@ tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb,

if (!key && hash_location) {
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5UNEXPECTED);
- tcp_hash_fail("Unexpected MD5 Hash found", family, skb, "");
trace_tcp_hash_md5_unexpected(sk, skb);
return SKB_DROP_REASON_TCP_MD5UNEXPECTED;
}
@@ -4491,21 +4490,6 @@ tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb,
NULL, skb);
if (genhash || memcmp(hash_location, newhash, 16) != 0) {
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5FAILURE);
- if (family == AF_INET) {
- tcp_hash_fail("MD5 Hash failed", AF_INET, skb, "%s L3 index %d",
- genhash ? "tcp_v4_calc_md5_hash failed"
- : "", l3index);
- } else {
- if (genhash) {
- tcp_hash_fail("MD5 Hash failed",
- AF_INET6, skb, "L3 index %d",
- l3index);
- } else {
- tcp_hash_fail("MD5 Hash mismatch",
- AF_INET6, skb, "L3 index %d",
- l3index);
- }
- }
trace_tcp_hash_md5_mismatch(sk, skb);
return SKB_DROP_REASON_TCP_MD5FAILURE;
}
@@ -4529,8 +4513,6 @@ tcp_inbound_hash(struct sock *sk, const struct request_sock *req,

/* Invalid option or two times meet any of auth options */
if (tcp_parse_auth_options(th, &md5_location, &aoh)) {
- tcp_hash_fail("TCP segment has incorrect auth options set",
- family, skb, "");
trace_tcp_hash_bad_header(sk, skb);
return SKB_DROP_REASON_TCP_AUTH_HDR;
}
@@ -4548,9 +4530,6 @@ tcp_inbound_hash(struct sock *sk, const struct request_sock *req,
}

NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOBAD);
- tcp_hash_fail("TCP connection can't start/end using TCP-AO",
- family, skb, "%s",
- !aoh ? "missing AO" : "AO signed");
trace_tcp_ao_handshake_failure(sk, skb, keyid, rnext, maclen);
return SKB_DROP_REASON_TCP_AOFAILURE;
}
@@ -4569,15 +4548,11 @@ tcp_inbound_hash(struct sock *sk, const struct request_sock *req,
* always at least one current_key.
*/
if (tcp_ao_required(sk, saddr, family, l3index, true)) {
- tcp_hash_fail("AO hash is required, but not found",
- family, skb, "L3 index %d", l3index);
trace_tcp_hash_ao_required(sk, skb);
return SKB_DROP_REASON_TCP_AONOTFOUND;
}
if (unlikely(tcp_md5_do_lookup(sk, l3index, saddr, family))) {
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND);
- tcp_hash_fail("MD5 Hash not found",
- family, skb, "L3 index %d", l3index);
trace_tcp_hash_md5_required(sk, skb);
return SKB_DROP_REASON_TCP_MD5NOTFOUND;
}
diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index 87c5d39dc105..f0f4203fdfed 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -893,9 +893,6 @@ tcp_ao_verify_hash(const struct sock *sk, const struct sk_buff *skb,
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOBAD);
atomic64_inc(&info->counters.pkt_bad);
atomic64_inc(&key->pkt_bad);
- tcp_hash_fail("AO hash wrong length", family, skb,
- "%u != %d L3index: %d", maclen,
- tcp_ao_maclen(key), l3index);
trace_tcp_ao_wrong_maclen(sk, skb, aoh->keyid,
aoh->rnext_keyid, maclen);
return SKB_DROP_REASON_TCP_AOFAILURE;
@@ -912,8 +909,6 @@ tcp_ao_verify_hash(const struct sock *sk, const struct sk_buff *skb,
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOBAD);
atomic64_inc(&info->counters.pkt_bad);
atomic64_inc(&key->pkt_bad);
- tcp_hash_fail("AO hash mismatch", family, skb,
- "L3index: %d", l3index);
trace_tcp_ao_mismatch(sk, skb, aoh->keyid,
aoh->rnext_keyid, maclen);
kfree(hash_buf);
@@ -944,8 +939,6 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb,
info = rcu_dereference(tcp_sk(sk)->ao_info);
if (!info) {
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOKEYNOTFOUND);
- tcp_hash_fail("AO key not found", family, skb,
- "keyid: %u L3index: %d", aoh->keyid, l3index);
trace_tcp_ao_key_not_found(sk, skb, aoh->keyid,
aoh->rnext_keyid, maclen);
return SKB_DROP_REASON_TCP_AOUNEXPECTED;
@@ -1052,8 +1045,6 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb,
key_not_found:
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOKEYNOTFOUND);
atomic64_inc(&info->counters.key_not_found);
- tcp_hash_fail("Requested by the peer AO key id not found",
- family, skb, "L3index: %d", l3index);
trace_tcp_ao_key_not_found(sk, skb, aoh->keyid,
aoh->rnext_keyid, maclen);
return SKB_DROP_REASON_TCP_AOKEYNOTFOUND;

--
2.42.0



Subject: [PATCH net-next v2 6/6] Documentation/tcp-ao: Add a few lines on tracepoints

From: Dmitry Safonov <[email protected]>

Signed-off-by: Dmitry Safonov <[email protected]>
---
Documentation/networking/tcp_ao.rst | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/Documentation/networking/tcp_ao.rst b/Documentation/networking/tcp_ao.rst
index 8a58321acce7..e96e62d1dab3 100644
--- a/Documentation/networking/tcp_ao.rst
+++ b/Documentation/networking/tcp_ao.rst
@@ -337,6 +337,15 @@ TCP-AO per-socket counters are also duplicated with per-netns counters,
exposed with SNMP. Those are ``TCPAOGood``, ``TCPAOBad``, ``TCPAOKeyNotFound``,
``TCPAORequired`` and ``TCPAODroppedIcmps``.

+For monitoring purposes, there are following TCP-AO trace events:
+``tcp_hash_bad_header``, ``tcp_hash_ao_required``, ``tcp_ao_handshake_failure``,
+``tcp_ao_wrong_maclen``, ``tcp_ao_wrong_maclen``, ``tcp_ao_key_not_found``,
+``tcp_ao_rnext_request``, ``tcp_ao_synack_no_key``, ``tcp_ao_snd_sne_update``,
+``tcp_ao_rcv_sne_update``. It's possible to separately enable any of them and
+one can filter them by net-namespace, 4-tuple, family, L3 index, and TCP header
+flags. If a segment has a TCP-AO header, the filters may also include
+keyid, rnext, and maclen. SNE updates include the rolled-over numbers.
+
RFC 5925 very permissively specifies how TCP port matching can be done for
MKTs::


--
2.42.0



Subject: [PATCH net-next v2 4/6] net/tcp: Add tcp-md5 and tcp-ao tracepoints

From: Dmitry Safonov <[email protected]>

Instead of forcing userspace to parse dmesg (that's what currently is
happening, at least in codebase of my current company), provide a better
way, that can be enabled/disabled in runtime.

Currently, there are already tcp events, add hashing related ones there,
too. Rasdaemon currently exercises net_dev_xmit_timeout,
devlink_health_report, but it'll be trivial to teach it to deal with
failed hashes. Otherwise, BGP may trace/log them itself. Especially
exciting for possible investigations is key rotation (RNext_key
requests).

Suggested-by: Jakub Kicinski <[email protected]>
Signed-off-by: Dmitry Safonov <[email protected]>
---
include/trace/events/tcp.h | 317 +++++++++++++++++++++++++++++++++++++++++++++
net/ipv4/tcp.c | 17 +++
net/ipv4/tcp_ao.c | 13 ++
net/ipv4/tcp_input.c | 8 +-
net/ipv4/tcp_output.c | 2 +
5 files changed, 355 insertions(+), 2 deletions(-)

diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
index 49b5ee091cf6..1c8bd8e186b8 100644
--- a/include/trace/events/tcp.h
+++ b/include/trace/events/tcp.h
@@ -411,6 +411,323 @@ TRACE_EVENT(tcp_cong_state_set,
__entry->cong_state)
);

+DECLARE_EVENT_CLASS(tcp_hash_event,
+
+ TP_PROTO(const struct sock *sk, const struct sk_buff *skb),
+
+ TP_ARGS(sk, skb),
+
+ TP_STRUCT__entry(
+ __field(__u64, net_cookie)
+ __field(const void *, skbaddr)
+ __field(const void *, skaddr)
+ __field(int, state)
+
+ /* sockaddr_in6 is always bigger than sockaddr_in */
+ __array(__u8, saddr, sizeof(struct sockaddr_in6))
+ __array(__u8, daddr, sizeof(struct sockaddr_in6))
+ __field(int, l3index)
+
+ __field(__u16, sport)
+ __field(__u16, dport)
+ __field(__u16, family)
+
+ __field(bool, fin)
+ __field(bool, syn)
+ __field(bool, rst)
+ __field(bool, psh)
+ __field(bool, ack)
+ ),
+
+ TP_fast_assign(
+ const struct tcphdr *th = (const struct tcphdr *)skb->data;
+
+ __entry->net_cookie = sock_net(sk)->net_cookie;
+ __entry->skbaddr = skb;
+ __entry->skaddr = sk;
+ __entry->state = sk->sk_state;
+
+ memset(__entry->saddr, 0, sizeof(struct sockaddr_in6));
+ memset(__entry->daddr, 0, sizeof(struct sockaddr_in6));
+ TP_STORE_ADDR_PORTS_SKB(skb, th, __entry->saddr, __entry->daddr);
+ __entry->l3index = inet_sdif(skb) ? inet_iif(skb) : 0;
+
+ /* For filtering use */
+ __entry->sport = ntohs(th->source);
+ __entry->dport = ntohs(th->dest);
+ __entry->family = sk->sk_family;
+
+ __entry->fin = th->fin;
+ __entry->syn = th->syn;
+ __entry->rst = th->rst;
+ __entry->psh = th->psh;
+ __entry->ack = th->ack;
+ ),
+
+ TP_printk("net=%llu state=%s family=%s src=%pISpc dest=%pISpc L3index=%d [%c%c%c%c%c]",
+ __entry->net_cookie,
+ show_tcp_state_name(__entry->state),
+ show_family_name(__entry->family),
+ __entry->saddr, __entry->daddr,
+ __entry->l3index,
+ __entry->fin ? 'F' : ' ',
+ __entry->syn ? 'S' : ' ',
+ __entry->rst ? 'R' : ' ',
+ __entry->psh ? 'P' : ' ',
+ __entry->ack ? '.' : ' ')
+);
+
+DEFINE_EVENT(tcp_hash_event, tcp_hash_bad_header,
+
+ TP_PROTO(const struct sock *sk, const struct sk_buff *skb),
+ TP_ARGS(sk, skb)
+);
+
+DEFINE_EVENT(tcp_hash_event, tcp_hash_md5_required,
+
+ TP_PROTO(const struct sock *sk, const struct sk_buff *skb),
+ TP_ARGS(sk, skb)
+);
+
+DEFINE_EVENT(tcp_hash_event, tcp_hash_md5_unexpected,
+
+ TP_PROTO(const struct sock *sk, const struct sk_buff *skb),
+ TP_ARGS(sk, skb)
+);
+
+DEFINE_EVENT(tcp_hash_event, tcp_hash_md5_mismatch,
+
+ TP_PROTO(const struct sock *sk, const struct sk_buff *skb),
+ TP_ARGS(sk, skb)
+);
+
+DEFINE_EVENT(tcp_hash_event, tcp_hash_ao_required,
+
+ TP_PROTO(const struct sock *sk, const struct sk_buff *skb),
+ TP_ARGS(sk, skb)
+);
+
+DECLARE_EVENT_CLASS(tcp_ao_event,
+
+ TP_PROTO(const struct sock *sk, const struct sk_buff *skb,
+ const __u8 keyid, const __u8 rnext, const __u8 maclen),
+
+ TP_ARGS(sk, skb, keyid, rnext, maclen),
+
+ TP_STRUCT__entry(
+ __field(__u64, net_cookie)
+ __field(const void *, skbaddr)
+ __field(const void *, skaddr)
+ __field(int, state)
+
+ /* sockaddr_in6 is always bigger than sockaddr_in */
+ __array(__u8, saddr, sizeof(struct sockaddr_in6))
+ __array(__u8, daddr, sizeof(struct sockaddr_in6))
+ __field(int, l3index)
+
+ __field(__u16, sport)
+ __field(__u16, dport)
+ __field(__u16, family)
+
+ __field(bool, fin)
+ __field(bool, syn)
+ __field(bool, rst)
+ __field(bool, psh)
+ __field(bool, ack)
+
+ __field(__u8, keyid)
+ __field(__u8, rnext)
+ __field(__u8, maclen)
+ ),
+
+ TP_fast_assign(
+ const struct tcphdr *th = (const struct tcphdr *)skb->data;
+
+ __entry->net_cookie = sock_net(sk)->net_cookie;
+ __entry->skbaddr = skb;
+ __entry->skaddr = sk;
+ __entry->state = sk->sk_state;
+
+ memset(__entry->saddr, 0, sizeof(struct sockaddr_in6));
+ memset(__entry->daddr, 0, sizeof(struct sockaddr_in6));
+ TP_STORE_ADDR_PORTS_SKB(skb, th, __entry->saddr, __entry->daddr);
+ __entry->l3index = inet_sdif(skb) ? inet_iif(skb) : 0;
+
+ /* For filtering use */
+ __entry->sport = ntohs(th->source);
+ __entry->dport = ntohs(th->dest);
+ __entry->family = sk->sk_family;
+
+ __entry->fin = th->fin;
+ __entry->syn = th->syn;
+ __entry->rst = th->rst;
+ __entry->psh = th->psh;
+ __entry->ack = th->ack;
+
+ __entry->keyid = keyid;
+ __entry->rnext = rnext;
+ __entry->maclen = maclen;
+ ),
+
+ TP_printk("net=%llu state=%s family=%s src=%pISpc dest=%pISpc L3index=%d [%c%c%c%c%c] keyid=%u rnext=%u maclen=%u",
+ __entry->net_cookie,
+ show_tcp_state_name(__entry->state),
+ show_family_name(__entry->family),
+ __entry->saddr, __entry->daddr,
+ __entry->l3index,
+ __entry->fin ? 'F' : ' ',
+ __entry->syn ? 'S' : ' ',
+ __entry->rst ? 'R' : ' ',
+ __entry->psh ? 'P' : ' ',
+ __entry->ack ? '.' : ' ',
+ __entry->keyid, __entry->rnext, __entry->maclen)
+);
+
+DEFINE_EVENT(tcp_ao_event, tcp_ao_handshake_failure,
+ TP_PROTO(const struct sock *sk, const struct sk_buff *skb,
+ const __u8 keyid, const __u8 rnext, const __u8 maclen),
+ TP_ARGS(sk, skb, keyid, rnext, maclen)
+);
+
+DEFINE_EVENT(tcp_ao_event, tcp_ao_wrong_maclen,
+ TP_PROTO(const struct sock *sk, const struct sk_buff *skb,
+ const __u8 keyid, const __u8 rnext, const __u8 maclen),
+ TP_ARGS(sk, skb, keyid, rnext, maclen)
+);
+
+DEFINE_EVENT(tcp_ao_event, tcp_ao_mismatch,
+ TP_PROTO(const struct sock *sk, const struct sk_buff *skb,
+ const __u8 keyid, const __u8 rnext, const __u8 maclen),
+ TP_ARGS(sk, skb, keyid, rnext, maclen)
+);
+
+DEFINE_EVENT(tcp_ao_event, tcp_ao_key_not_found,
+ TP_PROTO(const struct sock *sk, const struct sk_buff *skb,
+ const __u8 keyid, const __u8 rnext, const __u8 maclen),
+ TP_ARGS(sk, skb, keyid, rnext, maclen)
+);
+
+DEFINE_EVENT(tcp_ao_event, tcp_ao_rnext_request,
+ TP_PROTO(const struct sock *sk, const struct sk_buff *skb,
+ const __u8 keyid, const __u8 rnext, const __u8 maclen),
+ TP_ARGS(sk, skb, keyid, rnext, maclen)
+);
+
+DECLARE_EVENT_CLASS(tcp_ao_event_sk,
+
+ TP_PROTO(const struct sock *sk, const __u8 keyid, const __u8 rnext),
+
+ TP_ARGS(sk, keyid, rnext),
+
+ TP_STRUCT__entry(
+ __field(__u64, net_cookie)
+ __field(const void *, skaddr)
+ __field(int, state)
+
+ /* sockaddr_in6 is always bigger than sockaddr_in */
+ __array(__u8, saddr, sizeof(struct sockaddr_in6))
+ __array(__u8, daddr, sizeof(struct sockaddr_in6))
+
+ __field(__u16, sport)
+ __field(__u16, dport)
+ __field(__u16, family)
+
+ __field(__u8, keyid)
+ __field(__u8, rnext)
+ ),
+
+ TP_fast_assign(
+ const struct inet_sock *inet = inet_sk(sk);
+
+ __entry->net_cookie = sock_net(sk)->net_cookie;
+ __entry->skaddr = sk;
+ __entry->state = sk->sk_state;
+
+ memset(__entry->saddr, 0, sizeof(struct sockaddr_in6));
+ memset(__entry->daddr, 0, sizeof(struct sockaddr_in6));
+ TP_STORE_ADDR_PORTS(__entry, inet, sk);
+
+ /* For filtering use */
+ __entry->sport = ntohs(inet->inet_sport);
+ __entry->dport = ntohs(inet->inet_dport);
+ __entry->family = sk->sk_family;
+
+ __entry->keyid = keyid;
+ __entry->rnext = rnext;
+ ),
+
+ TP_printk("net=%llu state=%s family=%s src=%pISpc dest=%pISpc keyid=%u rnext=%u",
+ __entry->net_cookie,
+ show_tcp_state_name(__entry->state),
+ show_family_name(__entry->family),
+ __entry->saddr, __entry->daddr,
+ __entry->keyid, __entry->rnext)
+);
+
+DEFINE_EVENT(tcp_ao_event_sk, tcp_ao_synack_no_key,
+ TP_PROTO(const struct sock *sk, const __u8 keyid, const __u8 rnext),
+ TP_ARGS(sk, keyid, rnext)
+);
+
+DECLARE_EVENT_CLASS(tcp_ao_event_sne,
+
+ TP_PROTO(const struct sock *sk, __u32 new_sne),
+
+ TP_ARGS(sk, new_sne),
+
+ TP_STRUCT__entry(
+ __field(__u64, net_cookie)
+ __field(const void *, skaddr)
+ __field(int, state)
+
+ /* sockaddr_in6 is always bigger than sockaddr_in */
+ __array(__u8, saddr, sizeof(struct sockaddr_in6))
+ __array(__u8, daddr, sizeof(struct sockaddr_in6))
+
+ __field(__u16, sport)
+ __field(__u16, dport)
+ __field(__u16, family)
+
+ __field(__u32, new_sne)
+ ),
+
+ TP_fast_assign(
+ const struct inet_sock *inet = inet_sk(sk);
+
+ __entry->net_cookie = sock_net(sk)->net_cookie;
+ __entry->skaddr = sk;
+ __entry->state = sk->sk_state;
+
+ memset(__entry->saddr, 0, sizeof(struct sockaddr_in6));
+ memset(__entry->daddr, 0, sizeof(struct sockaddr_in6));
+ TP_STORE_ADDR_PORTS(__entry, inet, sk);
+
+ /* For filtering use */
+ __entry->sport = ntohs(inet->inet_sport);
+ __entry->dport = ntohs(inet->inet_dport);
+ __entry->family = sk->sk_family;
+
+ __entry->new_sne = new_sne;
+ ),
+
+ TP_printk("net=%llu state=%s family=%s src=%pISpc dest=%pISpc sne=%u",
+ __entry->net_cookie,
+ show_tcp_state_name(__entry->state),
+ show_family_name(__entry->family),
+ __entry->saddr, __entry->daddr,
+ __entry->new_sne)
+);
+
+DEFINE_EVENT(tcp_ao_event_sne, tcp_ao_snd_sne_update,
+ TP_PROTO(const struct sock *sk, __u32 new_sne),
+ TP_ARGS(sk, new_sne)
+);
+
+DEFINE_EVENT(tcp_ao_event_sne, tcp_ao_rcv_sne_update,
+ TP_PROTO(const struct sock *sk, __u32 new_sne),
+ TP_ARGS(sk, new_sne)
+);
+
#endif /* _TRACE_TCP_H */

/* This part must be outside protection */
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 99fea9919b08..a9b6ab19eda8 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -282,6 +282,7 @@
#include <asm/ioctls.h>
#include <net/busy_poll.h>
#include <net/hotdata.h>
+#include <trace/events/tcp.h>
#include <net/rps.h>

/* Track pending CMSGs. */
@@ -4475,6 +4476,7 @@ tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb,
if (!key && hash_location) {
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5UNEXPECTED);
tcp_hash_fail("Unexpected MD5 Hash found", family, skb, "");
+ trace_tcp_hash_md5_unexpected(sk, skb);
return SKB_DROP_REASON_TCP_MD5UNEXPECTED;
}

@@ -4504,6 +4506,7 @@ tcp_inbound_md5_hash(const struct sock *sk, const struct sk_buff *skb,
l3index);
}
}
+ trace_tcp_hash_md5_mismatch(sk, skb);
return SKB_DROP_REASON_TCP_MD5FAILURE;
}
return SKB_NOT_DROPPED_YET;
@@ -4528,15 +4531,27 @@ tcp_inbound_hash(struct sock *sk, const struct request_sock *req,
if (tcp_parse_auth_options(th, &md5_location, &aoh)) {
tcp_hash_fail("TCP segment has incorrect auth options set",
family, skb, "");
+ trace_tcp_hash_bad_header(sk, skb);
return SKB_DROP_REASON_TCP_AUTH_HDR;
}

if (req) {
if (tcp_rsk_used_ao(req) != !!aoh) {
+ u8 keyid, rnext, maclen;
+
+ if (aoh) {
+ keyid = aoh->keyid;
+ rnext = aoh->rnext_keyid;
+ maclen = tcp_ao_hdr_maclen(aoh);
+ } else {
+ keyid = rnext = maclen = 0;
+ }
+
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOBAD);
tcp_hash_fail("TCP connection can't start/end using TCP-AO",
family, skb, "%s",
!aoh ? "missing AO" : "AO signed");
+ trace_tcp_ao_handshake_failure(sk, skb, keyid, rnext, maclen);
return SKB_DROP_REASON_TCP_AOFAILURE;
}
}
@@ -4556,12 +4571,14 @@ tcp_inbound_hash(struct sock *sk, const struct request_sock *req,
if (tcp_ao_required(sk, saddr, family, l3index, true)) {
tcp_hash_fail("AO hash is required, but not found",
family, skb, "L3 index %d", l3index);
+ trace_tcp_hash_ao_required(sk, skb);
return SKB_DROP_REASON_TCP_AONOTFOUND;
}
if (unlikely(tcp_md5_do_lookup(sk, l3index, saddr, family))) {
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND);
tcp_hash_fail("MD5 Hash not found",
family, skb, "L3 index %d", l3index);
+ trace_tcp_hash_md5_required(sk, skb);
return SKB_DROP_REASON_TCP_MD5NOTFOUND;
}
return SKB_NOT_DROPPED_YET;
diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
index 7c9e90e531e3..87c5d39dc105 100644
--- a/net/ipv4/tcp_ao.c
+++ b/net/ipv4/tcp_ao.c
@@ -16,6 +16,7 @@
#include <net/tcp.h>
#include <net/ipv6.h>
#include <net/icmp.h>
+#include <trace/events/tcp.h>

DEFINE_STATIC_KEY_DEFERRED_FALSE(tcp_ao_needed, HZ);

@@ -895,6 +896,8 @@ tcp_ao_verify_hash(const struct sock *sk, const struct sk_buff *skb,
tcp_hash_fail("AO hash wrong length", family, skb,
"%u != %d L3index: %d", maclen,
tcp_ao_maclen(key), l3index);
+ trace_tcp_ao_wrong_maclen(sk, skb, aoh->keyid,
+ aoh->rnext_keyid, maclen);
return SKB_DROP_REASON_TCP_AOFAILURE;
}

@@ -911,6 +914,8 @@ tcp_ao_verify_hash(const struct sock *sk, const struct sk_buff *skb,
atomic64_inc(&key->pkt_bad);
tcp_hash_fail("AO hash mismatch", family, skb,
"L3index: %d", l3index);
+ trace_tcp_ao_mismatch(sk, skb, aoh->keyid,
+ aoh->rnext_keyid, maclen);
kfree(hash_buf);
return SKB_DROP_REASON_TCP_AOFAILURE;
}
@@ -927,6 +932,7 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb,
int l3index, const struct tcp_ao_hdr *aoh)
{
const struct tcphdr *th = tcp_hdr(skb);
+ u8 maclen = tcp_ao_hdr_maclen(aoh);
u8 *phash = (u8 *)(aoh + 1); /* hash goes just after the header */
struct tcp_ao_info *info;
enum skb_drop_reason ret;
@@ -940,6 +946,8 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb,
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAOKEYNOTFOUND);
tcp_hash_fail("AO key not found", family, skb,
"keyid: %u L3index: %d", aoh->keyid, l3index);
+ trace_tcp_ao_key_not_found(sk, skb, aoh->keyid,
+ aoh->rnext_keyid, maclen);
return SKB_DROP_REASON_TCP_AOUNEXPECTED;
}

@@ -979,6 +987,9 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb,
current_key = READ_ONCE(info->current_key);
/* Key rotation: the peer asks us to use new key (RNext) */
if (unlikely(aoh->rnext_keyid != current_key->sndid)) {
+ trace_tcp_ao_rnext_request(sk, skb, current_key->sndid,
+ aoh->rnext_keyid,
+ tcp_ao_hdr_maclen(aoh));
/* If the key is not found we do nothing. */
key = tcp_ao_established_key(info, aoh->rnext_keyid, -1);
if (key)
@@ -1043,6 +1054,8 @@ tcp_inbound_ao_hash(struct sock *sk, const struct sk_buff *skb,
atomic64_inc(&info->counters.key_not_found);
tcp_hash_fail("Requested by the peer AO key id not found",
family, skb, "L3index: %d", l3index);
+ trace_tcp_ao_key_not_found(sk, skb, aoh->keyid,
+ aoh->rnext_keyid, maclen);
return SKB_DROP_REASON_TCP_AOKEYNOTFOUND;
}

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 212b6fd0caf7..8281ec710240 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3578,8 +3578,10 @@ static void tcp_snd_sne_update(struct tcp_sock *tp, u32 ack)

ao = rcu_dereference_protected(tp->ao_info,
lockdep_sock_is_held((struct sock *)tp));
- if (ao && ack < tp->snd_una)
+ if (ao && ack < tp->snd_una) {
ao->snd_sne++;
+ trace_tcp_ao_snd_sne_update((struct sock *)tp, ao->snd_sne);
+ }
#endif
}

@@ -3604,8 +3606,10 @@ static void tcp_rcv_sne_update(struct tcp_sock *tp, u32 seq)

ao = rcu_dereference_protected(tp->ao_info,
lockdep_sock_is_held((struct sock *)tp));
- if (ao && seq < tp->rcv_nxt)
+ if (ao && seq < tp->rcv_nxt) {
ao->rcv_sne++;
+ trace_tcp_ao_rcv_sne_update((struct sock *)tp, ao->rcv_sne);
+ }
#endif
}

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index f97e098f18a5..c7f43b0af368 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3768,6 +3768,7 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
#ifdef CONFIG_TCP_AO
struct tcp_ao_key *ao_key = NULL;
u8 keyid = tcp_rsk(req)->ao_keyid;
+ u8 rnext = tcp_rsk(req)->ao_rcv_next;

ao_key = tcp_sk(sk)->af_specific->ao_lookup(sk, req_to_sk(req),
keyid, -1);
@@ -3777,6 +3778,7 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
* ao_keyid (RFC5925 RNextKeyID), so let's keep it simple here.
*/
if (unlikely(!ao_key)) {
+ trace_tcp_ao_synack_no_key(sk, keyid, rnext);
rcu_read_unlock();
kfree_skb(skb);
net_warn_ratelimited("TCP-AO: the keyid %u from SYN packet is not present - not sending SYNACK\n",

--
2.42.0



2024-06-05 08:08:12

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH net-next v2 3/6] net/tcp: Move tcp_inbound_hash() from headers

On Wed, Jun 5, 2024 at 4:20 AM Dmitry Safonov via B4 Relay
<[email protected]> wrote:
>
> From: Dmitry Safonov <[email protected]>
>
> Two reasons:
> 1. It's grown up enough
> 2. In order to not do header spaghetti by including
> <trace/events/tcp.h>, which is necessary for TCP tracepoints.
>
> Signed-off-by: Dmitry Safonov <[email protected]>

Then we probably do not need EXPORT_SYMBOL(tcp_inbound_md5_hash); anymore ?

2024-06-05 08:11:27

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH net-next v2 5/6] net/tcp: Remove tcp_hash_fail()

On Wed, Jun 5, 2024 at 4:20 AM Dmitry Safonov via B4 Relay
<[email protected]> wrote:
>
> From: Dmitry Safonov <[email protected]>
>
> Now there are tracepoints, that cover all functionality of
> tcp_hash_fail(), but also wire up missing places
> They are also faster, can be disabled and provide filtering.
>
> This potentially may create a regression if a userspace depends on dmesg
> logs. Fingers crossed, let's see if anyone complains in reality.
>
> Signed-off-by: Dmitry Safonov <[email protected]>

Reviewed-by: Eric Dumazet <[email protected]>

2024-06-05 17:35:47

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH net-next v2 3/6] net/tcp: Move tcp_inbound_hash() from headers

Hi Eric,

Thanks for the review,

On Wed, 5 Jun 2024 at 09:07, Eric Dumazet <[email protected]> wrote:
>
> On Wed, Jun 5, 2024 at 4:20 AM Dmitry Safonov via B4 Relay
> <[email protected]> wrote:
> >
> > From: Dmitry Safonov <[email protected]>
> >
> > Two reasons:
> > 1. It's grown up enough
> > 2. In order to not do header spaghetti by including
> > <trace/events/tcp.h>, which is necessary for TCP tracepoints.
> >
> > Signed-off-by: Dmitry Safonov <[email protected]>
>
> Then we probably do not need EXPORT_SYMBOL(tcp_inbound_md5_hash); anymore ?

Certainly, my bad. I will remove that in v3.

Thanks,
Dmitry

2024-06-05 17:41:22

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH net-next v2 3/6] net/tcp: Move tcp_inbound_hash() from headers

On Wed, Jun 5, 2024 at 7:35 PM Dmitry Safonov <[email protected]> wrote:
>
> Hi Eric,
>
> Thanks for the review,
>
> On Wed, 5 Jun 2024 at 09:07, Eric Dumazet <[email protected]> wrote:
> >
> > On Wed, Jun 5, 2024 at 4:20 AM Dmitry Safonov via B4 Relay
> > <[email protected]> wrote:
> > >
> > > From: Dmitry Safonov <[email protected]>
> > >
> > > Two reasons:
> > > 1. It's grown up enough
> > > 2. In order to not do header spaghetti by including
> > > <trace/events/tcp.h>, which is necessary for TCP tracepoints.
> > >
> > > Signed-off-by: Dmitry Safonov <[email protected]>
> >
> > Then we probably do not need EXPORT_SYMBOL(tcp_inbound_md5_hash); anymore ?
>
> Certainly, my bad. I will remove that in v3.
>

No problem, also make it static, in case this was not clear from my comment.

2024-06-06 01:06:17

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH net-next v2 3/6] net/tcp: Move tcp_inbound_hash() from headers

On Wed, 5 Jun 2024 at 18:37, Eric Dumazet <[email protected]> wrote:
>
> On Wed, Jun 5, 2024 at 7:35 PM Dmitry Safonov <[email protected]> wrote:
> >
> > Hi Eric,
> >
> > Thanks for the review,
> >
> > On Wed, 5 Jun 2024 at 09:07, Eric Dumazet <[email protected]> wrote:
[..]
> > > Then we probably do not need EXPORT_SYMBOL(tcp_inbound_md5_hash); anymore ?
> >
> > Certainly, my bad. I will remove that in v3.
> >
>
> No problem, also make it static, in case this was not clear from my comment.

Thanks!

--
Dmitry