When the ipv6 stack output a GSO packet, if its gso_size is larger than
dst MTU, then all segments would be fragmented. However, it is possible
for a GSO packet to have a trailing segment with smaller actual size
than both gso_size as well as the MTU, which leads to an "atomic
fragment". Atomic fragments are considered harmful in RFC-8021. An
Existing report from APNIC also shows that atomic fragments are more
likely to be dropped even it is equivalent to a no-op [1].
The series contains following changes on IPv6 output:
* drop dst_allfrag check, which is always false now
* refactor __ip6_finish_output code to separate GSO and non-GSO packet processing,
mirroring IPv4 side logic
* avoid generating atomic fragment on GSO packets
Link: https://www.potaroo.net/presentations/2022-03-01-ipv6-frag.pdf [1]
change log:
V2 -> V3: split the changes to separate commits as Willem de Bruijn suggested
V1 is incorrect and omitted
V2: https://lore.kernel.org/netdev/ZS1%[email protected]/
Yan Zhai (3):
ipv6: remove dst_allfrag test on ipv6 output
ipv6: refactor ip6_finish_output for GSO handling
ipv6: avoid atomic fragment on GSO packets
net/ipv6/ip6_output.c | 31 ++++++++++++++++++++++---------
1 file changed, 22 insertions(+), 9 deletions(-)
--
2.30.2
When the ipv6 stack output a GSO packet, if its gso_size is larger than
dst MTU, then all segments would be fragmented. However, it is possible
for a GSO packet to have a trailing segment with smaller actual size
than both gso_size as well as the MTU, which leads to an "atomic
fragment". Atomic fragments are considered harmful in RFC-8021. An
Existing report from APNIC also shows that atomic fragments are more
likely to be dropped even it is equivalent to a no-op [1].
Add an extra check in the GSO slow output path. For each segment from
the original over-sized packet, if it fits with the path MTU, then avoid
generating an atomic fragment.
Link: https://www.potaroo.net/presentations/2022-03-01-ipv6-frag.pdf [1]
Fixes: b210de4f8c97 ("net: ipv6: Validate GSO SKB before finish IPv6 processing")
Reported-by: David Wragg <[email protected]>
Signed-off-by: Yan Zhai <[email protected]>
---
net/ipv6/ip6_output.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 3270d56b5c37..3d4e8edaa10b 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -162,7 +162,13 @@ ip6_finish_output_gso_slowpath_drop(struct net *net, struct sock *sk,
int err;
skb_mark_not_on_list(segs);
- err = ip6_fragment(net, sk, segs, ip6_finish_output2);
+ /* Last GSO segment can be smaller than gso_size (and MTU).
+ * Adding a fragment header would produce an "atomic fragment",
+ * which is considered harmful (RFC-8021). Avoid that.
+ */
+ err = segs->len > mtu ?
+ ip6_fragment(net, sk, segs, ip6_finish_output2) :
+ ip6_finish_output2(net, sk, segs);
if (err && ret == 0)
ret = err;
}
--
2.30.2