Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp2683403rdg; Mon, 16 Oct 2023 11:27:38 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGdqy2kVt854PIWvdCEBI+8TU4UPLsrr9nSW4FXSoT3a2cXU5UULthCznl6AKaaiij0vOAy X-Received: by 2002:a05:6a00:2190:b0:6ba:2ba7:b9cb with SMTP id h16-20020a056a00219000b006ba2ba7b9cbmr9024113pfi.12.1697480858610; Mon, 16 Oct 2023 11:27:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697480858; cv=none; d=google.com; s=arc-20160816; b=klaYCFcw1zS+HFrbzvvrNNFeoPUz1IfzbOutyUOEOEDAypTRjASmirG4qU9iXTajEv gu/MLFVlphACPJrVyAi9tfSBRbc3MGwBV5yTrP/GlLtJnLhOMx8jcT4y4imNl1bb54UB iYssU7hfeZHCjWiP8hWk1/0otgIHIsl3t9QiiREoY4TtGTb9yJmqlxppE1dh10/B4VC3 2OqB97lnqD4t2xlrXIcTcJvNpHfo9I+iIVM7/rdWc7vhy7QFZkCpx5qkaykSQrRsWd8Z bfpPVckCVGNUuTEyVNQ7UeryJ2cpGUZmli93ljZCaIiRlfsL6xXeh1BA7yYewLRvDZuE AYVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=kb9SubRaEoUIylL4DNY5Xv/ATtzdMT7B3T3lbsOCFSw=; fh=X0+pxaxwBSJD+SPTMXk1Xgk/FD2iNpMukhQNhYv68GM=; b=FC0FL5f6HrImtaFLxneg3UwZxYeicDljMCiacjBGToEegZ/CKIKZn+RyB6pI4drdaf cuTl5AfhzPw/zWOr7TvOcTCYaBpEr3yhdo6TIg8NeMsP0/x6Ur+1Oo8vw7CFS7OXaw8o nKG1pdnjvThRiFSJDYGnhWtTJjKkjlE+hu3aURpZ2XeilR8zZm3G0SYJJ7L9JQz8dQm5 vxqIdPvTE7k/grWm7+KyGQ4ErteGQhtmHFwFctr0gKdGZbiPlq1/OV/WUaDKAk6Anhrw cRPZXNyuAEENjmKsmfXzwEeBwdioH6vlz9pKWmycnKpg6BnCV1LKMcKUJnFWghFLr6Hd G4TA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cloudflare.com header.s=google09082023 header.b=Fxh3OEPp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=cloudflare.com Return-Path: Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id z2-20020aa79f82000000b00690f9ca0f99si349287pfr.12.2023.10.16.11.27.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Oct 2023 11:27:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@cloudflare.com header.s=google09082023 header.b=Fxh3OEPp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=cloudflare.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id E15F680A8569; Mon, 16 Oct 2023 11:27:35 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232228AbjJPS1X (ORCPT + 99 others); Mon, 16 Oct 2023 14:27:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35858 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232198AbjJPS1W (ORCPT ); Mon, 16 Oct 2023 14:27:22 -0400 Received: from mail-ed1-x530.google.com (mail-ed1-x530.google.com [IPv6:2a00:1450:4864:20::530]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 07115E1 for ; Mon, 16 Oct 2023 11:27:19 -0700 (PDT) Received: by mail-ed1-x530.google.com with SMTP id 4fb4d7f45d1cf-522bd411679so8177191a12.0 for ; Mon, 16 Oct 2023 11:27:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google09082023; t=1697480837; x=1698085637; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=kb9SubRaEoUIylL4DNY5Xv/ATtzdMT7B3T3lbsOCFSw=; b=Fxh3OEPpVrhXN/xHJQ6g07KTtmFfijds/o6sb6nz/xErJWTBQ26ILQE7zBfs7SK9zV /VFvha3f9h6fLbWHC7eXcIm9m0YbA3NseM6tKfb+2H1zjio5aBkCom6TbwzBOvdRPTuk ST/xrK2i+fiNk9DSi+MEU+Unvxr8V9pUqrzxLvIxjO2Ktr5dtxUaizw8roKgCWghVjq9 YwGcH0sXO2QjGLxcTEpXTknx2ikwV5yux1PbGc9LhExfu9zS9VU/RyGFr8D9myVtrEMb Q19hb8aNM37zIQT6Av4cSSZlU6yZyMkDbWOB18bu6vNHCZwCEKtNg4WjHC18Pktb8x29 CErA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697480837; x=1698085637; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kb9SubRaEoUIylL4DNY5Xv/ATtzdMT7B3T3lbsOCFSw=; b=wBCSdAH/WeNH+DmQNB9AK3h8MJnTjihrpX27MeIVX6ncL02HICIJngxSQzIDIIpDgp zGKz80Vtg/v5cGCB8fNmUJifPnCCcI9TalYQOQXZgpRN4o15rKh5xy99nXb35UCDP5Li wYewK0ZEGlnXDMMjrV7Od/pRNUzp+r9qPTkzHTfskCqRMkKPnNaZqFR4RSlsa3B7pY2k feuwuJnFuEppLujwUMAfOUmlNniH56h+X7iHA4b5reOHudwcDfaSOnA7qA+sZGofVcom PoiiiKlNWxe50h+PC6mezF33+/0aLFmFg9Uo66xIK0Q1qmi3kx/fuDgDTKQ2dtXKvY7h AiMw== X-Gm-Message-State: AOJu0YzuaYP9+lRHrP1MkpzvI97WQ95SFGw5e1lkhgxSHAXEa7JOh9N9 5U0jjMo4AF+c++byMnCc3bIh6fyuMNeGyl3ZYY3Npw== X-Received: by 2002:a05:6402:51d4:b0:53d:eca8:8775 with SMTP id r20-20020a05640251d400b0053deca88775mr34017edd.26.1697480837407; Mon, 16 Oct 2023 11:27:17 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Yan Zhai Date: Mon, 16 Oct 2023 13:27:06 -0500 Message-ID: Subject: Re: [PATCH v2 net-next] ipv6: avoid atomic fragment on GSO packets To: netdev@vger.kernel.org Cc: "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Aya Levin , Tariq Toukan , linux-kernel@vger.kernel.org, kernel-team@cloudflare.com, Florian Westphal , Willem de Bruijn Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on howler.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Mon, 16 Oct 2023 11:27:36 -0700 (PDT) On Mon, Oct 16, 2023 at 1:23=E2=80=AFPM Yan Zhai wrote= : > > GSO packets can contain a trailing segment that is smaller than > gso_size. When examining the dst MTU for such packet, if its gso_size is > too large, then all segments would be fragmented. However, there is a > good chance the trailing segment has smaller actual size than both > gso_size as well as the MTU, which leads to an "atomic fragment". It is > considered harmful in RFC-8021. An Existing report from APNIC also shows > that atomic fragments are more likely to be dropped even it is > equivalent to a no-op [1]. > > Refactor __ip6_finish_output code to separate GSO and non-GSO packet > processing. It mirrors __ip_finish_output logic now. Add an extra check > in GSO handling to avoid atomic fragments. Lastly, drop dst_allfrag > check, which is no longer true since commit 9d289715eb5c ("ipv6: stop > sending PTB packets for MTU < 1280"). > > Link: https://www.potaroo.net/presentations/2022-03-01-ipv6-frag.pdf [1] > Fixes: b210de4f8c97 ("net: ipv6: Validate GSO SKB before finish IPv6 proc= essing") > Suggested-by: Florian Westphal > Reported-by: David Wragg > Signed-off-by: Yan Zhai > --- Forgot to add v1 thread: https://lore.kernel.org/lkml/20231002171146.GB9274@breakpoint.cc/. It was wrongly implemented though without considering max_frag_size for non-GSO packets though, so not really useful in fact. > net/ipv6/ip6_output.c | 33 +++++++++++++++++++++++---------- > 1 file changed, 23 insertions(+), 10 deletions(-) > > diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c > index a471c7e91761..1de6f3c11655 100644 > --- a/net/ipv6/ip6_output.c > +++ b/net/ipv6/ip6_output.c > @@ -162,7 +162,14 @@ ip6_finish_output_gso_slowpath_drop(struct net *net,= struct sock *sk, > int err; > > skb_mark_not_on_list(segs); > - err =3D ip6_fragment(net, sk, segs, ip6_finish_output2); > + /* Last gso segment might be smaller than actual MTU. Add= ing > + * a fragment header to it would produce an "atomic fragm= ent", > + * which is considered harmful (RFC-8021) > + */ > + err =3D segs->len > mtu ? > + ip6_fragment(net, sk, segs, ip6_finish_output2) : > + ip6_finish_output2(net, sk, segs); > + > if (err && ret =3D=3D 0) > ret =3D err; > } > @@ -170,10 +177,19 @@ ip6_finish_output_gso_slowpath_drop(struct net *net= , struct sock *sk, > return ret; > } > > +static int ip6_finish_output_gso(struct net *net, struct sock *sk, > + struct sk_buff *skb, unsigned int mtu) > +{ > + if (!(IP6CB(skb)->flags & IP6SKB_FAKEJUMBO) && > + !skb_gso_validate_network_len(skb, mtu)) > + return ip6_finish_output_gso_slowpath_drop(net, sk, skb, = mtu); > + > + return ip6_finish_output2(net, sk, skb); > +} > + > static int __ip6_finish_output(struct net *net, struct sock *sk, struct = sk_buff *skb) > { > unsigned int mtu; > - > #if defined(CONFIG_NETFILTER) && defined(CONFIG_XFRM) > /* Policy lookup after SNAT yielded a new policy */ > if (skb_dst(skb)->xfrm) { > @@ -183,17 +199,14 @@ static int __ip6_finish_output(struct net *net, str= uct sock *sk, struct sk_buff > #endif > > mtu =3D ip6_skb_dst_mtu(skb); > - if (skb_is_gso(skb) && > - !(IP6CB(skb)->flags & IP6SKB_FAKEJUMBO) && > - !skb_gso_validate_network_len(skb, mtu)) > - return ip6_finish_output_gso_slowpath_drop(net, sk, skb, = mtu); > + if (skb_is_gso(skb)) > + return ip6_finish_output_gso(net, sk, skb, mtu); > > - if ((skb->len > mtu && !skb_is_gso(skb)) || > - dst_allfrag(skb_dst(skb)) || > + if (skb->len > mtu || > (IP6CB(skb)->frag_max_size && skb->len > IP6CB(skb)->frag_max= _size)) > return ip6_fragment(net, sk, skb, ip6_finish_output2); > - else > - return ip6_finish_output2(net, sk, skb); > + > + return ip6_finish_output2(net, sk, skb); > } > > static int ip6_finish_output(struct net *net, struct sock *sk, struct sk= _buff *skb) > -- > 2.30.2 >