Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp1530414pxb; Fri, 1 Apr 2022 16:43:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxAAMEE6fFL6YEbMOmxZ8ff9tkfcoOz1lecT0lnrbINrBJUQGuR/tBwbC2C/Zw1e2aWVoWw X-Received: by 2002:a17:90b:3508:b0:1c6:e4f9:538b with SMTP id ls8-20020a17090b350800b001c6e4f9538bmr14742119pjb.158.1648856623367; Fri, 01 Apr 2022 16:43:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648856623; cv=none; d=google.com; s=arc-20160816; b=odp05pYXGdIVvrSPl1jQ/Qp5m7b6lmepAE1ikxzc/S/Tl3aV4LWMLbsCZtFOYKko5v jOARt92UydsQjLsp1ygHtiNUj1CKfBWoJXFJ6YO/er/r+HOgXWi+ndSD/hcUr+GlDgZr /f7tVkvxU/IpzjhKpZ2glbxhg8R6vTBe1OnlySGAMAJ0+4WKE6pF6ttdyQV7aKyMAaob TXtpT2vmMQ6+3fRsU4xv25idqDOASNHzz4IGuuXS0t4+ddCXeKrIJJ/BHpfLQ0k+RWuW 4ZX6ZmmSCZXukW0hDbYiNcKF234MFON0k/ZdeXwHuUFIEFDYI8D6k0kZhQbIHeJPA5aV iJBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=NStYi8rXtYWop4V2MnxgWp7QtmI7tN/4ns3xchJYyfU=; b=gU4A2sNpoxp19UrHwg3l8B+nedaijJyuwLe7xKAqr5paEthf7LH3rTl8gsXEPZ4mhN Vd2KZRD5dzXsTqjOmCouRgRwnj15ibKpam8N8l73clSAd4VPSWxxvgIOgUnf+dVCWsV4 0tIYtKAe0rB4s522fEjKvLiUErKi6G5VtGcvkro2WGuceaKFDRom1QC39d34wzMA1w0o ymn0/0hGNqTHX62j4mZrWKyFgccLzGgNe3NlpPq/0xijYpIku1WDlR48hfIt4i3f75Qm Nkl8JeoAXW/vKQsy6J8phnNF4IBQoQxjwaGChOMiOlmEepXvZGhiGOIgAkbkFtJfNI4u fEKA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=RcBEDEhw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h38-20020a631226000000b003816043eea5si3648618pgl.154.2022.04.01.16.43.29; Fri, 01 Apr 2022 16:43:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=RcBEDEhw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349354AbiDAQQP (ORCPT + 99 others); Fri, 1 Apr 2022 12:16:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55822 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1350413AbiDAQNy (ORCPT ); Fri, 1 Apr 2022 12:13:54 -0400 Received: from mail-qt1-x832.google.com (mail-qt1-x832.google.com [IPv6:2607:f8b0:4864:20::832]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BD8D0CC52F for ; Fri, 1 Apr 2022 08:39:41 -0700 (PDT) Received: by mail-qt1-x832.google.com with SMTP id a11so2431811qtb.12 for ; Fri, 01 Apr 2022 08:39:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=NStYi8rXtYWop4V2MnxgWp7QtmI7tN/4ns3xchJYyfU=; b=RcBEDEhwGNG4pz2bKGcoUjhgTPL//Nw9SMozlKIJc/XyK5WeU3Z51Nq9I6OP26Qego NZCnXhVVX0/YyfUzn51W4PEJVv9q+AWibLf4tNKLhoRnytOvEeL67wZA0WTNC58XbncV rN5RfSRPM7HXo31kmDUNKIW48ke86Nc6RlBpatVt2QpyHllYeIRubzqOhP01RSyFeJTv 63toCKRjluEIi2Ud+b3w19ZNRaYRwhS0j4Kvym9j1GLh3X7v5squ5E9kC17IF3iC1t6Q xGzYu8R9+oKaCVw+geZf/roX+EFERpJpHQ3Ok5K8/7qADlsejiKMbNh7EPMQf4vOGXuS OsfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=NStYi8rXtYWop4V2MnxgWp7QtmI7tN/4ns3xchJYyfU=; b=tfEFbj8mNJV+GdpiaKI33s/cOZoyh5SF7GSokfVi3ODcERLCzRu13u45u5CiZed+ie Wlnmznpd9i2oYggG+cM51xgpMOFvSrquxtItlPkM3hIdKNSonjdpUzmlqNAaVEiyR7Zr TNGVhYfw6XiE0YHN3GDvNQmzwEARW+HTqCBGDIotr3reEiYgXdZnyEJzpJg/wkh9R078 9Nn01wQHx3K9T8q4Fwzs5DbPhtJPHiV0IpbA/yUjU6v+j6MZkkzmd+Y2DVTj2WKgUreA MHOm+kaS7WzIlDIGhS92JwFLYUo1axjdhFCgIBgJLmWLcLarBc+XDoGWGASL7pRvQgJj pdTg== X-Gm-Message-State: AOAM53073ndViOI1hcOM19+EL4PxevaIrEg9iXeLS4FPXNOHMtf24iAM oNXwnBMTi0nuPRzMrwwZy6QAv0GsTYEN8eJKKXaf4Q== X-Received: by 2002:ac8:7f0d:0:b0:2e1:e894:9f16 with SMTP id f13-20020ac87f0d000000b002e1e8949f16mr8818981qtk.183.1648827580616; Fri, 01 Apr 2022 08:39:40 -0700 (PDT) MIME-Version: 1.0 References: <10c1e561-8f01-784f-c4f4-a7c551de0644@uls.co.za> <5f1bbeb2-efe4-0b10-bc76-37eff30ea905@uls.co.za> In-Reply-To: From: Neal Cardwell Date: Fri, 1 Apr 2022 11:39:24 -0400 Message-ID: Subject: Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections To: Jaco Kroon Cc: Eric Dumazet , LKML , Netdev , Yuchung Cheng , Wei Wang Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 29, 2022 at 9:03 PM Jaco wrote: ... > Connection setup: > > 00:56:17.055481 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:4= 00c:c07::1b.25: Flags [S], seq 956633779, win 62580, options [mss 8940,nop,= nop,TS val 3687705482 ecr 0,nop,wscale 7,tfo cookie f025dd84b6122510,nop,n= op], length 0 > > 00:56:17.217747 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:f= eb8:f27b.59110: Flags [S.], seq 726465675, ack 956633780, win 65535, option= s [mss 1440,nop,nop,TS val 3477429218 ecr 3687705482,nop,wscale 8], length = 0 > > 00:56:17.218628 IP6 2a00:1450:400c:c07::1b.25 > 2c0f:f720:0:3:d6ae:52ff:f= eb8:f27b.59110: Flags [P.], seq 726465676:726465760, ack 956633780, win 256= , options [nop,nop,TS val 3477429220 ecr 3687705482], length 84: SMTP: 220 = mx.google.com ESMTP e16-20020a05600c4e5000b0038c77be9b2dsi226281wmq.72 - gs= mtp > > 00:56:17.218663 IP6 2c0f:f720:0:3:d6ae:52ff:feb8:f27b.59110 > 2a00:1450:4= 00c:c07::1b.25: Flags [.], ack 726465760, win 489, options [nop,nop,TS val = 3687705645 ecr 3477429220], length 0 > > This is pretty normal, we advertise an MSS of 8940 and the return is 1440= , thus > we shouldn't send segments larger than that, and they "can't". I need to > determine if this is some form of offloading or they really are sending >= 1500 > byte frames (which I know won't pass our firewalls without fragmentation = so > probably some form of NIC offloading - which if it was active on older 5.= 8 > kernels did not cause problems): Jaco, was there some previous kernel version on these client machines where this problem did not show up? Perhaps the v5.8 version you mention here? Can you please share the exact version number? If so, a hypothesis would be: (1) There is a bug in netfilter's handling of TFO connections where the server sends a data packet after a TFO SYNACK, before the client ACKs anything (as we see in this trace). This bug is perhaps similar in character to the bug fixed by Yuchung's 2013 commit that Eric mentioned: 356d7d88e088687b6578ca64601b0a2c9d145296 netfilter: nf_conntrack: fix tcp_in_window for Fast Open (2) With kernel v5.8, TFO blackhole detection detected that in your workload there were TFO connections that died due to apparent blackholing (like what's shown in the trace), and dynamically disabled TFO on your machines. This allowed mail traffic to flow, because the netfilter bug was no longer tickled. This worked around the netfilter bug. (3) You upgraded your client-side machine from v5.8 to v5.17, which has the following commit from v5.14, which disables TFO blackhole logic by default: 213ad73d0607 tcp: disable TFO blackhole logic by default (4) Due to (3), the blackhole detection logic was no longer operative, and when the netfilter bug blackholed the connection, TFO stayed enabled. This caused mail traffic to Google to stall. This hypothesis would explain why: o disabling TFO fixes this problem o you are seeing this with a newer kernel (and apparently not with a kernel before v5.14?) With this hypothesis, we need several pieces to trigger this: (a) client side software that tries TFO to a server that supports TFO (like the exim mail transfer agent you are using, connecting to Google) (b) a client-side Linux kernel running buggy netfilter code (you are running netfilter) (c) a client-side Linux kernel with TFO support but no blackhole detection logic active (e.g. v5.14 or later, like your v5.17.1) That's probably a rare combination, so would explain why we have not had this report before. Jaco, to provide some evidence for this hypothesis, can you please re-enable fastopen but also enable the TFO blackhole detection that was disabled in v5.14 (213ad73d0607), with something like: sysctl -w net.ipv4.tcp_fastopen=3D1 sysctl -w tcp_fastopen_blackhole_timeout=3D3600 And then after a few hours, check to see if this blackholing behavior has been detected: nstat -az | grep -i blackhole And see if TFO FastOpenActive attempts have been cut to a super-low rate: nstat -az | grep -i fastopenactive thanks, neal