Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp130758pxb; Wed, 20 Oct 2021 18:13:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxdSV9qGM/ADlc5GrT6iTxc9yEaMZSZ5+rfVw0JVVrkacyJr5yK9F2l9gp376D6TwM3QGfY X-Received: by 2002:a17:906:646:: with SMTP id t6mr3403162ejb.197.1634778837688; Wed, 20 Oct 2021 18:13:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634778837; cv=none; d=google.com; s=arc-20160816; b=LYU2rlTKxvnIW7mKa/IZPpL6szgIB3wLYIRjMM/R0AsUvcuRFgrydZqxV0gv6n7zGT kTRqze51P0RqTCdy0Dq65Dmrsgw7OOkiro8uPo2zwwD3PT65N4vvuDnkpZoHTywWO8mL 5ocx6LHvZcNSEOU7hBsLtKlhtAGC6I000xQBdypce4N2yzog8vHJGw5jLVgI0qOoM+6t sy77O03EHWVl1o4iy+Gge2KlW3qI+u/v87U+6MtcmGEdRc/8Uo8Dx1OLuN8KBhVi58pz IIjESODgKDBWb2Ry5DPuQwPqpJwm5woWqGEwktl4E1HkPnwQUhTgh/+p1e+YAya9J1eY ztTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=59+7vYZvVeE11Od3F0Lfi0fRFVyktXzOm+nlhxMfCew=; b=D7oysMoAjulgTQbknpgaxRrBD8POQgmIsnQxnDQBlbz0V2u4CVv/i5iqVgNCk3A8CW jYfcEs2lLpFnL/qNLX58XiB83midEVTJGaGfOAhtiYcTMiRvjPjKnNlC/FLzgjux2ACn GnHy7KT+7Fmzh0uoUoWlSgECUtbF2UArdRhlGbEkaRp50wNSQwjL+lVJdArD/vu08DbI ezuKX+0de+cvUFOSzvLaNTR5MScXD0aSm4p2YCe9lRwF9AEB//ByPZ/LnRE7oqzUEuJ5 gmi0SHLZij5eDuq8Gm/Ye+wfXg9AfTdLvdQByRrhajNbpKiHyjq2l+tYWBjMvWHeoIkt Z9aQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=UYB4ksJC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id sd37si5884799ejc.13.2021.10.20.18.13.26; Wed, 20 Oct 2021 18:13:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=UYB4ksJC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231137AbhJUBNZ (ORCPT + 99 others); Wed, 20 Oct 2021 21:13:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56498 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229702AbhJUBNX (ORCPT ); Wed, 20 Oct 2021 21:13:23 -0400 Received: from mail-yb1-xb33.google.com (mail-yb1-xb33.google.com [IPv6:2607:f8b0:4864:20::b33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6D5AC061749 for ; Wed, 20 Oct 2021 18:11:06 -0700 (PDT) Received: by mail-yb1-xb33.google.com with SMTP id b9so4292085ybc.5 for ; Wed, 20 Oct 2021 18:11:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=59+7vYZvVeE11Od3F0Lfi0fRFVyktXzOm+nlhxMfCew=; b=UYB4ksJC2CmvyFAuJCYud6lA9hfnrEV6XZHX0QgNcPhyM5FELde1LLAtnhak5w3aW+ 1cyOIj1hA/EzOuaNHEYcA7BU6v7oP3BWfKwdWEJEDVxCPYR2LslnfUJEh0K++9UjFrWe d585HpM1Zxmf5Ww012naPzNOkB5NvVOv9dmphh4e8hFnqCjVnf5UH2wY95QbSi7xgrL+ 1M1ck4miCcnCml8A2s0HXVhaE5xDLqv38IVzEdT1m4YESSf7aKuZboeDj1en2ag0l0Cp TFApW+hyLKxGt42UW3f9j8ptkFI53Bxssxk3aaNky5hb8R07Ovls+0agdWqwpjjPlQXV tXOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=59+7vYZvVeE11Od3F0Lfi0fRFVyktXzOm+nlhxMfCew=; b=vz2bqI5mVAuYvN8qK0EcG8ufhMHu6X8Ae/o+WcIr9l2SWpH5CVm+kFCPGJch1vfwwE LsG/MmyJGeCYBDUB3gbjKRMqgKMkEyE5fGDkHefrvOd0ZgwlcLBL/acbLvd07EJ1sY09 Aw81boRnxvLMAv3aUcN+Tf4KHet0Xar7Arn6JiIL+fnjdpMzl3J4J7LIFNBQbLMdP/+T 2Wq9wMUWZpQj3bKPScohmSvjH0S0q9MO1QCdNAR5gZQTNipWAE+OL7/YB4lEXKXJUPpQ vHyZze8z9SOhOPFjhK22JpC5VLO8DQZhHQeCFSIXeAP/Ck6hDQ000SxEYtVsF42Zz17L 0SQw== X-Gm-Message-State: AOAM5324PqMU3Z0mrSGOXkCKPjUh8/739+2cQiw5HUEI/UtS5aMUHbUz u4eiMsAhwvnR11B5nYE7ak77cudwSye30OZFbenP4w== X-Received: by 2002:a25:918e:: with SMTP id w14mr2654210ybl.225.1634778665329; Wed, 20 Oct 2021 18:11:05 -0700 (PDT) MIME-Version: 1.0 References: <20211020232447.9548-1-jmaxwell37@gmail.com> In-Reply-To: <20211020232447.9548-1-jmaxwell37@gmail.com> From: Eric Dumazet Date: Wed, 20 Oct 2021 18:10:54 -0700 Message-ID: Subject: Re: [net-next] tcp: don't free a FIN sk_buff in tcp_remove_empty_skb() To: Jon Maxwell Cc: David Miller , Hideaki YOSHIFUJI , David Ahern , Jakub Kicinski , netdev , LKML Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 20, 2021 at 4:25 PM Jon Maxwell wrote: > > A customer reported sockets stuck in the CLOSING state. A Vmcore revealed= that > the write_queue was not empty as determined by tcp_write_queue_empty() bu= t the > sk_buff containing the FIN flag had been freed and the socket was zombied= in > that state. Corresponding pcaps show no FIN from the Linux kernel on the = wire. > > Some instrumentation was added to the kernel and it was found that there = is a > timing window where tcp_sendmsg() can run after tcp_send_fin(). > > tcp_sendmsg() will hit an error, for example: > > 1269 =E2=96=B9 if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))= =E2=86=A9 > 1270 =E2=96=B9 =E2=96=B9 goto do_error;=E2=86=A9 > > tcp_remove_empty_skb() will then free the FIN sk_buff as "skb->len =3D=3D= 0". The > TCP socket is now wedged in the FIN-WAIT-1 state because the FIN is never= sent. > > If the other side sends a FIN packet the socket will transition to CLOSIN= G and > remain that way until the system is rebooted. > > Fix this by checking for the FIN flag in the sk_buff and don't free it if= that > is the case. Testing confirmed that fixed the issue. > > Fixes: fdfc5c8594c2 ("tcp: remove empty skb from write queue in error cas= es") > Signed-off-by: Jon Maxwell > --- > net/ipv4/tcp.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c > index c2d9830136d2..d2b06d8f0c37 100644 > --- a/net/ipv4/tcp.c > +++ b/net/ipv4/tcp.c > @@ -938,7 +938,7 @@ int tcp_send_mss(struct sock *sk, int *size_goal, int= flags) > */ > void tcp_remove_empty_skb(struct sock *sk, struct sk_buff *skb) > { > - if (skb && !skb->len) { > + if (skb && !skb->len && !TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)= { > tcp_unlink_write_queue(skb, sk); > if (tcp_write_queue_empty(sk)) > tcp_chrono_stop(sk, TCP_CHRONO_BUSY); > Very nice catch ! The FIN flag is a really special case here. What we need is to make sure the skb is 'empty' . What about using a single condition ? if (skb && TCP_SKB_CB(skb)->seq =3D=3D TCP_SKB_CB(skb)->end_seq)