Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp5738742ioo; Wed, 1 Jun 2022 11:29:29 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzi1Rpn+jAwmxsafwQm2ko33gkwUAcfsAx0TN9/BB6RI/xKbj63lqOHcVuhwn/0oBvXB67S X-Received: by 2002:a17:90a:ca13:b0:1e2:fcf3:c7a6 with SMTP id x19-20020a17090aca1300b001e2fcf3c7a6mr17781584pjt.186.1654108169539; Wed, 01 Jun 2022 11:29:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654108169; cv=none; d=google.com; s=arc-20160816; b=EF+tcDEXNg6kCLKJFj7bVdTzB2NtYzMcXnRe1ushFOMKBLrB6rISEIE+AQ45rZHCLS NcnxHA32FrnmXSMPlK0C4cCe5c5ZJ28iGT7UspqTlDbU0B+A5+uX17LIG8yNOtkECGKt o4wPMtLHqdopLzr/HXD55F5DgSSI+/6kYSIFEyPds9DiDxFVOj+qT29HQ/CVVZACnNV0 S7koc1s4N9tZ3zLGYSLw6np+5TQyK8dIPbg7LfpcZTt7T6y14cJRUNHZRJb6eb7B9+mI j7Oe5nuLw1j6WBIP0EI7mj+2ODg6HrVJp12zN9341eowx5xUb/3xi+AJE4WAY2PETmAC vX8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=fB3apY8xbVHthr1QEWHAXtgcu7IrOhAj3wmtPI6ossk=; b=Xv4Jvzfb73A/7ZfNrtjgFbq/mtWnI0cmVS+VIF5AkF41h1d67DaOpFJJeOTBuxLSGn J6tJz5bP1c2OO9YikDDT49ePHQzvFij2h4l8k6Dii01BU7yGUXawmmEPTxdMN+KThGw2 xbHqWSAgrwG+hmqkINcpZtA/PSE1dIOokq1jyk+E/OsAYIDrHJ8GKHueQYCHaKeNPwtO /1dDEEwpTM39VN8Y3YhckDXXFOv/Sl0RTfM2uvFygnTJFdcjc8tIt0YekxzXPK1wdarj c32JOsj71yLLA13SkOSREC5icVrHJK8Z7hKh9hJzMYWQfwmV/SM4EJaHv/E1rqMwnRwD rhvQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=ioHJfIft; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id ji1-20020a170903324100b00163915db48fsi2882942plb.304.2022.06.01.11.29.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jun 2022 11:29:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=ioHJfIft; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 244EF54F93; Wed, 1 Jun 2022 11:27:56 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239498AbiE3Pxr (ORCPT + 99 others); Mon, 30 May 2022 11:53:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44662 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237819AbiE3Pxb (ORCPT ); Mon, 30 May 2022 11:53:31 -0400 Received: from mail-yb1-xb2f.google.com (mail-yb1-xb2f.google.com [IPv6:2607:f8b0:4864:20::b2f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F0B425CC for ; Mon, 30 May 2022 08:28:35 -0700 (PDT) Received: by mail-yb1-xb2f.google.com with SMTP id z186so15294550ybz.3 for ; Mon, 30 May 2022 08:28:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=fB3apY8xbVHthr1QEWHAXtgcu7IrOhAj3wmtPI6ossk=; b=ioHJfIftxUNz0BY07JyYItCLkJmGYaaXpyHWWwHc2fdMgDoCI5jK1O4cSLQRd9aSRi N0Ds0OjrvtSxBPVkHMQN8RBLUw696uBCQ50evtjXg51GFtBbnGZAAiP37Yxn0GG69QnV HyM0gDeeHyEPb1+BvynzteSlmmWDVqzoXYJT8vNjm/HAw8Gez/VHHdnjOGVVI9LL4r0/ mtZzhHM0sEbUuQimXzupEDuDpKg07B2gFOYbGhgg7oveT0AMPkRrZU8o6G0sDXFEitsg 5LJT1Az6E86ynvoBSr+tU5pCuyZnfj7h1rKB6ttCrAvyYgecQ3I3HxE0wfhm1HE7IvQv cHGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=fB3apY8xbVHthr1QEWHAXtgcu7IrOhAj3wmtPI6ossk=; b=FzxNTJdyxZ+huquLL5YokmE5JlTp+4pvL00razK8BOOxUAt7XnIy1ffl5MUNeHWBxJ 9A9C1vLu1uxvXnbcFiqR78nDkhyC28pArPYWbJ2KVEi+XFVjnDvQaOgv1bKIrDR9AlLd G/VPQOYYURPmlIF4BCIElEI4sDEWvtdF3ds3GTaU/0SU8QPYxh148+1gdCLyxLMY+8Xi y1/hEhXGvuB5WOiQMBcLqJ3z4CSVZusAFdAqPkO49OfQQPj8oAGnE2ce764g3GxEGLg9 qn3/w++jDNtnsXSJG4yhmpBTeA0Qysugh/ukkzuO7X1SkvrGFCXYiLR/P13kO5hu11CT xbqQ== X-Gm-Message-State: AOAM531U2sQZhd/2MzzluZpVHfizRfkL61zRcoXRSxAiEhQC4FDNg+rR lVNgOZNWXNU85jmm6qmHXND3Kw0XuCWJof4LG7sU0w== X-Received: by 2002:a05:6902:a:b0:65c:b38e:6d9f with SMTP id l10-20020a056902000a00b0065cb38e6d9fmr9873470ybh.36.1653924514380; Mon, 30 May 2022 08:28:34 -0700 (PDT) MIME-Version: 1.0 References: <5099dc39-c6d9-115a-855b-6aa98d17eb4b@collabora.com> <8eb9b438-7018-4fe3-8be6-bb023df99594@collabora.com> In-Reply-To: <8eb9b438-7018-4fe3-8be6-bb023df99594@collabora.com> From: Eric Dumazet Date: Mon, 30 May 2022 08:28:23 -0700 Message-ID: Subject: Re: [RFC] EADDRINUSE from bind() on application restart after killing To: Muhammad Usama Anjum Cc: "David S. Miller" , Hideaki YOSHIFUJI , David Ahern , Jakub Kicinski , Paolo Abeni , Gabriel Krisman Bertazi , open list , Collabora Kernel ML , Paul Gofman , "open list:NETWORKING [TCP]" , Sami Farin Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 30, 2022 at 6:15 AM Muhammad Usama Anjum wrote: > > Hi, > > Thank you for your reply. > > On 5/25/22 3:13 AM, Eric Dumazet wrote: > > On Tue, May 24, 2022 at 1:19 AM Muhammad Usama Anjum > > wrote: > >> > >> Hello, > >> > >> We have a set of processes which talk with each other through a local > >> TCP socket. If the process(es) are killed (through SIGKILL) and > >> restarted at once, the bind() fails with EADDRINUSE error. This error > >> only appears if application is restarted at once without waiting for 60 > >> seconds or more. It seems that there is some timeout of 60 seconds for > >> which the previous TCP connection remains alive waiting to get closed > >> completely. In that duration if we try to connect again, we get the error. > >> > >> We are able to avoid this error by adding SO_REUSEADDR attribute to the > >> socket in a hack. But this hack cannot be added to the application > >> process as we don't own it. > >> > >> I've looked at the TCP connection states after killing processes in > >> different ways. The TCP connection ends up in 2 different states with > >> timeouts: > >> > >> (1) Timeout associated with FIN_WAIT_1 state which is set through > >> `tcp_fin_timeout` in procfs (60 seconds by default) > >> > >> (2) Timeout associated with TIME_WAIT state which cannot be changed. It > >> seems like this timeout has come from RFC 1337. > >> > >> The timeout in (1) can be changed. Timeout in (2) cannot be changed. It > >> also doesn't seem feasible to change the timeout of TIME_WAIT state as > >> the RFC mentions several hazards. But we are talking about a local TCP > >> connection where maybe those hazards aren't applicable directly? Is it > >> possible to change timeout for TIME_WAIT state for only local > >> connections without any hazards? > >> > >> We have tested a hack where we replace timeout of TIME_WAIT state from a > >> value in procfs for local connections. This solves our problem and > >> application starts to work without any modifications to it. > >> > >> The question is that what can be the best possible solution here? Any > >> thoughts will be very helpful. > >> > > > > One solution would be to extend TCP diag to support killing TIME_WAIT sockets. > > (This has been raised recently anyway) > I think this has been raised here: > https://lore.kernel.org/netdev/ba65f579-4e69-ae0d-4770-bc6234beb428@gmail.com/ > > > > > Then you could zap all sockets, before re-starting your program. > > > > ss -K -ta src :listen_port > > > > Untested patch: > The following command and patch work for my use case. The socket in > TIME_WAIT_2 or TIME_WAIT state are closed when zapped. > > Can you please upstream this patch? Yes, I will when net-next reopens, thanks for testing it. > > > > > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c > > index 9984d23a7f3e1353d2e1fc9053d98c77268c577e..1b7bde889096aa800b2994c64a3a68edf3b62434 > > 100644 > > --- a/net/ipv4/tcp.c > > +++ b/net/ipv4/tcp.c > > @@ -4519,6 +4519,15 @@ int tcp_abort(struct sock *sk, int err) > > local_bh_enable(); > > return 0; > > } > > + if (sk->sk_state == TCP_TIME_WAIT) { > > + struct inet_timewait_sock *tw = inet_twsk(sk); > > + > > + refcount_inc(&tw->tw_refcnt); > > + local_bh_disable(); > > + inet_twsk_deschedule_put(tw); > > + local_bh_enable(); > > + return 0; > > + } > > return -EOPNOTSUPP; > > } > > -- > Muhammad Usama Anjum