Received: by 2002:a05:7412:3784:b0:e2:908c:2ebd with SMTP id jk4csp553042rdb; Sat, 30 Sep 2023 16:09:24 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFyVy2+KMipNzl9EMh2ZdG4Q67QT5pXA4yovkvottoJjqw9V7FFz+3MYXVPV3/5smwrLW8K X-Received: by 2002:a05:6a00:190e:b0:68c:57c7:1eb0 with SMTP id y14-20020a056a00190e00b0068c57c71eb0mr8915775pfi.11.1696115364581; Sat, 30 Sep 2023 16:09:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696115364; cv=none; d=google.com; s=arc-20160816; b=R3UFr08sQZBdCjbX7pKJq6BeXCfea7mxsDxHk0PckPt4ElwHLByIx1Oiljd9jhNNxj 53TbO9UlqxZE/zdb65cq7k6fItC2gIJpF6Pz0sRrfkhvBHHfDrXhzLWccDGzd8999fB3 CQfAPlAcokzharbU9bHmms1uje4/srgLHxcxPmEDoz95Cvf5yWNQg78yFpDgGC8OuMmw JzFiPyt7mrUFCyNDVuN4DeVVqqesZAFpnyEAdt6fVIYEHoeRuzTO8qmaRelW1zcVDZRE 4IyGzuuriXUXTc/B8LLWY3hdX5QnpzesQa5DWAuEzJ9re3TwnTGv/6qTk595OATbrLdH byeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=jy8DN3Zlf3dOuVSr2y4SHxyHmlerLeAnaUPPAkN08E0=; fh=utJqhDeOTEgDQVBimEEggCTA1/RUefc1K/awvp+UW64=; b=WwQf7vpq7pIiMyHMz22A8BOQOFbafWrrGbcBA4ZgRG64ph9OaRmQcH/8EffAmOYqZL o+4Wc7Ithlu0JWOO4DYEaGj5md07ZFBIBP+jVnnVeTAgdjs8RLC43Mt30G4T7Nvrnaot nlmVUvd9l+5xmzSVJh/ZmKRarqWBrHBJk8pTRCQTL/lRkvo4uWe4PxeOxOFmD4T4ZaA0 tvIMKoDyGEgriOcj7yYe7FS4xfQrEgVeW6kWVn5Nj40r8uEJY0MZSu3LkE8YoVQws5gs Rh37YErP4Z415OcHxD60tWw8VN5yUzZ+sxabFV/oT1kgHB8AFV/n+uTy4yA24fGWjxQ+ mOPg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@umich.edu header.s=google-2016-06-03 header.b=e5B8dCAy; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=umich.edu Return-Path: Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id y36-20020a056a001ca400b0068fce6a86acsi23402702pfw.121.2023.09.30.16.09.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Sep 2023 16:09:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@umich.edu header.s=google-2016-06-03 header.b=e5B8dCAy; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=umich.edu Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id D1D4E801F491; Sat, 30 Sep 2023 15:37:01 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232715AbjI3WhB (ORCPT + 99 others); Sat, 30 Sep 2023 18:37:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53196 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230492AbjI3WhB (ORCPT ); Sat, 30 Sep 2023 18:37:01 -0400 Received: from mail-lj1-x22a.google.com (mail-lj1-x22a.google.com [IPv6:2a00:1450:4864:20::22a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B209FCF for ; Sat, 30 Sep 2023 15:36:58 -0700 (PDT) Received: by mail-lj1-x22a.google.com with SMTP id 38308e7fff4ca-2c18b0569b6so12468581fa.1 for ; Sat, 30 Sep 2023 15:36:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=umich.edu; s=google-2016-06-03; t=1696113417; x=1696718217; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=jy8DN3Zlf3dOuVSr2y4SHxyHmlerLeAnaUPPAkN08E0=; b=e5B8dCAy3siJ74I0+8tpu5PwBB4wMcwohjbxetlTSrYkrcQpGUAieBw0lsfOunORYS Plzv8aMjd1bhYN2PB4GfwayrYHrwhGJwgiVjTe3OAnXrjY/ZObMFuopURMRTkLjwmf42 w0VT5NkTOrQtpqoXOrcpl/2ThSf7ddrNqbiY5IYhXhTmigCTO2ICPTWYMf52BTq9yVBD qC5+d3LODuJMKsf9AscrAjK5l+eC3/wou4i6ucMxThqcC8LFSanH/MR1Hu6l8L8yMS13 kgV/9wV768LWc4RYvN50iQr1ylFemNJLwYR9TNqxrdFn+sJAswck9kmnUhvy+/zVN3kS dNqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696113417; x=1696718217; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jy8DN3Zlf3dOuVSr2y4SHxyHmlerLeAnaUPPAkN08E0=; b=HJlX/lrzGJywrtVyjetcxILp8pqihMEcn660rt9mIcDcSTMh15nGif/jM/y87N/941 tajjgabdt58XFHkYAFjlqU8ap87+iGsrZD1MjYYK6xv2nuSwPFHdZbCwBbMvdvKTciXE TwrDaEyDi2QGyfIN3vfqnW1NwMNS7EmmuilbUeepWliCiZ32ALpQd4U+1U72PsMr9Mzv 9V61vKaMRzJQ6hoDdkaJ03XgB+E2FZ2r2FyI5E6pmrSiPIM+57hu9LzRGKvZ7DhqRrXO 5gHqAabsgk2807wlYCQIFF8ihIXSQw3A51Y9kZty5BO1GEWREbbx2LjluRCZvuVy65FM vydQ== X-Gm-Message-State: AOJu0YziwtjYHF+TtspImSGGTmepdavtvlK4X8Y16u7cdIpTxTlK1GPJ Ybgt7vHvBvVEhDY5ajq8kTB33cq3os9l/Q/1Enlmzvyk X-Received: by 2002:a2e:312:0:b0:2bf:f151:26ec with SMTP id 18-20020a2e0312000000b002bff15126ecmr5098747ljd.0.1696113416581; Sat, 30 Sep 2023 15:36:56 -0700 (PDT) MIME-Version: 1.0 References: <20230927192712.317799-1-trondmy@kernel.org> In-Reply-To: From: Olga Kornievskaia Date: Sat, 30 Sep 2023 18:36:45 -0400 Message-ID: Subject: Re: [PATCH] SUNRPC: Don't retry using the same source port if connection failed To: Trond Myklebust Cc: "linux-nfs@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Sat, 30 Sep 2023 15:37:02 -0700 (PDT) On Fri, Sep 29, 2023 at 10:57=E2=80=AFPM Trond Myklebust wrote: > > On Thu, 2023-09-28 at 10:58 -0400, Olga Kornievskaia wrote: > > On Wed, Sep 27, 2023 at 3:35=E2=80=AFPM wrote: > > > > > > From: Trond Myklebust > > > > > > If the TCP connection attempt fails without ever establishing a > > > connection, then assume the problem may be the server is rejecting > > > us > > > due to port reuse. > > > > Doesn't this break 4.0 replay cache? Seems too general to assume that > > any unsuccessful SYN was due to a server reboot and it's ok for the > > client to change the port. > > This is where things get interesting. Yes, if we change the port > number, then it will almost certainly break NFSv3 and NFSv4.0 replay > caching on the server. > > However the problem is that once we get stuck in the situation where we > cannot connect, then each new connection attempt is just causing the > server's TCP layer to push back and recall that the connection from > this port was closed. > IOW: the problem is that once we're in this situation, we cannot easily > exit without doing one of the following. Either we have to > > 1. Change the port number, so that the TCP layer allows us to > connect. > 2. Or.. Wait for long enough that the TCP layer has forgotten > altogether about the previous connection. > > The problem is that option (2) is subject to livelock, and so has a > potential infinite time out. I've seen this livelock in action, and I'm > not seeing a solution that has predictable results. > > So unless there is a solution for the problems in (2), I don't see how > we can avoid defaulting to option (1) at some point, in which case the > only question is "when do we switch ports?". I'm not sure how one can justify that regression that will come out of #1 will be less of a problem then the problem in #2. I think I'm still not grasping why the NFS server would (legitimately) be closing a connection that is re-using the port. Can you present a sequence of events that would lead to this? But can't we at least arm ourselves in not unnecessarily breaking the reply cache by at least imposing some timeout/number of retries before resetting? If the client was retrying to unsuccessfully re-establish connection for a (fixed) while, then 4.0 client's lease would expire and switching the port after the lease expires makes no difference. There isn't a solution in v3 unfortunately. But a time-based approach would at least separate these 'peculiar' servers vs normal servers. And if this is a 4.1 client, we can reset the port without a timeout. Am I correct that every unsuccessful SYN causes a new source point to be taken? If so, then a server reboot where multiple SYNs are sent prior to connection re-establishment (times number of mounts) might cause source port exhaustion? > > > > > > > > > Signed-off-by: Trond Myklebust > > > --- > > > net/sunrpc/xprtsock.c | 10 +++++++++- > > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > > > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c > > > index 71848ab90d13..1a96777f0ed5 100644 > > > --- a/net/sunrpc/xprtsock.c > > > +++ b/net/sunrpc/xprtsock.c > > > @@ -62,6 +62,7 @@ > > > #include "sunrpc.h" > > > > > > static void xs_close(struct rpc_xprt *xprt); > > > +static void xs_reset_srcport(struct sock_xprt *transport); > > > static void xs_set_srcport(struct sock_xprt *transport, struct > > > socket *sock); > > > static void xs_tcp_set_socket_timeouts(struct rpc_xprt *xprt, > > > struct socket *sock); > > > @@ -1565,8 +1566,10 @@ static void xs_tcp_state_change(struct sock > > > *sk) > > > break; > > > case TCP_CLOSE: > > > if (test_and_clear_bit(XPRT_SOCK_CONNECTING, > > > - &transport->sock_state)) > > > + &transport->sock_state)) { > > > + xs_reset_srcport(transport); > > > xprt_clear_connecting(xprt); > > > + } > > > clear_bit(XPRT_CLOSING, &xprt->state); > > > /* Trigger the socket release */ > > > xs_run_error_worker(transport, > > > XPRT_SOCK_WAKE_DISCONNECT); > > > @@ -1722,6 +1725,11 @@ static void xs_set_port(struct rpc_xprt > > > *xprt, unsigned short port) > > > xs_update_peer_port(xprt); > > > } > > > > > > +static void xs_reset_srcport(struct sock_xprt *transport) > > > +{ > > > + transport->srcport =3D 0; > > > +} > > > + > > > static void xs_set_srcport(struct sock_xprt *transport, struct > > > socket *sock) > > > { > > > if (transport->srcport =3D=3D 0 && transport->xprt.reuseport) > > > -- > > > 2.41.0 > > > > > -- > Trond Myklebust Linux NFS client maintainer, Hammerspace > trond.myklebust@hammerspace.com