Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp18570ybi; Thu, 30 May 2019 19:31:31 -0700 (PDT) X-Google-Smtp-Source: APXvYqz7RLryoXfcW0rjkZ8xt6w1AdCEeb1sTpTG40FHKAq/SYRPItMzfbMXKflDuPOFwhVbxpjI X-Received: by 2002:a63:1b56:: with SMTP id b22mr6285739pgm.87.1559269891728; Thu, 30 May 2019 19:31:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1559269891; cv=none; d=google.com; s=arc-20160816; b=lkkrID7DuOWIXl0/l9c7yRGQl1bulnZM/v1lI/Ky6diTLspE+KakoCPozFTh57W8Z7 fD0OoX9iOj8Q34a23A+btL+nYvzN4IVPMVHL//DHIP/1sQ3tGdYIPpfQm3Pxa1JGrzbz JZUOWVvx1HAq9yPwWeyYCzvNm1bbFSSRplhrkdRZT16Sw7EYbUY4gFK6qRyYTJXKjH6p Gwo6fhnHMWK4vYPnPo8MM667b5giwxTP+kLKZTWIqrqcD6m0BgFWD01oXe6zelP0PdHY KoH9JkeZTqq4cQ+BxFCI95DkokVzb4CJ1yGi3ZGZsKHn9VMJb/eGYLT5pP0RoGebSSL+ XCaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:references :in-reply-to:subject:cc:date:to:from; bh=WpkFLuBrkwvW+U9Fg9/UaW/zz1aKCWVKWycxeSK5wIs=; b=JbiswoC5n/5uO1TeqwP/cKj+8cITR2uPaUHMnbdXxBfCFuXcfZpjvX1OyOWr6qqs0x ecSviepIr73sDXrJ6to7m2PS9qq680riPIZRz4Bxv+QZ8E9EHXE3H+TKnSjYl9ww3ytN UCnNn+CDROUtGJzS3KyclbN89qlFdNh1NaaxixBEkSn2Lfy1Fh6kK0PMx5eMEEYXcFmU 9UXnpxlKOcrvK5gj8Bwpzk4/gkaNHWRWkseZXKpKTAcn9gQr0+Ba42lA25jTS/VJUJcz J88PmpAuhr7B6Bh4SnRDk4J4A2WGctWQQk7lLth0mxEPbLsNIaFlD2DleqfWub5tQvMF LXAg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w7si4414004ply.279.2019.05.30.19.31.12; Thu, 30 May 2019 19:31:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726487AbfEaCbL (ORCPT + 99 others); Thu, 30 May 2019 22:31:11 -0400 Received: from mx2.suse.de ([195.135.220.15]:40592 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726372AbfEaCbL (ORCPT ); Thu, 30 May 2019 22:31:11 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id E09CEAF46; Fri, 31 May 2019 02:31:09 +0000 (UTC) From: NeilBrown To: Tom Talpey , Olga Kornievskaia Date: Fri, 31 May 2019 12:31:02 +1000 Cc: Chuck Lever , Schumaker Anna , Trond Myklebust , linux-nfs Subject: Re: [PATCH 0/9] Multiple network connections for a single NFS mount. In-Reply-To: References: <155917564898.3988.6096672032831115016.stgit@noble.brown> <1df23ebc-ffe5-1a57-c40a-d5e9a45c8498@talpey.com> <9b64b9d9-b7cf-c818-28e2-58b3a821d39d@talpey.com> <87pnnztvo1.fsf@notabene.neil.brown.name> Message-ID: <87ef4fxsm1.fsf@notabene.neil.brown.name> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Thu, May 30 2019, Tom Talpey wrote: > On 5/30/2019 6:38 PM, NeilBrown wrote: >> On Thu, May 30 2019, Tom Talpey wrote: >>=20 >>> On 5/30/2019 1:20 PM, Olga Kornievskaia wrote: >>>> On Thu, May 30, 2019 at 1:05 PM Tom Talpey wrote: >>>>> >>>>> On 5/29/2019 8:41 PM, NeilBrown wrote: >>>>>> I've also re-arrange the patches a bit, merged two, and remove the >>>>>> restriction to TCP and NFSV4.x,x>=3D1. Discussions seemed to suggest >>>>>> these restrictions were not needed, I can see no need. >>>>> >>>>> I believe the need is for the correctness of retries. Because NFSv2, >>>>> NFSv3 and NFSv4.0 have no exactly-once semantics of their own, server >>>>> duplicate request caches are important (although often imperfect). >>>>> These caches use client XID's, source ports and addresses, sometimes >>>>> in addition to other methods, to detect retry. Existing clients are >>>>> careful to reconnect with the same source port, to ensure this. And >>>>> existing servers won't change. >>>> >>>> Retries are already bound to the same connection so there shouldn't be >>>> an issue of a retransmission coming from a different source port. >>> >>> So, there's no path redundancy? If any connection is lost and can't >>> be reestablished, the requests on that connection will time out? >>=20 >> Path redundancy happens lower down in the stack. Presumably a bonding >> driver will divert flows to a working path when one path fails. >> NFS doesn't see paths at all. It just sees TCP connections - each with >> the same source and destination address. How these are associated, from >> time to time, with different hardware is completely transparent to NFS. > > But, you don't propose to constrain this to bonded connections. So > NFS will create connections on whatever collection of NICs which are > locally, and if these aren't bonded, well, the issues become visible. If a client had multiple network interfaces with different addresses, and several of them had routes to the selected server IP, then this might result in the multiple connections to the server having different local addresses (as well as different local ports) - I don't know the network layer well enough to be sure if this is possible, but it seems credible. If one of these interfaces then went down, and there was no automatic routing reconfiguration in place to restore connectivity through a different interface, then the TCP connection would timeout and break. The xprt would then try to reconnect using the same source port and destination address - it doesn't provide an explicit source address, but lets the network layer provide one. This would presumably result in a connection with a different source address. So requests would continue to flow on the xprt, but they might miss the DRC as the source address would be different. If you have a configuration like this (multi-homed client with multiple interfaces that can reach the server with equal weight), then you already have a possible problem of missing the DRC if one interface goes down a new connection is established from another one. nconnect doesn't change that. So I still don't see any problem. If I've misunderstood you, please provide a detailed description of the sort of configuration where you think a problem might arise. > > BTW, RDMA NICs are never bonded. I've come across the concept of "Multi-Rail", but I cannot say that I fully understand it yet. I suspect you would need more than nconnect to make proper use of multi-rail RDMA Thanks, NeilBrown --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEG8Yp69OQ2HB7X0l6Oeye3VZigbkFAlzwkeYACgkQOeye3VZi gblhzQ//R0DUo+syjAS9hjA2pteJs4obvnXxueQpu//4OU+Lk1SxV/Lu991AiR98 I2di7qNWNLAFJgKuRcgF7b4VjikJ8VQwPeP/2tfOKuIlQUvLrJTT2YNupQXV/poj 9r0rjaXnOtNOnrwZ0nBf5nu6/yk2I6IcoU3W3Xpe04cxPQiRtBqfadWQ8oyWiVQ0 85QqgyrGMsuwUGKMyRuuYtf7UCeTmI3Ym0+V3bF1KrWxY1EjjygT5aiuBLANVDDH SoJkMDpU54J9c7ji/fWTxgVnbQsOnYPczkNcOOoDb/alWi65R6b6nl72z/YFqc9f pjPdQdNKqXlzbEn9Hkpzbw5oMYUw5U1DoqaksXm+CTBL5epwFNvEZsP7GelD97DT dq4y5PEw/f+byaHl5sS8N32UFoDxIzXNwKSdJhXF2Cy1CyUeN6pmFRxoXowQ7wtW xSPFV5JVxon4G1YLo476dzEhd6OLxb3FC/202+i8pw8M7LKYxu6b3J+KQc+m8YGz yuOWexVJ0TNoj+j2tUkI0YntWi7VHbKMp955e+tq3RV5rDbXzuIAdg4GjRPyfYGa DcwGZ0n/mI4FZOPWM+y7HBiWSbO4c9kH9B18bprOvAw5VI9cXPtV7gkXIZj6ZZ7G VqK9OznymcG8bcaMI1O1L17NLf5YyIO4pH+WJNGt8fN5ClI6oWM= =qFEY -----END PGP SIGNATURE----- --=-=-=--