Return-Path: Received: from mail-lf0-f41.google.com ([209.85.215.41]:37036 "EHLO mail-lf0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751668AbdHRUPn (ORCPT ); Fri, 18 Aug 2017 16:15:43 -0400 Received: by mail-lf0-f41.google.com with SMTP id f7so27129379lfg.4 for ; Fri, 18 Aug 2017 13:15:42 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <1503068252.44656.2.camel@primarydata.com> References: <1503068252.44656.2.camel@primarydata.com> From: Bennett Amodio Date: Fri, 18 Aug 2017 13:15:40 -0700 Message-ID: Subject: Re: [RFC v3 0/2] NFSv3 and NFSv4 Multipathing To: Trond Myklebust Cc: "linux-nfs@vger.kernel.org" , "juchang@purestorage.com" , "anna.schumaker@netapp.com" , "vas@purestorage.com" , "igor@purestorage.com" Content-Type: text/plain; charset="UTF-8" Sender: linux-nfs-owner@vger.kernel.org List-ID: On Fri, Aug 18, 2017 at 7:57 AM, Trond Myklebust wrote: > On Tue, 2017-08-15 at 17:46 -0700, Bennett Amodio wrote: >> After seeing Trond=E2=80=99s patches for NFS multipathing on NFSv4.1, we >> decided to try using the same concept for NFSv3/4. The primary issue >> we identified was XID collision in the duplicate request cache >> (replay >> cache) for NFSv3/4. In NFSv3/4, entries are hashed based on XID >> instead of the slot ID and sequence ID that NFSv4.1 uses. Since the >> XIDs are generated by the RPC transports, and Trond=E2=80=99s patches cr= eate >> multiple transports for multipathing, different transports can end up >> using an overlapping set of XIDs. > > Why is that a problem? You should end up with connections that show > different combinations of source IP+port and/or destination IP+port. It > should be trivial to distinguish between XIDs. Although the Linux NFS server hashes cache entries based on source IP and source port as well as XID, this is not a requirement of the NFSv3/v4 specification, so NFS server implementations may exist which hash only based on source IP and XID. In practice, is this uncommon enough that it's not worth addressing? > Quite frankly, I do not want to start carving up the XID space, since a > 32-bit number is really not that big in these days of 100GigE networks. This is a good point, and we also think that carving up the XID space is not a great solution. If XID collision is a problem, another solution could be an atomic XID shared between transports which belong to the same client. If there's no problem in the first place, that's even better. We thought when you said "I don't feel comfortable subjecting NFSv3/v4 replay caches to this treatment yet" that you were referring to XID collision. Is there another potential issue with multipathing and replay caches? Cheers! Bennett Amodio