Return-Path: Received: from mail-lf0-f51.google.com ([209.85.215.51]:37701 "EHLO mail-lf0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752859AbdHPAqJ (ORCPT ); Tue, 15 Aug 2017 20:46:09 -0400 Received: by mail-lf0-f51.google.com with SMTP id m86so10100744lfi.4 for ; Tue, 15 Aug 2017 17:46:08 -0700 (PDT) MIME-Version: 1.0 From: Bennett Amodio Date: Tue, 15 Aug 2017 17:46:07 -0700 Message-ID: Subject: [RFC v3 0/2] NFSv3 and NFSv4 Multipathing To: linux-nfs@vger.kernel.org Cc: anna.schumaker@netapp.com, trond.myklebust@primarydata.com, Igor Ostrovsky , Vas Chellappa , Jui-Yu Chang Content-Type: text/plain; charset="UTF-8" Sender: linux-nfs-owner@vger.kernel.org List-ID: After seeing Trond=E2=80=99s patches for NFS multipathing on NFSv4.1, we decided to try using the same concept for NFSv3/4. The primary issue we identified was XID collision in the duplicate request cache (replay cache) for NFSv3/4. In NFSv3/4, entries are hashed based on XID instead of the slot ID and sequence ID that NFSv4.1 uses. Since the XIDs are generated by the RPC transports, and Trond=E2=80=99s patches creat= e multiple transports for multipathing, different transports can end up using an overlapping set of XIDs. To fix this, we apply a mask to XIDs. Each transport is constrained to its own segment of the total XID range, and they can never overlap. In terms of loss of entropy, by masking out just enough bits from the XID, we are convinced that the probability of XID wraparound or collision on NFS client restart has not increased to a problematic level (so long as the RPCs are distributed round-robin, as in Trond=E2=80= =99s patches). We tested multipathing out and discovered that it enables NFS to get more bandwidth on a bonded interface (instead of using only one physical link, it can use multiple). Specifically, we tested on a setup where the client was connected to the server via 4 bonded 10Gb/s links. Without multipathing, the client could only achieve 10Gb/s (using one physical link). With multipathing, the client was able to achieve a maximum of close to 40Gb/s. However, although the maximum performance was close to 40Gb/s, achieving an average throughput of even 30Gb/s required many connections. The performance of individual trials had a high variance. We traced this uneven performance to colliding network paths. With round-robin distribution of RPCs, no single TCP connection can exceed the performance of the slowest one. If the connections are distributed unevenly across network paths, some connections can bottleneck others. To solve this problem, we are currently working on patches to provide load-balancing as an alternative to round-robin for distributing RPCs. To use these patches, you first have to apply Trond's 5 patches (Available at https://www.spinics.net/lists/linux-nfs/msg63368.html). Let us know what you think or if you have any ideas for improving this. Jui-Yu Chang (1): NFS: Allow multiple connections to NFSv3 and NFSv4.0 servers Bennett Amodio (1): SUNRPC: Mask XIDs to prevent replay cache collision fs/nfs/client.c | 3 +++ fs/nfs/nfs4client.c | 2 +- include/linux/sunrpc/xprt.h | 5 +++++ net/sunrpc/clnt.c | 8 ++++++++ net/sunrpc/xprt.c | 14 ++++++-------- 5 files changed, 23 insertions(+), 9 deletions(-) -- 1.9.1