Return-Path: Received: from mails1n2-route0.email.arizona.edu ([128.196.130.79]:12176 "EHLO mails1n2-route0.email.arizona.edu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725908AbeGMWtO (ORCPT ); Fri, 13 Jul 2018 18:49:14 -0400 Subject: Re: RDMA connection closed and not re-opened References: <4A72535B-E6D2-4E8A-B6DB-BF09856A41EB@gmail.com> <19cd3809-669b-2d63-d453-ed553c9e01a9@genome.arizona.edu> <57cf42c5-d12d-fff3-fd77-0d191d32111e@genome.arizona.edu> <9b0802b9-ad7c-0969-6087-9f2aef703143@genome.arizona.edu> <0423D037-63F9-4BA6-882A-CBD9EBC630F2@oracle.com> To: Linux NFS Mailing List From: admin@genome.arizona.edu Message-ID: <5b08ea1b-4cde-c432-92cc-04eff469ed54@genome.arizona.edu> Date: Fri, 13 Jul 2018 15:32:34 -0700 MIME-Version: 1.0 In-Reply-To: <0423D037-63F9-4BA6-882A-CBD9EBC630F2@oracle.com> Content-Type: text/plain; charset=utf-8; format=flowed Sender: linux-nfs-owner@vger.kernel.org List-ID: Chuck Lever wrote on 07/13/2018 07:36 AM: > You should be able to mount using "proto=tcp" with your mlx4 cards. > That avoids the use of NFS/RDMA but would enable the use of the > higher bandwidth network fabric. Thanks I could definitely try that. IPoIB has it's own set of issues though but can cross that bridge when I get to it.... > Can you diagram your full configuration during the backup? The main server in relation to this issue, which is named "pac" in the log files, has several local storage devices which are exported over the Ethernet and Infiniband interfaces. In addition, it has several other mounts over Ethernet to some of our other NFS servers. The rsnapshot/backup job uses rsync to read from the local storage and sends to the NFS mounts to another server using standard 1Gb ethernet and TCP protocol. So the answer to your second question, > Does the > NFS client mount the NFS server on this same host? I believe is "yes" > Does it use > NFS/RDMA or can it use ssh instead of NFS? Currently just uses NFS/TCP over 1Gb Ethernet link. rsnapshot does have the ability to use SSH > I'm not familiar with the CentOS bug database. If there's an "NFS" > category, I would go with that. There is no "NFS" category, only nfs-utils, nfs-utils-lib, and nfs4-acl-tools. So I'm guessing if we want to report against NFS then "kernel" would be the category? > Before filing, you should search that database to see if there are > similar bugs. Simply Googling "peername failed!" brings up several > NFSD related entries right at the top of the list that appear > similar to your circumstance (and there is no mention of NFS/RDMA). Thanks I will be checking that out