Date: Fri, 14 Apr 2017 15:33:38 -0400
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: List Linux RDMA Mailing <linux-rdma@vger.kernel.org>,
        Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH v3 09/14] svcrdma: Report Write/Reply chunk overruns
Message-ID: <20170414193338.GA10208@fieldses.org>
References: <20170409163820.15073.43257.stgit@klimt.1015granger.net>
 <20170409170641.15073.82788.stgit@klimt.1015granger.net>
 <20170414155634.GC5362@fieldses.org>
 <FCC0F429-0799-4416-9A61-ABD19C773AA9@oracle.com>
 <20170414175216.GA8290@fieldses.org>
 <809B3E7B-8991-4358-804F-B2D0101723FF@oracle.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
In-Reply-To: <809B3E7B-8991-4358-804F-B2D0101723FF@oracle.com>
Sender: linux-nfs-owner@vger.kernel.org

On Fri, Apr 14, 2017 at 03:07:20PM -0400, Chuck Lever wrote:
> 
> > On Apr 14, 2017, at 1:52 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> > 
> > On Fri, Apr 14, 2017 at 12:10:03PM -0400, Chuck Lever wrote:
> >> 
> >>> On Apr 14, 2017, at 11:56 AM, J. Bruce Fields <bfields@fieldses.org> wrote:
> >>> 
> >>> On Sun, Apr 09, 2017 at 01:06:41PM -0400, Chuck Lever wrote:
> >>>> Observed at Connectathon 2017.
> >>>> 
> >>>> If a client has underestimated the size of a Write or Reply chunk,
> >>>> the Linux server writes as much payload data as it can, then it
> >>>> recognizes there was a problem and closes the connection without
> >>>> sending the transport header.
> >>> 
> >>> Why would the client underestimate?  Is this a client-side bug?
> >> 
> >> It can be a bug, and the behavior in this case is that the
> >> client retransmits indefinitely and deadlocks the transport,
> >> because the client's upper layer never sees a reply.
> >> 
> >> But as you know there are some NFS operations where the client
> >> cannot predict in advance how large the reply will be. In
> >> particular the upper bound size of an NFSACL GETACL reply or
> >> certain NFSv4 GETATTR attributes are not predictable.
> > 
> > Oh, I'd forgotten about those cases.
> > 
> >> These
> >> I might categorize as protocol bugs.
> >> 
> >> A client can do its best by posting a very large reply buffer
> >> for such operations, but since these situations typically
> >> are in practice rare, but NFSv4 GETATTR can be a relatively
> >> common operation, clients post a few dozen KB for the reply
> >> buffer and call it a day.
> >> 
> >> In these cases (if they should ever fail IRL), returning an
> >> error is polite and allows operation of other RPCs on that
> >> transport to continue.
> > 
> > Got it, thanks.  (I assume this is documented somewhere in the specs?)
> 
> I've written about it in rfc5667bis-09. It's a short document,
> review comments welcome.

Oh, look, right there in
https://tools.ietf.org/html/draft-ietf-nfsv4-rfc5667bis-09#section-2.1
Thanks!  And apologies for not keeping up with stuff.

--b.