From: Chuck Lever Subject: Re: Should truncated READDIR replies return -EIO? Date: Fri, 8 Feb 2008 14:25:52 -0500 Message-ID: <036E8879-FFE0-41B6-80FE-78568812BF86@oracle.com> References: <1202483082-5334-1-git-send-email-jlayton@redhat.com> <1202483596.8914.13.camel@heimdal.trondhjem.org> <47AC77C6.80503@redhat.com> <4B54CC40-B164-4B8D-A5D7-74CE2B684955@oracle.com> <47AC9C71.4090306@redhat.com> Mime-Version: 1.0 (Apple Message framework v753) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Cc: Trond Myklebust , Jeff Layton , linux-nfs@vger.kernel.org To: Peter Staubach Return-path: Received: from agminet01.oracle.com ([141.146.126.228]:16639 "EHLO agminet01.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1761802AbYBHT1n (ORCPT ); Fri, 8 Feb 2008 14:27:43 -0500 In-Reply-To: <47AC9C71.4090306@redhat.com> Sender: linux-nfs-owner@vger.kernel.org List-ID: On Feb 8, 2008, at 1:16 PM, Peter Staubach wrote: > Chuck Lever wrote: >> On Feb 8, 2008, at 10:39 AM, Peter Staubach wrote: >>> Trond Myklebust wrote: >>>> On Fri, 2008-02-08 at 10:04 -0500, Jeff Layton wrote: >>>> >>>>> Recently, I ran across a server-side bug that caused the server >>>>> to send >>>>> truncated READDIR replies. The server would send a valid RPC >>>>> response to >>>>> a READDIR call, but the contents of it were basically missing >>>>> (everything after the status). >>>>> >>>>> The server problem had long been patched in mainline kernels, >>>>> but the >>>>> interesting bit was that clients didn't return an error in this >>>>> situation. The XDR decoders for readdir calls are supposed to >>>>> check the >>>>> validity of the response, but in this situation it just fudges the >>>>> contents of the pagecache to make it look like a completely empty >>>>> directory. >>>>> >>>>> Shouldn't the client return an error in this situation? The >>>>> response >>>>> obviously isn't valid so it seems like it shouldn't pretend >>>>> that it is. >>>>> If so, would something like the following patch make sense? >>>>> >>>> >>>> It is quite valid (though silly!) for a server to return a >>>> READDIR reply >>>> with no entries. AFAICR there were servers that actually did >>>> this at one >>>> point (though I shall refrain from naming and shaming). >>>> >>>> So whereas I agree that it might be correct to flag a READDIR >>>> reply that >>>> contains no entries due to XDR encoding bugs, I'm not sure that we >>>> should be flagging errors in the case where the XDR is correct. >>> >>> In this case, I believe that the response was malformed. Pretty >>> much everything after the status was missing, including the EOF >>> indicator. I would agree that it would be silly to return a >>> response with no error indicated, no entries, and the eof >>> indication set to false. >>> >>> This really boils down to how do we handle malformed responses? >>> Is there a general policy to retransmit the request? This would >>> seem to be the right thing because a malformed response would >>> result from many things including the TCP connection getting >>> dropped in the middle of receiving the response from a timeout >>> and other things. However, in this situation, retransmitting >>> the request would just have resulted in the same, broken response >>> from the server. This was due to a server bug, which has since >>> been fixed, but exists still out in nature. >> >> >> Replies that are malformed network or RPC level packets are >> dropped by the RPC client, and the matching requests are >> retransmitted by the RPC client after a timeout. Network events >> (like your TCP connection example) result in a malformed RPC level >> packet that the RPC client never delivers to the XDR layer, and >> are thus retransmitted by the RPC client. >> >> Replies that have malformed XDR are treated by the NFS client as >> errors. The problem is the decoders (on Linux) are not terribly >> careful about checking the correctness of the server's XDR >> encoding, especially in cases like READDIR (Not to mention >> compound RPCs!) where the decoding can be complex. Olaf has >> mentioned the Linux XDR layer was hand-coded rather than >> constructed with rpcgen to keep the decoders simple and efficient. >> >> Network-related corruption is likely to be caught by the lower >> layers. I tend to think that malformed XDR is nearly always a >> genuine software defect on the server, and thus not worth >> retransmitting (especially if it's an idempotent request!). > > What happens if a response is interrupted in the middle by the > TCP connection being broken? Is this caught at the RPC layer > and then rejected? As I understand it, xs_tcp_read_request() checks for a truncated TCP read, and discards the reply by not invoking xprt_complete_rqst(). If the TCP layer stops calling the RPC client back with more bytes on the socket, then xprt_complete_rqst() is never invoked to mark the RPC request as complete. So, ostensibly, the RPC client will discard a partially received RPC reply and at some later point, time out the pending request and retransmit it. -- Chuck Lever chuck[dot]lever[at]oracle[dot]com