Return-Path: linux-nfs-owner@vger.kernel.org Received: from cmexedge1.ext.emulex.com ([138.239.224.99]:14096 "EHLO CMEXEDGE1.ext.emulex.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750858AbaETFm0 convert rfc822-to-8bit (ORCPT ); Tue, 20 May 2014 01:42:26 -0400 From: Devesh Sharma To: Steve Wise , "'J. Bruce Fields'" CC: "linux-nfs@vger.kernel.org" , "linux-rdma@vger.kernel.org" , "tom@opengridcomputing.com" Subject: RE: [PATCH V2 RFC 0/3] svcrdma: refactor marshalling logic Date: Tue, 20 May 2014 05:42:24 +0000 Message-ID: References: <20140506174621.18208.24242.stgit@build.ogc.int> <20140506192730.GK18281@fieldses.org> <53694F72.9010007@opengridcomputing.com> <00c301cf7396$878f0720$96ad1560$@opengridcomputing.com> In-Reply-To: <00c301cf7396$878f0720$96ad1560$@opengridcomputing.com> Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org List-ID: Hi Steve, > -----Original Message----- > From: Steve Wise [mailto:swise@opengridcomputing.com] > Sent: Tuesday, May 20, 2014 12:44 AM > To: Devesh Sharma; 'J. Bruce Fields' > Cc: linux-nfs@vger.kernel.org; linux-rdma@vger.kernel.org; > tom@opengridcomputing.com > Subject: RE: [PATCH V2 RFC 0/3] svcrdma: refactor marshalling logic > > > > > -----Original Message----- > > From: linux-nfs-owner@vger.kernel.org > > [mailto:linux-nfs-owner@vger.kernel.org] On Behalf Of Devesh Sharma > > Sent: Monday, May 19, 2014 2:07 PM > > To: Steve Wise; J. Bruce Fields > > Cc: linux-nfs@vger.kernel.org; linux-rdma@vger.kernel.org; > > tom@opengridcomputing.com > > Subject: RE: [PATCH V2 RFC 0/3] svcrdma: refactor marshalling logic > > > > While testing with ocrdma driver I am finding server side SQ full. > > Following is the log, > yet to > > identify why it's happening. Once this is reported Client side crashes > > due to some > reason. > > My kdump is not working properly therefore I am not able to analyze > > the situation > properly. > > > > May 19 23:47:02 neo01-el64 kernel: svcrdma: RDMA_WRITE rmr=8008b12, > > to=45a2d790c, xdr_off=0, write_len=68, vec->sge=ffff88086cb4a0c8, > > vec->count=2 May 19 23:47:02 neo01-el64 kernel: svcrdma: send_reply > > returns 0 May 19 23:47:02 neo01-el64 kernel: svc: server > > ffff88086409a000 waiting for data (to = > > 3600000) > > May 19 23:47:02 neo01-el64 kernel: svc: transport ffff88087dfa2400 > > served by daemon > > ffff88086409a000 > > May 19 23:47:02 neo01-el64 kernel: svc: server ffff88086409a000, pool > > 0, transport ffff88087dfa2400, inuse=18 May 19 23:47:02 neo01-el64 > > kernel: svcrdma: rqstp=ffff88086409a000 May 19 23:47:02 neo01-el64 > > kernel: svcrdma: processing ctxt=ffff880866754540 on > > xprt=ffff88087dfa2400, rqstp=ffff88086409a000, status=0 May 19 > > 23:47:02 neo01-el64 kernel: svcrdma: failed to post SQ WR rc=-22, > > sc_sq_count=0, > > sc_sq_depth=128 > > May 19 23:47:02 neo01-el64 kernel: svcrdma: Error -22 posting > > RDMA_READ > > Hey Deevesh, > > Looking ocrdma_post_send(),-22 (-EINVAL) is returned when the QP is not in > RTS. If the SQ is full, -ENOMEM is returned. So I think the send error is a > downstream error because the connection got knocked down. You should > try and figure out what kicked the QP out of RTS. Oh wow! I perfectly missed it, let me go through the logs once again and update you. > > > Steve.