Return-Path: Received: from userp1050.oracle.com ([156.151.31.82]:17614 "EHLO userp1050.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750721AbcHSEcF (ORCPT ); Fri, 19 Aug 2016 00:32:05 -0400 Received: from userp1040.oracle.com (userp1040.oracle.com [156.151.31.81]) by userp1050.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id u7IM00KV015839 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Thu, 18 Aug 2016 22:00:00 GMT Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: WARN_ON added to rpc_create() From: Chuck Lever In-Reply-To: <20160818215642.GB25052@fieldses.org> Date: Thu, 18 Aug 2016 17:59:12 -0400 Cc: "J. Bruce Fields" , Linux NFS Mailing List Message-Id: References: <42D0C152-58F9-4467-B86D-2A7A25544CE4@oracle.com> <20160803174724.GA5993@fieldses.org> <5E7D6A55-B7F3-411D-A74B-E8BCE04BCF02@oracle.com> <20F73E14-8CEC-4ED1-8A04-203343839860@oracle.com> <20160818215642.GB25052@fieldses.org> To: "J. Bruce Fields" Sender: linux-nfs-owner@vger.kernel.org List-ID: > On Aug 18, 2016, at 5:56 PM, J. Bruce Fields wrote: > > On Wed, Aug 10, 2016 at 02:01:20PM -0400, Chuck Lever wrote: >> Following up. >> >> >>> On Aug 3, 2016, at 3:40 PM, Chuck Lever wrote: >>> >>>> >>>> On Aug 3, 2016, at 1:47 PM, bfields@fieldses.org wrote: >>>> >>>> On Wed, Aug 03, 2016 at 11:27:47AM -0400, Chuck Lever wrote: >>>>> Hi Bruce- >>>>> >>>>> I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f >>>>> Author: J. Bruce Fields >>>>> AuthorDate: Tue May 17 12:38:21 2016 -0400 >>>>> >>>>> rpc: share one xps between all backchannels >>>>> >>>>> has added this piece of code: >>>>> >>>>> @@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args, >>>>> struct rpc_clnt *clnt = NULL; >>>>> struct rpc_xprt_switch *xps; >>>>> >>>>> - xps = xprt_switch_alloc(xprt, GFP_KERNEL); >>>>> - if (xps == NULL) { >>>>> - xprt_put(xprt); >>>>> - return ERR_PTR(-ENOMEM); >>>>> + if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) { >>>>> + WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); >>>>> + xps = args->bc_xprt->xpt_bc_xps; >>>>> + xprt_switch_get(xps); >>>>> + } else { >>>>> >>>>> >>>>> the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA. >>>>> >>>>> Can you say why it was added? Is there something RPC/RDMA needs to >>>>> do to make the code safe? >>>> >>>> What is args->protocol in this case? >>>> >>>> Digging around... OK, I missed that BC_TCP and BC_RDMA were defined as >>>> OR's of an XPRT_TRANSPORT_BC bit with the identifier of the underlying >>>> transport. That makes sense. >>>> >>>> So, I should have just used XPRT_TRANSPORT_BC there--I think all I meant >>>> was "is this a backchannel". >>>> >>>> Does that fix the problem? >>> >>> This simple fix eliminates the log noise: >>> >>> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c >>> index 2808d55..f94caf7 100644 >>> --- a/net/sunrpc/clnt.c >>> +++ b/net/sunrpc/clnt.c >>> @@ -520,7 +520,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args) >>> char servername[48]; >>> >>> if (args->bc_xprt) { >>> - WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); >>> + WARN_ON(!(args->protocol & XPRT_TRANSPORT_BC)); >>> xprt = args->bc_xprt->xpt_bc_xprt; >>> if (xprt) { >>> xprt_get(xprt); >>> >>> >>> This code seems to come from: >>> >>> commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9 >>> Author: J. Bruce Fields >>> AuthorDate: Mon May 16 17:03:42 2016 -0400 >>> >>> nfsd4/rpc: move backchannel create logic into rpc code >>> >>> >>> Where it may have been copied from: >>> >>> -static struct rpc_clnt *create_backchannel_client(struct rpc_create_args *args) >>> -{ >>> - struct rpc_xprt *xprt; >>> - >>> - if (args->protocol != XPRT_TRANSPORT_BC_TCP) >>> - return rpc_create(args); >>> - >>> - xprt = args->bc_xprt->xpt_bc_xprt; >>> - if (xprt) { >>> - xprt_get(xprt); >>> - return rpc_create_xprt(args, xprt); >>> - } >>> - >>> - return rpc_create(args); >>> -} >>> >>> There's no warning here. In fact, protocol != BC_TCP seems to >>> be expected. >>> >>> I'm wondering if the warning is needed at all? >> >> Using NFSv4.1/RDMA against my v4.7 NFS server seems to result >> in a system deadlock in short order on the server. I haven't >> looked further into this because you mentioned you were going >> to have a look at these commits that change the backchannel >> code. > > I'm not seeing an obvious bug in those commits, for what it's worth. Thanks for checking. I've tracked the misbehavior down to a DMA mapping mismatch. It's not related to the backchannel at all. I'm putting together a fix right now. But I would like to use NFSv4.1/RDMA without that warning firing. Any reason to keep it (in either place) ? -- Chuck Lever