Received: by 2002:a05:6a10:17d3:0:0:0:0 with SMTP id hz19csp2777255pxb; Mon, 19 Apr 2021 13:44:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyOc0pZ2N3ybvgjw5VOF3u76Er2XVEZcqqNQBdaWUFv1c/eKBEPZIa2EG7knNjqu6+4FN7R X-Received: by 2002:a17:90a:bd13:: with SMTP id y19mr1045061pjr.181.1618865086924; Mon, 19 Apr 2021 13:44:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618865086; cv=none; d=google.com; s=arc-20160816; b=0ZQWikYzD8IBOYj5dR6uPbF7bB/gfNkFfXLb8g3Mz2lEi5H0rbncI2LEVZx/EV99CP pLBfd1x4ulkfBKCuk+uEjYCrufY6r+skjYc/KQleb2875j7hszjW9bluDcbDC8nj71+Y eP11BmvZLKBU9LvAzpcjtdgpLOd1mi5RCqrKniEopcpPppVnBQXFSEVHd7l2U0Gfe0R3 COWrWGdX9alvxKoEBCm8FxRvD0bET6jpkHESc2Gbhlk0a0JnE2rWIQgG8Foya1rTkNv8 oYuDB5SVtG3huVtdP/VqhsTEE5/CcVcvpmkzimOcqNV4sq65/TtVzvzaYRMx2TKswslZ lw3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:cc:to:from :subject; bh=XylJ6xWoRa0I/tEF4+gwDx0f6Ey5vO60aIxwcYEgnwY=; b=y3CPmy57WTwAKKgdGgF6P2cz+4sd9CCCWR6Y4aC/cbI29KJMyio3+vfWDDdA2sRbOh X5W+/bD5r6WgVWQn7QNpPm6k/8GQCnz9wY4SsIXgvaKkihOvXBgsLVj8GrDmWV/xET07 jP8LiCwvva7MgYXoBwgb8DACfJzMXflFHnhqPFvAMa7kjEF0l5vPjMfu7bi00KFOSNUy ygE9xiWSThTc9S/6Ni4/zMSIXEwqnH5I6BRqOQT94iE/YgtSC6RAeWVBLDOKf21zrTLo GrYZF73g/fgbIkpzFL3tAYxlMI/Oy4Rjdz2TCyMAAAXwgrFchJvoKLSsNAwmgwbHRpqT YR8Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 33si11356644pld.182.2021.04.19.13.44.27; Mon, 19 Apr 2021 13:44:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240433AbhDSSDN (ORCPT + 99 others); Mon, 19 Apr 2021 14:03:13 -0400 Received: from mail.kernel.org ([198.145.29.99]:41036 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240510AbhDSSDM (ORCPT ); Mon, 19 Apr 2021 14:03:12 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 3C20061107; Mon, 19 Apr 2021 18:02:42 +0000 (UTC) Subject: [PATCH v3 10/26] xprtrdma: Fix cwnd update ordering From: Chuck Lever To: trondmy@hammerspace.com Cc: linux-nfs@vger.kernel.org, linux-rdma@vger.kernel.org Date: Mon, 19 Apr 2021 14:02:41 -0400 Message-ID: <161885536143.38598.9604745938847259595.stgit@manet.1015granger.net> In-Reply-To: <161885481568.38598.16682844600209775665.stgit@manet.1015granger.net> References: <161885481568.38598.16682844600209775665.stgit@manet.1015granger.net> User-Agent: StGit/0.23-29-ga622f1 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org After a reconnect, the reply handler is opening the cwnd (and thus enabling more RPC Calls to be sent) /before/ rpcrdma_post_recvs() can post enough Receive WRs to receive their replies. This causes an RNR and the new connection is lost immediately. The race is most clearly exposed when KASAN and disconnect injection are enabled. This slows down rpcrdma_rep_create() enough to allow the send side to post a bunch of RPC Calls before the Receive completion handler can invoke ib_post_recv(). Fixes: 2ae50ad68cd7 ("xprtrdma: Close window between waking RPC senders and posting Receives") Signed-off-by: Chuck Lever --- net/sunrpc/xprtrdma/rpc_rdma.c | 3 ++- net/sunrpc/xprtrdma/verbs.c | 10 +++++----- net/sunrpc/xprtrdma/xprt_rdma.h | 2 +- 3 files changed, 8 insertions(+), 7 deletions(-) diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c index 292f066d006e..21ddd78a8c35 100644 --- a/net/sunrpc/xprtrdma/rpc_rdma.c +++ b/net/sunrpc/xprtrdma/rpc_rdma.c @@ -1430,9 +1430,10 @@ void rpcrdma_reply_handler(struct rpcrdma_rep *rep) credits = 1; /* don't deadlock */ else if (credits > r_xprt->rx_ep->re_max_requests) credits = r_xprt->rx_ep->re_max_requests; + rpcrdma_post_recvs(r_xprt, credits + (buf->rb_bc_srv_max_requests << 1), + false); if (buf->rb_credits != credits) rpcrdma_update_cwnd(r_xprt, credits); - rpcrdma_post_recvs(r_xprt, false); req = rpcr_to_rdmar(rqst); if (unlikely(req->rl_reply)) diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c index 0ade69501061..5a2871c4561f 100644 --- a/net/sunrpc/xprtrdma/verbs.c +++ b/net/sunrpc/xprtrdma/verbs.c @@ -544,7 +544,7 @@ int rpcrdma_xprt_connect(struct rpcrdma_xprt *r_xprt) * outstanding Receives. */ rpcrdma_ep_get(ep); - rpcrdma_post_recvs(r_xprt, true); + rpcrdma_post_recvs(r_xprt, 1, true); rc = rdma_connect(ep->re_id, &ep->re_remote_cma); if (rc) @@ -1395,21 +1395,21 @@ int rpcrdma_post_sends(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req) /** * rpcrdma_post_recvs - Refill the Receive Queue * @r_xprt: controlling transport instance - * @temp: mark Receive buffers to be deleted after use + * @needed: current credit grant + * @temp: mark Receive buffers to be deleted after one use * */ -void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, bool temp) +void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, int needed, bool temp) { struct rpcrdma_buffer *buf = &r_xprt->rx_buf; struct rpcrdma_ep *ep = r_xprt->rx_ep; struct ib_recv_wr *wr, *bad_wr; struct rpcrdma_rep *rep; - int needed, count, rc; + int count, rc; rc = 0; count = 0; - needed = buf->rb_credits + (buf->rb_bc_srv_max_requests << 1); if (likely(ep->re_receive_count > needed)) goto out; needed -= ep->re_receive_count; diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h index 31404326f29f..2504f67af63e 100644 --- a/net/sunrpc/xprtrdma/xprt_rdma.h +++ b/net/sunrpc/xprtrdma/xprt_rdma.h @@ -462,7 +462,7 @@ int rpcrdma_xprt_connect(struct rpcrdma_xprt *r_xprt); void rpcrdma_xprt_disconnect(struct rpcrdma_xprt *r_xprt); int rpcrdma_post_sends(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req); -void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, bool temp); +void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, int needed, bool temp); /* * Buffer calls - xprtrdma/verbs.c