Received: by 2002:a05:6a10:17d3:0:0:0:0 with SMTP id hz19csp2777258pxb; Mon, 19 Apr 2021 13:44:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzv9K+LlnaWgCoVHyPm7hm35GUyBGR3gFvSTW7N0P4EHpn5/uLt+HOghnUwglbpx2EpNUJ2 X-Received: by 2002:a62:1ec1:0:b029:24d:b3de:25be with SMTP id e184-20020a621ec10000b029024db3de25bemr21293935pfe.17.1618865086930; Mon, 19 Apr 2021 13:44:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618865086; cv=none; d=google.com; s=arc-20160816; b=N75BmjAGnVnrSyBMGJoAgRz/DTPJqKFrUWymvVVvWfr/gCN6hPsvlFjcvG4zqzoZZ/ sJxTW1Popb77VlPdXUmgYT59lf+07/tGvxrp5Kanoyarpk2FXNnBfEN6kIl27+GI+z3S qhgn55IzWcIrN4C/ZUEHRKUi/T2FmRg5RQwz6uH+Vby9IxfC45gyHCCYJHigfS15gf9c KzZQxHTsnC4kpyoApGNktyKoX50SdhKU3RgFfGFDh43xeICgK5pDQa5GqV+VuzF7iNVc 7RCvhXnwRF0xWrnb6nALBhxs+aQbsS8nKFDFRREMkf/hw3u+ihIP5916yTNRddEYpl6N MSwA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:cc:to:from :subject; bh=En1tKqTEgl7VQdcxdsR5OpVxrw2wnwKOvQslJ+ulUzo=; b=AnW8NL8nHaMwTfEwin3Gk/x5OazU/3D27Y83DUqRhJ71lHhHy5crtZdWCVDZMyFkKC Co8mo9jk2fSF9fC5fM46pcRA9yTw1kRMWCilrpwLFksE35tliWYFOKR4h3W54eFBLR6q LMplkjBGTObedZY0vlYNF4j64Xo/iobQrcWFIxgsGmAk/PWuY2zp04vdpg5lMgq8eLVX Jg7RX75bGboV2mA4dJQEaqR4zswhJyFRMYmkiKH7BCZjrqTynyYhQbPdMksOlTr9jqcy Car/9qfy9fK2riSXJz0qhHZcRbHv+RPN4k6igUuI/eaJfRjrkW49PZr0xT4kEZh4Fryi 6W0Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id f22si18534683pgb.594.2021.04.19.13.44.27; Mon, 19 Apr 2021 13:44:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240481AbhDSSCl (ORCPT + 99 others); Mon, 19 Apr 2021 14:02:41 -0400 Received: from mail.kernel.org ([198.145.29.99]:40892 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240433AbhDSSCl (ORCPT ); Mon, 19 Apr 2021 14:02:41 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id C49C861107; Mon, 19 Apr 2021 18:02:10 +0000 (UTC) Subject: [PATCH v3 05/26] xprtrdma: Do not refresh Receive Queue while it is draining From: Chuck Lever To: trondmy@hammerspace.com Cc: linux-nfs@vger.kernel.org, linux-rdma@vger.kernel.org Date: Mon, 19 Apr 2021 14:02:09 -0400 Message-ID: <161885532997.38598.5957438962258396970.stgit@manet.1015granger.net> In-Reply-To: <161885481568.38598.16682844600209775665.stgit@manet.1015granger.net> References: <161885481568.38598.16682844600209775665.stgit@manet.1015granger.net> User-Agent: StGit/0.23-29-ga622f1 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Currently the Receive completion handler refreshes the Receive Queue whenever a successful Receive completion occurs. On disconnect, xprtrdma drains the Receive Queue. The first few Receive completions after a disconnect are typically successful, until the first flushed Receive. This means the Receive completion handler continues to post more Receive WRs after the drain sentinel has been posted. The late- posted Receives flush after the drain sentinel has completed, leading to a crash later in rpcrdma_xprt_disconnect(). To prevent this crash, xprtrdma has to ensure that the Receive handler stops posting Receives before ib_drain_rq() posts its drain sentinel. Suggested-by: Tom Talpey Signed-off-by: Chuck Lever --- net/sunrpc/xprtrdma/verbs.c | 13 +++++++++++++ net/sunrpc/xprtrdma/xprt_rdma.h | 1 + 2 files changed, 14 insertions(+) diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c index ec912cf9c618..d8ed69442219 100644 --- a/net/sunrpc/xprtrdma/verbs.c +++ b/net/sunrpc/xprtrdma/verbs.c @@ -101,6 +101,12 @@ static void rpcrdma_xprt_drain(struct rpcrdma_xprt *r_xprt) struct rpcrdma_ep *ep = r_xprt->rx_ep; struct rdma_cm_id *id = ep->re_id; + /* Wait for rpcrdma_post_recvs() to leave its critical + * section. + */ + if (atomic_inc_return(&ep->re_receiving) > 1) + wait_for_completion(&ep->re_done); + /* Flush Receives, then wait for deferred Reply work * to complete. */ @@ -414,6 +420,7 @@ static int rpcrdma_ep_create(struct rpcrdma_xprt *r_xprt) __module_get(THIS_MODULE); device = id->device; ep->re_id = id; + reinit_completion(&ep->re_done); ep->re_max_requests = r_xprt->rx_xprt.max_reqs; ep->re_inline_send = xprt_rdma_max_inline_write; @@ -1385,6 +1392,9 @@ void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, bool temp) if (!temp) needed += RPCRDMA_MAX_RECV_BATCH; + if (atomic_inc_return(&ep->re_receiving) > 1) + goto out; + /* fast path: all needed reps can be found on the free list */ wr = NULL; while (needed) { @@ -1410,6 +1420,9 @@ void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, bool temp) rc = ib_post_recv(ep->re_id->qp, wr, (const struct ib_recv_wr **)&bad_wr); + if (atomic_dec_return(&ep->re_receiving) > 0) + complete(&ep->re_done); + out: trace_xprtrdma_post_recvs(r_xprt, count, rc); if (rc) { diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h index fe3be985e239..31404326f29f 100644 --- a/net/sunrpc/xprtrdma/xprt_rdma.h +++ b/net/sunrpc/xprtrdma/xprt_rdma.h @@ -83,6 +83,7 @@ struct rpcrdma_ep { unsigned int re_max_inline_recv; int re_async_rc; int re_connect_status; + atomic_t re_receiving; atomic_t re_force_disconnect; struct ib_qp_init_attr re_attr; wait_queue_head_t re_connect_wait;