Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966525AbcCPIRP (ORCPT ); Wed, 16 Mar 2016 04:17:15 -0400 Received: from mail.kernel.org ([198.145.29.136]:60716 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966478AbcCPIM5 (ORCPT ); Wed, 16 Mar 2016 04:12:57 -0400 From: lizf@kernel.org To: stable@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Neil Brown , Trond Myklebust , Zefan Li Subject: [PATCH 3.4 096/107] SUNRPC: never enqueue a ->rq_cong request on ->sending Date: Wed, 16 Mar 2016 16:06:30 +0800 Message-Id: <1458115601-5762-96-git-send-email-lizf@kernel.org> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1458115541-5712-1-git-send-email-lizf@kernel.org> References: <1458115541-5712-1-git-send-email-lizf@kernel.org> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1966 Lines: 53 From: Neil Brown 3.4.111-rc1 review patch. If anyone has any objections, please let me know. ------------------ commit 298073181112a6ab6c30fe7971b99de968daf81e upstream. If the sending queue has a task without ->rq_cong set at the front, and then a number of tasks with ->rq_cong set such that they use the entire congestion window, then the queue deadlocks. The first entry cannot be processed until later entries complete. This scenario has been seen with a client using UDP to access a server, and the network connection breaking for a period of time - it doesn't recover. It never really makes sense for an ->rq_cong request to be on the ->sending queue, but it can happen when a request is being retried, and finds the transport if locked (XPRT_LOCKED). In this case we simple call __xprt_put_cong() and the deadlock goes away. Signed-off-by: NeilBrown Signed-off-by: Trond Myklebust Signed-off-by: Zefan Li --- net/sunrpc/xprt.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index f1a63c1..fe57284 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -66,6 +66,7 @@ static void xprt_init(struct rpc_xprt *xprt, struct net *net); static void xprt_request_init(struct rpc_task *, struct rpc_xprt *); static void xprt_connect_status(struct rpc_task *task); static int __xprt_get_cong(struct rpc_xprt *, struct rpc_task *); +static void __xprt_put_cong(struct rpc_xprt *, struct rpc_rqst *); static void xprt_destroy(struct rpc_xprt *xprt); static DEFINE_SPINLOCK(xprt_list_lock); @@ -269,6 +270,8 @@ int xprt_reserve_xprt_cong(struct rpc_xprt *xprt, struct rpc_task *task) } xprt_clear_locked(xprt); out_sleep: + if (req) + __xprt_put_cong(xprt, req); dprintk("RPC: %5u failed to lock transport %p\n", task->tk_pid, xprt); task->tk_timeout = 0; task->tk_status = -EAGAIN; -- 1.9.1