Received: by 10.213.65.68 with SMTP id h4csp102865imn; Fri, 30 Mar 2018 15:22:02 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+rkrdmviEAiAR5ultFsNyNmmPJbaG28n/h6H3rFQC+/LmY8yNNVRe34Z9wtwMYPa7unQjg X-Received: by 2002:a17:902:a981:: with SMTP id bh1-v6mr733745plb.255.1522448522482; Fri, 30 Mar 2018 15:22:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522448522; cv=none; d=google.com; s=arc-20160816; b=qLnfnScWLyErrpWXRQKBcljuH88mPYR5ezmx0y2uwJ56kzL2kGtwWtSEf1LgV+R+bu KlwtF2Q/F8fuGAs9R3tFaztW15EI0vvmNto378T3qpUDrjoSY0Q0hRH2SQP0xzTYA4dl toQXxI6K99zr6X91qIhDKS8Apf0fEKKpQo0E7nIPWIaLP3VGuBxJGHmuswmQwZ/z9hNB /pA3eu43+SO1cEjChOMf8dTyMTUDzC6iJfM6iSmk+gwM9dDSHD10tnLcZ2ewOkEQ++8R os89KFozNu2iAmtzL1jXm2oJbyeb5K9XHXcHSvqczHVEp/0mWDZzzF9eX7RiIima4Y/D ca4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:reply-to:message-id:date:subject:cc:to :from:arc-authentication-results; bh=vPXEgDjcSFSsyTG+m+yoWQlE5RH+Pmv5SxZ0hLvOS5k=; b=gHgfLpXrTz36EptSTdonP+5oQaQqsXtd+iOR6gVtMDUOadAPjtI4BrZV6LLumMEWw1 oaAM/bbRR6wIrIe4360pp7rPo3nbLGSdSpcU4O8iIr5bXZJQR06Bb/tWEGUCghPPXcnF 3wR6oN5SjpIA+acjSjwIS71oVsgVbSz72Tnu+JQZdXpCDVt/HDZmDuDRwi4lnGL1ANuG fczy6wNrA4zs5gNcCccrEG8zcodJ6a9hEZhhvbJ0GbR+cuTwH+UHOE8Hr+Zol2EmSfg8 s24R87AruvN4fdeKKgqWz5s2HbWeKPJdJWzzEzq4L0X9FFYXZF3Zx+iTcVwViYgdR0h7 o/dQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c2-v6si9095205plb.607.2018.03.30.15.21.17; Fri, 30 Mar 2018 15:22:02 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752756AbeC3WSD (ORCPT + 99 others); Fri, 30 Mar 2018 18:18:03 -0400 Received: from a2nlsmtp01-03.prod.iad2.secureserver.net ([198.71.225.37]:51792 "EHLO a2nlsmtp01-03.prod.iad2.secureserver.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751275AbeC3WSB (ORCPT ); Fri, 30 Mar 2018 18:18:01 -0400 Received: from linuxonhyperv2.linuxonhyperv.com ([107.180.71.197]) by : HOSTING RELAY : with SMTP id 22K8fJPMUzDeH22K8fKOCb; Fri, 30 Mar 2018 15:17:00 -0700 x-originating-ip: 107.180.71.197 Received: from longli by linuxonhyperv2.linuxonhyperv.com with local (Exim 4.89_1) (envelope-from ) id 1f22K8-0008T6-4Q; Fri, 30 Mar 2018 15:16:40 -0700 From: Long Li To: Steve French , linux-cifs@vger.kernel.org, samba-technical@lists.samba.org, linux-kernel@vger.kernel.org Cc: Long Li Subject: [PATCH 1/2] cifs: smbd: avoid reconnect lockup Date: Fri, 30 Mar 2018 15:16:35 -0700 Message-Id: <20180330221636.32512-1-longli@linuxonhyperv.com> X-Mailer: git-send-email 2.15.1 Reply-To: longli@microsoft.com X-CMAE-Envelope: MS4wfOk9/GjI/YXBjbKCG+YrISH5CJBlpV5FjFxjVk1KcKuTihwxZNW6ASWujcNaHjG81LxNzrdzJquDLgEcGsEeDVkUat4rooOv15ucu+7chNBurPgV3KCt e6RS57aL3YOiiolbCS1pzaJVmB7Zh9tQLeHpVYJLmRYEJe5+8LAJMTX9Ej1jxI+H8v9HrWWGGfux6c61zmZcNKJLyd7Jk9qdpVe3J2P97c5yus0+V7zxw7mC 4xqhAT/1RH5WD2riCyH/NWHYG64tg/hatJ4mSoGxu29SWO5Mr10pJBpKRpzp3xGvTaLd5C4qyDB5IZsaUTLfuXCw+/Q5WAYSUdsVEZxS9mHrxoCOPVeXQSu/ 6+PZRa0+3p0AYKnXvdDHSO/vK+73xA== Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Long Li During transport reconnect, other processes may have registered memory and blocked on transport. This creates a deadlock situation because the transport resources can't be freed, and reconnect is blocked. Fix this by returning to upper layer on timeout. Before returning, transport status is set to reconnecting so other processes will release memory registration resources. Upper layer will retry the reconnect. This is not in fast I/O path so setting the timeout to 5 seconds. Signed-off-by: Long Li --- fs/cifs/smbdirect.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/fs/cifs/smbdirect.c b/fs/cifs/smbdirect.c index 5aa0b54..3f7883e 100644 --- a/fs/cifs/smbdirect.c +++ b/fs/cifs/smbdirect.c @@ -1498,8 +1498,8 @@ int smbd_reconnect(struct TCP_Server_Info *server) log_rdma_event(INFO, "reconnecting rdma session\n"); if (!server->smbd_conn) { - log_rdma_event(ERR, "rdma session already destroyed\n"); - return -EINVAL; + log_rdma_event(INFO, "rdma session already destroyed\n"); + goto create_conn; } /* @@ -1512,15 +1512,19 @@ int smbd_reconnect(struct TCP_Server_Info *server) } /* wait until the transport is destroyed */ - wait_event(server->smbd_conn->wait_destroy, - server->smbd_conn->transport_status == SMBD_DESTROYED); + if (!wait_event_timeout(server->smbd_conn->wait_destroy, + server->smbd_conn->transport_status == SMBD_DESTROYED, 5*HZ)) + return -EAGAIN; destroy_workqueue(server->smbd_conn->workqueue); kfree(server->smbd_conn); +create_conn: log_rdma_event(INFO, "creating rdma session\n"); server->smbd_conn = smbd_get_connection( server, (struct sockaddr *) &server->dstaddr); + log_rdma_event(INFO, "created rdma session info=%p\n", + server->smbd_conn); return server->smbd_conn ? 0 : -ENOENT; } -- 2.7.4