Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753131AbcJ3VGl (ORCPT ); Sun, 30 Oct 2016 17:06:41 -0400 Received: from mail-wm0-f48.google.com ([74.125.82.48]:38522 "EHLO mail-wm0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751438AbcJ3VGi (ORCPT ); Sun, 30 Oct 2016 17:06:38 -0400 Subject: Re: [PATCH] IBcore/CM: Issue DREQ when receiving REQ/REP for stale QP To: Hans Westgaard Ry , Doug Ledford , Sean Hefty , Hal Rosenstock , Matan Barak , Erez Shitrit , Bart Van Assche , Ira Weiny , Or Gerlitz , Hakon Bugge , Yuval Shaia , linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org References: <1477653269-27359-1-git-send-email-hans.westgaard.ry@oracle.com> From: Sagi Grimberg Message-ID: <8df2c90e-8581-4ce8-fd18-92ac57eb6160@grimberg.me> Date: Sun, 30 Oct 2016 23:06:33 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: <1477653269-27359-1-git-send-email-hans.westgaard.ry@oracle.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2173 Lines: 44 > from "InfiBand Architecture Specifications Volume 1": > > A QP is said to have a stale connection when only one side has > connection information. A stale connection may result if the remote CM > had dropped the connection and sent a DREQ but the DREQ was never > received by the local CM. Alternatively the remote CM may have lost > all record of past connections because its node crashed and rebooted, > while the local CM did not become aware of the remote node's reboot > and therefore did not clean up stale connections. > > and: > > A local CM may receive a REQ/REP for a stale connection. It shall > abort the connection issuing REJ to the REQ/REP. It shall then issue > DREQ with "DREQ:remote QPN? set to the remote QPN from the REQ/REP. > > This patch solves a problem with reuse of QPN. Current codebase, that > is IPoIB, relies on a REAP-mechanism to do cleanup of the structures > in CM. A problem with this is the timeconstants governing this > mechanism; they are up to 768 seconds and the interface may look > inresponsive in that period. Issuing a DREQ (and receiving a DREP) > does the necessary cleanup and the interface comes up. I like this fix, so, Reviewed-by: Sagi Grimberg But I think the CM layer still is buggy in this area. In vol 1 the state transition table specifically states that DREP timeouts should move the cm_id to timewait state but the CM doesn't seem to maintain response timeouts on disconnect requests. If the DREQ happened to fail (send error completion) things are fine, but if the DREQ makes it to the peer but it doesn't reply then no one will take care of it (i.e. we will never see a TIMEWAIT event from this cm_id)... I recall some debugging session with Hal on this area a ~year ago with a new iser target (which didn't reply to DREQs on reboot sequences). iser initiator waits for a DISCONNECTED/TIMEWAIT events before destroying the cm_id (which never happened because of the above). I think I ended up working around that in iser to just go ahead and destroy the cm_id after issuing a DREQ (but now I realize it was never included so I'll probably dig it up again soon).