Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752246AbbEBLzh (ORCPT ); Sat, 2 May 2015 07:55:37 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:51203 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752016AbbEBLzZ (ORCPT ); Sat, 2 May 2015 07:55:25 -0400 From: Sowmini Varadhan To: netdev@vger.kernel.org, linux-kernel@vger.kernel.org Cc: chien.yen@oracle.com, davem@davemloft.net, rds-devel@oss.oracle.com, ajaykumar.hotchandani@oracle.com, Sowmini Varadhan Subject: [PATCH 0/2] net/rds: RDS-TCP robustness fixes Date: Sat, 2 May 2015 07:55:07 -0400 Message-Id: X-Mailer: git-send-email 1.7.1 X-Source-IP: aserv0021.oracle.com [141.146.126.233] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1806 Lines: 40 This patch-set contains bug fixes for state-recovery at the RDS layer when the underlying transport is TCP and the TCP state at one of the endpoints is reset, e.g., due to a "modprobe -r rds_tcp" or a reboot. When that situation happens, the existing code does not correctly clean up RDS socket state for stale connections, resulting in some unstable, timing-dependant behavior on the wire, including an infinite exchange of 3WHs back-and-forth, and a resulting potential to never converge RDS state. Test cases used to verify the changes in this set are: 1. Start rds client/server applications on two participating nodes, node1 and node1. After at least one packet has been sent (to establish the TCP connection), restart the rds_tcp module on the client, and now resend packets. Tcpdump should show server sending a FIN for the "old" client port, and clean connection establishment/exchange for the new client port. 2. At the end of step 1, restart rds srever on node2, and start client on node1, make sure using tcpdump, 'netstat -an|grep 16385' that packets flow correctly. Sowmini Varadhan (2): RDS-TCP: Always create a new rds_sock for an incoming connection. RDS-TCP: only initiate reconnect attempt on outgoing TCP socket. net/rds/connection.c | 17 ++++++++++++++- net/rds/tcp.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++- net/rds/tcp.h | 5 +++- net/rds/tcp_connect.c | 2 +- net/rds/tcp_listen.c | 13 +++++++++++- 5 files changed, 82 insertions(+), 6 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/