Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752704AbbBJQup (ORCPT ); Tue, 10 Feb 2015 11:50:45 -0500 Received: from aserp1040.oracle.com ([141.146.126.69]:24341 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751168AbbBJQuo convert rfc822-to-8bit (ORCPT ); Tue, 10 Feb 2015 11:50:44 -0500 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: [PATCH] rds: rds_cong_queue_updates needs to defer the congestion update transmission From: Chuck Lever In-Reply-To: <20150210142214.GO337@oracle.com> Date: Tue, 10 Feb 2015 11:50:39 -0500 Cc: Chien Yen , davem@davemloft.net, rds-devel@oss.oracle.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8BIT Message-Id: References: <20150210142214.GO337@oracle.com> To: Sowmini Varadhan X-Mailer: Apple Mail (2.1878.6) X-Source-IP: acsinet21.oracle.com [141.146.126.237] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2256 Lines: 64 On Feb 10, 2015, at 9:22 AM, Sowmini Varadhan wrote: > > This patch fixes a sock_lock deadlock in the rds_cong_queue_update path. Note that the deadlock appears to exist only with TCP transports. > We cannot inline the call to rds_send_xmit from rds_cong_queue_update > because > (a) we are already holding the sock_lock in the recv path, and > will deadlock when tcp_setsockopt/tcp_sendmsg try to get the sock > lock > (b) cong_queue_update does an irqsave on the rds_cong_lock, and this > will trigger warnings (for a good reason) from functions called > out of sock_lock. > > Signed-off-by: Sowmini Varadhan > --- > net/rds/cong.c | 16 +++++++++++++++- > 1 files changed, 15 insertions(+), 1 deletions(-) > > diff --git a/net/rds/cong.c b/net/rds/cong.c > index e5b65ac..765d18f 100644 > --- a/net/rds/cong.c > +++ b/net/rds/cong.c > @@ -221,7 +221,21 @@ void rds_cong_queue_updates(struct rds_cong_map *map) > list_for_each_entry(conn, &map->m_conn_list, c_map_item) { > if (!test_and_set_bit(0, &conn->c_map_queued)) { > rds_stats_inc(s_cong_update_queued); > - rds_send_xmit(conn); > + /* We cannot inline the call to rds_send_xmit() here > + * for two reasons: > + * 1. When we get here from the receive path, we > + * are already holding the sock_lock (held by > + * tcp_v4_rcv()). So inlining calls to > + * tcp_setsockopt and/or tcp_sendmsg will deadlock > + * when it tries to get the sock_lock()) > + * 2. Interrupts are masked so that we can mark the > + * the port congested from both send and recv paths. > + * (See comment around declaration of rds_cong_lock). > + * An attempt to get the sock_lock() here will > + * therefore trigger warnings. > + * Defer the xmit to rds_send_worker() instead. > + */ > + queue_delayed_work(rds_wq, &conn->c_send_w, 0); > } > } > > -- > 1.7.1 > -- Chuck Lever chuck[dot]lever[at]oracle[dot]com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/