Received: by 10.223.185.116 with SMTP id b49csp6357094wrg; Wed, 28 Feb 2018 08:06:34 -0800 (PST) X-Google-Smtp-Source: AH8x225lhgKZ2X1Yr8NNQLX8/bppO61PlnOOpMYmz4XMxdYnV9VifV8vb+QiFHH9RtmZ7ZkC+Nh9 X-Received: by 10.101.73.197 with SMTP id t5mr14823038pgs.426.1519833994827; Wed, 28 Feb 2018 08:06:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519833994; cv=none; d=google.com; s=arc-20160816; b=Y1iAKMHdexQuxK8cwZ0yZLqcxAB+QX3SZQmVBe4ZUDcDNKCwPPR9EccdX+LQYFp9cB QOd4uQkkvz9OQAGP+wPlKsdi44FcYT+/GFpGAqqoTEOOSk2bMMxhL9kdd/VGObvVA/S1 tncYK2rUsZCoi6StY9xnbgejC1Uu+WMN5Li9C7lJfOTfo+p89Mfbjx94vWDTVLbiWSsX KrX2J01tHSxz/UVsVIPjIoaYXgiAQVvaH56tm8Ht3p6jsCf15fbW0xFSDp34mMhJxYAd v/l09xO8BZ2CKiidXu1DCQM/0kKesP0SgjMdra2ZvhdX7JBY6892I+WoIRNU5VHGOYAP 4clQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:subject:message-id:date:cc:to :from:mime-version:content-transfer-encoding:content-disposition :arc-authentication-results; bh=zvS/b25yPtYxItPjp476mzJ9wscTDFaeUMjr8iAQ5/c=; b=ZwTXPriOczcX7z/uwFPt/ssrvRWlK67Y+1y75Bb1b9x+QvvRwUhhuKGdaw5o3VMqvi Js7UnirDp0J1gB6wraSq/idcAiRgFIsmbY3f7eJ4lnjOr+61CL19v5XZR9dXV2NIC6aN j9froT4meVsZgb71a1bDdxqgqLFIy+HsWT0HGbVL67mUmyDZrhWcAYdbdZxx8SAKH1zx 6jEzhGMnvHxTu3cVC3EG2YKUeHP0vrkXsYrAMILFz20Lze2Xlrp5S4yo7whWVIIL1Pb0 g3v3hYK8K1RraTlVkaKAIoXSR3jge1NHB2FfY9KisbJmcJpSgAxyHXZB3yj4JTfXjQ4A y86A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o5si1135022pgv.774.2018.02.28.08.06.19; Wed, 28 Feb 2018 08:06:34 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934484AbeB1QFS (ORCPT + 99 others); Wed, 28 Feb 2018 11:05:18 -0500 Received: from shadbolt.e.decadent.org.uk ([88.96.1.126]:34875 "EHLO shadbolt.e.decadent.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934460AbeB1QFQ (ORCPT ); Wed, 28 Feb 2018 11:05:16 -0500 Received: from [2a02:8011:400e:2:6f00:88c8:c921:d332] (helo=deadeye) by shadbolt.decadent.org.uk with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1er3Yr-0006Xf-Aa; Wed, 28 Feb 2018 15:22:29 +0000 Received: from ben by deadeye with local (Exim 4.90_1) (envelope-from ) id 1er3Yi-0000Ar-6T; Wed, 28 Feb 2018 15:22:20 +0000 Content-Type: text/plain; charset="UTF-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit MIME-Version: 1.0 From: Ben Hutchings To: linux-kernel@vger.kernel.org, stable@vger.kernel.org CC: akpm@linux-foundation.org, "Erez Shitrit" , "Jason Gunthorpe" , "Alex Vesker" , "Leon Romanovsky" Date: Wed, 28 Feb 2018 15:20:18 +0000 Message-ID: X-Mailer: LinuxStableQueue (scripts by bwh) Subject: [PATCH 3.16 156/254] IB/ipoib: Fix race condition in neigh creation In-Reply-To: X-SA-Exim-Connect-IP: 2a02:8011:400e:2:6f00:88c8:c921:d332 X-SA-Exim-Mail-From: ben@decadent.org.uk X-SA-Exim-Scanned: No (on shadbolt.decadent.org.uk); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 3.16.55-rc1 review patch. If anyone has any objections, please let me know. ------------------ From: Erez Shitrit commit 16ba3defb8bd01a9464ba4820a487f5b196b455b upstream. When using enhanced mode for IPoIB, two threads may execute xmit in parallel to two different TX queues while the target is the same. In this case, both of them will add the same neighbor to the path's neigh link list and we might see the following message: list_add double add: new=ffff88024767a348, prev=ffff88024767a348... WARNING: lib/list_debug.c:31__list_add_valid+0x4e/0x70 ipoib_start_xmit+0x477/0x680 [ib_ipoib] dev_hard_start_xmit+0xb9/0x3e0 sch_direct_xmit+0xf9/0x250 __qdisc_run+0x176/0x5d0 __dev_queue_xmit+0x1f5/0xb10 __dev_queue_xmit+0x55/0xb10 Analysis: Two SKB are scheduled to be transmitted from two cores. In ipoib_start_xmit, both gets NULL when calling ipoib_neigh_get. Two calls to neigh_add_path are made. One thread takes the spin-lock and calls ipoib_neigh_alloc which creates the neigh structure, then (after the __path_find) the neigh is added to the path's neigh link list. When the second thread enters the critical section it also calls ipoib_neigh_alloc but in this case it gets the already allocated ipoib_neigh structure, which is already linked to the path's neigh link list and adds it again to the list. Which beside of triggering the list, it creates a loop in the linked list. This loop leads to endless loop inside path_rec_completion. Solution: Check list_empty(&neigh->list) before adding to the list. Add a similar fix in "ipoib_multicast.c::ipoib_mcast_send" Fixes: b63b70d87741 ('IPoIB: Use a private hash table for path lookup in xmit path') Signed-off-by: Erez Shitrit Reviewed-by: Alex Vesker Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings --- drivers/infiniband/ulp/ipoib/ipoib_main.c | 25 ++++++++++++++++++------- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 5 ++++- 2 files changed, 22 insertions(+), 8 deletions(-) --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -615,8 +615,8 @@ static int path_rec_start(struct net_dev return 0; } -static void neigh_add_path(struct sk_buff *skb, u8 *daddr, - struct net_device *dev) +static struct ipoib_neigh *neigh_add_path(struct sk_buff *skb, u8 *daddr, + struct net_device *dev) { struct ipoib_dev_priv *priv = netdev_priv(dev); struct ipoib_path *path; @@ -629,7 +629,15 @@ static void neigh_add_path(struct sk_buf spin_unlock_irqrestore(&priv->lock, flags); ++dev->stats.tx_dropped; dev_kfree_skb_any(skb); - return; + return NULL; + } + + /* To avoid race condition, make sure that the + * neigh will be added only once. + */ + if (unlikely(!list_empty(&neigh->list))) { + spin_unlock_irqrestore(&priv->lock, flags); + return neigh; } path = __path_find(dev, daddr + 4); @@ -665,7 +673,7 @@ static void neigh_add_path(struct sk_buf spin_unlock_irqrestore(&priv->lock, flags); ipoib_send(dev, skb, path->ah, IPOIB_QPN(daddr)); ipoib_neigh_put(neigh); - return; + return NULL; } } else { neigh->ah = NULL; @@ -678,7 +686,7 @@ static void neigh_add_path(struct sk_buf spin_unlock_irqrestore(&priv->lock, flags); ipoib_neigh_put(neigh); - return; + return NULL; err_path: ipoib_neigh_free(neigh); @@ -688,6 +696,8 @@ err_drop: spin_unlock_irqrestore(&priv->lock, flags); ipoib_neigh_put(neigh); + + return NULL; } static void unicast_arp_send(struct sk_buff *skb, struct net_device *dev, @@ -784,8 +794,9 @@ static int ipoib_start_xmit(struct sk_bu case htons(ETH_P_TIPC): neigh = ipoib_neigh_get(dev, cb->hwaddr); if (unlikely(!neigh)) { - neigh_add_path(skb, cb->hwaddr, dev); - return NETDEV_TX_OK; + neigh = neigh_add_path(skb, cb->hwaddr, dev); + if (likely(!neigh)) + return NETDEV_TX_OK; } break; case htons(ETH_P_ARP): --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -728,7 +728,10 @@ out: spin_lock_irqsave(&priv->lock, flags); if (!neigh) { neigh = ipoib_neigh_alloc(daddr, dev); - if (neigh) { + /* Make sure that the neigh will be added only + * once to mcast list. + */ + if (neigh && list_empty(&neigh->list)) { kref_get(&mcast->ah->ref); neigh->ah = mcast->ah; list_add_tail(&neigh->list, &mcast->neigh_list);