Received: by 10.213.65.68 with SMTP id h4csp223927imn; Fri, 23 Mar 2018 03:16:51 -0700 (PDT) X-Google-Smtp-Source: AG47ELtGNdOJIfnBSSrl3VsrNt7+GU6/cEZKEXReY0cwvTibN4f3Y1F4mFdeV/l28SuN7fGZfYRc X-Received: by 10.98.71.211 with SMTP id p80mr6079162pfi.136.1521800211034; Fri, 23 Mar 2018 03:16:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521800211; cv=none; d=google.com; s=arc-20160816; b=v7DlZxd0KKMhHjpv9K0iNX7f3okxVnQjzBuqeFAbRxF7pb9yZ19Wg8ZSB8IfPeE3sv M2hXFZH+pIlBBCy040nSwgZet2JdBR+mkaMZ8ww7xVcg/8ShNm3RrzGj9kWlkWm3XkX1 iokqZO8x4qCPL/Nz7aOQEAhBrco9U4iewB6LM8vWHxg4JjAhqqNco/A7jCw5+bX9zJIh PT5ufydpkGm/OwBK2VCGqAnQaMO5NYirZ4FuPS2KKH/rwNoxbOs5Kul1risLToEEItO4 YnpAGs5NDtpAODiufDlKLbUb6rlugaxVsMd1OPf1wtCnMJ31XyN0PRRsOtq6fQeyWXoq mO0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=+2ghKHW8x/UDi5RHqAYb2/Ci5tYJzR3YhLFh2JBG1+c=; b=1LOS78DNOQ37JQHRK8sVeGUp7N6jvA+ntLPbTo4Ju90o0D4aQ9LsqAl+xyWpwe7PuJ 4eyDWyOGHqetMMm3gI7VnUqqrq/YaG+PdKF/X41+4/PYPj5NGOMlRgWidJNn+FwHAH4h zek6UyOaca/ywij7um/CHSkm7pwGFnAqhh0S9Il3ZwR0ZCcaXAAKPkj0Rq3qNiTvs1xZ jsboeRXSaTukFvIIjY4izmdgNRyIaFHwwFFs6yBgzbIxhPGuatjPgOfP5JTyxEmBVrnm tIkYeYbRMaO/gQC1LPXLKXGd8dwLPHj/mqseJfzKPJDXBn/WA9lHmV79c1uVe+D9uUmT Xomw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i129si5851234pgd.206.2018.03.23.03.16.36; Fri, 23 Mar 2018 03:16:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755989AbeCWKOL (ORCPT + 99 others); Fri, 23 Mar 2018 06:14:11 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:45058 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755901AbeCWKOH (ORCPT ); Fri, 23 Mar 2018 06:14:07 -0400 Received: from localhost (LFbn-1-12247-202.w90-92.abo.wanadoo.fr [90.92.61.202]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id AE6FE122A; Fri, 23 Mar 2018 10:14:06 +0000 (UTC) From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Feras Daoud , Leon Romanovsky , Doug Ledford , Sasha Levin Subject: [PATCH 4.4 31/97] IB/ipoib: Fix deadlock between ipoib_stop and mcast join flow Date: Fri, 23 Mar 2018 10:54:18 +0100 Message-Id: <20180323094159.381404262@linuxfoundation.org> X-Mailer: git-send-email 2.16.2 In-Reply-To: <20180323094157.535925724@linuxfoundation.org> References: <20180323094157.535925724@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.4-stable review patch. If anyone has any objections, please let me know. ------------------ From: Feras Daoud [ Upstream commit 3e31a490e01a6e67cbe9f6e1df2f3ff0fbf48972 ] Before calling ipoib_stop, rtnl_lock should be taken, then the flow clears the IPOIB_FLAG_ADMIN_UP and IPOIB_FLAG_OPER_UP flags, and waits for mcast completion if IPOIB_MCAST_FLAG_BUSY is set. On the other hand, the flow of multicast join task initializes a mcast completion, sets the IPOIB_MCAST_FLAG_BUSY and calls ipoib_mcast_join. If IPOIB_FLAG_OPER_UP flag is not set, this call returns EINVAL without setting the mcast completion and leads to a deadlock. ipoib_stop | | | clear_bit(IPOIB_FLAG_ADMIN_UP) | | | Context Switch | | ipoib_mcast_join_task | | | spin_lock_irq(lock) | | | init_completion(mcast) | | | set_bit(IPOIB_MCAST_FLAG_BUSY) | | | Context Switch | | clear_bit(IPOIB_FLAG_OPER_UP) | | | spin_lock_irqsave(lock) | | | Context Switch | | ipoib_mcast_join | return (-EINVAL) | | | spin_unlock_irq(lock) | | | Context Switch | | ipoib_mcast_dev_flush | wait_for_completion(mcast) | ipoib_stop will wait for mcast completion for ever, and will not release the rtnl_lock. As a result panic occurs with the following trace: [13441.639268] Call Trace: [13441.640150] [] schedule+0x29/0x70 [13441.641038] [] schedule_timeout+0x239/0x2d0 [13441.641914] [] ? complete+0x47/0x50 [13441.642765] [] ? flush_workqueue_prep_pwqs+0x16d/0x200 [13441.643580] [] wait_for_completion+0x116/0x170 [13441.644434] [] ? wake_up_state+0x20/0x20 [13441.645293] [] ipoib_mcast_dev_flush+0x150/0x190 [ib_ipoib] [13441.646159] [] ipoib_ib_dev_down+0x37/0x60 [ib_ipoib] [13441.647013] [] ipoib_stop+0x75/0x150 [ib_ipoib] Fixes: 08bc327629cb ("IB/ipoib: fix for rare multicast join race condition") Signed-off-by: Feras Daoud Signed-off-by: Leon Romanovsky Signed-off-by: Doug Ledford Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman --- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -473,6 +473,9 @@ static int ipoib_mcast_join(struct net_d !test_bit(IPOIB_FLAG_OPER_UP, &priv->flags)) return -EINVAL; + init_completion(&mcast->done); + set_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags); + ipoib_dbg_mcast(priv, "joining MGID %pI6\n", mcast->mcmember.mgid.raw); rec.mgid = mcast->mcmember.mgid; @@ -631,8 +634,6 @@ void ipoib_mcast_join_task(struct work_s if (mcast->backoff == 1 || time_after_eq(jiffies, mcast->delay_until)) { /* Found the next unjoined group */ - init_completion(&mcast->done); - set_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags); if (ipoib_mcast_join(dev, mcast)) { spin_unlock_irq(&priv->lock); return; @@ -652,11 +653,9 @@ out: queue_delayed_work(priv->wq, &priv->mcast_task, delay_until - jiffies); } - if (mcast) { - init_completion(&mcast->done); - set_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags); + if (mcast) ipoib_mcast_join(dev, mcast); - } + spin_unlock_irq(&priv->lock); }