Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755099AbcKNCLn (ORCPT ); Sun, 13 Nov 2016 21:11:43 -0500 Received: from shadbolt.e.decadent.org.uk ([88.96.1.126]:46282 "EHLO shadbolt.e.decadent.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753189AbcKNCGU (ORCPT ); Sun, 13 Nov 2016 21:06:20 -0500 Content-Type: text/plain; charset="UTF-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit MIME-Version: 1.0 From: Ben Hutchings To: linux-kernel@vger.kernel.org, stable@vger.kernel.org CC: akpm@linux-foundation.org, "Leon Romanovsky" , "Doug Ledford" , "Alex Vesker" Date: Mon, 14 Nov 2016 00:14:07 +0000 Message-ID: X-Mailer: LinuxStableQueue (scripts by bwh) Subject: [PATCH 3.2 128/152] IB/ipoib: Don't allow MC joins during light MC flush In-Reply-To: X-SA-Exim-Connect-IP: 2a02:8011:400e:2:6f00:88c8:c921:d332 X-SA-Exim-Mail-From: ben@decadent.org.uk X-SA-Exim-Scanned: No (on shadbolt.decadent.org.uk); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4073 Lines: 83 3.2.84-rc1 review patch. If anyone has any objections, please let me know. ------------------ From: Alex Vesker commit 344bacca8cd811809fc33a249f2738ab757d327f upstream. This fix solves a race between light flush and on the fly joins. Light flush doesn't set the device to down and unset IPOIB_OPER_UP flag, this means that if while flushing we have a MC join in progress and the QP was attached to BC MGID we can have a mismatches when re-attaching a QP to the BC MGID. The light flush would set the broadcast group to NULL causing an on the fly join to rejoin and reattach to the BC MCG as well as adding the BC MGID to the multicast list. The flush process would later on remove the BC MGID and detach it from the QP. On the next flush the BC MGID is present in the multicast list but not found when trying to detach it because of the previous double attach and single detach. [18332.714265] ------------[ cut here ]------------ [18332.717775] WARNING: CPU: 6 PID: 3767 at drivers/infiniband/core/verbs.c:280 ib_dealloc_pd+0xff/0x120 [ib_core] ... [18332.775198] Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011 [18332.779411] 0000000000000000 ffff8800b50dfbb0 ffffffff813fed47 0000000000000000 [18332.784960] 0000000000000000 ffff8800b50dfbf0 ffffffff8109add1 0000011832f58300 [18332.790547] ffff880226a596c0 ffff880032482000 ffff880032482830 ffff880226a59280 [18332.796199] Call Trace: [18332.798015] [] dump_stack+0x63/0x8c [18332.801831] [] __warn+0xd1/0xf0 [18332.805403] [] warn_slowpath_null+0x1d/0x20 [18332.809706] [] ib_dealloc_pd+0xff/0x120 [ib_core] [18332.814384] [] ipoib_transport_dev_cleanup+0xfc/0x1d0 [ib_ipoib] [18332.820031] [] ipoib_ib_dev_cleanup+0x98/0x110 [ib_ipoib] [18332.825220] [] ipoib_dev_cleanup+0x2d8/0x550 [ib_ipoib] [18332.830290] [] ipoib_uninit+0x2f/0x40 [ib_ipoib] [18332.834911] [] rollback_registered_many+0x1aa/0x2c0 [18332.839741] [] rollback_registered+0x31/0x40 [18332.844091] [] unregister_netdevice_queue+0x48/0x80 [18332.848880] [] ipoib_vlan_delete+0x1fb/0x290 [ib_ipoib] [18332.853848] [] delete_child+0x7d/0xf0 [ib_ipoib] [18332.858474] [] dev_attr_store+0x18/0x30 [18332.862510] [] sysfs_kf_write+0x3a/0x50 [18332.866349] [] kernfs_fop_write+0x120/0x170 [18332.870471] [] __vfs_write+0x28/0xe0 [18332.874152] [] ? percpu_down_read+0x1f/0x50 [18332.878274] [] vfs_write+0xa2/0x1a0 [18332.881896] [] SyS_write+0x46/0xa0 [18332.885632] [] do_syscall_64+0x57/0xb0 [18332.889709] [] entry_SYSCALL64_slow_path+0x25/0x25 [18332.894727] ---[ end trace 09ebbe31f831ef17 ]--- Fixes: ee1e2c82c245 ("IPoIB: Refresh paths instead of flushing them on SM change events") Signed-off-by: Alex Vesker Signed-off-by: Leon Romanovsky Signed-off-by: Doug Ledford [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings --- drivers/infiniband/ulp/ipoib/ipoib_ib.c | 9 +++++++++ 1 file changed, 9 insertions(+) --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -973,8 +973,17 @@ static void __ipoib_ib_dev_flush(struct } if (level == IPOIB_FLUSH_LIGHT) { + int oper_up; ipoib_mark_paths_invalid(dev); + /* Set IPoIB operation as down to prevent races between: + * the flush flow which leaves MCG and on the fly joins + * which can happen during that time. mcast restart task + * should deal with join requests we missed. + */ + oper_up = test_and_clear_bit(IPOIB_FLAG_OPER_UP, &priv->flags); ipoib_mcast_dev_flush(dev); + if (oper_up) + set_bit(IPOIB_FLAG_OPER_UP, &priv->flags); } if (level >= IPOIB_FLUSH_NORMAL)