Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp2005902imm; Thu, 7 Jun 2018 04:02:46 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIEG0g8761Nlh9zmXdzaGEmJRp62wI++ysNwGyQxT3tElHF8VU2gtuyYrfNLkvWbZ8mvKWj X-Received: by 2002:a65:644f:: with SMTP id s15-v6mr1193299pgv.228.1528369366713; Thu, 07 Jun 2018 04:02:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528369366; cv=none; d=google.com; s=arc-20160816; b=TBNLptDM9dnAI/q2IxZ3TshEFtRypOtm0l8XZdn4+GKmkyR6VaTOvMyPpaHlCVGiwR m6l8XXcphZfQr3xr+NXo8FDqrFY6UsphA/iRGca5BEcXb4xIMO791W+C4Zl1WwyFdc8N Gb5V5rYpvbRRyG5B/zKukz6fXUP7Gf+J6eBKGRkWs9REziyajFT2KWN+g0Ua1/ugdkG0 Dw7MECNp2O58HVuL9JRDxCNl+/w6i+ecBeOfFGrZcg6nI0vYn+jfhwROCFzgqZeLnjtu /x+v+tqBfa8hQtExPgszt+2P46Vg2DYsaMqTMaJ8jSlnrPe/tVhp2OfZ6hT245oErqpI QVNg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:to:from:dkim-signature:arc-authentication-results; bh=Npb16z/ZFW/XbyVTjJ0mw2WfoNm3IBf3ZGOh0+43Bd0=; b=fyXcsep0y4D08lmUjhDEkOYE/21Qe9W4sJTRGyoqtahVL2TJXOhQSgb2bgFk1x4TZz kFEGYP3rXHH9lxhu/82IngCyPPg69xKaNc7k3i87zAzivYg8UXbTgnsENVRzKBPMTScH nrqro9+7mHnBuFhYLN12FA5P+k6LuttV2os4zEjcq2BbMNRAsD30YSWXFSi468i3Ee5b NiakDJGs8rCv95MhnPFwezahDlzCm4Nrs2+e59dfY98Z5LZQpYVymZvXOjqdKsZV1RsU 1uL+ahiXcJCSu0HY28hVMES4vLKrnhjL05OpVeVYls7XEnNfaleKcYktQIz3wfeLSohM Ax6g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=iEINi2zQ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 32-v6si53815485plc.252.2018.06.07.04.02.31; Thu, 07 Jun 2018 04:02:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=iEINi2zQ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932271AbeFGKxV (ORCPT + 99 others); Thu, 7 Jun 2018 06:53:21 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:46832 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753100AbeFGKwv (ORCPT ); Thu, 7 Jun 2018 06:52:51 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w57AokIc025601; Thu, 7 Jun 2018 10:52:27 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : in-reply-to : references; s=corp-2017-10-26; bh=Npb16z/ZFW/XbyVTjJ0mw2WfoNm3IBf3ZGOh0+43Bd0=; b=iEINi2zQRkwiQrMxL15DEsgD1wo2yA3P2LEDa+bvfgXb+uL+77ZZbJ0yHLbmZIcEi6g8 rfHOVsPCv3AYkdP8FAZxcvKRv0IBAsjFOoNZxYQSBDN6bCIC+8q1P9asLjiP7ZXyC5jb 8XucSEoaHa3U2vEd+1BFqv4ioB6pATj0cNpDCJPECAmwO5q65maPLN8CH5BXjwX8M7w1 S7+hzayObvKB43Uz82LvjcvoyocPPKh074Vh+DKbMv2ecq5eERPYKLmWNi1m3i+/wkUP sE05UEph5Rt2b6UXB103G+I7JUEGeJyV2y6zV+xn9d723QFfLlZGmfoslDa5lqRnwSTM sQ== Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp2120.oracle.com with ESMTP id 2jbvyprdpr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 07 Jun 2018 10:52:27 +0000 Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w57AqR6w012399 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 7 Jun 2018 10:52:27 GMT Received: from abhmp0011.oracle.com (abhmp0011.oracle.com [141.146.116.17]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id w57AqQlS003853; Thu, 7 Jun 2018 10:52:26 GMT Received: from lab02.no.oracle.com (/10.172.144.56) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 07 Jun 2018 03:52:25 -0700 From: Hans Westgaard Ry To: Doug Ledford , Jason Gunthorpe , Hakon Bugge , Parav Pandit , Jack Morgenstein , Pravin Shedge , Matthew Wilcox , Andrew Morton , Jeff Layton , Wei Wang , Chris Mi , Eric Biggers , Rasmus Villemoes , Mel Gorman , linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 2/2] IB/mad: Use ID allocator routines to allocate agent number Date: Thu, 7 Jun 2018 12:52:08 +0200 Message-Id: <20180607105208.16332-3-hans.westgaard.ry@oracle.com> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180607105208.16332-1-hans.westgaard.ry@oracle.com> References: <20180529073808.27735-1-hans.westgaard.ry@oracle.com> <20180607105208.16332-1-hans.westgaard.ry@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8916 signatures=668702 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1805220000 definitions=main-1806070128 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The agent TID is a 64 bit value split in two dwords. The least significant dword is the TID running counter. The most significant dword is the agent number. In the CX-3 shared port model, the mlx4 driver uses the most significant byte of the agent number to store the slave number, making agent numbers greater and equal to 2^24 (3 bytes) unusable. The current codebase uses a variable which is incremented atomically for each new agent number giving too large agent numbers over time. The IDA set of functions are used instead of the simple counter approach. This allows re-use of agent numbers. The signature of the bug is a MAD layer that stops working and the console is flooded with messages like: mlx4_ib: egress mad has non-null tid msb:1 class:4 slave:0 Orabug: 25571450 Signed-off-by: Hans Westgaard Ry --- drivers/infiniband/core/mad.c | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c index b28452a55a08..c01a2d63ffa2 100644 --- a/drivers/infiniband/core/mad.c +++ b/drivers/infiniband/core/mad.c @@ -41,6 +41,7 @@ #include #include #include +#include #include #include "mad_priv.h" @@ -59,8 +60,7 @@ module_param_named(recv_queue_size, mad_recvq_size, int, 0444); MODULE_PARM_DESC(recv_queue_size, "Size of receive queue in number of work requests"); static struct list_head ib_mad_port_list; -static atomic_t ib_mad_client_id = ATOMIC_INIT(0); - +static DEFINE_IDA(ib_mad_client_ids); /* Port list lock */ static DEFINE_SPINLOCK(ib_mad_port_list_lock); @@ -212,7 +212,7 @@ struct ib_mad_agent *ib_register_mad_agent(struct ib_device *device, int ret2, qpn; unsigned long flags; u8 mgmt_class, vclass; - + u32 ib_mad_client_id; /* Validate parameters */ qpn = get_spl_qp_index(qp_type); if (qpn == -1) { @@ -375,9 +375,19 @@ struct ib_mad_agent *ib_register_mad_agent(struct ib_device *device, ret = ERR_PTR(ret2); goto error4; } + ib_mad_client_id = ida_simple_get_cyclic(&ib_mad_client_ids, + 1, + BIT(24) - 1, + GFP_KERNEL); + if (ib_mad_client_id < 0) { + pr_err("Couldn't allocate agent tid; errcode: %#x\n", + ib_mad_client_id); + ret = ERR_PTR(ib_mad_client_id); + goto error4; + } + mad_agent_priv->agent.hi_tid = ib_mad_client_id; spin_lock_irqsave(&port_priv->reg_lock, flags); - mad_agent_priv->agent.hi_tid = atomic_inc_return(&ib_mad_client_id); /* * Make sure MAD registration (if supplied) @@ -428,6 +438,8 @@ struct ib_mad_agent *ib_register_mad_agent(struct ib_device *device, error5: spin_unlock_irqrestore(&port_priv->reg_lock, flags); ib_mad_agent_security_cleanup(&mad_agent_priv->agent); + ida_simple_remove(&ib_mad_client_ids, ib_mad_client_id); + error4: kfree(reg_req); error3: @@ -576,6 +588,7 @@ static void unregister_mad_agent(struct ib_mad_agent_private *mad_agent_priv) { struct ib_mad_port_private *port_priv; unsigned long flags; + u32 ib_mad_client_id; /* Note that we could still be handling received MADs */ @@ -587,6 +600,8 @@ static void unregister_mad_agent(struct ib_mad_agent_private *mad_agent_priv) port_priv = mad_agent_priv->qp_info->port_priv; cancel_delayed_work(&mad_agent_priv->timed_work); + ib_mad_client_id = mad_agent_priv->agent.hi_tid; + spin_lock_irqsave(&port_priv->reg_lock, flags); remove_mad_reg_req(mad_agent_priv); list_del(&mad_agent_priv->agent_list); @@ -602,6 +617,7 @@ static void unregister_mad_agent(struct ib_mad_agent_private *mad_agent_priv) kfree(mad_agent_priv->reg_req); kfree(mad_agent_priv); + ida_simple_remove(&ib_mad_client_ids, ib_mad_client_id); } static void unregister_mad_snoop(struct ib_mad_snoop_private *mad_snoop_priv) @@ -3347,4 +3363,5 @@ int ib_mad_init(void) void ib_mad_cleanup(void) { ib_unregister_client(&mad_client); + ida_destroy(&ib_mad_client_ids); } -- 2.14.3