Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp3102206imm; Tue, 29 May 2018 00:41:11 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJ88uWWp1lL1l24Xf7LWqmJB9yKxQTek5srbOh5ry7Uw7lJ/WsKsqYMfJAyrT+6KYAE8IME X-Received: by 2002:a63:745a:: with SMTP id e26-v6mr3915107pgn.377.1527579671652; Tue, 29 May 2018 00:41:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527579671; cv=none; d=google.com; s=arc-20160816; b=rpOMwuS0rB7GFWScRRqnliQ0uKDXgPW1R8RHBaHltbkpqLPnDbctJSYWdXtuI/I9Un fvE5/r7/IeOs8MDFMX4LPDVwYuHAAH3c7zxTKYmAYaSwsiqzMRLrVdvLiss2tjUVwcwH zhigvkxhQvD7hDRsO8VcmTSKkbUC5FwncAwEvBXrU1FLzlqbKZxDKbv3qkwxcj9d/gnF OV0oR0w9tFK8zfzi0oLHbxyFUX7dW4VPEEYDJPggE9CUrw/2TqjXanR9KWmLSyWvSawx NLbzoYRW++/1Fgo8uBXor40Uj18ZKQ4DromwZVDIysp3r9STKkaUONpN8E13eTHmNnfN m+uw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:to:from:dkim-signature :arc-authentication-results; bh=WJ2Ea/1qHSzPkJXTeB6btlO0xAM9+UnPQDx/b0Taahg=; b=NIPKvg4FpwlDzaju3qGWNgz1mARSpBozeNPkElCeWc/NZ+uYx+eVe4GqKE5yZdq++w ESExZYN9YklsG6XUGM8msTIxzqGGj38mkoCDhFJ04qi4rX/FxVX6TH6+io/taIoiQl+F Zs+pZUIhsZSJ/Ch9WkdB3Ngjmgg3ZhzEtrLsm5fNsB6BMtnrbQI8nzipZxI2/NcyhlrT 9sz0sINUMc7b2YKWXBUyjbMj5b5paRXlbcJw3Vd+9yvckpErIVjTkOC1FI0+k13o94wS byfXIdop3c6FH3J0MSirbTmYotmosduEcGSsalkC7jLNJHSQ567jxv5EkjFt6h2YtF8u 3ShQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=URBpGS3A; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 61-v6si31010730plc.173.2018.05.29.00.40.57; Tue, 29 May 2018 00:41:11 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=URBpGS3A; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754967AbeE2His (ORCPT + 99 others); Tue, 29 May 2018 03:38:48 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:40656 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754759AbeE2Hio (ORCPT ); Tue, 29 May 2018 03:38:44 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w4T7a4LA059071; Tue, 29 May 2018 07:38:24 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id : mime-version : content-type : content-transfer-encoding; s=corp-2017-10-26; bh=WJ2Ea/1qHSzPkJXTeB6btlO0xAM9+UnPQDx/b0Taahg=; b=URBpGS3AV83sDDzQnTbVcozn9/tsL27bQGJ3uQsZYq6dG1+TGV264yY4x3x6WrYm6pc1 fibIB0W4mO06HLmB2VtO7L2T725HBextyUpuag59bFzv0vl97hupgY0IyngJEosR2w6w 4/Dm4PJHuzVXm2Mgx9B+yFWSshCGKWvywUnAsCvjGX/USclqmdj+NmyQeZakmrDa2ENu S9JV//E5GMBEfzXw0IYD6e0rNv61fFUkIVbEq6HSX0NtgMwLkyeJ1ZNsMd1xrf4oBgSF KL6oPBVHZLGMbqMQKjfSTLSg1gNaiuWqamPhRXfWArtj6SeK7O/+wkDB5KmgHsNJQICf Rw== Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by userp2130.oracle.com with ESMTP id 2j6y187tb1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 29 May 2018 07:38:24 +0000 Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w4T7cOuq026131 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 29 May 2018 07:38:24 GMT Received: from abhmp0017.oracle.com (abhmp0017.oracle.com [141.146.116.23]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id w4T7cNlv027119; Tue, 29 May 2018 07:38:23 GMT Received: from lab02.no.oracle.com (/10.172.144.56) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 29 May 2018 00:38:22 -0700 From: Hans Westgaard Ry To: Doug Ledford , Jason Gunthorpe , Hakon Bugge , Jack Morgenstein , Daniel Jurgens , Parav Pandit , Pravin Shedge , linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] IB/mad: Use ID allocator routines to allocate agent number Date: Tue, 29 May 2018 09:38:08 +0200 Message-Id: <20180529073808.27735-1-hans.westgaard.ry@oracle.com> X-Mailer: git-send-email 2.13.6 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8907 signatures=668702 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1805290091 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The agent TID is a 64 bit value split in two dwords. The least significant dword is the TID running counter. The most significant dword is the agent number. In the CX-3 shared port model, the mlx4 driver uses the most significant byte of the agent number to store the slave number, making agent numbers greater and equal to 2^24 (3 bytes) unusable. The current codebase uses a variable which is incremented atomically for each new agent number giving too large agent numbers over time. The IDA set of functions are used instead of the simple counter approach. This allows re-use of agent numbers. A sysctl variable is also introduced, to control the max agent number. The signature of the bug is a MAD layer that stops working and the console is flooded with messages like: mlx4_ib: egress mad has non-null tid msb:1 class:4 slave:0 Signed-off-by: Hans Westgaard Ry --- drivers/infiniband/core/mad.c | 50 +++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 48 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c index b28452a55a08..adce6cd5fc41 100644 --- a/drivers/infiniband/core/mad.c +++ b/drivers/infiniband/core/mad.c @@ -41,6 +41,8 @@ #include #include #include +#include +#include #include #include "mad_priv.h" @@ -57,9 +59,27 @@ module_param_named(send_queue_size, mad_sendq_size, int, 0444); MODULE_PARM_DESC(send_queue_size, "Size of send queue in number of work requests"); module_param_named(recv_queue_size, mad_recvq_size, int, 0444); MODULE_PARM_DESC(recv_queue_size, "Size of receive queue in number of work requests"); +/* Sysctl variable to set largest mad agent number */ +static u32 ib_mad_sysctl_min_client_id_max; +static u32 ib_mad_sysctl_max_client_id_max; +static u32 ib_mad_sysctl_client_id_max; +static struct ctl_table_header *ib_mad_sysctl_hdr; + +static struct ctl_table ib_mad_sysctl_table[] = { + { + .procname = "client_id_max", + .data = &ib_mad_sysctl_client_id_max, + .maxlen = sizeof(ib_mad_sysctl_client_id_max), + .mode = 0644, + .proc_handler = &proc_douintvec_minmax, + .extra1 = &ib_mad_sysctl_min_client_id_max, + .extra2 = &ib_mad_sysctl_max_client_id_max, + }, + { } +}; static struct list_head ib_mad_port_list; -static atomic_t ib_mad_client_id = ATOMIC_INIT(0); +DEFINE_IDA(ib_mad_client_ids); /* Port list lock */ static DEFINE_SPINLOCK(ib_mad_port_list_lock); @@ -212,6 +232,7 @@ struct ib_mad_agent *ib_register_mad_agent(struct ib_device *device, int ret2, qpn; unsigned long flags; u8 mgmt_class, vclass; + u32 ib_mad_client_id; /* Validate parameters */ qpn = get_spl_qp_index(qp_type); @@ -376,8 +397,18 @@ struct ib_mad_agent *ib_register_mad_agent(struct ib_device *device, goto error4; } + ib_mad_client_id = ida_simple_get(&ib_mad_client_ids, + 0, + ib_mad_sysctl_client_id_max, + GFP_KERNEL); + if (ib_mad_client_id < 0) { + pr_err("Couldn't allocate agent tid; errcode: %#x\n", + ib_mad_client_id); + ret = ERR_PTR(ib_mad_client_id); + goto error4; + } + mad_agent_priv->agent.hi_tid = ib_mad_client_id; spin_lock_irqsave(&port_priv->reg_lock, flags); - mad_agent_priv->agent.hi_tid = atomic_inc_return(&ib_mad_client_id); /* * Make sure MAD registration (if supplied) @@ -428,6 +459,8 @@ struct ib_mad_agent *ib_register_mad_agent(struct ib_device *device, error5: spin_unlock_irqrestore(&port_priv->reg_lock, flags); ib_mad_agent_security_cleanup(&mad_agent_priv->agent); + ida_simple_remove(&ib_mad_client_ids, ib_mad_client_id); + error4: kfree(reg_req); error3: @@ -588,6 +621,7 @@ static void unregister_mad_agent(struct ib_mad_agent_private *mad_agent_priv) cancel_delayed_work(&mad_agent_priv->timed_work); spin_lock_irqsave(&port_priv->reg_lock, flags); + ida_simple_remove(&ib_mad_client_ids, mad_agent_priv->agent.hi_tid); remove_mad_reg_req(mad_agent_priv); list_del(&mad_agent_priv->agent_list); spin_unlock_irqrestore(&port_priv->reg_lock, flags); @@ -3341,10 +3375,22 @@ int ib_mad_init(void) return -EINVAL; } + ib_mad_sysctl_min_client_id_max = 1 << 10; + ib_mad_sysctl_max_client_id_max = 1 << 23; + ib_mad_sysctl_client_id_max = 1 << 18; + ib_mad_sysctl_hdr = register_net_sysctl(&init_net, "net/ibmad", + ib_mad_sysctl_table); + if (!ib_mad_sysctl_hdr) { + pr_err("%s: register_net_sysctl failed\n", __func__); + ib_mad_cleanup(); + return -EINVAL; + } return 0; } void ib_mad_cleanup(void) { ib_unregister_client(&mad_client); + ida_destroy(&ib_mad_client_ids); + unregister_net_sysctl_table(ib_mad_sysctl_hdr); } -- 2.13.6