Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp109879imm; Thu, 10 May 2018 16:32:49 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpdCtKPBrOWpu9fvuJ4NgXSKRC5Q5gTj8AQJtLrAOCkIrogMVUT2hmlJle4e9eR6sOxoUb7 X-Received: by 2002:a17:902:2805:: with SMTP id e5-v6mr3196260plb.55.1525995169720; Thu, 10 May 2018 16:32:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525995169; cv=none; d=google.com; s=arc-20160816; b=fxHmrZ7ucNE4CJ89DVvELJ2xEu0rn2SMLbM61AwPGPbt6GIjrc9JlO+CR0sPZYuHYx G01AEOhK4eQi3sGGkPPBvJ3f3IeX7kNeFGPG6EwlSwvT44KjFsV+TlDJx4E0YO7XOIgK x2wzb2cPRIaK9G3qIWKl8GQJ0hv3sPnd1tlYJRDwTIY+SboDvfO3OTcm6cUdj+6NZWuf OWPA9Z7UxN9vrfTmbQ/CBGNMaNDvICmFYtiURQkwKQ+vF2CnC8uGnL0gADmoy9p3v9C3 nJX09o3zTUh3RCa8Y0nfNt8EXChNWYjNARwN2ib4hHooHM3I83n7tXXMRKy/6aorW6pV ZAog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=2UBMeuU4wonlknFqHzDCf0ML39XBLuDchAdWydVJqws=; b=Dt4LNqcMC5BnY8STn0qIFb3q42fsV7HSo0btFp+m99oqoMOWJYKTEDjn8l0nRfxbVI OdDGqQ1ZqXHOPgHhX5eEjIPZ67wrdMd10eGY9xbIK/LmSzr+jvsl5Zl3140M4kQkDfxf lfA6QHEzbr87wNzy/kDrTsRd8PLhj11TfSSKGPWUu7Yv9gU+7Ptw3PsxE7crJ0veK6Go ZqvPnX+iJbxfIwlpsosjvgj6eUYuMKsts27wKGQwz2Z3X8puKfbgMfNIby1oc7aXx/ym fe0+3tL8+dgU0RPlAMgEikpNKEDdafXqkVkabi6++oTjKCW6aTQK9FRe1WtvkmN7L2Yf q13w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=qkwzDpG/; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c7-v6si1493231pgu.439.2018.05.10.16.32.34; Thu, 10 May 2018 16:32:49 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=qkwzDpG/; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751467AbeEJXc0 (ORCPT + 99 others); Thu, 10 May 2018 19:32:26 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:55778 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750710AbeEJXcY (ORCPT ); Thu, 10 May 2018 19:32:24 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w4ANVQEL134473; Thu, 10 May 2018 23:32:16 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id; s=corp-2017-10-26; bh=2UBMeuU4wonlknFqHzDCf0ML39XBLuDchAdWydVJqws=; b=qkwzDpG/EXbhdbf3jqdZYjVtSDoX4FlzsGhkO4iKTU0Z9LfR0BuaflGfMPGKxxjeeCBW H9BCovsg4zuSFNkN7uN4gPbCVByEGvDsljULkikLI0A2jrd0qCvJbQW7bBtVpY77vwQc fu+CbS4NC23UQsu96lJDIPiH7kS4pOUoP5eV7Bb94Fw1Z0fgUNpJj1d1zHZY7fx69KlK SKcOG+Jh2QQs99mfgJBTNpZqOYCwto5csgm69g4l0psO7lNLHReRumDMbogypyU6IW/a 4ldIIdfvR4kUjCB9a5IkyM2vZhqda5kHOdDMvodQHzopu/BxSQgjG9WPMG1gGxccJzfV 7g== Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp2120.oracle.com with ESMTP id 2hvth99h0r-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 10 May 2018 23:32:16 +0000 Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w4ANWFK6009278 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 10 May 2018 23:32:15 GMT Received: from abhmp0012.oracle.com (abhmp0012.oracle.com [141.146.116.18]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w4ANWFUo008479; Thu, 10 May 2018 23:32:15 GMT Received: from qing-ol6-work.us.oracle.com (/10.132.91.100) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 10 May 2018 16:32:15 -0700 From: Qing Huang To: tariqt@mellanox.com, davem@davemloft.net Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, Qing Huang Subject: [PATCH] mlx4_core: allocate 4KB ICM chunks Date: Thu, 10 May 2018 16:31:43 -0700 Message-Id: <20180510233143.7236-1-qing.huang@oracle.com> X-Mailer: git-send-email 2.9.3 X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8889 signatures=668698 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=765 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1805100216 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When a system is under memory presure (high usage with fragments), the original 256KB ICM chunk allocations will likely trigger kernel memory management to enter slow path doing memory compact/migration ops in order to complete high order memory allocations. When that happens, user processes calling uverb APIs may get stuck for more than 120s easily even though there are a lot of free pages in smaller chunks available in the system. Syslog: ... Dec 10 09:04:51 slcc03db02 kernel: [397078.572732] INFO: task oracle_205573_e:205573 blocked for more than 120 seconds. ... With 4KB ICM chunk size, the above issue is fixed. However in order to support 4KB ICM chunk size, we need to fix another issue in large size kcalloc allocations. E.g. Setting log_num_mtt=30 requires 1G mtt entries. With the 4KB ICM chunk size, each ICM chunk can only hold 512 mtt entries (8 bytes for each mtt entry). So we need a 16MB allocation for a table->icm pointer array to hold 2M pointers which can easily cause kcalloc to fail. The solution is to use vzalloc to replace kcalloc. There is no need for contiguous memory pages for a driver meta data structure (no need of DMA ops). Signed-off-by: Qing Huang Acked-by: Daniel Jurgens --- drivers/net/ethernet/mellanox/mlx4/icm.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/icm.c b/drivers/net/ethernet/mellanox/mlx4/icm.c index a822f7a..2b17a4b 100644 --- a/drivers/net/ethernet/mellanox/mlx4/icm.c +++ b/drivers/net/ethernet/mellanox/mlx4/icm.c @@ -43,12 +43,12 @@ #include "fw.h" /* - * We allocate in as big chunks as we can, up to a maximum of 256 KB - * per chunk. + * We allocate in 4KB page size chunks to avoid high order memory + * allocations in fragmented/high usage memory situation. */ enum { - MLX4_ICM_ALLOC_SIZE = 1 << 18, - MLX4_TABLE_CHUNK_SIZE = 1 << 18 + MLX4_ICM_ALLOC_SIZE = 1 << 12, + MLX4_TABLE_CHUNK_SIZE = 1 << 12 }; static void mlx4_free_icm_pages(struct mlx4_dev *dev, struct mlx4_icm_chunk *chunk) @@ -400,7 +400,7 @@ int mlx4_init_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table, obj_per_chunk = MLX4_TABLE_CHUNK_SIZE / obj_size; num_icm = (nobj + obj_per_chunk - 1) / obj_per_chunk; - table->icm = kcalloc(num_icm, sizeof(*table->icm), GFP_KERNEL); + table->icm = vzalloc(num_icm * sizeof(*table->icm)); if (!table->icm) return -ENOMEM; table->virt = virt; @@ -446,7 +446,7 @@ int mlx4_init_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table, mlx4_free_icm(dev, table->icm[i], use_coherent); } - kfree(table->icm); + vfree(table->icm); return -ENOMEM; } @@ -462,5 +462,5 @@ void mlx4_cleanup_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table) mlx4_free_icm(dev, table->icm[i], table->coherent); } - kfree(table->icm); + vfree(table->icm); } -- 2.9.3