Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp1436832imm; Wed, 23 May 2018 16:25:31 -0700 (PDT) X-Google-Smtp-Source: AB8JxZoNhGoxyHNb9VbqTAb0T5cUsNJqVMbeAp5tDGDGjUFGXJgJtmB4MZElx4h0zXbaCyNr6FHE X-Received: by 2002:a17:902:6bca:: with SMTP id m10-v6mr4859881plt.6.1527117931373; Wed, 23 May 2018 16:25:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527117931; cv=none; d=google.com; s=arc-20160816; b=ZTxNouzdc/n4xMN2I6H17HsjXp+ifN8FTaCnUBL7Sp2Y+j2Ho0BAlZYyt2NVvHQNZD gkkoFZMgsv8+d+A+RIpPvA2rLtnl/faYtx+9HFfEEcXsRCUvIF3sounAHXZmpkXQK+1+ 890H3s1/0N53fSM0YEJT6NVzAArqcBdrvDa1OE4NU7XpkStfW6QFyP+jp/GzgK9njXdf c9cZEfc7uOSgszbzKLF9KZ124e/ys3NCUyNEg4dVw1/NBXbc4Hhhv1BTNIAinP+GcCTM UsGsdUdZtZy3Rmha4mcx8BNMSJAVf0VADWA6TQQhSSzi3t6QDxt53zJAl7TVbDTytXDc BcGA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=b1CUAOYIAYCGfUIo0Y6LD5qjpCb54TpQI84isJsqSmM=; b=sxs4CEu93P9pB3kYvY7k1LtgtAxFVsWlOlIGScrBJ5xyhffCkN5teQdrScIM2D/gxu YplyKmsx0j8PPIaYBJg/CSZgfpMHOUOxo4tHpwTHSHXPzWgZGaZsXboelhbE8xT64MaF yyp5FXOZMDrTVO8YRd2DT1x+Pci50OENcX41JaJ0+LHl1NK0VZGinSpdx8b7HFQyl9z2 GRcs1WMZe5n+u1CkXXAmOsiaxB0JVUafvnmIdIiMPGwt/SWX2wsY+2vqgB3CVKdSn7H8 ZvcLULS5Ff8hbMtqKw3r3Iazmv3dfjX+xHyAV1qSpZnkdBUJ3+6shblIoEW4kQ37ArfA nFdw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=C+gVS6se; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 203-v6si20506584pfc.21.2018.05.23.16.25.16; Wed, 23 May 2018 16:25:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=C+gVS6se; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935113AbeEWXX3 (ORCPT + 99 others); Wed, 23 May 2018 19:23:29 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:37252 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934671AbeEWXX1 (ORCPT ); Wed, 23 May 2018 19:23:27 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w4NNL5rL189846; Wed, 23 May 2018 23:23:19 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id; s=corp-2017-10-26; bh=b1CUAOYIAYCGfUIo0Y6LD5qjpCb54TpQI84isJsqSmM=; b=C+gVS6seAAunnf65eHNM8tELtHoTA+1Cta9hF/+LMgnK4upFsohhj9Xvv9/6PT+emWz7 //JtnxVKrmgHVOqVl09hj3UKpN4b5CTINfFnMvkuwMCajzdrcAprI+pF5zmM7wJ79IRh /zEe5mWONQyGVMJJZt3lrmjbagi083FRYici+0SliNQtwXfCuiUb1tebbghVy/Fz9BFj oNwYnisCCUx8ZtKyROeN07bFaK/V1mua+QRusfjJJ5QM+f8kTBLspu6FNCZu/CNH541q ui8I8TAsdOHGaqWwUAAd8T55wRh0uzbsK9/ghAuZ72f81pjRLMvaOwcb/Ffr0/4T8Fdh Rg== Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by userp2130.oracle.com with ESMTP id 2j4nh7p4td-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 23 May 2018 23:23:19 +0000 Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w4NNNF3l012851 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 23 May 2018 23:23:16 GMT Received: from abhmp0013.oracle.com (abhmp0013.oracle.com [141.146.116.19]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id w4NNNEuB027910; Wed, 23 May 2018 23:23:15 GMT Received: from qing-ol6-work.us.oracle.com (/10.132.91.100) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 23 May 2018 16:23:14 -0700 From: Qing Huang To: tariqt@mellanox.com, davem@davemloft.net, haakon.bugge@oracle.com, yanjun.zhu@oracle.com Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, gi-oh.kim@profitbricks.com, Qing Huang Subject: [PATCH V4] mlx4_core: allocate ICM memory in page size chunks Date: Wed, 23 May 2018 16:22:46 -0700 Message-Id: <20180523232246.20445-1-qing.huang@oracle.com> X-Mailer: git-send-email 2.9.3 X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8902 signatures=668700 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1805230233 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When a system is under memory presure (high usage with fragments), the original 256KB ICM chunk allocations will likely trigger kernel memory management to enter slow path doing memory compact/migration ops in order to complete high order memory allocations. When that happens, user processes calling uverb APIs may get stuck for more than 120s easily even though there are a lot of free pages in smaller chunks available in the system. Syslog: ... Dec 10 09:04:51 slcc03db02 kernel: [397078.572732] INFO: task oracle_205573_e:205573 blocked for more than 120 seconds. ... With 4KB ICM chunk size on x86_64 arch, the above issue is fixed. However in order to support smaller ICM chunk size, we need to fix another issue in large size kcalloc allocations. E.g. Setting log_num_mtt=30 requires 1G mtt entries. With the 4KB ICM chunk size, each ICM chunk can only hold 512 mtt entries (8 bytes for each mtt entry). So we need a 16MB allocation for a table->icm pointer array to hold 2M pointers which can easily cause kcalloc to fail. The solution is to use kvzalloc to replace kcalloc which will fall back to vmalloc automatically if kmalloc fails. Signed-off-by: Qing Huang Acked-by: Daniel Jurgens Reviewed-by: Zhu Yanjun --- v4: use kvzalloc instead of vzalloc add one err condition check don't include vmalloc.h any more v3: use PAGE_SIZE instead of PAGE_SHIFT add comma to the end of enum variables include vmalloc.h header file to avoid build issues on Sparc v2: adjusted chunk size to reflect different architectures drivers/net/ethernet/mellanox/mlx4/icm.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/icm.c b/drivers/net/ethernet/mellanox/mlx4/icm.c index a822f7a..685337d 100644 --- a/drivers/net/ethernet/mellanox/mlx4/icm.c +++ b/drivers/net/ethernet/mellanox/mlx4/icm.c @@ -43,12 +43,12 @@ #include "fw.h" /* - * We allocate in as big chunks as we can, up to a maximum of 256 KB - * per chunk. + * We allocate in page size (default 4KB on many archs) chunks to avoid high + * order memory allocations in fragmented/high usage memory situation. */ enum { - MLX4_ICM_ALLOC_SIZE = 1 << 18, - MLX4_TABLE_CHUNK_SIZE = 1 << 18 + MLX4_ICM_ALLOC_SIZE = PAGE_SIZE, + MLX4_TABLE_CHUNK_SIZE = PAGE_SIZE, }; static void mlx4_free_icm_pages(struct mlx4_dev *dev, struct mlx4_icm_chunk *chunk) @@ -398,9 +398,11 @@ int mlx4_init_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table, u64 size; obj_per_chunk = MLX4_TABLE_CHUNK_SIZE / obj_size; + if (WARN_ON(!obj_per_chunk)) + return -EINVAL; num_icm = (nobj + obj_per_chunk - 1) / obj_per_chunk; - table->icm = kcalloc(num_icm, sizeof(*table->icm), GFP_KERNEL); + table->icm = kvzalloc(num_icm * sizeof(*table->icm), GFP_KERNEL); if (!table->icm) return -ENOMEM; table->virt = virt; @@ -446,7 +448,7 @@ int mlx4_init_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table, mlx4_free_icm(dev, table->icm[i], use_coherent); } - kfree(table->icm); + kvfree(table->icm); return -ENOMEM; } @@ -462,5 +464,5 @@ void mlx4_cleanup_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table) mlx4_free_icm(dev, table->icm[i], table->coherent); } - kfree(table->icm); + kvfree(table->icm); } -- 2.9.3