Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp1188463imm; Fri, 11 May 2018 12:24:05 -0700 (PDT) X-Google-Smtp-Source: AB8JxZrnDQwn4KWklHvj6nIfwV9lGe3MFJSM8IZ2YXUPhMr1dZ6/D/24UPh+u5A+Mib5l+vMUAkN X-Received: by 2002:a62:d6da:: with SMTP id a87-v6mr77699pfl.200.1526066645931; Fri, 11 May 2018 12:24:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526066645; cv=none; d=google.com; s=arc-20160816; b=NKpXpSpT3rUrb/T6viICVsLCzM6crzdlEq1UJ3Ug/zSCJi/xPXm6hWqQxEeLcbMXCK yJPqSJ1iJyMvHYqf+v3PpyRIBl+D8suwptB0pua6bns8aEWHSza7hxF4yybqZgzooHb6 HSoGx7410KrTpQWjGGoPnbZ8mem2EZ1+rAvpAnL51wWU/qLZuqD+2xtCUmd8o6Mvx0+A jukbR7KFttcwpP1VBsRlV7DreTnl0z/sn/Umem2A2vR0qPdrKMFtmagIenmjtIPTHfBb FEK1YfOImkge8llT9vEhMKdQEmfRHVVZ0kmbw/oxW2rPjvyuHXKUBlhBdVkBlbUDp7gH 262g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=Ny2FOF0MQH/eNNMPr52VK/46kGjWcgMhCbt+t6i2s9Y=; b=Bi41U8lBxiRSKAnerK9di+2G3BX8oF8Ju5sGq51NWDMYqyImnvXFVbbjfKQPtx7O6n nZgWmuYlGpanLYfs8ujtvMWAs4LnugoczCbXzqdyMDo4wH6z/iCK0kcj8cacc/UUrek9 YtGNMeaN8om3UDAfoiELur+OQkAGYzmWapWz8coO4mKpauVYZnxZR8qkG7+XtBJmEff+ eeOBkoB6EF4NcX1Jt5lh6LAmy25/eRCz6I2LqU9x+3dG2G7u651LE6NhcRW9Endhr3qN Pe7e6kfJs/Mu4Hp+U0va3GtB3O7Tjw318FYvHS8uGqdQIyUSVjhqGtYv8FHfnmle49q5 Dqtw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=PY3snpjL; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z13-v6si4092370pfc.128.2018.05.11.12.23.49; Fri, 11 May 2018 12:24:05 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=PY3snpjL; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751147AbeEKTXj (ORCPT + 99 others); Fri, 11 May 2018 15:23:39 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:60578 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750711AbeEKTXi (ORCPT ); Fri, 11 May 2018 15:23:38 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w4BJGONo164620; Fri, 11 May 2018 19:23:32 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id; s=corp-2017-10-26; bh=Ny2FOF0MQH/eNNMPr52VK/46kGjWcgMhCbt+t6i2s9Y=; b=PY3snpjLvbWQtbA23LznYWAB7lZdYxLJaqCrfPeI5oO/qDDtq74BNBggxv3JBVnAuTv4 2POApW5AuSgtpTZG3oR6pgbr/oa8mir7kjF1C6QdHCwZneMPj/oCvDPtacB/PVJncYLZ 51o+chZ9B4L3ACzagB17BMhoXDeFErYyNubIFGvpPKkTrl4b/4frrimIPapFH/yB0qhb T8ljqctcSsYinlw91BfA9mYHsPDkqtMobVwcVvF6/2F+dSwOhcLxaIEjZzxrWIFaSvb8 dB6SQavYLnH4L4RKidB/h1xl7tgKZy+oKL4GIRlWlGl1+gKJ/YsG6MNGOncmEMmdlMVr IA== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp2120.oracle.com with ESMTP id 2hwd7dryhe-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 11 May 2018 19:23:32 +0000 Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w4BJNVW1027406 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 11 May 2018 19:23:31 GMT Received: from abhmp0012.oracle.com (abhmp0012.oracle.com [141.146.116.18]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id w4BJNVAg018895; Fri, 11 May 2018 19:23:31 GMT Received: from qing-ol6-work.us.oracle.com (/10.132.91.100) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 11 May 2018 12:23:30 -0700 From: Qing Huang To: tariqt@mellanox.com, davem@davemloft.net, haakon.bugge@oracle.com, yanjun.zhu@oracle.com Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, Qing Huang Subject: [PATCH V2] mlx4_core: allocate ICM memory in page size chunks Date: Fri, 11 May 2018 12:23:18 -0700 Message-Id: <20180511192318.22342-1-qing.huang@oracle.com> X-Mailer: git-send-email 2.9.3 X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8890 signatures=668698 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=787 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1805110177 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When a system is under memory presure (high usage with fragments), the original 256KB ICM chunk allocations will likely trigger kernel memory management to enter slow path doing memory compact/migration ops in order to complete high order memory allocations. When that happens, user processes calling uverb APIs may get stuck for more than 120s easily even though there are a lot of free pages in smaller chunks available in the system. Syslog: ... Dec 10 09:04:51 slcc03db02 kernel: [397078.572732] INFO: task oracle_205573_e:205573 blocked for more than 120 seconds. ... With 4KB ICM chunk size on x86_64 arch, the above issue is fixed. However in order to support smaller ICM chunk size, we need to fix another issue in large size kcalloc allocations. E.g. Setting log_num_mtt=30 requires 1G mtt entries. With the 4KB ICM chunk size, each ICM chunk can only hold 512 mtt entries (8 bytes for each mtt entry). So we need a 16MB allocation for a table->icm pointer array to hold 2M pointers which can easily cause kcalloc to fail. The solution is to use vzalloc to replace kcalloc. There is no need for contiguous memory pages for a driver meta data structure (no need of DMA ops). Signed-off-by: Qing Huang Acked-by: Daniel Jurgens Reviewed-by: Zhu Yanjun --- v2 -> v1: adjusted chunk size to reflect different architectures. drivers/net/ethernet/mellanox/mlx4/icm.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/icm.c b/drivers/net/ethernet/mellanox/mlx4/icm.c index a822f7a..ccb62b8 100644 --- a/drivers/net/ethernet/mellanox/mlx4/icm.c +++ b/drivers/net/ethernet/mellanox/mlx4/icm.c @@ -43,12 +43,12 @@ #include "fw.h" /* - * We allocate in as big chunks as we can, up to a maximum of 256 KB - * per chunk. + * We allocate in page size (default 4KB on many archs) chunks to avoid high + * order memory allocations in fragmented/high usage memory situation. */ enum { - MLX4_ICM_ALLOC_SIZE = 1 << 18, - MLX4_TABLE_CHUNK_SIZE = 1 << 18 + MLX4_ICM_ALLOC_SIZE = 1 << PAGE_SHIFT, + MLX4_TABLE_CHUNK_SIZE = 1 << PAGE_SHIFT }; static void mlx4_free_icm_pages(struct mlx4_dev *dev, struct mlx4_icm_chunk *chunk) @@ -400,7 +400,7 @@ int mlx4_init_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table, obj_per_chunk = MLX4_TABLE_CHUNK_SIZE / obj_size; num_icm = (nobj + obj_per_chunk - 1) / obj_per_chunk; - table->icm = kcalloc(num_icm, sizeof(*table->icm), GFP_KERNEL); + table->icm = vzalloc(num_icm * sizeof(*table->icm)); if (!table->icm) return -ENOMEM; table->virt = virt; @@ -446,7 +446,7 @@ int mlx4_init_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table, mlx4_free_icm(dev, table->icm[i], use_coherent); } - kfree(table->icm); + vfree(table->icm); return -ENOMEM; } @@ -462,5 +462,5 @@ void mlx4_cleanup_icm_table(struct mlx4_dev *dev, struct mlx4_icm_table *table) mlx4_free_icm(dev, table->icm[i], table->coherent); } - kfree(table->icm); + vfree(table->icm); } -- 2.9.3