Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp3538300imm; Fri, 25 May 2018 07:25:00 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpCZf830rvYCztsIt4n1dzjqkKJY5A0UXb+42nzAB68cVtaOw4cOZAMkOhMqB40nlZKpDDi X-Received: by 2002:a65:4b02:: with SMTP id r2-v6mr2198423pgq.82.1527258300035; Fri, 25 May 2018 07:25:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527258299; cv=none; d=google.com; s=arc-20160816; b=x1/smoejApHqFF8qF14BinJcou7U4x9yaGXIbvHfiTMZ/4zkZsX5oLoWjyQkQhsjdP fZP/SFwt0Q7pv+BCj/AXyXfuA6OQdJwdb8V8ExYruHFoxTo6UFj5J5x/1SUMvBcTHlua AY3SxP5YL6aahLpZBbSbNUj/wMcdD0VeHFyrEkA148WXkEkkVAHdXROBdN5WjYEmFw9Y awrkY41Y5AgUMVkWQnNZ7vhkIbYBLEx/96pWfztZZcyl+mffnbri5CrWOaWNZOzIZgou ppu/bvDXV7evwrAdGU3hW8ELjYfmtb7FP0RMBzCpsNrMxHk4rWd7WqdeCp4GhGMaHRKb uJ2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:from:subject:cc:to:message-id:date :arc-authentication-results; bh=aGsCKxL3buhLqxbr/ukYCKjnsg/YKJ4UhKivSiSTbyw=; b=l56ddRWjtMHujHxNrJBZrCFIQUGQy/mzvfB2xuxkak28Akz5J0kL/WCr0tljw8/HFv X4sQiDM3P2LJTAnumV8Ewv46Xe8escxHoQHHwqsz/iNNwZZj4hGu2V4Wk2Ql4TJkfEGC /UfcHCH/DcrJsFc8Zceu6kPwm2KZa93crh2Nj8kI+5P3i+DakJ1uINkuGnKH+WgBiUUM Kn07AcGpy3tUyLiKkNzNI0NAUsNhsFvcWezBzRZTpvCwE80viRrLj4sLNJPidaDqfazk 0L7M4rUISxaSfMM5L7Q1zy9XygZV7DCjKRfP8onL78E3BFIshG5EeBJ7H/Kp2546k55Q OLMA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b70-v6si2490825pga.536.2018.05.25.07.24.45; Fri, 25 May 2018 07:24:59 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936219AbeEYOX3 (ORCPT + 99 others); Fri, 25 May 2018 10:23:29 -0400 Received: from shards.monkeyblade.net ([184.105.139.130]:32824 "EHLO shards.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935949AbeEYOX2 (ORCPT ); Fri, 25 May 2018 10:23:28 -0400 Received: from localhost (unknown [172.58.224.74]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) (Authenticated sender: davem-davemloft) by shards.monkeyblade.net (Postfix) with ESMTPSA id 0EB021089B4FB; Fri, 25 May 2018 07:23:25 -0700 (PDT) Date: Fri, 25 May 2018 10:23:21 -0400 (EDT) Message-Id: <20180525.102321.858995452200286788.davem@davemloft.net> To: qing.huang@oracle.com Cc: tariqt@mellanox.com, haakon.bugge@oracle.com, yanjun.zhu@oracle.com, netdev@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, gi-oh.kim@profitbricks.com Subject: Re: [PATCH V4] mlx4_core: allocate ICM memory in page size chunks From: David Miller In-Reply-To: <20180523232246.20445-1-qing.huang@oracle.com> References: <20180523232246.20445-1-qing.huang@oracle.com> X-Mailer: Mew version 6.7 on Emacs 25.3 / Mule 6.0 (HANACHIRUSATO) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.5.12 (shards.monkeyblade.net [149.20.54.216]); Fri, 25 May 2018 07:23:27 -0700 (PDT) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Qing Huang Date: Wed, 23 May 2018 16:22:46 -0700 > When a system is under memory presure (high usage with fragments), > the original 256KB ICM chunk allocations will likely trigger kernel > memory management to enter slow path doing memory compact/migration > ops in order to complete high order memory allocations. > > When that happens, user processes calling uverb APIs may get stuck > for more than 120s easily even though there are a lot of free pages > in smaller chunks available in the system. > > Syslog: > ... > Dec 10 09:04:51 slcc03db02 kernel: [397078.572732] INFO: task > oracle_205573_e:205573 blocked for more than 120 seconds. > ... > > With 4KB ICM chunk size on x86_64 arch, the above issue is fixed. > > However in order to support smaller ICM chunk size, we need to fix > another issue in large size kcalloc allocations. > > E.g. > Setting log_num_mtt=30 requires 1G mtt entries. With the 4KB ICM chunk > size, each ICM chunk can only hold 512 mtt entries (8 bytes for each mtt > entry). So we need a 16MB allocation for a table->icm pointer array to > hold 2M pointers which can easily cause kcalloc to fail. > > The solution is to use kvzalloc to replace kcalloc which will fall back > to vmalloc automatically if kmalloc fails. > > Signed-off-by: Qing Huang > Acked-by: Daniel Jurgens > Reviewed-by: Zhu Yanjun Applied, thanks.