Received: by 10.223.185.111 with SMTP id b44csp805506wrg; Fri, 9 Mar 2018 14:10:01 -0800 (PST) X-Google-Smtp-Source: AG47ELsnXYR8sufRc2derAO3eiNq5dk7u/Pyf/NU1SL+y+PZztc9FA8YIPGBxSGSnLdajTCVYC/w X-Received: by 10.99.149.15 with SMTP id p15mr16458pgd.154.1520633401320; Fri, 09 Mar 2018 14:10:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520633401; cv=none; d=google.com; s=arc-20160816; b=N+mBu/nXNnkMBliFbRr2tz0NkkwfaoNFJxsE8lTTAwsOMqdsOjYVnS5vRrPDag41VR lX9/xhWAN4TSgH+z5eC4KcVdyzMvi2oroUbKYQPD0mPvg7zImubAZCmfjeqRxFiChl/K xq+xa97NZNx/hZUI1GwP/jGdhq/N8G3+pWpb/2x8yx2BS31Qetqocw5w941Ce4i0FtcN 9v963zuoTOJJCzwZqgVRuUt+Bdbnz6rAxJd/a8uSNqipQoaxY9wmLLA4j5DSVAQj3Ebl 4djBe0J7NH8sMBY3gnCSYDk0FHOiOraOchURFNVNqE7NrZ74dSce9ECP1CCy1GJ1Fub+ V8Wg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:to:from :dkim-signature:arc-authentication-results; bh=2afiEdW1KWaZ0ylze5mi3t62Si/yzebgaPzu2uhSI3Q=; b=a72rsNL/v23YV0Q+WqFh+k0VDrfWmHK5Gl6PEbvFMX9jRBIzXW5EYBjjTVX3JN9+/4 0iN6IxDpI56zQ0tQGQYKYyjzrLrUddRD6IQKP0ccw8AIYr7X8sRqKxzORQpItUjvsGUt +74wd2NEe+RcDkIm8ZewFjCPyGIevRHu4qqu8sXoK2Gnh71b+c9vdJybFWZAfCYFIceT okpLfPFTzwM14hrcbJeoiPMDIi5oazeLklBb7BQ6zs6OVABtZ5vfguGbe8HPPPrvU4Mf b/TAjaY4T49HHglg86bb/BYkAxPpibNUIZYR1a2ZT0H1XptDjyhcDvkbxhP0mHXePEKr IBPg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=X76C9RgW; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k11-v6si1528299pls.58.2018.03.09.14.09.47; Fri, 09 Mar 2018 14:10:01 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=X76C9RgW; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932796AbeCIWI5 (ORCPT + 99 others); Fri, 9 Mar 2018 17:08:57 -0500 Received: from aserp2120.oracle.com ([141.146.126.78]:37476 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932157AbeCIWIy (ORCPT ); Fri, 9 Mar 2018 17:08:54 -0500 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w29M1scx025207; Fri, 9 Mar 2018 22:08:21 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id; s=corp-2017-10-26; bh=2afiEdW1KWaZ0ylze5mi3t62Si/yzebgaPzu2uhSI3Q=; b=X76C9RgWpWCDCu9DiNb2MjPhEuGn3ohmxy9Bx8fSOG+VgfOuI6JMQK7dwKH46SDleeh2 nx1cRE1r1K6+amHXVbjgXCViKT9/igWWLc2sDd7zoQLwoE/AbIcUu23+54OqJyTUqbr+ vhJ1KgsXF6HV288Lbj0fPkgDANvFAbVIcPmPoNuEgF73jIKDjVS/fKSrQ1V/WWOOq8Ep it+LXbxtDPpo8wMiEvhag8as3ruihRnnDz+x7H74a3O8Jbeib+1toyn1k0Or8wnQ3SDB 9g+qyWm4jm0WflM5lyNbbjf8E3QV1nj4JNQH6DhBeC/I5SL8tkxDIn1TKNvaC3Hro6Sz 5A== Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by aserp2120.oracle.com with ESMTP id 2gm2pm81qc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 09 Mar 2018 22:08:21 +0000 Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w29M8KbU019392 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 9 Mar 2018 22:08:20 GMT Received: from abhmp0016.oracle.com (abhmp0016.oracle.com [141.146.116.22]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id w29M8FcG021783; Fri, 9 Mar 2018 22:08:15 GMT Received: from localhost.localdomain (/98.216.35.41) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 09 Mar 2018 14:08:15 -0800 From: Pavel Tatashin To: steven.sistare@oracle.com, daniel.m.jordan@oracle.com, pasha.tatashin@oracle.com, m.mizuma@jp.fujitsu.com, akpm@linux-foundation.org, mhocko@suse.com, catalin.marinas@arm.com, takahiro.akashi@linaro.org, gi-oh.kim@profitbricks.com, heiko.carstens@de.ibm.com, baiyaowei@cmss.chinamobile.com, richard.weiyang@gmail.com, paul.burton@mips.com, miles.chen@mediatek.com, vbabka@suse.cz, mgorman@suse.de, hannes@cmpxchg.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [v5 0/2] initialize pages on demand during boot Date: Fri, 9 Mar 2018 17:08:05 -0500 Message-Id: <20180309220807.24961-1-pasha.tatashin@oracle.com> X-Mailer: git-send-email 2.16.2 X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8827 signatures=668688 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1803090261 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Change log: v4 - v5 - Fix issue reported by Vlasimil Babka: > I've noticed that this function first disables the > on-demand initialization, and then runs the kthreads. > Doesn't that leave a window where allocations can fail? The > chances are probably small, but I think it would be better > to avoid it completely, rare failures suck. > > Fixing that probably means rethinking the whole > synchronization more dramatically though :/ - Introduced a new patch that uses node resize lock to synchronize on-demand deferred page initialization, and regular deferred page initialization. v3 - v4 - Fix !CONFIG_NUMA issue. v2 - v3 Andrew Morton's comments: - Moved read of pgdat->first_deferred_pfn into deferred_zone_grow_lock, thus got rid of READ_ONCE()/WRITE_ONCE() - Replaced spin_lock() with spin_lock_irqsave() in deferred_grow_zone - Updated comments for deferred_zone_grow_lock - Updated comment before deferred_grow_zone() explaining return value, and also noinline specifier. - Fixed comment before _deferred_grow_zone(). v1 - v2 Added Tested-by: Masayoshi Mizuma This change helps for three reasons: 1. Insufficient amount of reserved memory due to arguments provided by user. User may request some buffers, increased hash tables sizes etc. Currently, machine panics during boot if it can't allocate memory due to insufficient amount of reserved memory. With this change, it will be able to grow zone before deferred pages are initialized. One observed example is described in the linked discussion [1] Mel Gorman writes: " Yasuaki Ishimatsu reported a premature OOM when trace_buf_size=100m was specified on a machine with many CPUs. The kernel tried to allocate 38.4GB but only 16GB was available due to deferred memory initialisation. " The allocations in the above scenario happen per-cpu in smp_init(), and before deferred pages are initialized. So, there is no way to predict how much memory we should put aside to boot successfully with deferred page initialization feature compiled in. 2. The second reason is future proof. The kernel memory requirements may change, and we do not want to constantly update reset_deferred_meminit() to satisfy the new requirements. In addition, this function is currently in common code, but potentially would need to be split into arch specific variants, as more arches will start taking advantage of deferred page initialization feature. 3. On demand initialization of reserved pages guarantees that we will initialize only as many pages early in boot using only one thread as needed, the rest are going to be efficiently initialized in parallel. [1] https://www.spinics.net/lists/linux-mm/msg139087.html Pavel Tatashin (2): mm: disable interrupts while initializing deferred pages mm: initialize pages on demand during boot include/linux/memblock.h | 10 -- include/linux/memory_hotplug.h | 73 ++++++++++----- include/linux/mmzone.h | 5 +- mm/memblock.c | 23 ----- mm/page_alloc.c | 205 +++++++++++++++++++++++++++++++---------- 5 files changed, 206 insertions(+), 110 deletions(-) -- 2.16.2