Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp74046imm; Thu, 20 Sep 2018 15:27:49 -0700 (PDT) X-Google-Smtp-Source: ANB0Vdb9EgwS4GAQTT0Bu0IcVCbfPgb+vj6RujQWhD7GVSZpXRcFsIcH5SWKOfqwgLrEqScg7YLg X-Received: by 2002:a63:f51:: with SMTP id 17-v6mr11398335pgp.100.1537482469233; Thu, 20 Sep 2018 15:27:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537482469; cv=none; d=google.com; s=arc-20160816; b=ibzaWQ77h1kMl3JBhQlFgFBsHcM2jWDnKHm1E5Ws6KOjE0tAWMk+lYr+dUjUTcsIEL +Au20s/c4SqHLfpi7JiPOe5xVt6aDnb1VhAVi2LCqe3xcnEbkCWtOMEdeLaiItJJF3W8 IVpae+UZsoYQVK89zJwrk1cNmq9U4omkqhSafGQPLEOk0PyZVxjo6YEz0d1r7R078fb1 uPH6NpduxRgPjRGxZVj4STnaaJKXdjOcnCQxgy7D8EtGm+nxmefjRPkn2RmPRPgVfhkt 8G7liqhFyRr6CFfAbtePKk1Dhgw2CtKtlnxpnhiiMJpwAFs/7qrwwUpMlEYE6XydJb1g USmw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:cc:to:from :subject; bh=j6g+gX5p29AHCR+jAj2W2G6ghXucCVUNEUSc4EVcWrk=; b=BQFNTn401pWLXA+4JvvJUSn9pkm6H2+yShqxYr6CUzwCm1yapLyhvN3fQHkMlX5Gc8 lbk4KkMP4NLZywAa92IiWFq1qGQwPnaufznOduBb/Os8DSHgzs045HxIaQj7/QgFaPpE 5EoIVpnYnEdqtagkYGijto1HPyk+T15EE6qiO58SH5Nb9rKE0FY1z9Cnb4aPKxTxS8XJ GC/+OlS44nQ2zxd8vGwwoquzkLki4TqNfHFDkhPNFa4NVhYaU12LYWVgzaFrf6Mt8ZtQ RgRoNOp31h45PicApoVEHA+YkEkxGmQCt3l0Mlzl25g9vW4oNYZCcdbU0/HzebRdx3zR Sl3g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f8-v6si27439855plm.117.2018.09.20.15.27.33; Thu, 20 Sep 2018 15:27:49 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726196AbeIUEMe (ORCPT + 99 others); Fri, 21 Sep 2018 00:12:34 -0400 Received: from mga03.intel.com ([134.134.136.65]:50952 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725823AbeIUEMe (ORCPT ); Fri, 21 Sep 2018 00:12:34 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Sep 2018 15:26:51 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,282,1534834800"; d="scan'208";a="85285550" Received: from ahduyck-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.7.198.152]) by orsmga003.jf.intel.com with ESMTP; 20 Sep 2018 15:26:36 -0700 Subject: [PATCH v4 1/5] mm: Provide kernel parameter to allow disabling page init poisoning From: Alexander Duyck To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org Cc: pavel.tatashin@microsoft.com, mhocko@suse.com, dave.jiang@intel.com, mingo@kernel.org, dave.hansen@intel.com, jglisse@redhat.com, akpm@linux-foundation.org, logang@deltatee.com, dan.j.williams@intel.com, kirill.shutemov@linux.intel.com Date: Thu, 20 Sep 2018 15:26:36 -0700 Message-ID: <20180920222415.19464.38400.stgit@localhost.localdomain> In-Reply-To: <20180920215824.19464.8884.stgit@localhost.localdomain> References: <20180920215824.19464.8884.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On systems with a large amount of memory it can take a significant amount of time to initialize all of the page structs with the PAGE_POISON_PATTERN value. I have seen it take over 2 minutes to initialize a system with over 12TB of RAM. In order to work around the issue I had to disable CONFIG_DEBUG_VM and then the boot time returned to something much more reasonable as the arch_add_memory call completed in milliseconds versus seconds. However in doing that I had to disable all of the other VM debugging on the system. In order to work around a kernel that might have CONFIG_DEBUG_VM enabled on a system that has a large amount of memory I have added a new kernel parameter named "vm_debug" that can be set to "-" in order to disable it. Signed-off-by: Alexander Duyck --- v3: Switched from kernel config option to parameter v4: Added comment to parameter handler to record when option is disabled Updated parameter description based on feedback from Michal Hocko Fixed GB vs TB typo in patch description. Switch to vm_debug option similar to slub_debug Documentation/admin-guide/kernel-parameters.txt | 12 ++++++ include/linux/page-flags.h | 8 ++++ mm/debug.c | 46 +++++++++++++++++++++++ mm/memblock.c | 5 +-- mm/sparse.c | 4 +- 5 files changed, 69 insertions(+), 6 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index dfe3d7b99abf..ee257b5b584f 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -4811,6 +4811,18 @@ This is actually a boot loader parameter; the value is passed to the kernel using a special protocol. + vm_debug[=options] [KNL] Available with CONFIG_DEBUG_VM=y. + May slow down system boot speed, especially when + enabled on systems with a large amount of memory. + All options are enabled by default, and this + interface is meant to allow for selectively + enabling or disabling specific virtual memory + debugging features. + + Available options are: + P Enable page structure init time poisoning + - Disable all of the above options + vmalloc=nn[KMG] [KNL,BOOT] Forces the vmalloc area to have an exact size of . This can be used to increase the minimum size (128MB on x86). It can also be used to diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 4d99504f6496..934f91ef3f54 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -163,6 +163,14 @@ static inline int PagePoisoned(const struct page *page) return page->flags == PAGE_POISON_PATTERN; } +#ifdef CONFIG_DEBUG_VM +void page_init_poison(struct page *page, size_t size); +#else +static inline void page_init_poison(struct page *page, size_t size) +{ +} +#endif + /* * Page flags policies wrt compound pages * diff --git a/mm/debug.c b/mm/debug.c index bd10aad8539a..cdacba12e09a 100644 --- a/mm/debug.c +++ b/mm/debug.c @@ -13,6 +13,7 @@ #include #include #include +#include #include "internal.h" @@ -175,4 +176,49 @@ void dump_mm(const struct mm_struct *mm) ); } +static bool page_init_poisoning __read_mostly = true; + +static int __init setup_vm_debug(char *str) +{ + bool __page_init_poisoning = true; + + /* + * Calling vm_debug with no arguments is equivalent to requesting + * to enable all debugging options we can control. + */ + if (*str++ != '=' || !*str) + goto out; + + __page_init_poisoning = false; + if (*str == '-') + goto out; + + while (*str) { + switch (tolower(*str)) { + case'p': + __page_init_poisoning = true; + break; + default: + pr_err("vm_debug option '%c' unknown. skipped\n", + *str); + } + + str++; + } +out: + if (page_init_poisoning && !__page_init_poisoning) + pr_warn("Page struct poisoning disabled by kernel command line option 'vm_debug'\n"); + + page_init_poisoning = __page_init_poisoning; + + return 1; +} +__setup("vm_debug", setup_vm_debug); + +void page_init_poison(struct page *page, size_t size) +{ + if (page_init_poisoning) + memset(page, PAGE_POISON_PATTERN, size); +} +EXPORT_SYMBOL_GPL(page_init_poison); #endif /* CONFIG_DEBUG_VM */ diff --git a/mm/memblock.c b/mm/memblock.c index f7981098537b..b1017ec1b167 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1495,10 +1495,9 @@ void * __init memblock_virt_alloc_try_nid_raw( ptr = memblock_virt_alloc_internal(size, align, min_addr, max_addr, nid); -#ifdef CONFIG_DEBUG_VM if (ptr && size > 0) - memset(ptr, PAGE_POISON_PATTERN, size); -#endif + page_init_poison(ptr, size); + return ptr; } diff --git a/mm/sparse.c b/mm/sparse.c index 10b07eea9a6e..67ad061f7fb8 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -696,13 +696,11 @@ int __meminit sparse_add_one_section(struct pglist_data *pgdat, goto out; } -#ifdef CONFIG_DEBUG_VM /* * Poison uninitialized struct pages in order to catch invalid flags * combinations. */ - memset(memmap, PAGE_POISON_PATTERN, sizeof(struct page) * PAGES_PER_SECTION); -#endif + page_init_poison(memmap, sizeof(struct page) * PAGES_PER_SECTION); section_mark_present(ms); sparse_init_one_section(ms, section_nr, memmap, usemap);