Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp1604662ybi; Sun, 16 Jun 2019 08:44:40 -0700 (PDT) X-Google-Smtp-Source: APXvYqzqiTchDvYtbuJRe+Nq4mrvjNk1I7Rx3UQRBU5BlKrgsutagDl3ROApyKu3lVVPLQa+WGLR X-Received: by 2002:a63:fb17:: with SMTP id o23mr16136125pgh.362.1560699880263; Sun, 16 Jun 2019 08:44:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560699880; cv=none; d=google.com; s=arc-20160816; b=RWBYUvf+nNxD4VoJmiBekPrk/4vyRBVIge+gVF0uWmcDryW4Ezayypana/Uag7yyqj YMAtM+cpX2vFWfNpZ520zNABLu8dcTVvaPnIXPAtbZNEnbgMnDYK27uas83RjLx1LfSa hBM1EXRcUiOmlokZLS7IipQ08JTKQd6XlGZBkt/MBrX5PG416044F3fXST0H1pYKcq9f w6/Rk33aycTuu1POlVPlI0Q8OZwn4rqa1hqs1zoY9VDkpabLNCrf1GHlMP0az0LR4amg i2sMj+LaKd8yGjxMQE+xFqQuQsfEBCp0n80RFEgB3m+hUYyN9BHLLqpFilEJhAWj7hZt aKoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=RUNNGKFAqP73eQ+vhxZTGCl7HcRBhNddyWHG8nJUMUI=; b=RLnr7D5yyQ6f5KtfjBLXzPZWabv6Nn7P3AoQ8HQuNIkwkqxG4sREBb0Wd4oOuXqru3 BcPL3pDewYnA0OnCPayKmisVw3KpmzG8xN49F+pZ6SvpwtYg3cugApPIzYf9JawcwJZu SQMXw/7InQ6JsVKXALVNHtXHLrHEVyO4y6u4DLLEJ437nDRxEJQ6doU3IqTpkzxc+WuW SvzfCIom1iT2x98/Zox3FKeQUt+Ps6ciKJ5YPLKwYkGvInlKtFCSl9y9HouJ/HAcuV9d d/AsN6Co68b73kSeT4dkPcfEjvEEoEbJ62dfxrcqbgjSdWu41LAL+tKhOfKBQ2UCHt4W alAA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=LQ8Jjksf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a10si7932693pfc.55.2019.06.16.08.44.25; Sun, 16 Jun 2019 08:44:40 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=LQ8Jjksf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727329AbfFPPmg (ORCPT + 99 others); Sun, 16 Jun 2019 11:42:36 -0400 Received: from mail-oi1-f193.google.com ([209.85.167.193]:43014 "EHLO mail-oi1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726038AbfFPPmg (ORCPT ); Sun, 16 Jun 2019 11:42:36 -0400 Received: by mail-oi1-f193.google.com with SMTP id w79so5316092oif.10 for ; Sun, 16 Jun 2019 08:42:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=RUNNGKFAqP73eQ+vhxZTGCl7HcRBhNddyWHG8nJUMUI=; b=LQ8JjksfqkPqT+AF46Yc03Ej6i8p1rWl1/IVeA7w3nWMjVEJZM5leo0PwrZJmp9Juw 98lmube2TqS7nKbeNulDuInyvRZ4nnLdljkF61bbMmvpUcvGspoD8rzE0JccY7/bsJdQ mqR3NPbUDWb0qbRRzpRStr/hW3YkYGeItzZC8yMruQdUuXGQe5MRt5ZUypkYKYcTVqu5 VdsHWP2INuUo5oxMOCbePF3tNSNrvuZzG9YGmd+9UnPr01GAvr3drZrffStG1vXLoWnA grB8f6fSEPqp4+LiumsBCQJEyT4len2EriwA8MAdla3jvUIL4926+VnZsJChR4/qW+5E 2uXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=RUNNGKFAqP73eQ+vhxZTGCl7HcRBhNddyWHG8nJUMUI=; b=FOR++K8xvsofQcBBoYw9f5rr1sECfuK9W6S5iEx1iVsHevNzf9EgwB3OIzHzSjtQDo BVdD97EsGCgabiTmIW/DuK7JLTcOD5VsMTSfKb4i/fW8F5a/bA0gt2arod3WArsKoD7o qzYxkfm2kdhES8smbJyswnrOcZskUblPu3NwrQUoYFQz8CvGOMGHPjBKtuEVVl4ldvUa mQwao6nYyG6GYSReYRpvPv3fegIrgV2k33SAD6PT+TPrPVt6FzKIZDR8/ImRbp0BtvnO /LHrlb9qyso6yBATjsAEtUNXP3k7cQAG8N/D29wJmp42PCtOnaJnlwe9Va83F2KVRWXp JFcw== X-Gm-Message-State: APjAAAWaZO1a9WY+jY1duJla7tZnupYxAcXgH08gFhvTeizJ2LHnaF7i qCGq/JhAukCvx51fKQv2638ZurBIra3W+PbdfWdExns5 X-Received: by 2002:aca:fc50:: with SMTP id a77mr8120678oii.0.1560699755129; Sun, 16 Jun 2019 08:42:35 -0700 (PDT) MIME-Version: 1.0 References: <1560366952-10660-1-git-send-email-cai@lca.pw> <1560376072.5154.6.camel@lca.pw> <87lfy4ilvj.fsf@linux.ibm.com> <1560524365.5154.21.camel@lca.pw> <1560541220.5154.23.camel@lca.pw> <1560544982.5154.24.camel@lca.pw> In-Reply-To: <1560544982.5154.24.camel@lca.pw> From: Dan Williams Date: Sun, 16 Jun 2019 08:42:22 -0700 Message-ID: Subject: Re: [PATCH -next] mm/hotplug: skip bad PFNs from pfn_to_online_page() To: Qian Cai Cc: "Aneesh Kumar K.V" , Andrew Morton , Oscar Salvador , Linux MM , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 14, 2019 at 1:43 PM Qian Cai wrote: > > On Fri, 2019-06-14 at 12:48 -0700, Dan Williams wrote: > > On Fri, Jun 14, 2019 at 12:40 PM Qian Cai wrote: > > > > > > On Fri, 2019-06-14 at 11:57 -0700, Dan Williams wrote: > > > > On Fri, Jun 14, 2019 at 11:03 AM Dan Williams > > > > wrote: > > > > > > > > > > On Fri, Jun 14, 2019 at 7:59 AM Qian Cai wrote: > > > > > > > > > > > > On Fri, 2019-06-14 at 14:28 +0530, Aneesh Kumar K.V wrote: > > > > > > > Qian Cai writes: > > > > > > > > > > > > > > > > > > > > > > 1) offline is busted [1]. It looks like test_pages_in_a_zone() > > > > > > > > missed > > > > > > > > the > > > > > > > > same > > > > > > > > pfn_section_valid() check. > > > > > > > > > > > > > > > > 2) powerpc booting is generating endless warnings [2]. In > > > > > > > > vmemmap_populated() at > > > > > > > > arch/powerpc/mm/init_64.c, I tried to change PAGES_PER_SECTION to > > > > > > > > PAGES_PER_SUBSECTION, but it alone seems not enough. > > > > > > > > > > > > > > > > > > > > > > Can you check with this change on ppc64. I haven't reviewed this > > > > > > > series > > > > > > > yet. > > > > > > > I did limited testing with change . Before merging this I need to go > > > > > > > through the full series again. The vmemmap poplulate on ppc64 needs > > > > > > > to > > > > > > > handle two translation mode (hash and radix). With respect to vmemap > > > > > > > hash doesn't setup a translation in the linux page table. Hence we > > > > > > > need > > > > > > > to make sure we don't try to setup a mapping for a range which is > > > > > > > arleady convered by an existing mapping. > > > > > > > > > > > > It works fine. > > > > > > > > > > Strange... it would only change behavior if valid_section() is true > > > > > when pfn_valid() is not or vice versa. They "should" be identical > > > > > because subsection-size == section-size on PowerPC, at least with the > > > > > current definition of SUBSECTION_SHIFT. I suspect maybe > > > > > free_area_init_nodes() is too late to call subsection_map_init() for > > > > > PowerPC. > > > > > > > > Can you give the attached incremental patch a try? This will break > > > > support for doing sub-section hot-add in a section that was only > > > > partially populated early at init, but that can be repaired later in > > > > the series. First things first, don't regress. > > > > > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > > > index 874eb22d22e4..520c83aa0fec 100644 > > > > --- a/mm/page_alloc.c > > > > +++ b/mm/page_alloc.c > > > > @@ -7286,12 +7286,10 @@ void __init free_area_init_nodes(unsigned long > > > > *max_zone_pfn) > > > > > > > > /* Print out the early node map */ > > > > pr_info("Early memory node ranges\n"); > > > > - for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, > > > > &nid) { > > > > + for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, > > > > &nid) > > > > pr_info(" node %3d: [mem %#018Lx-%#018Lx]\n", nid, > > > > (u64)start_pfn << PAGE_SHIFT, > > > > ((u64)end_pfn << PAGE_SHIFT) - 1); > > > > - subsection_map_init(start_pfn, end_pfn - start_pfn); > > > > - } > > > > > > > > /* Initialise every node */ > > > > mminit_verify_pageflags_layout(); > > > > diff --git a/mm/sparse.c b/mm/sparse.c > > > > index 0baa2e55cfdd..bca8e6fa72d2 100644 > > > > --- a/mm/sparse.c > > > > +++ b/mm/sparse.c > > > > @@ -533,6 +533,7 @@ static void __init sparse_init_nid(int nid, > > > > unsigned long pnum_begin, > > > > } > > > > check_usemap_section_nr(nid, usage); > > > > sparse_init_one_section(__nr_to_section(pnum), pnum, > > > > map, usage); > > > > + subsection_map_init(section_nr_to_pfn(pnum), > > > > PAGES_PER_SECTION); > > > > usage = (void *) usage + mem_section_usage_size(); > > > > } > > > > sparse_buffer_fini(); > > > > > > It works fine except it starts to trigger slab debugging errors during boot. > > > Not > > > sure if it is related yet. > > > > If you want you can give this branch a try if you suspect something > > else in -next is triggering the slab warning. > > > > https://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm.git/log/?h=subsect > > ion-v9 > > > > It's the original v9 patchset + dependencies backported to v5.2-rc4. > > > > I otherwise don't see how subsections would effect slab caches. > > It works fine there. Much appreciated Qian! Does this change modulate the x86 failures?