Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp3052573ybi; Mon, 17 Jun 2019 15:33:20 -0700 (PDT) X-Google-Smtp-Source: APXvYqxF/s9z80pRz62rJ/KtN1aAZNQuKoKxZ68QqPwv8bQNKMl7v8YFU5CBlWamQuMrOBtt/r2r X-Received: by 2002:a62:1c91:: with SMTP id c139mr109503213pfc.25.1560810800452; Mon, 17 Jun 2019 15:33:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560810800; cv=none; d=google.com; s=arc-20160816; b=LaiZevXnNQgUTQqKFmXFKx53QdJx1GBgKCD9bpVnL6VyvZdXvtGME47rHa6J7txVb/ 4sJCDp1G4pSIhkgS5sQxzi+XWW04KgXFrCKarAXH3+b+/BNXBVDDOpolZ3DnsqE4AwZt RXYNgmUoHu58LZEQzgEGeehGCBI0EVeCR8PHe2dQBS79HRADApyC7Faiv+T6b1GcfT6A TDuEYeY88rqcB9NvJExhx9Xf4OTviwN8qCFEAf+Q0NGjvGQsUCquQCVGiT9GEYmvGorA iY4DsnNtOXR0EI3ah2DZeZnHqCSO5XZpnOH2oJ7BNLpe5zEfddpT2wTHENVLqdYqctJH 4jBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=7qbPiWISZ/3YLAH4qk20thx5mPsENn7iQ0jZWGyMcis=; b=rvcVx+Pqzeh4wiAp60YECQ/oKCe9/8FpZb7b38srS9RJC0IQhl7k3L9u5U3A11U3/L Ay7qX50DPGqGCkd0f9tAsO6ZH1gL0gzgPbgTKrdW8R/fKZS8llrWa5pcjgq8J5PSpk8h VcvsteAYC68bAzjTXvf/MqBzB+k8fUJO8bVifOnodYd2vrRF+hTizKBOpaahco7d8Bdc rJU4bkZfnnz/DO1FusutlFf1cJLCLtAIWKknCDd6uJ6GDZnXm//NIEc2KDUsr3OGXYb1 Oc+ehXa03wfb0cmb3NvA6fjJfDzMPKJ3CFGQZEnNwprKErVLfJ3zlGRvc50wv0HfGglC 0EHg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=a91I+xSD; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 131si12079785pfu.165.2019.06.17.15.33.05; Mon, 17 Jun 2019 15:33:20 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=a91I+xSD; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728339AbfFQWc6 (ORCPT + 99 others); Mon, 17 Jun 2019 18:32:58 -0400 Received: from mail-ot1-f65.google.com ([209.85.210.65]:39671 "EHLO mail-ot1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726336AbfFQWc6 (ORCPT ); Mon, 17 Jun 2019 18:32:58 -0400 Received: by mail-ot1-f65.google.com with SMTP id r21so8585124otq.6 for ; Mon, 17 Jun 2019 15:32:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=7qbPiWISZ/3YLAH4qk20thx5mPsENn7iQ0jZWGyMcis=; b=a91I+xSDVHvK1ZEOenRvvjE4vhGbnMM5IGQHmISQyRGCfz3lejo/r5ImOKIR368oej og0AcK5XFYS27XW+i5wQU/6RiEwpPUjImiPaOtWcOb4m8k9WnyQ3oOtutHO8Ue2cfUUe J0zRHWGOGVEFk7FiSUmXM7W2dI9bYseIgHTZXjQ04eE2lFmfvM0lLFnS4bIcUVi26/I4 wF/BDJYOiaJYlZLvZSKIR4rt0ihW1MNP5fI5G8+xBCuY16R60CgJsByQAXLL4+0yBgrd suSnDakVYOMj+yqj5I159f1jj8m1ODIYx74Dh4Jo8kqLO6mwqXJntHs16+YD0/dMeu7E Yw1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=7qbPiWISZ/3YLAH4qk20thx5mPsENn7iQ0jZWGyMcis=; b=htUXCrLikgXc3saXxb3dnXTVDu1HJTDkmIRei1vOdTosWH8VXGcz5f6TwTxzf+UxE8 r9PdzER58tfz6xFj0Roiyk6sneyIBICfXf4GdSC2KJBFHDLeZVmTtHq1zNw+Xlg3riI0 sWdMJCCt/I7usfzJUlMmEpyzj0p1BYsaD9lttT6NIpv9m1y+w7JYgveAGvWe3SkbfKU1 p9MtRhgJ4odf+i91nxHf0e4WIsZ1+p7LQafqWwhXOcQpf/dS4T3O/6ewHP1odN80h8rY CRLOXGbRS3LMwgwPlsYYGeot/KWqFizzSO4k8CkWklEAc3MXzGUhd/0N4lA4SlPgAE7+ BKjQ== X-Gm-Message-State: APjAAAVRBi72VbzBjBgsRnx/UCz8QTehl1/Ca7EK9djBEj/eSsXyhNLD iWEO/R932kFBamb4L+PtNvpT0fx1PU/jH6oQhEW1LQ== X-Received: by 2002:a9d:7248:: with SMTP id a8mr18184303otk.363.1560810776909; Mon, 17 Jun 2019 15:32:56 -0700 (PDT) MIME-Version: 1.0 References: <155977186863.2443951.9036044808311959913.stgit@dwillia2-desk3.amr.corp.intel.com> <155977187919.2443951.8925592545929008845.stgit@dwillia2-desk3.amr.corp.intel.com> <20190617222156.v6eaujbdrmkz35wr@master> In-Reply-To: <20190617222156.v6eaujbdrmkz35wr@master> From: Dan Williams Date: Mon, 17 Jun 2019 15:32:45 -0700 Message-ID: Subject: Re: [PATCH v9 02/12] mm/sparsemem: Add helpers track active portions of a section at boot To: Wei Yang Cc: Andrew Morton , Michal Hocko , Vlastimil Babka , Logan Gunthorpe , Oscar Salvador , Pavel Tatashin , Jane Chu , Linux MM , linux-nvdimm , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 17, 2019 at 3:22 PM Wei Yang wrote: > > On Wed, Jun 05, 2019 at 02:57:59PM -0700, Dan Williams wrote: > >Prepare for hot{plug,remove} of sub-ranges of a section by tracking a > >sub-section active bitmask, each bit representing a PMD_SIZE span of the > >architecture's memory hotplug section size. > > > >The implications of a partially populated section is that pfn_valid() > >needs to go beyond a valid_section() check and read the sub-section > >active ranges from the bitmask. The expectation is that the bitmask > >(subsection_map) fits in the same cacheline as the valid_section() data, > >so the incremental performance overhead to pfn_valid() should be > >negligible. > > > >Cc: Michal Hocko > >Cc: Vlastimil Babka > >Cc: Logan Gunthorpe > >Cc: Oscar Salvador > >Cc: Pavel Tatashin > >Tested-by: Jane Chu > >Signed-off-by: Dan Williams > >--- > > include/linux/mmzone.h | 29 ++++++++++++++++++++++++++++- > > mm/page_alloc.c | 4 +++- > > mm/sparse.c | 35 +++++++++++++++++++++++++++++++++++ > > 3 files changed, 66 insertions(+), 2 deletions(-) > > > >diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > >index ac163f2f274f..6dd52d544857 100644 > >--- a/include/linux/mmzone.h > >+++ b/include/linux/mmzone.h > >@@ -1199,6 +1199,8 @@ struct mem_section_usage { > > unsigned long pageblock_flags[0]; > > }; > > > >+void subsection_map_init(unsigned long pfn, unsigned long nr_pages); > >+ > > struct page; > > struct page_ext; > > struct mem_section { > >@@ -1336,12 +1338,36 @@ static inline struct mem_section *__pfn_to_section(unsigned long pfn) > > > > extern int __highest_present_section_nr; > > > >+static inline int subsection_map_index(unsigned long pfn) > >+{ > >+ return (pfn & ~(PAGE_SECTION_MASK)) / PAGES_PER_SUBSECTION; > >+} > >+ > >+#ifdef CONFIG_SPARSEMEM_VMEMMAP > >+static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) > >+{ > >+ int idx = subsection_map_index(pfn); > >+ > >+ return test_bit(idx, ms->usage->subsection_map); > >+} > >+#else > >+static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) > >+{ > >+ return 1; > >+} > >+#endif > >+ > > #ifndef CONFIG_HAVE_ARCH_PFN_VALID > > static inline int pfn_valid(unsigned long pfn) > > { > >+ struct mem_section *ms; > >+ > > if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) > > return 0; > >- return valid_section(__nr_to_section(pfn_to_section_nr(pfn))); > >+ ms = __nr_to_section(pfn_to_section_nr(pfn)); > >+ if (!valid_section(ms)) > >+ return 0; > >+ return pfn_section_valid(ms, pfn); > > } > > #endif > > > >@@ -1373,6 +1399,7 @@ void sparse_init(void); > > #define sparse_init() do {} while (0) > > #define sparse_index_init(_sec, _nid) do {} while (0) > > #define pfn_present pfn_valid > >+#define subsection_map_init(_pfn, _nr_pages) do {} while (0) > > #endif /* CONFIG_SPARSEMEM */ > > > > /* > >diff --git a/mm/page_alloc.c b/mm/page_alloc.c > >index c6d8224d792e..bd773efe5b82 100644 > >--- a/mm/page_alloc.c > >+++ b/mm/page_alloc.c > >@@ -7292,10 +7292,12 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn) > > > > /* Print out the early node map */ > > pr_info("Early memory node ranges\n"); > >- for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) > >+ for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) { > > pr_info(" node %3d: [mem %#018Lx-%#018Lx]\n", nid, > > (u64)start_pfn << PAGE_SHIFT, > > ((u64)end_pfn << PAGE_SHIFT) - 1); > >+ subsection_map_init(start_pfn, end_pfn - start_pfn); > >+ } > > Just curious about why we set subsection here? > > Function free_area_init_nodes() mostly handles pgdat, if I am correct. Setup > subsection here looks like touching some lower level system data structure. Correct, I'm not sure how it ended up there, but it was the source of a bug that was fixed with this change: https://lore.kernel.org/lkml/CAPcyv4hjvBPDYKpp2Gns3-cc2AQ0AVS1nLk-K3fwXeRUvvzQLg@mail.gmail.com/