Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760336AbcLAQpl (ORCPT ); Thu, 1 Dec 2016 11:45:41 -0500 Received: from foss.arm.com ([217.140.101.70]:48780 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751993AbcLAQpk (ORCPT ); Thu, 1 Dec 2016 11:45:40 -0500 Date: Thu, 1 Dec 2016 16:45:39 +0000 From: Will Deacon To: Robert Richter Cc: Catalin Marinas , Ard Biesheuvel , David Daney , Mark Rutland , Hanjun Guo , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] arm64: mm: Fix memmap to be initialized for the entire section Message-ID: <20161201164538.GB1236@arm.com> References: <1480530091-1092-1-git-send-email-rrichter@cavium.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1480530091-1092-1-git-send-email-rrichter@cavium.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4737 Lines: 117 On Wed, Nov 30, 2016 at 07:21:31PM +0100, Robert Richter wrote: > On ThunderX systems with certain memory configurations we see the > following BUG_ON(): > > kernel BUG at mm/page_alloc.c:1848! > > This happens for some configs with 64k page size enabled. The BUG_ON() > checks if start and end page of a memmap range belongs to the same > zone. > > The BUG_ON() check fails if a memory zone contains NOMAP regions. In > this case the node information of those pages is not initialized. This > causes an inconsistency of the page links with wrong zone and node > information for that pages. NOMAP pages from node 1 still point to the > mem zone from node 0 and have the wrong nid assigned. > > The reason for the mis-configuration is a change in pfn_valid() which > reports pages marked NOMAP as invalid: > > 68709f45385a arm64: only consider memblocks with NOMAP cleared for linear mapping > > This causes pages marked as nomap being no long reassigned to the new > zone in memmap_init_zone() by calling __init_single_pfn(). > > Fixing this by restoring the old behavior of pfn_valid() to use > memblock_is_memory(). Also changing users of pfn_valid() in arm64 code > to use memblock_is_map_memory() where necessary. This only affects > code in ioremap.c. The code in mmu.c still can use the new version of > pfn_valid(). > > As a consequence, pfn_valid() can not be used to check if a physical > page is RAM. It just checks if there is an underlying memmap with a > valid struct page. Moreover, for performance reasons the whole memmap > (with pageblock_nr_pages number of pages) has valid pfns (SPARSEMEM > config). The memory range is extended to fit the alignment of the > memmap. Thus, pfn_valid() may return true for pfns that do not map to > physical memory. Those pages are simply not reported to the mm, they > are not marked reserved nor added to the list of free pages. Other > functions such a page_is_ram() or memblock_is_map_ memory() must be > used to check for memory and if the page can be mapped with the linear > mapping. > > Since NOMAP mem ranges may need to be mapped with different mem > attributes (e.g. read-only or non-caching) we can not use linear > mapping here. The use of memblock_is_memory() in pfn_valid() may not > break this behaviour. Since commit: > > e7cd190385d1 arm64: mark reserved memblock regions explicitly in iomem > > NOMAP mem resources are no longer marked as system RAM (IORESOURCE_ > SYSTEM_RAM). Now page_is_ram() and region_intersects() (see > memremap()) do not detect NOMAP mem as system ram and NOMAP mem is not > added to the linear mapping as system RAM is. > > v2: > > * Added Ack > * updated description to reflect the discussion > > Acked-by: Ard Biesheuvel > Signed-off-by: Robert Richter > --- > arch/arm64/mm/init.c | 2 +- > arch/arm64/mm/ioremap.c | 5 +++-- > 2 files changed, 4 insertions(+), 3 deletions(-) > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 212c4d1e2f26..166911f4a2e6 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -147,7 +147,7 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max) > #ifdef CONFIG_HAVE_ARCH_PFN_VALID > int pfn_valid(unsigned long pfn) > { > - return memblock_is_map_memory(pfn << PAGE_SHIFT); > + return memblock_is_memory(pfn << PAGE_SHIFT); > } > EXPORT_SYMBOL(pfn_valid); > #endif > diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c > index 01e88c8bcab0..c17c220b0c48 100644 > --- a/arch/arm64/mm/ioremap.c > +++ b/arch/arm64/mm/ioremap.c > @@ -21,6 +21,7 @@ > */ > > #include > +#include > #include > #include > #include > @@ -55,7 +56,7 @@ static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size, > /* > * Don't allow RAM to be mapped. > */ > - if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr)))) > + if (WARN_ON(memblock_is_map_memory(phys_addr))) > return NULL; > > area = get_vm_area_caller(size, VM_IOREMAP, caller); > @@ -96,7 +97,7 @@ EXPORT_SYMBOL(__iounmap); > void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size) > { > /* For normal memory we already have a cacheable mapping. */ > - if (pfn_valid(__phys_to_pfn(phys_addr))) > + if (memblock_is_map_memory(phys_addr)) > return (void __iomem *)__phys_to_virt(phys_addr); Thanks for sending out the new patch. Whilst I'm still a bit worried about changing pfn_valid like this, I guess we'll just have to fix up any callers which suffer from this change. Acked-by: Will Deacon I'd like to see this sit in -next for a bit before we send it further. Will