Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp4792291imm; Fri, 18 May 2018 10:43:15 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpJeh2gv4w0ylH7XEu0TKCAnmOzcrbfUlmxZjU444SsozCjfv3G81nWhkfX+8NU4Fa4NqID X-Received: by 2002:a62:5281:: with SMTP id g123-v6mr10353640pfb.22.1526665395876; Fri, 18 May 2018 10:43:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526665395; cv=none; d=google.com; s=arc-20160816; b=yUINBG+XA7G9VyuxNA+OHvoPcke1LAYH6bfQTL3YbslOUWGN3SeIwvse9t3OF4j3Kd SSHhGM2yKTRsVKwV6lMbkQ/UNP0h5W6lsGiFaplglbhu4ycOKqBZfeUgG0KvT5yY8yUj 9NNJt1R6oatdmuqZuAdtxidx9b6tghUBsptven7giLPTLP0fmomTnYd5el93vTl4htVZ 1e6t4L+Doe930jPRhAZMtBrRg9Y4qccZV3uwpg6f2vdQHZQiPNvJtUqCZl1sRnNiqj7O +d9gkeJLZ724s6HOxFxvmeSNMCxfVdvd37Yp8fUKrEG22bwyt9u7JbanRUZd9sqgoJvO qQVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=0UcQpqnT5xmygH9Q2qvC7Srqn78xi0920gfPZ/fC92k=; b=NCCNKfI+jdn/VOdNqU5qqRRP2ZwVoZ79YYgSQK43xKASS3nB5GSOj3Wfq3i6jhRvwv YLG4vb61A2mIiRNTziV96LfxZYIQNXBozNnE3/ZgB8k9E0m6SM3Gm/pt02c5tQFxs6Ks UjMQAhJvxeZI+TZi05k/jOBnO0d52H7HOtNSj4uZb9g0zJKGzJsP25gRk1ir6i/k6jHM Rz2s5LWceLsTCLsPYYmpfnFjQUI1VlUf4+7/S+SUbpn+BLPZQccdX094E06egSNOXEnk +/UW8MPdUkD0vejxDyUCGYDMQusJqYu3A6nvkyHruz6DWT/FOCYgYvn/IEVaAD0MzQVd TPvA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 1-v6si7658745plj.122.2018.05.18.10.43.01; Fri, 18 May 2018 10:43:15 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752215AbeERRma (ORCPT + 99 others); Fri, 18 May 2018 13:42:30 -0400 Received: from mail-oi0-f68.google.com ([209.85.218.68]:44221 "EHLO mail-oi0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751920AbeERRm2 (ORCPT ); Fri, 18 May 2018 13:42:28 -0400 Received: by mail-oi0-f68.google.com with SMTP id e80-v6so7801991oig.11 for ; Fri, 18 May 2018 10:42:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=0UcQpqnT5xmygH9Q2qvC7Srqn78xi0920gfPZ/fC92k=; b=nGI4WUwxOMHZFth/CwqDGEwKfUx4qokj3WXQSEZzQrsgdPSnkArZSdQWmtPYETLcNG /5p3gTTITKA09QOkzPfl/N1l+K8xqRBRv+aZ79g9QLv6qrl7ReRSvEhoZWa2LvYyqhmY JFGnfo4im3ILP8Xf1qZYob0NE71gyJZ8MQKZdcAktccsBFuR9a+MaVLX0yV7qIgZOMgW 7XiVOziHnZEyL0k4OS3KWKp6bSjFJkEuOlT4DYpJq4BkWgVSnpe9cGkYHASrkEIzrMnx B9gm+/MuH4Sf8QQTelVY6mx49EQ4jSEPjHkLXspDduC5ZI35juXQRPFL8wmHqV61mbZj 7ERA== X-Gm-Message-State: ALKqPwd8mQYzmMbjHGUwADquGsyQqJe/vWAAO+o3kW9qMViWtg3ownky cRFRwgWfVK1BK8CpocSZQ8s1VmSzhzw= X-Received: by 2002:aca:5b89:: with SMTP id p131-v6mr5947656oib.8.1526665347078; Fri, 18 May 2018 10:42:27 -0700 (PDT) Received: from ?IPv6:2601:602:9802:a8dc::d2dd? ([2601:602:9802:a8dc::d2dd]) by smtp.gmail.com with ESMTPSA id q12-v6sm4042466otg.56.2018.05.18.10.42.24 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 18 May 2018 10:42:25 -0700 (PDT) Subject: Re: [PATCH] Revert "mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE" To: Joonsoo Kim Cc: Michal Hocko , =?UTF-8?B?VmlsbGUgU3lyasOkbMOk?= , "Aneesh Kumar K . V" , Tony Lindgren , Vlastimil Babka , Johannes Weiner , Laura Abbott , Marek Szyprowski , Mel Gorman , Michal Nazarewicz , Minchan Kim , Rik van Riel , Russell King , Will Deacon , Andrew Morton , Linus Torvalds , linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20180517125959.8095-1-ville.syrjala@linux.intel.com> <20180517132109.GU12670@dhcp22.suse.cz> <20180517133629.GH23723@intel.com> <20180517135832.GI23723@intel.com> <20180517164947.GV12670@dhcp22.suse.cz> <20180517170816.GW12670@dhcp22.suse.cz> <20180518040104.GA17433@js1304-desktop> From: Laura Abbott Message-ID: <913e9450-da8b-8d4a-d1bd-06f1e0f49340@redhat.com> Date: Fri, 18 May 2018 10:42:23 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <20180518040104.GA17433@js1304-desktop> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/17/2018 09:01 PM, Joonsoo Kim wrote: > On Thu, May 17, 2018 at 10:53:32AM -0700, Laura Abbott wrote: >> On 05/17/2018 10:08 AM, Michal Hocko wrote: >>> On Thu 17-05-18 18:49:47, Michal Hocko wrote: >>>> On Thu 17-05-18 16:58:32, Ville Syrjälä wrote: >>>>> On Thu, May 17, 2018 at 04:36:29PM +0300, Ville Syrjälä wrote: >>>>>> On Thu, May 17, 2018 at 03:21:09PM +0200, Michal Hocko wrote: >>>>>>> On Thu 17-05-18 15:59:59, Ville Syrjala wrote: >>>>>>>> From: Ville Syrjälä >>>>>>>> >>>>>>>> This reverts commit bad8c6c0b1144694ecb0bc5629ede9b8b578b86e. >>>>>>>> >>>>>>>> Make x86 with HIGHMEM=y and CMA=y boot again. >>>>>>> >>>>>>> Is there any bug report with some more details? It is much more >>>>>>> preferable to fix the issue rather than to revert the whole thing >>>>>>> right away. >>>>>> >>>>>> The machine I have in front of me right now didn't give me anything. >>>>>> Black screen, and netconsole was silent. No serial port on this >>>>>> machine unfortunately. >>>>> >>>>> Booted on another machine with serial: >>>> >>>> Could you provide your .config please? >>>> >>>> [...] >>>>> [ 0.000000] cma: Reserved 4 MiB at 0x0000000037000000 >>>> [...] >>>>> [ 0.000000] BUG: Bad page state in process swapper pfn:377fe >>>>> [ 0.000000] page:f53effc0 count:0 mapcount:-127 mapping:00000000 index:0x0 >>>> >>>> OK, so this looks the be the source of the problem. -128 would be a >>>> buddy page but I do not see anything that would set the counter to -127 >>>> and the real map count updates shouldn't really happen that early. >>>> >>>> Maybe CONFIG_DEBUG_VM and CONFIG_DEBUG_HIGHMEM will tell us more. >>> >>> Looking closer, I _think_ that the bug is in set_highmem_pages_init->is_highmem >>> and zone_movable_is_highmem might force CMA pages in the zone movable to >>> be initialized as highmem. And that sounds supicious to me. Joonsoo? >>> >> >> For a point of reference, arm with this configuration doesn't hit this bug >> because highmem pages are freed via the memblock interface only instead >> of iterating through each zone. It looks like the x86 highmem code >> assumes only a single highmem zone and/or it's disjoint? > > Good point! Reason of the crash is that the span of MOVABLE_ZONE is > extended to whole node span for future CMA initialization, and, > normal memory is wrongly freed here. > > Here goes the fix. Ville, Could you test below patch? > I re-generated the issue on my side and this patch fixed it. > Reviewed-by: Laura Abbott > Thanks. > > ------------>8------------- > From 569899a4dbd28cebb8d350d3d1ebb590d88b2629 Mon Sep 17 00:00:00 2001 > From: Joonsoo Kim > Date: Fri, 18 May 2018 10:52:05 +0900 > Subject: [PATCH] x86/32/highmem: check if the zone is matched when free > highmem pages on init > > If CONFIG_CMA is enabled, it extends the span of the MOVABLE_ZONE > to manage the CMA memory later. And, in this case, the span of the > MOVABLE_ZONE could overlap the other zone's memory. We need to > avoid freeing this overlapped memory here since it would be the > memory of the other zone. Therefore, this patch adds a check > whether the page is indeed on the requested zone or not. Skipped > page will be freed when the memory of the matched zone is freed. > > Reported-by: Ville Syrjälä > Signed-off-by: Joonsoo Kim > --- > arch/x86/include/asm/highmem.h | 4 ++-- > arch/x86/mm/highmem_32.c | 5 ++++- > arch/x86/mm/init_32.c | 25 +++++++++++++++++++++---- > 3 files changed, 27 insertions(+), 7 deletions(-) > > diff --git a/arch/x86/include/asm/highmem.h b/arch/x86/include/asm/highmem.h > index a805993..e383f57 100644 > --- a/arch/x86/include/asm/highmem.h > +++ b/arch/x86/include/asm/highmem.h > @@ -72,8 +72,8 @@ void *kmap_atomic_prot_pfn(unsigned long pfn, pgprot_t prot); > > #define flush_cache_kmaps() do { } while (0) > > -extern void add_highpages_with_active_regions(int nid, unsigned long start_pfn, > - unsigned long end_pfn); > +extern void add_highpages_with_active_regions(int nid, struct zone *zone, > + unsigned long start_pfn, unsigned long end_pfn); > > #endif /* __KERNEL__ */ > > diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c > index 6d18b70..bf9f5b8 100644 > --- a/arch/x86/mm/highmem_32.c > +++ b/arch/x86/mm/highmem_32.c > @@ -120,6 +120,9 @@ void __init set_highmem_pages_init(void) > if (!is_highmem(zone)) > continue; > > + if (!populated_zone(zone)) > + continue; > + > zone_start_pfn = zone->zone_start_pfn; > zone_end_pfn = zone_start_pfn + zone->spanned_pages; > > @@ -127,7 +130,7 @@ void __init set_highmem_pages_init(void) > printk(KERN_INFO "Initializing %s for node %d (%08lx:%08lx)\n", > zone->name, nid, zone_start_pfn, zone_end_pfn); > > - add_highpages_with_active_regions(nid, zone_start_pfn, > + add_highpages_with_active_regions(nid, zone, zone_start_pfn, > zone_end_pfn); > } > } > diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c > index 8008db2..f646072 100644 > --- a/arch/x86/mm/init_32.c > +++ b/arch/x86/mm/init_32.c > @@ -431,7 +431,7 @@ static void __init permanent_kmaps_init(pgd_t *pgd_base) > pkmap_page_table = pte; > } > > -void __init add_highpages_with_active_regions(int nid, > +void __init add_highpages_with_active_regions(int nid, struct zone *zone, > unsigned long start_pfn, unsigned long end_pfn) > { > phys_addr_t start, end; > @@ -442,9 +442,26 @@ void __init add_highpages_with_active_regions(int nid, > start_pfn, end_pfn); > unsigned long e_pfn = clamp_t(unsigned long, PFN_DOWN(end), > start_pfn, end_pfn); > - for ( ; pfn < e_pfn; pfn++) > - if (pfn_valid(pfn)) > - free_highmem_page(pfn_to_page(pfn)); > + for ( ; pfn < e_pfn; pfn++) { > + struct page *page; > + > + if (!pfn_valid(pfn)) > + continue; > + > + page = pfn_to_page(pfn); > + > + /* > + * If CONFIG_CMA is enabled, it extends the span of > + * the MOVABLE_ZONE to manage the CMA memory > + * in the future. And, in this case, the span of the > + * MOVABLE_ZONE could overlap the other zone's memory. > + * We need to avoid freeing this memory here. > + */ > + if (IS_ENABLED(CONFIG_CMA) && page_zone(page) != zone) > + continue; > + > + free_highmem_page(pfn_to_page(pfn)); > + } > } > } > #else >