Received: by 2002:a25:1104:0:0:0:0:0 with SMTP id 4csp55373ybr; Fri, 22 May 2020 00:29:53 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyPl70adwBeftRwrbOSMDgFwt1QpzX+3J69MFEr+6dbFxbBLuJSTnhea+1W6aEUFhM6DOlM X-Received: by 2002:a17:906:eda5:: with SMTP id sa5mr6723774ejb.289.1590132592978; Fri, 22 May 2020 00:29:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1590132592; cv=none; d=google.com; s=arc-20160816; b=NAn9CxTKdN6SuKxx+ptd7AZU1MYm+3kNKFmADXc/o5sODjd6QEfO7sCBU0+nNWrOrD 7OqRQ49Ah7Cma75bPDGIiTRyxmYdZTOfnr1/VZJ9PZD3Tm8E1nsTNhddEv6Hmn3F5paK wtRk6b7MxIhnEj5ErfvxWjO6tf+YCmhB7ur3osmPataZgkJ7w3QtlDy6MWtvbVI/IaP0 lqJySqrdNbX/YWsIsXJhboBesYwIaOt7RcoHf2913nYFeTZT1phvX1MVJsCxqdTqXB/z Z3ryWW2mTTIn9zf7M+imkhm3VBDFHClW1S8g0IdT4zU3aCGGW2S8dJatkv5ENtD/f7GV NRHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=HrXzaU1D+L/hYoY5vmL3woDQWDBITRObLiwmxim3cFY=; b=0Rh04MqKWOHT0SptmtfesHJhpKpEeDD1c2Xwj8urg2WlDhFAODuYomNtLRNZ0j/2M2 A2cPW2g8wyebG/mT4arlIH5gbExdglhShc/Uo3flqZLX1e6dlgO7+bu8E8fGvfScLsYb N8i1xSKocZ5NHEdeTb4xoe+0fsHp1HjGFsKSOBGgtG/bw94deMSXGTj/tQYGof7TLSC8 Y46CRjB+BhnvBQ0RaPsEKc3mmlBZNfmSA2unxbgCnObfVNlP7XC7AdD4vC1roHvfC60A RMnzcYuxrJt2nUgn9PMLaGRbeOQ6JNADKdPJfiCOQRvqFGbVgdU0xJxv8tJAdWew13XA eomQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=RXTVhy4z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g17si4766099ejo.179.2020.05.22.00.29.29; Fri, 22 May 2020 00:29:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=RXTVhy4z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728861AbgEVHZq (ORCPT + 99 others); Fri, 22 May 2020 03:25:46 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:22245 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728800AbgEVHZp (ORCPT ); Fri, 22 May 2020 03:25:45 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1590132343; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=HrXzaU1D+L/hYoY5vmL3woDQWDBITRObLiwmxim3cFY=; b=RXTVhy4zhUMEAhZHfdD5qEHU386H4tRDZhknBuNP6vq23aZKiG7ixtuIb1a407Lo9l0HoD XaY1HzJ12NL2Txjkw31Z5Nu2LJEkwNSFZJ0b2B6melB11vg9o8bqRoJN11RCDxA3TLPrfU +pe7aggoxCNYfFfhgbn7CKAQDgfigds= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-187-vOjMLDpBNLCpFG1wfj5FjA-1; Fri, 22 May 2020 03:25:41 -0400 X-MC-Unique: vOjMLDpBNLCpFG1wfj5FjA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7685E80183C; Fri, 22 May 2020 07:25:39 +0000 (UTC) Received: from localhost (ovpn-12-170.pek2.redhat.com [10.72.12.170]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 64EFA60CCC; Fri, 22 May 2020 07:25:37 +0000 (UTC) Date: Fri, 22 May 2020 15:25:24 +0800 From: Baoquan He To: Mike Rapoport , mgorman@suse.de Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, cai@lca.pw, mhocko@kernel.org Subject: Re: [PATCH] mm/compaction: Fix the incorrect hole in fast_isolate_freepages() Message-ID: <20200522072524.GF26955@MiWiFi-R3L-srv> References: <20200521014407.29690-1-bhe@redhat.com> <20200521092612.GP1059226@linux.ibm.com> <20200521155225.GA20045@MiWiFi-R3L-srv> <20200521171836.GU1059226@linux.ibm.com> <20200522070114.GE26955@MiWiFi-R3L-srv> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200522070114.GE26955@MiWiFi-R3L-srv> User-Agent: Mutt/1.10.1 (2018-07-13) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/22/20 at 03:01pm, Baoquan He wrote: > > > As I said, the unavailable range includes firmware reserved ranges, and > > > holes inside one boot memory section, if that boot memory section haves > > > useable memory range, and firmware reserved ranges, and holes. Adding > > > them all into memblock seems a little unreasonable, since they are never > > > used by system in memblock, buddy or high level memory allocator. But I > > > can see that adding them into memblock may have the same effect as the > > > old code which is beofre your your patchset applied. Let's see if Mel or > > > other people have some saying. I pesonally would not suggest doing it > > > like this though. > > > > Adding reserved regions to memblock.memory will not have the same effect > > as the old code. We anyway have to initialize struct page for these > > areas, but unlike the old code we don't need to run them by the > > early_pfn_in_nid() checks and we still get rid the > > CONFIG_NODES_SPAN_OTHER_NODES option. > > Hmm, I mean adding them to memblock will let us have the same result, > they are added into the node, zone where they should be, and marked as > reserved, just as the old code did. > > Rethink about this, seems adding them into memblock is doable. But > we may not need to add them from e820 reserved range, since that will > skip hole range which share the same section with usable range, and may > need to change code in different ARCHes. How about this: > > We add them into memblock in init_unavailable_range(), memmap_init() will > add them into the right node and zone, reserve_bootmem_region() will > initialize them and mark them as Reserved. > > > From d019d0f9e7c958542dfcb142f93d07fcce6c7c22 Mon Sep 17 00:00:00 2001 > From: Baoquan He > Date: Fri, 22 May 2020 14:36:13 +0800 > Subject: [PATCH] mm/page_alloc.c: Add unavailable ranges into memblock > > These unavailable ranges shares the same section with the usable range > in boot memory, e.g the firmware reserved ranges, and holes. > > Previously, they are added into node 0, zone 0 in function > init_unavailable_range(), and marked as Reserved. Later, in function > memmap_init(), they will be added to appropriate node and zone, where > they are covered. > > However, after the patchset ("mm: rework free_area_init*() funcitons") > is applied, we change to iterate over memblock regions. These unavailable > ranges are skipped, and the node and zone adjustment won't be done any > more as the old code did. This cause a crash in compaction which is triggered > by VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn)). > > So let's add these unavailable ranges into memblock and reserve them > in init_unavailable_range() instead. With this change, they will be added > into appropriate node and zone in memmap_init(), and initialized in > reserve_bootmem_region() just like any other memblock reserved regions. Seems this is not right. They can't get nid in init_unavailable_range(). Adding e820 ranges may let them get nid. But the hole range won't be added to memblock, and still has the issue. Nack this one for now, still considering. > > Signed-off-by: Baoquan He > --- > mm/page_alloc.c | 17 +++++++++++------ > 1 file changed, 11 insertions(+), 6 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 603187800628..3973b5fdfe3f 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -6925,7 +6925,7 @@ static u64 __init init_unavailable_range(unsigned long spfn, unsigned long epfn) > static void __init init_unavailable_mem(void) > { > phys_addr_t start, end; > - u64 i, pgcnt; > + u64 i, pgcnt, size; > phys_addr_t next = 0; > > /* > @@ -6934,9 +6934,11 @@ static void __init init_unavailable_mem(void) > pgcnt = 0; > for_each_mem_range(i, &memblock.memory, NULL, > NUMA_NO_NODE, MEMBLOCK_NONE, &start, &end, NULL) { > - if (next < start) > - pgcnt += init_unavailable_range(PFN_DOWN(next), > - PFN_UP(start)); > + if (next < start) { > + size = PFN_UP(start) - PFN_DOWN(next); > + memblock_add(PFN_DOWN(next), size); > + memblock_reserve(PFN_DOWN(next), size); > + } > next = end; > } > > @@ -6947,8 +6949,11 @@ static void __init init_unavailable_mem(void) > * considered initialized. Make sure that memmap has a well defined > * state. > */ > - pgcnt += init_unavailable_range(PFN_DOWN(next), > - round_up(max_pfn, PAGES_PER_SECTION)); > + size = round_up(max_pfn, PAGES_PER_SECTION) - PFN_DOWN(next); > + if (size) { > + memblock_add(PFN_DOWN(next), size); > + memblock_reserve(PFN_DOWN(next), size); > + } > > /* > * Struct pages that do not have backing memory. This could be because > -- > 2.17.2 >