Received: by 10.213.65.68 with SMTP id h4csp1123205imn; Wed, 21 Mar 2018 03:16:13 -0700 (PDT) X-Google-Smtp-Source: AG47ELtPkAMXntcO794PURhcBj7NiGwOadkOMzY6N19EOFkQPpO6h5tPRGOyDvBHRUBlUYgm+8AF X-Received: by 2002:a17:902:2e:: with SMTP id 43-v6mr20098185pla.282.1521627373878; Wed, 21 Mar 2018 03:16:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521627373; cv=none; d=google.com; s=arc-20160816; b=DKWDzFUBNgUGyAPtEgeTmSkRYOAh+hlG5vuG7aHGlHuq4/aSIkubVPdulPhX2GB1fu 6SobBg0omzU1gnEym/Kfab4DAGN+JmqA/44xN3qyOBbW35zbwY2Bo4vKLh6yPH1Qm8IY EFUVF2PGIlFWK6Zlohf7SZ3kJ8Baj6LO48xLc1vaMdjmXvPmaHd4jUlm2EY/8pRL/TjZ JGTjtC9WvSH7M5nEfb7inq2GnkPAg9Q1N0oB/ynq0cBE7XrpZwxqZ0OxaEqxQu9CUiBQ NFtQgTEEjl9V8iRN4A60dDSMtYbP0TnHtGCfTGdB/UQs1P/vl/ati0rBep9T4UjpggRX sHQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:arc-authentication-results; bh=kNk7pf8v+ncHoi8aGaiBkSXf6KxPraz2VCZ1ftqciQM=; b=eBNVFEl5XfGdPmOhPZ4bDZqm71GHxFCqtVW3wC1rcT7XyqeQxIWBcGiMpFafo4hbW1 US+Gc+9lWJZUakX81jeuKcWY/1TeGVIXNdO2KxLuDogO0cc79Uu70eZ1ClQby++e3zSa RiGQwYLlkSy4LkHwJHa8LGFkD3JyReP6KgH/tmCwKrs7YQ02aV6R8H12Q44Zu44r+MB6 zTTdkCJLwVz598gIbGLzr87IteRSltlBd7ZfDXyjfchnnhmlBKpWcqVoKjZVi0Vkkkql vuSZz41M/e4EzzOK9Uq1I9ajFSqv4znFzEJ55YYQl57ISb3EmYJYdpFFQtN5W5r4tn1U ng3A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q11si2591414pgv.144.2018.03.21.03.15.59; Wed, 21 Mar 2018 03:16:13 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751716AbeCUKOV (ORCPT + 99 others); Wed, 21 Mar 2018 06:14:21 -0400 Received: from mail-ot0-f194.google.com ([74.125.82.194]:34986 "EHLO mail-ot0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751641AbeCUKOQ (ORCPT ); Wed, 21 Mar 2018 06:14:16 -0400 Received: by mail-ot0-f194.google.com with SMTP id r30-v6so4995391otr.2 for ; Wed, 21 Mar 2018 03:14:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=kNk7pf8v+ncHoi8aGaiBkSXf6KxPraz2VCZ1ftqciQM=; b=oLStQm9gi9pH6/YXQcvqSJA/LHjTw/LtANX+UEvzs9r2MLHXGQ8zwCCoX5cCGzSNLx 1W5hBqiJ6gnRtj0CBsDOez3yxNcjeMIp160xbMnoMi+qUJjx/n3KFNjWcqylbeAm3Xi0 mnttUU0yCRrsoTsom238O24vsGLLY7DZ58lWSNeB4PidtDtmK5vRAo7vwPK9dVkDXPn5 ip3vt69Xbr6w78jUVUIhYYzGJz5Vghzw4QSBkNC6oTg5rjwxmtU8m6WoPWoF+pAdMhMf 1106sRk4ehKjR3Gpe2quOfASRf9FryZuWsQH6d9eGy11z/qCZJ5b5hpsyWVLC11bRRNq oePg== X-Gm-Message-State: AElRT7FAlINh37z3MFUDKNuAhmOfeeeVbkmSPrK8xIOurzm7RWx8xo/2 lTLWNVcKAl8N4SqupHHFKQbB6yuVA6Na+2PHGp11KA== X-Received: by 2002:a9d:2e73:: with SMTP id c48-v6mr11415966otd.327.1521627255558; Wed, 21 Mar 2018 03:14:15 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a9d:39f6:0:0:0:0:0 with HTTP; Wed, 21 Mar 2018 03:14:14 -0700 (PDT) In-Reply-To: <1521619796-3846-2-git-send-email-hejianet@gmail.com> References: <1521619796-3846-1-git-send-email-hejianet@gmail.com> <1521619796-3846-2-git-send-email-hejianet@gmail.com> From: Daniel Vacek Date: Wed, 21 Mar 2018 11:14:14 +0100 Message-ID: Subject: Re: [PATCH 1/4] mm: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn() To: Jia He Cc: Andrew Morton , Michal Hocko , Catalin Marinas , Mel Gorman , Will Deacon , Mark Rutland , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Pavel Tatashin , Daniel Jordan , AKASHI Takahiro , Gioh Kim , Steven Sistare , Eugeniu Rosca , Vlastimil Babka , open list , linux-mm@kvack.org, James Morse , Ard Biesheuvel , Steve Capper , x86@kernel.org, Greg Kroah-Hartman , Kate Stewart , Philippe Ombredanne , Johannes Weiner , Kemi Wang , Petr Tesarik , YASUAKI ISHIMATSU , Andrey Ryabinin , Nikolay Borisov , Jia He Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 21, 2018 at 9:09 AM, Jia He wrote: > Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns > where possible") optimized the loop in memmap_init_zone(). But there is > still some room for improvement. E.g. if pfn and pfn+1 are in the same > memblock region, we can simply pfn++ instead of doing the binary search > in memblock_next_valid_pfn. There is a revert-mm-page_alloc-skip-over-regions-of-invalid-pfns-where-possible.patch in -mm reverting b92df1de5d289c0b as it is fundamentally wrong by design causing system panics on some machines with rare but still valid mappings. Basically it skips valid pfns which are outside of usable memory ranges (outside of memblock memory regions). So I guess if you want to optimize boot up time, you would have to come up with different solution in memmap_init_zone. Most likely using mem sections instead of memblock. Sorry. --nX > Signed-off-by: Jia He > --- > include/linux/memblock.h | 3 +-- > mm/memblock.c | 23 +++++++++++++++++++---- > mm/page_alloc.c | 3 ++- > 3 files changed, 22 insertions(+), 7 deletions(-) > > diff --git a/include/linux/memblock.h b/include/linux/memblock.h > index b7aa3ff..9471db4 100644 > --- a/include/linux/memblock.h > +++ b/include/linux/memblock.h > @@ -203,8 +203,7 @@ void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn, > i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid)) > #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */ > > -unsigned long memblock_next_valid_pfn(unsigned long pfn); > - > +unsigned long memblock_next_valid_pfn(unsigned long pfn, int *last_idx); > /** > * for_each_free_mem_range - iterate through free memblock areas > * @i: u64 used as loop variable > diff --git a/mm/memblock.c b/mm/memblock.c > index c87924d..a9e8da4 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -1133,13 +1133,26 @@ int __init_memblock memblock_set_node(phys_addr_t base, phys_addr_t size, > } > #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */ > > -unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn) > +unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn, > + int *last_idx) > { > struct memblock_type *type = &memblock.memory; > unsigned int right = type->cnt; > unsigned int mid, left = 0; > + unsigned long start_pfn, end_pfn; > phys_addr_t addr = PFN_PHYS(++pfn); > > + /* fast path, return pfh+1 if next pfn is in the same region */ > + if (*last_idx != -1) { > + start_pfn = PFN_DOWN(type->regions[*last_idx].base); > + end_pfn = PFN_DOWN(type->regions[*last_idx].base + > + type->regions[*last_idx].size); > + > + if (pfn < end_pfn && pfn > start_pfn) > + return pfn; > + } > + > + /* slow path, do the binary searching */ > do { > mid = (right + left) / 2; > > @@ -1149,15 +1162,17 @@ unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn) > type->regions[mid].size)) > left = mid + 1; > else { > - /* addr is within the region, so pfn is valid */ > + *last_idx = mid; > return pfn; > } > } while (left < right); > > if (right == type->cnt) > return -1UL; > - else > - return PHYS_PFN(type->regions[right].base); > + > + *last_idx = right; > + > + return PHYS_PFN(type->regions[*last_idx].base); > } > > static phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size, > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 3899209..f28c62c 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -5456,6 +5456,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, > unsigned long end_pfn = start_pfn + size; > pg_data_t *pgdat = NODE_DATA(nid); > unsigned long pfn; > + int idx = -1; > unsigned long nr_initialised = 0; > struct page *page; > #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP > @@ -5487,7 +5488,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, > * end_pfn), such that we hit a valid pfn (or end_pfn) > * on our next iteration of the loop. > */ > - pfn = memblock_next_valid_pfn(pfn) - 1; > + pfn = memblock_next_valid_pfn(pfn, &idx) - 1; > #endif > continue; > } > -- > 2.7.4 >