Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp3860007pxb; Mon, 1 Feb 2021 06:34:53 -0800 (PST) X-Google-Smtp-Source: ABdhPJx3/WqDC7jUasRA8axPAX71V+unZJduDrzxOFIohJnpZxTRTfWDHknD3p/uhgJq/MTsybja X-Received: by 2002:a17:906:2a42:: with SMTP id k2mr17888461eje.118.1612190092819; Mon, 01 Feb 2021 06:34:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612190092; cv=none; d=google.com; s=arc-20160816; b=hL9gQA61rA1FQ4Ti32sAJga5m+mY4Wc0jMFqt/eYvlOzbWXy0fVjsYlNRQl5yhfnLD y3b38orgVZTg1+N+6YqDEgyZIT9MPClNvgoL4Tv4STpuN9xy6GBYT00BluzR/zFQuuQl C92E3+XTxBeACzSv8pV5KBeQxAHULLsGBeYuzOrcZMCYkybXc6SLQepF2nc6YSsgqViI l/F55ptuf3DSh6i10duUUeHIUr/qQfizMTLd55q/W5SiFsSji4fQejH7b3KTsr0hSc6D FQtoqaBCG8Qv1Z2hMzDn3KkGIfEV6wtNmB9Zcxj6msCRn7ZM6sLDZfh8LLceo0AAtJnM oLMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=mcW/+5la/ociqxrvJ6gdIHC4xa5l5n8ZCPUpkq6Qs5w=; b=eWnvX0vvxHJ8u1vlJeBNz8aOUkX+CmoyWezFcUVfGs1B5JuGbhs/GYHP2+QbQeI/iO ug1Q4r4tvzBQ0GoEcluJoFRXRVS1KqSwojBL+h20qR2EsUcj1Kjhzaja0XihJ27b3on+ /jBZMXsQbkWQHYcLxBqhg72S6X+/GnAV1ettucYruxo0rCpG6OuTPA89SJB9uSLQELyv EuxjPWYPKFp0/+9RaPbBbQYmjEkYEb+6d7I1Q7jX02kjL2315KSmh05QRDKp8vof3IpK pHPjthxNFGZsPEhsQO+eZa/Re9WYBSu+6fMk2zK5ecnodrfPbTEEFFM587uIvGus28yV 8zhA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=HX8Fi86s; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id gv30si1242208ejc.474.2021.02.01.06.34.27; Mon, 01 Feb 2021 06:34:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=HX8Fi86s; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229689AbhBAObN (ORCPT + 99 others); Mon, 1 Feb 2021 09:31:13 -0500 Received: from mail.kernel.org ([198.145.29.99]:58222 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229564AbhBAObI (ORCPT ); Mon, 1 Feb 2021 09:31:08 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id BDE0764E46; Mon, 1 Feb 2021 14:30:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1612189826; bh=pSp6KogtqHICua0+XffQWrWfoLcultom2kfK3j81FgA=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=HX8Fi86s+eeswzryT4zcy1KUCIumjs5yI5tU+H2q0Qr8Mg+gbz7uwYYAsIaiuOhwI Mm/8BKIaQcq+/58nkb8gOx1xLgKtNdiCiEJiPlr6UPROFfOAPYx6TvIhcfwSYcHGrU /t/wZ2TwK6cGn8MQ0LL1RgOQ/OPY6nxJYIxjCCA3gv+9r+ezeoAP/SIB0/nbj0unmG CX5mmqxywWGPjyLAPwisX5Qmcja40vEZ1F9lwh+87sYUOnccV44UZRmzSyLFCj09kn jEpS79p8trrsaktnJK305wJ2pfKuuVJI1PFxWauCN/4yBM2s9CQeDmYzKwfw1AbzJP YxQflX/7f+iRw== Date: Mon, 1 Feb 2021 16:30:14 +0200 From: Mike Rapoport To: David Hildenbrand Cc: Andrew Morton , Andrea Arcangeli , Baoquan He , Borislav Petkov , Chris Wilson , "H. Peter Anvin" , Ingo Molnar , Linus Torvalds , =?utf-8?Q?=C5=81ukasz?= Majczak , Mel Gorman , Michal Hocko , Mike Rapoport , Qian Cai , "Sarvela, Tomi P" , Thomas Gleixner , Vlastimil Babka , linux-kernel@vger.kernel.org, linux-mm@kvack.org, stable@vger.kernel.org, x86@kernel.org Subject: Re: [PATCH v4 1/2] x86/setup: always add the beginning of RAM as memblock.memory Message-ID: <20210201143014.GI242749@kernel.org> References: <20210130221035.4169-1-rppt@kernel.org> <20210130221035.4169-2-rppt@kernel.org> <56e2c568-b121-8860-a6b0-274ace46d835@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56e2c568-b121-8860-a6b0-274ace46d835@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Feb 01, 2021 at 10:32:44AM +0100, David Hildenbrand wrote: > On 30.01.21 23:10, Mike Rapoport wrote: > > From: Mike Rapoport > > > > The physical memory on an x86 system starts at address 0, but this is not > > always reflected in e820 map. For example, the BIOS can have e820 entries > > like > > > > [ 0.000000] BIOS-provided physical RAM map: > > [ 0.000000] BIOS-e820: [mem 0x0000000000001000-0x000000000009ffff] usable > > > > or > > > > [ 0.000000] BIOS-provided physical RAM map: > > [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x0000000000000fff] reserved > > [ 0.000000] BIOS-e820: [mem 0x0000000000001000-0x0000000000057fff] usable > > > > In either case, e820__memblock_setup() won't add the range 0x0000 - 0x1000 > > to memblock.memory and later during memory map initialization this range is > > left outside any zone. > > > > With SPARSEMEM=y there is always a struct page for pfn 0 and this struct > > page will have it's zone link wrong no matter what value will be set there. > > > > To avoid this inconsistency, add the beginning of RAM to memblock.memory. > > Limit the added chunk size to match the reserved memory to avoid > > registering memory that may be used by the firmware but never reserved at > > e820__memblock_setup() time. > > > > Fixes: bde9cfa3afe4 ("x86/setup: don't remove E820_TYPE_RAM for pfn 0") > > Signed-off-by: Mike Rapoport > > Cc: stable@vger.kernel.org > > --- > > arch/x86/kernel/setup.c | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > > index 3412c4595efd..67c77ed6eef8 100644 > > --- a/arch/x86/kernel/setup.c > > +++ b/arch/x86/kernel/setup.c > > @@ -727,6 +727,14 @@ static void __init trim_low_memory_range(void) > > * Kconfig help text for X86_RESERVE_LOW. > > */ > > memblock_reserve(0, ALIGN(reserve_low, PAGE_SIZE)); > > + > > + /* > > + * Even if the firmware does not report the memory at address 0 as > > + * usable, inform the generic memory management about its existence > > + * to ensure it is a part of ZONE_DMA and the memory map for it is > > + * properly initialized. > > + */ > > + memblock_add(0, ALIGN(reserve_low, PAGE_SIZE)); > > } > > > > /* > > > > I think, to make that code more robust, and to not rely on archs to do the > right thing, we should do something like > > 1) Make sure in free_area_init() that each PFN with a memmap (i.e., falls > into a partial present section) is spanned by a zone; that would include PFN > 0 in this case. > > 2) In init_zone_unavailable_mem(), similar to round_up(max_pfn, > PAGES_PER_SECTION) handling, consider range > [round_down(min_pfn, PAGES_PER_SECTION), min_pfn - 1] > which would handle in the x86-64 case [0..0] and, therefore, initialize PFN > 0. > > Also, I think the special-case of PFN 0 is analogous to the > round_up(max_pfn, PAGES_PER_SECTION) handling in > init_zone_unavailable_mem(): who guarantees that these PFN above the highest > present PFN are actually spanned by a zone? > > I'd suggest going through all zone ranges in free_area_init() first, dealing > with zones that have "not section aligned start/end", clamping them up/down > if required such that no holes within a section are left uncovered by a > zone. I thought about changing the way zone extents are calculated so that zone start/end will be always on a section boundary, but zone->zone_start_pfn depends on node->node_start_pfn which is defined by hardware and expanding a node to make its start pfn aligned at the section boundary might violate the HW addressing scheme. Maybe this could never happen, or maybe it's not really important as the pages there will be reserved anyway, but I'm not sure I can estimate all the implications. -- Sincerely yours, Mike.