Received: by 10.213.65.68 with SMTP id h4csp1030518imn; Wed, 14 Mar 2018 07:36:30 -0700 (PDT) X-Google-Smtp-Source: AG47ELu1rFnmXx711PeWQGNm4/UAAVpDNbsigqIL2Nf5Yh6/ROCBleqwRMhPA5+iC6yDWslbNPgF X-Received: by 2002:a17:902:b2c6:: with SMTP id x6-v6mr4420749plw.298.1521038190177; Wed, 14 Mar 2018 07:36:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521038190; cv=none; d=google.com; s=arc-20160816; b=dIA+fviF+I+hj/eKbr/1xOX3FRZIJFu/3tczyGRdH7Co0/AAXANe8zMNrIzisp25HY mpVQ+7y6BikxjIYjDto/jBqsiL2ylQZHn+ps2QJ2vjvE1DHFg/1epZo/XdX3SzbqDUju z2XKe9CE5enaWL4uYbfuHv3RI+/DGhyj3oLFsPlQf7DhYuQZkclwvg5KJZj9sMlhQOXv KyrX7ELYhVHSvF6nMoopvQAd1csysPOrfrcvOTsgat8HyQ02B00g7PtFQKZ2nkUhXw5q 0ozUJGA94LwtQevNOuIkUNU/8LY+1Ln9qWvRJajWTKYe7ux+SuaR8x8eus/slt/TtWTA gWlw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=/4ceh6i/X704YWH6GW8Kt4nTwIWOKnrE5PeLw8N5RZw=; b=tEoe5WFwopxZHIlnVZMoycnv34MK9m+7/A5xoIP3v3vzCy8/s/CBDhS1xKV/Y7Eczp JcVG1S9m3i+fDMPh04Hvra65ZnthH1Fyfg2RVonh0dtG9jxt9/prjKtqwv4l+p316vFy Q38Sp6wkr99wc1kj49Rc41JZ+GkKcqJ0SHRFjFMkiNWosszas5Q0ku9n+EfKgSbFtJip h96mCSYQF4T8kl0l6VHiBkEsGtucIXkEAii+DiskyQuXIiKnVZMRh2uADbFBDsk3xXvl gX5EeDK8HQhrnsOw6mqfbetkTREdaW/Iwv4sm0We+KYsQ39VpiC7R775y36vozFOTM5P iVLQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=R+20YGQN; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k7si1904512pgo.509.2018.03.14.07.36.14; Wed, 14 Mar 2018 07:36:30 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=R+20YGQN; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751402AbeCNOfP (ORCPT + 99 others); Wed, 14 Mar 2018 10:35:15 -0400 Received: from mail-io0-f193.google.com ([209.85.223.193]:35385 "EHLO mail-io0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750779AbeCNOfO (ORCPT ); Wed, 14 Mar 2018 10:35:14 -0400 Received: by mail-io0-f193.google.com with SMTP id k21so4641651ioc.2 for ; Wed, 14 Mar 2018 07:35:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=/4ceh6i/X704YWH6GW8Kt4nTwIWOKnrE5PeLw8N5RZw=; b=R+20YGQNUEnTc1pldLMa+drzoN4KtsXEM3VAx+/hOfoSMGKe5XlTSMMz6mgnwtwTIs rncPSMqW67mBC5pYaSsO/metGJ5OOQbOiUkLBVJNbXuLdxLTXqY9+VSOvWYoZGlyYUAJ xGXfi8WWRt9bt5rg8eQFve8AGZ8S+K9hcCqoI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=/4ceh6i/X704YWH6GW8Kt4nTwIWOKnrE5PeLw8N5RZw=; b=TTjfXufssDUa9OeiMYGB2iUT+VwaYG6A2L+2VbFPsv0cnuzORCdau9S9hHBWBJQk/Q AvzNYt/un42nrsqYkVUlm+lEXGaSsbECH+4fxWesvJrgLA6TX5y8puqaQIcfXsJiu0+Z BEmcrfIbRLQQZ00hTjTo16knAAYhFJZJ+NFesddNl6TMN2BeMVSqtMVHNbgn0DB/qNzq 7c0CQHiCzJjhjFx4P7jjuZ+iq0lTWFsgWn7jtXApKQdNKnUnjnhZHUrGS8xsqlX58kVD G7Y15jnYDFBjGPJ4jmsWGGB0SaBZWLJL5yv/lkYepcUlGs4HLysEan+BooxQuQg5SSaR jivA== X-Gm-Message-State: AElRT7FxbW3vA7RFiIiuJbXQ0HIK/u0T1GWiqWjDPrHg1Ey5ZgLPaGbT w8W9T1SASo35avp1uvpIFDl2coiSBshM3A51waf/jiJERN4= X-Received: by 10.107.41.16 with SMTP id p16mr4970576iop.173.1521038113185; Wed, 14 Mar 2018 07:35:13 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.138.209 with HTTP; Wed, 14 Mar 2018 07:35:12 -0700 (PDT) In-Reply-To: <20180314141323.GD23100@dhcp22.suse.cz> References: <20180314134431.13241-1-ard.biesheuvel@linaro.org> <20180314141323.GD23100@dhcp22.suse.cz> From: Ard Biesheuvel Date: Wed, 14 Mar 2018 14:35:12 +0000 Message-ID: Subject: Re: [PATCH] Revert "mm/page_alloc: fix memmap_init_zone pageblock alignment" To: Michal Hocko Cc: linux-arm-kernel , Linux Kernel Mailing List , Mark Rutland , Will Deacon , Catalin Marinas , Marc Zyngier , Daniel Vacek , Mel Gorman , Paul Burton , Pavel Tatashin , Vlastimil Babka , Andrew Morton , Linus Torvalds Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 14 March 2018 at 14:13, Michal Hocko wrote: > Does http://lkml.kernel.org/r/20180313224240.25295-1-neelx@redhat.com > fix your issue? From the debugging info you provided it should because > the patch prevents jumping backwards. > The patch does fix the boot hang. But I am concerned that we are papering over a fundamental flaw in memblock_next_valid_pfn(). If that does not always produce the next valid PFN, surely we should be fixing *that* rather than dealing with it here by rounding, aligning and keeping track of whether we are advancing or not? So in my opinion, this patch should still be reverted, and the underlying issue fixed properly instead. > On Wed 14-03-18 13:44:31, Ard Biesheuvel wrote: >> This reverts commit 864b75f9d6b0100bb24fdd9a20d156e7cda9b5ae. >> >> It breaks the boot on my Socionext SynQuacer based system, because >> it enters an infinite loop iterating over the pfns. >> >> Adding the following debug output to memmap_init_zone() >> >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -5365,6 +5365,11 @@ >> * the valid region but still depends on correct page >> * metadata. >> */ >> + pr_err("pfn:%lx oldnext:%lx newnext:%lx\n", pfn, >> + memblock_next_valid_pfn(pfn, end_pfn) - 1, >> + (memblock_next_valid_pfn(pfn, end_pfn) & >> + ~(pageblock_nr_pages-1)) - 1); >> + >> pfn = (memblock_next_valid_pfn(pfn, end_pfn) & >> ~(pageblock_nr_pages-1)) - 1; >> #endif >> >> results in >> >> Booting Linux on physical CPU 0x0000000000 [0x410fd034] >> Linux version 4.16.0-rc5-00004-gfc6eabbbf8ef-dirty (ard@dogfood) ... >> Machine model: Socionext Developer Box >> earlycon: pl11 at MMIO 0x000000002a400000 (options '') >> bootconsole [pl11] enabled >> efi: Getting EFI parameters from FDT: >> efi: EFI v2.70 by Linaro >> efi: SMBIOS 3.0=0xff580000 ESRT=0xf9948198 MEMATTR=0xf83b1a98 RNG=0xff7ac898 >> random: fast init done >> efi: seeding entropy pool >> esrt: Reserving ESRT space from 0x00000000f9948198 to 0x00000000f99481d0. >> cma: Reserved 16 MiB at 0x00000000fd800000 >> NUMA: No NUMA configuration found >> NUMA: Faking a node at [mem 0x0000000000000000-0x0000000fffffffff] >> NUMA: NODE_DATA [mem 0xffffd8d80-0xffffda87f] >> Zone ranges: >> DMA32 [mem 0x0000000080000000-0x00000000ffffffff] >> Normal [mem 0x0000000100000000-0x0000000fffffffff] >> Movable zone start for each node >> Early memory node ranges >> node 0: [mem 0x0000000080000000-0x00000000febeffff] >> node 0: [mem 0x00000000febf0000-0x00000000fefcffff] >> node 0: [mem 0x00000000fefd0000-0x00000000ff43ffff] >> node 0: [mem 0x00000000ff440000-0x00000000ff7affff] >> node 0: [mem 0x00000000ff7b0000-0x00000000ffffffff] >> node 0: [mem 0x0000000880000000-0x0000000fffffffff] >> Initmem setup node 0 [mem 0x0000000080000000-0x0000000fffffffff] >> pfn:febf0 oldnext:febf0 newnext:fe9ff >> pfn:febf0 oldnext:febf0 newnext:fe9ff >> pfn:febf0 oldnext:febf0 newnext:fe9ff >> etc etc >> >> and the boot never proceeds after this point. >> >> So the logic is obviously flawed, and so it is best to revert this at >> the current -rc stage (unless someone can fix the logic instead) >> >> Fixes: 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock alignment") >> Cc: Daniel Vacek >> Cc: Mel Gorman >> Cc: Michal Hocko >> Cc: Paul Burton >> Cc: Pavel Tatashin >> Cc: Vlastimil Babka >> Cc: Andrew Morton >> Cc: Linus Torvalds >> Signed-off-by: Ard Biesheuvel >> --- >> mm/page_alloc.c | 9 ++------- >> 1 file changed, 2 insertions(+), 7 deletions(-) >> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 3d974cb2a1a1..cb416723538f 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -5359,14 +5359,9 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, >> /* >> * Skip to the pfn preceding the next valid one (or >> * end_pfn), such that we hit a valid pfn (or end_pfn) >> - * on our next iteration of the loop. Note that it needs >> - * to be pageblock aligned even when the region itself >> - * is not. move_freepages_block() can shift ahead of >> - * the valid region but still depends on correct page >> - * metadata. >> + * on our next iteration of the loop. >> */ >> - pfn = (memblock_next_valid_pfn(pfn, end_pfn) & >> - ~(pageblock_nr_pages-1)) - 1; >> + pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1; >> #endif >> continue; >> } >> -- >> 2.15.1 >> > > -- > Michal Hocko > SUSE Labs