Received: by 10.213.65.68 with SMTP id h4csp28543imn; Thu, 15 Mar 2018 08:35:57 -0700 (PDT) X-Google-Smtp-Source: AG47ELt0EHy8RC6yhMUjfB/+sJpPieC1j2WbXa5cVakKU3a31R2THXH6fj2ZDl0DmYOgXQ3Ywre3 X-Received: by 10.99.115.9 with SMTP id o9mr1214332pgc.332.1521128157122; Thu, 15 Mar 2018 08:35:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521128157; cv=none; d=google.com; s=arc-20160816; b=vqowogQ6BYOpgvzVp9h/OU5gnmOf9vl+a2LLzN2gunPBcoEq9MCmOIypC/BKuDmtnK J0HFbp9zvrM0UpPtUzE7nVui0t6MwFcEl0y+27ivC60i2OTEI6Nur/AzIoUrFh1i22g3 5H0UvMx8eirPbKKQKAz/eo0+Q1MvqmOoBEIpjcFRnBE6fGenbvi+jVgZCG2hYCvlMBEB w/X0+BjKXQjClZdd3/FTWFhzkH5Pidhr7HTnxE0ktfc4SGcnIuhk5UAJLLCPMEg+kQnW xxSUzz4+wtxwEjYBkKb8//JFTeEwwP6iTbVfByvo3KpjnLaaHbhl8vlSjwr5XIPF2zMl 8dUg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:arc-authentication-results; bh=XX0YFRKageYQzYcrDDzTS7FdxmWZDh6kyqVQoZU9Mj8=; b=fIK0fzgHjH9tmR/k8HQwqnNai71sLjQvaWw/5DUmg4DZQV+rl72LolE5GyY0uPbjLe Q0kfCKB6ddMKxvAVZl/oM+xqNf+ecNZhBSgm6pwP5hHRJV/W/2KPGq9wzNBuKYA9rhcn 3mcJY9lqgFbLF8Kbz9hBPRddx0ehCoveMxSgFj4SkSsbWQKxFKj1pHPqxy1I2ENBCt/1 YgjmFJzsaQaM/w/9l69/S2hkrA7B3/zGzxzn6rj3SzT/Gaf0myRTEpMLZB9RP+zYTySS 6q05T9/ZZwlIX+kG1A7ScUCfNM33QiYfYZ+KRJ+ajvWvIQ919RxEvoTRu8ZZAWndRtaY ygOQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s17si3556058pgq.421.2018.03.15.08.35.41; Thu, 15 Mar 2018 08:35:57 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752578AbeCOPep (ORCPT + 99 others); Thu, 15 Mar 2018 11:34:45 -0400 Received: from mail-oi0-f65.google.com ([209.85.218.65]:43757 "EHLO mail-oi0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752416AbeCOPen (ORCPT ); Thu, 15 Mar 2018 11:34:43 -0400 Received: by mail-oi0-f65.google.com with SMTP id y27so1837805oix.10 for ; Thu, 15 Mar 2018 08:34:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=XX0YFRKageYQzYcrDDzTS7FdxmWZDh6kyqVQoZU9Mj8=; b=WW0uLHRLNNhJnO4WgMKmfy+fMlNL/rUR4Yu0IKSIptjgSk0lLFUOuHcXcr68OTEokR 5WEUxux77BR+NXQMYIuzSXuwqlMtP1pLEzxoMj4KLFpSCpUH+RVPMhrPMyPTUxsRuwBa DJs13XTPXxzmI73gceIltLdsAMCZ2IOroTzHeV99ZTWfHBlG9DmLLRWZxcexzEXAdEf4 XzhIjWs2Zs7iYWNVX1Blxn6PflKtvDbS0N/Ifpr7SiOqa1GD8uab51knYk4TnTPwwn/N FFIeZdUscYgGQkYEE/yWz1nu2Bahooc99LDHGEyecZyr2TKtWNTb9YHAg6ZucHW3rZVL q5hg== X-Gm-Message-State: AElRT7HDKcOCa8FVuUgGNbWRwJwWX27/M/NXCdArWpfNBkEXIm7ZBQld 6zuKTsUp+vg8x6LyJ0t6cUpcCqITayZfXJ9Vo9Bo+A== X-Received: by 10.202.93.136 with SMTP id r130mr5619251oib.84.1521128083271; Thu, 15 Mar 2018 08:34:43 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a9d:39f6:0:0:0:0:0 with HTTP; Thu, 15 Mar 2018 08:34:42 -0700 (PDT) In-Reply-To: References: <20180314192937.12888-1-ard.biesheuvel@linaro.org> From: Daniel Vacek Date: Thu, 15 Mar 2018 16:34:42 +0100 Message-ID: Subject: Re: [PATCH v2] Revert "mm/page_alloc: fix memmap_init_zone pageblock alignment" To: Ard Biesheuvel Cc: linux-arm-kernel , open list , Mark Rutland , Will Deacon , Catalin Marinas , Marc Zyngier , Mel Gorman , Michal Hocko , Paul Burton , Pavel Tatashin , Vlastimil Babka , Andrew Morton , Linus Torvalds Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 15, 2018 at 4:17 PM, Ard Biesheuvel wrote: > On 15 March 2018 at 15:12, Daniel Vacek wrote: >> On Thu, Mar 15, 2018 at 8:45 AM, Ard Biesheuvel >> wrote: >>> On 15 March 2018 at 07:44, Daniel Vacek wrote: >>>> On Thu, Mar 15, 2018 at 8:36 AM, Ard Biesheuvel >>>> wrote: >>>>> On 15 March 2018 at 02:23, Daniel Vacek wrote: >>>>>> On Wed, Mar 14, 2018 at 8:29 PM, Ard Biesheuvel >>>>>> wrote: >>>>>>> This reverts commit 864b75f9d6b0100bb24fdd9a20d156e7cda9b5ae. >>>>>>> >>>>>>> Commit 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock >>>>>>> alignment") modified the logic in memmap_init_zone() to initialize >>>>>>> struct pages associated with invalid PFNs, to appease a VM_BUG_ON() >>>>>>> in move_freepages(), which is redundant by its own admission, and >>>>>>> dereferences struct page fields to obtain the zone without checking >>>>>>> whether the struct pages in question are valid to begin with. >>>>>>> >>>>>>> Commit 864b75f9d6b0 only makes it worse, since the rounding it does >>>>>>> may cause pfn assume the same value it had in a prior iteration of >>>>>>> the loop, resulting in an infinite loop and a hang very early in the >>>>>>> boot. Also, since it doesn't perform the same rounding on start_pfn >>>>>>> itself but only on intermediate values following an invalid PFN, we >>>>>>> may still hit the same VM_BUG_ON() as before. >>>>>>> >>>>>>> So instead, let's fix this at the core, and ensure that the BUG >>>>>>> check doesn't dereference struct page fields of invalid pages. >>>>>>> >>>>>>> Fixes: 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock alignment") >>>>>>> Cc: Daniel Vacek >>>>>>> Cc: Mel Gorman >>>>>>> Cc: Michal Hocko >>>>>>> Cc: Paul Burton >>>>>>> Cc: Pavel Tatashin >>>>>>> Cc: Vlastimil Babka >>>>>>> Cc: Andrew Morton >>>>>>> Cc: Linus Torvalds >>>>>>> Signed-off-by: Ard Biesheuvel >>>>>>> --- >>>>>>> mm/page_alloc.c | 13 +++++-------- >>>>>>> 1 file changed, 5 insertions(+), 8 deletions(-) >>>>>>> >>>>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>>>>>> index 3d974cb2a1a1..635d7dd29d7f 100644 >>>>>>> --- a/mm/page_alloc.c >>>>>>> +++ b/mm/page_alloc.c >>>>>>> @@ -1910,7 +1910,9 @@ static int move_freepages(struct zone *zone, >>>>>>> * Remove at a later date when no bug reports exist related to >>>>>>> * grouping pages by mobility >>>>>>> */ >>>>>>> - VM_BUG_ON(page_zone(start_page) != page_zone(end_page)); >>>>>>> + VM_BUG_ON(pfn_valid(page_to_pfn(start_page)) && >>>>>>> + pfn_valid(page_to_pfn(end_page)) && >>>>>>> + page_zone(start_page) != page_zone(end_page)); >>>>>> >>>>>> Hi, I am on vacation this week and I didn't have a chance to test this >>>>>> yet but I am not sure this is correct. Generic pfn_valid() unlike the >>>>>> arm{,64} arch specific versions returns true for all pfns in a section >>>>>> if there is at least some memory mapped in that section. So I doubt >>>>>> this prevents the crash I was targeting. I believe pfn_valid() does >>>>>> not change a thing here :( >>>>>> >>>>> >>>>> If this is the case, memblock_next_valid_pfn() is broken since it >>>>> skips valid PFNs, and we should be fixing that instead. >>>> >>>> How do you define valid pfn? Maybe the generic version of pfn_valid() >>>> should be fixed??? >>>> >>> >>> memblock_next_valid_pfn() skips PFNs for which pfn_valid() returns >>> true. That is clearly a bug. >> >> So pfn_valid() does not mean this frame is usable memory? >> > > Who cares what it *means*? abstractions? > memblock_next_valid_pfn() has 'valid_pfn' in its name, so if passing > pfn A returns B, and there exists a C such that A < C < B and > pfn_valid(C) returns true, memblock_next_valid_pfn doesn't do what it > says on the tin and should be fixed. If you don't like the name I would argue it should be changed to something like memblock_pfn_of_next_memory(). Still the name is not next_valid_pfn() but memblock_next... kind of distinguishing something different. > You keep going on about how pfn_valid() does or does not do what you > think, but that is really irrelevant. I'm trying to learn. Hence I was asking what is the abstract meaning of it. As I see two *way_different* implementations so I am not sure how I should understand that. >> OK, in that case we need to fix or revert memblock_next_valid_pfn(). >> That works for me. >> > > OK. You can add my ack to a patch that reverts it, and we can revisit > it for the next cycle. Cool. I'll do that next week. Thank you, sir. --nX