Received: by 10.213.65.68 with SMTP id h4csp1492214imn; Thu, 15 Mar 2018 00:46:36 -0700 (PDT) X-Google-Smtp-Source: AG47ELscZvZ/jDPeG6T2Yx3Bh9MPDwtAHAf7GLdjJOYtMrKzITHFzIV45DlgGDGSuNhTFGcp2P9l X-Received: by 10.98.75.129 with SMTP id d1mr6933274pfj.19.1521099996662; Thu, 15 Mar 2018 00:46:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521099996; cv=none; d=google.com; s=arc-20160816; b=Y4OQ32lYMOKb3bNmCQtn1/6cTtW79Iyp1nPsIq6QnTGIszaqMUdonTaaIswusPgdn2 OE/a+sha+631IinDxyA40AV6bQBKC3rJ/AV+5V62Xd5s23Om7g482bF1JfoXjf0Gq/Yi ZxvKZ3kIX0Xi1JquUVAwI3Ln9go96caCnHtc5JaQwG79cmRjUqBb5fnTR3SM6FuM9B/R 0gO3+Ni1jByyPxmLNLYOmitnVycjStpEco2lq9vV/fNu01JlAJH6+OG1ccQsGU1th1sd oNGlDFJltixJLjO3GgHC/dzDCBLEAt50zS9VT7MSoXvtGZUms7/B51BBq4PhEd5ZaU/Q EZ0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:arc-authentication-results; bh=Nq7dAaknvivFCbBhn130VIng2iseJbMDX1kTiXgGrlk=; b=MTTE86g5xT5p4/r0bVEaobVqChtJTu+xjOAXyMuwilmFHoa2aG2mRyDdZHDZQdJltp MfcgPwqfIbZeUgNRjjUa27sN3qHP92MMRDDMmOyE8MhPujx9M7nxsmPy3JYQ4xkiv0VJ Pfq9n2y1TfxfpQY9Oa7Va2yB43MZglUBKsRNRk9odqDrZXQ4lQ96RuXJoEH8RpqiP396 hgN26awvaU04A7n9mEIpYEbXkbbRWZ3245uks6dJjc+YKFrKPQ11RmOnDvka6HjzAGRs RhH4iiaPR0LcNsku3MQ9YHsAd5gAaaXMm5QU+W4WV7vqdeJQZr+FbXKzDyemZf7YMiR4 TguQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e16-v6si3496659pli.337.2018.03.15.00.46.22; Thu, 15 Mar 2018 00:46:36 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752004AbeCOHo6 (ORCPT + 99 others); Thu, 15 Mar 2018 03:44:58 -0400 Received: from mail-ot0-f193.google.com ([74.125.82.193]:42181 "EHLO mail-ot0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751588AbeCOHoc (ORCPT ); Thu, 15 Mar 2018 03:44:32 -0400 Received: by mail-ot0-f193.google.com with SMTP id l5-v6so5943621otf.9 for ; Thu, 15 Mar 2018 00:44:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=Nq7dAaknvivFCbBhn130VIng2iseJbMDX1kTiXgGrlk=; b=PKwTHn2ZEcEAYpKIx78ryGrHbe0NgN4ZJodpNQ1Yovd5m97L7F/QM+LV87Jk8j01ge Hvio09B2pe0ekZWXnG3Nf5A5pgAkMt7aIoaUEB8LW3SalVKjpLp4b0ziEOEZ9j989ObA ySBCR3WX5KBsG9UUioEbenriQi0eDtk3tgmsl3gHFFyrJzndTguD6goMv0x7vv3LUusn K59uA23cp7xJvqGHSHbPcUlYzbOzjrtihxcLElq37ns6HJ6Dw4INOfn2HvweqoSjzbdT XJYmp5URRTW+zdjDkW6N6h5I3Cm3b8agwn7rT9BkN6X7bgxPA5ynvZbWIn0XvhAcIbKJ K14A== X-Gm-Message-State: AElRT7H3ER+EPJUnJG4GKCWmgRHodCtjIh328pM2DD6lN+MAjJ3X9zyJ jZNbaQJSEc9BtaoqjTlrmXuaB1wxFRV5IvPyuq2yCBA9 X-Received: by 10.157.22.144 with SMTP id c16mr4658669ote.326.1521099871934; Thu, 15 Mar 2018 00:44:31 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a9d:39f6:0:0:0:0:0 with HTTP; Thu, 15 Mar 2018 00:44:31 -0700 (PDT) In-Reply-To: References: <20180314192937.12888-1-ard.biesheuvel@linaro.org> From: Daniel Vacek Date: Thu, 15 Mar 2018 08:44:31 +0100 Message-ID: Subject: Re: [PATCH v2] Revert "mm/page_alloc: fix memmap_init_zone pageblock alignment" To: Ard Biesheuvel Cc: linux-arm-kernel , open list , Mark Rutland , Will Deacon , Catalin Marinas , Marc Zyngier , Mel Gorman , Michal Hocko , Paul Burton , Pavel Tatashin , Vlastimil Babka , Andrew Morton , Linus Torvalds Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 15, 2018 at 8:36 AM, Ard Biesheuvel wrote: > On 15 March 2018 at 02:23, Daniel Vacek wrote: >> On Wed, Mar 14, 2018 at 8:29 PM, Ard Biesheuvel >> wrote: >>> This reverts commit 864b75f9d6b0100bb24fdd9a20d156e7cda9b5ae. >>> >>> Commit 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock >>> alignment") modified the logic in memmap_init_zone() to initialize >>> struct pages associated with invalid PFNs, to appease a VM_BUG_ON() >>> in move_freepages(), which is redundant by its own admission, and >>> dereferences struct page fields to obtain the zone without checking >>> whether the struct pages in question are valid to begin with. >>> >>> Commit 864b75f9d6b0 only makes it worse, since the rounding it does >>> may cause pfn assume the same value it had in a prior iteration of >>> the loop, resulting in an infinite loop and a hang very early in the >>> boot. Also, since it doesn't perform the same rounding on start_pfn >>> itself but only on intermediate values following an invalid PFN, we >>> may still hit the same VM_BUG_ON() as before. >>> >>> So instead, let's fix this at the core, and ensure that the BUG >>> check doesn't dereference struct page fields of invalid pages. >>> >>> Fixes: 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock alignment") >>> Cc: Daniel Vacek >>> Cc: Mel Gorman >>> Cc: Michal Hocko >>> Cc: Paul Burton >>> Cc: Pavel Tatashin >>> Cc: Vlastimil Babka >>> Cc: Andrew Morton >>> Cc: Linus Torvalds >>> Signed-off-by: Ard Biesheuvel >>> --- >>> mm/page_alloc.c | 13 +++++-------- >>> 1 file changed, 5 insertions(+), 8 deletions(-) >>> >>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>> index 3d974cb2a1a1..635d7dd29d7f 100644 >>> --- a/mm/page_alloc.c >>> +++ b/mm/page_alloc.c >>> @@ -1910,7 +1910,9 @@ static int move_freepages(struct zone *zone, >>> * Remove at a later date when no bug reports exist related to >>> * grouping pages by mobility >>> */ >>> - VM_BUG_ON(page_zone(start_page) != page_zone(end_page)); >>> + VM_BUG_ON(pfn_valid(page_to_pfn(start_page)) && >>> + pfn_valid(page_to_pfn(end_page)) && >>> + page_zone(start_page) != page_zone(end_page)); >> >> Hi, I am on vacation this week and I didn't have a chance to test this >> yet but I am not sure this is correct. Generic pfn_valid() unlike the >> arm{,64} arch specific versions returns true for all pfns in a section >> if there is at least some memory mapped in that section. So I doubt >> this prevents the crash I was targeting. I believe pfn_valid() does >> not change a thing here :( >> > > If this is the case, memblock_next_valid_pfn() is broken since it > skips valid PFNs, and we should be fixing that instead. How do you define valid pfn? Maybe the generic version of pfn_valid() should be fixed??? -nX >> ------------------------ >> include/linux/mmzone.h: >> pfn_valid(pfn) >> valid_section(__nr_to_section(pfn_to_section_nr(pfn))) >> return (section && (section->section_mem_map & SECTION_HAS_MEM_MAP)) >> >> arch/arm64/mm/init.c: >> #ifdef CONFIG_HAVE_ARCH_PFN_VALID >> int pfn_valid(unsigned long pfn) >> { >> return memblock_is_map_memory(pfn << PAGE_SHIFT); >> } >> EXPORT_SYMBOL(pfn_valid); >> #endif >> ------------------------ >> >> Also I already sent a fix to Andrew yesterday which was reported to >> fix the loop. >> >> Moreover, you also reported this: >> >>> Early memory node ranges >>> node 0: [mem 0x0000000080000000-0x00000000febeffff] >>> node 0: [mem 0x00000000febf0000-0x00000000fefcffff] >>> node 0: [mem 0x00000000fefd0000-0x00000000ff43ffff] >>> node 0: [mem 0x00000000ff440000-0x00000000ff7affff] >>> node 0: [mem 0x00000000ff7b0000-0x00000000ffffffff] >>> node 0: [mem 0x0000000880000000-0x0000000fffffffff] >>> Initmem setup node 0 [mem 0x0000000080000000-0x0000000fffffffff] >>> pfn:febf0 oldnext:febf0 newnext:fe9ff >>> pfn:febf0 oldnext:febf0 newnext:fe9ff >>> pfn:febf0 oldnext:febf0 newnext:fe9ff >>> etc etc >> >> I am wondering how come pfn_valid(0xfebf0) returns false here. Should >> it be true or do I miss something? >> >> --nX