Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp1376998ybl; Tue, 13 Aug 2019 11:30:39 -0700 (PDT) X-Google-Smtp-Source: APXvYqw080wois/refzXACq/WhZjdzTtl8u0BWgtnhVLK8FOR8TFkHZDCYNo8ffZZHQUsTGPMIIp X-Received: by 2002:aa7:96ee:: with SMTP id i14mr21511640pfq.217.1565721038934; Tue, 13 Aug 2019 11:30:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565721038; cv=none; d=google.com; s=arc-20160816; b=nkO7plWBPPMiA35NxNrz5xa00SSahNSBFfe29WRz8YDyw8xHEPsX0Ca4JPn6bMb09V EoM2FPHgn5CBRG6ijTMCuiOHgSN+WJMmSzkaBlzgyNzDe6aKHWcpG5Ys+/iW4GWSCD2e cRLhOyzBus+DGPXWel+1ZGQmi/BBQDlLaaNHvFR8jWlNkLbyZ4xd7ZQvx/uofYYO15YG GpBEXG3whipdnwtrbjLObq5pCzhRlpiUkGS06YIvtxkJA6Y/CGWgzPim50Bs9GbiVrxw C+HfKGfk8vqZAlUAheysAX563tmRNPSwJht7n2ILaE78uPveF3e6piP5TF4r1kR29KRP PnVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date:dkim-signature; bh=k8N1+CVxt+RJj8Si9vYbnsuXtAKEpExU7o2UK2KgdhA=; b=KoGO+WvSaU/4BrOxmrLrCXvZQDcKVHqke6YjgrH55MsRclK+it/VuKtM0U5Bd4KESk xwELJXUIiw4mm1eyw19i2cmyIFm1mQk8TdFqp4lcaRZtb5b0mKIWegV657QVfH89pMh4 OY367w3lpCL7JhzTaBgyHqRViCs6gfayaW7Pzb81U4byYxDu14xd7BGvpoY8eBkhqK7F x9SlAS8u9DYnQ+KWvpuUyWtvxGfFb8Osbid0ikqA7b0jrkmA7/XBRzJ/+LYtGsmloPMX xP3IudjSaGpQNUxOrZ0MAKqAvWcvEHdRSLjCkNP49roqKG6z5ehr17vwxI+Fjtk/+4pl tNcA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="Wqulxs/I"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e2si17798681pgm.99.2019.08.13.11.30.22; Tue, 13 Aug 2019 11:30:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="Wqulxs/I"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726637AbfHMRWo (ORCPT + 99 others); Tue, 13 Aug 2019 13:22:44 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:45148 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726094AbfHMRWo (ORCPT ); Tue, 13 Aug 2019 13:22:44 -0400 Received: by mail-pf1-f196.google.com with SMTP id w26so6708239pfq.12 for ; Tue, 13 Aug 2019 10:22:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=k8N1+CVxt+RJj8Si9vYbnsuXtAKEpExU7o2UK2KgdhA=; b=Wqulxs/IqBFJerqJEhOtZ6R8tJJJ4oo6uKVQRnyyCSK6crKU0MUkbeWvT+IutOzfK7 I6KpqsK28kEq5P5h/CNsz/7Iic03Pu3YnV5l6xv2IOPAEguLIuXxPrp5KtIf3T+Wxkbo HFMJFR8kaC9/kGtWe+PQe3dnMf92OxiYZWZYY0Mqc21LETZr9UX9BYAoeyWOMef9x1+U yl4E6RhhBtjYdDN8tZc6cSaW3viAsk2jgUnmZK1zmbgIGHEQHhED5T4ahqKVm3AI/YEo QJkMfqqGEox5DER+eJJl3xVcge13QA9IJZkoaLR7BhJ2CBT3yH/w0l5fIjhd+6eYkhqA F76Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=k8N1+CVxt+RJj8Si9vYbnsuXtAKEpExU7o2UK2KgdhA=; b=FsP3QynjCW/BXzMco9kApQ/ECuNvgBgMTz1cD8Vw1YzUWMynvr+jLl6oFZjAMwyPYB xUUcN94IwbQN8lLorvwdwL97Pd638p0gDhrcnTu6RP7RmSPz6sVXzszBH1WWVPHaJm6Q QX5Hy/4BhUQodoh5prPC8iYgJ8EFDfjEEM0QFItNP7sj7ZD809J1689SL8EPlx1Grjj2 iI9oqtNWEPK9HClNeFV9aE3pMsT/+0TPe+O3QNLtSR1grvXvBhxDO0rcBWepxk9wXEgr BJ+xC9tqbOBfUHKXNUh8H/1PGAwQCxRugxmhVlLYnd48PqVjrC4Nr1hi8HTOLFtsbe1j 1SrQ== X-Gm-Message-State: APjAAAWj7W+sTxPBpDawEpofDbXN+qwlhnZVEQmYPIV6b2TKisZwM6fX 5Ua6Yf8Y465aQ7K3UNzkc8ySLw== X-Received: by 2002:a17:90a:bb0c:: with SMTP id u12mr3272874pjr.132.1565716962792; Tue, 13 Aug 2019 10:22:42 -0700 (PDT) Received: from [2620:15c:17:3:3a5:23a7:5e32:4598] ([2620:15c:17:3:3a5:23a7:5e32:4598]) by smtp.gmail.com with ESMTPSA id z4sm166362957pfg.166.2019.08.13.10.22.41 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Tue, 13 Aug 2019 10:22:41 -0700 (PDT) Date: Tue, 13 Aug 2019 10:22:41 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Vlastimil Babka cc: Andrew Morton , Naoya Horiguchi , Mel Gorman , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [patch] mm, page_alloc: move_freepages should not examine struct page of reserved memory In-Reply-To: <3aadeed1-3f38-267d-8dae-839e10a2f9d2@suse.cz> Message-ID: References: <3aadeed1-3f38-267d-8dae-839e10a2f9d2@suse.cz> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 13 Aug 2019, Vlastimil Babka wrote: > > After commit 907ec5fca3dc ("mm: zero remaining unavailable struct pages"), > > struct page of reserved memory is zeroed. This causes page->flags to be 0 > > and fixes issues related to reading /proc/kpageflags, for example, of > > reserved memory. > > > > The VM_BUG_ON() in move_freepages_block(), however, assumes that > > page_zone() is meaningful even for reserved memory. That assumption is no > > longer true after the aforementioned commit. > > How comes that move_freepages_block() gets called on reserved memory in > the first place? > It's simply math after finding a valid free page from the per-zone free area to use as fallback. We find the beginning and end of the pageblock of the valid page and that can bring us into memory that was reserved per the e820. pfn_valid() is still true (it's backed by a struct page), but since it's zero'd we shouldn't make any inferences here about comparing its node or zone. The current node check just happens to succeed most of the time by luck because reserved memory typically appears on node 0. The fix here is to validate that we actually have buddy pages before testing if there's any type of zone or node strangeness going on. > > There's no reason why move_freepages_block() should be testing the > > legitimacy of page_zone() for reserved memory; its scope is limited only > > to pages on the zone's freelist. > > > > Note that pfn_valid() can be true for reserved memory: there is a backing > > struct page. The check for page_to_nid(page) is also buggy but reserved > > memory normally only appears on node 0 so the zeroing doesn't affect this. > > > > Move the debug checks to after verifying PageBuddy is true. This isolates > > the scope of the checks to only be for buddy pages which are on the zone's > > freelist which move_freepages_block() is operating on. In this case, an > > incorrect node or zone is a bug worthy of being warned about (and the > > examination of struct page is acceptable bcause this memory is not > > reserved). > > > > Signed-off-by: David Rientjes > > --- > > mm/page_alloc.c | 19 ++++--------------- > > 1 file changed, 4 insertions(+), 15 deletions(-) > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -2238,27 +2238,12 @@ static int move_freepages(struct zone *zone, > > unsigned int order; > > int pages_moved = 0; > > > > -#ifndef CONFIG_HOLES_IN_ZONE > > - /* > > - * page_zone is not safe to call in this context when > > - * CONFIG_HOLES_IN_ZONE is set. This bug check is probably redundant > > - * anyway as we check zone boundaries in move_freepages_block(). > > - * Remove at a later date when no bug reports exist related to > > - * grouping pages by mobility > > - */ > > - VM_BUG_ON(pfn_valid(page_to_pfn(start_page)) && > > - pfn_valid(page_to_pfn(end_page)) && > > - page_zone(start_page) != page_zone(end_page)); > > -#endif > > for (page = start_page; page <= end_page;) { > > if (!pfn_valid_within(page_to_pfn(page))) { > > page++; > > continue; > > } > > > > - /* Make sure we are not inadvertently changing nodes */ > > - VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page); > > - > > if (!PageBuddy(page)) { > > /* > > * We assume that pages that could be isolated for > > @@ -2273,6 +2258,10 @@ static int move_freepages(struct zone *zone, > > continue; > > } > > > > + /* Make sure we are not inadvertently changing nodes */ > > + VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page); > > + VM_BUG_ON_PAGE(page_zone(page) != zone, page); > > The later check implies the former check, so if it's to stay, the first > one could be removed and comment adjusted s/nodes/zones/ > Does it? The first is checking for a corrupted page_to_nid the second is checking for a corrupted or unexpected page_zone. What's being tested here is the state of struct page, as it was previous to this patch, not the state of struct zone.