Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp1985552pxu; Tue, 24 Nov 2020 13:58:30 -0800 (PST) X-Google-Smtp-Source: ABdhPJwWL4rTovd/2pGYixYebbCVCu4o3m920IR635j1Bb1E+5dfYVRS7YzhO2ZWCQ+JnhJpUsm1 X-Received: by 2002:a17:906:268c:: with SMTP id t12mr436434ejc.91.1606255110485; Tue, 24 Nov 2020 13:58:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606255110; cv=none; d=google.com; s=arc-20160816; b=EJ3KQrc3z+brt+ig6QxU3l23hUedTKiXSxEdL/jG4j4S2N/lTp107I3lbGb6HVXZDC 9GqiGQmubwsxmh1jniolkS9kDI2+CyXfZ4VbIU5VGgKBYBUNAaVyaGvwpGxJ2DTLP1j0 zdZgAwVFMGyAih3t+6kfz4VR9jJqgd5zW6grVAnIxgRWxBxTlN5AeFLBUsE/VV8XJX9k TiHDeksQ2r4EQNibbT8hpItE5bBb0YUmBzOj8eAMk8DaAniQO+f9N70MietrCTShobkM 04Cb+2W+AIdvX94o73vs+GESy8wnOdC7cE5vkCbOuON0d0zVqOw91MMI14RPbpVgnwUs 0e7A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=ft47GhkhFDP7BIqJsP/NZqsWkyRowlfY49A0LMv5+Kg=; b=BeF4Ml7EL3oXWvOJzFOEAOML/fdw5h0MXAX8LmZFgaoTeekK2pXZBcDF6pMmFYrhzz bC4evXwp8JYskd2Yez4u4wWWu7lqTSFpjT5LhpHVup3rMwd18P1p23+Ckw10WlGzB1/a of/8QhxxkL+mE/KHMFJRLjwkQurkVyY7n7fqgJDgGRtUswDXxt6X0vku5YJNXAuacyVY SlK1W83EwwGLx6kYjbwwrmDTSa0IyPxGXYXrMIjovWfYpiQURsCaaVEJKtdgIfe68qQn 6TjLXa0D1FxFHZv+ZLa9QOToKuCoQblCo6Ox3zqOzfwzYiSn/L+0z9HU2XxMUb+9Vwr1 fdWw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t7si6702edv.378.2020.11.24.13.58.06; Tue, 24 Nov 2020 13:58:30 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388382AbgKXNcK (ORCPT + 99 others); Tue, 24 Nov 2020 08:32:10 -0500 Received: from mx2.suse.de ([195.135.220.15]:48744 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388349AbgKXNcJ (ORCPT ); Tue, 24 Nov 2020 08:32:09 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id DD8CFAFC9; Tue, 24 Nov 2020 13:32:07 +0000 (UTC) Date: Tue, 24 Nov 2020 13:32:05 +0000 From: Mel Gorman To: Vlastimil Babka Cc: Andrea Arcangeli , Andrew Morton , linux-mm@kvack.org, Qian Cai , Michal Hocko , David Hildenbrand , linux-kernel@vger.kernel.org, Mike Rapoport , Baoquan He Subject: Re: [PATCH 1/1] mm: compaction: avoid fast_isolate_around() to set pageblock_skip on reserved pages Message-ID: <20201124133205.GK3306@suse.de> References: <8C537EB7-85EE-4DCF-943E-3CC0ED0DF56D@lca.pw> <20201121194506.13464-1-aarcange@redhat.com> <20201121194506.13464-2-aarcange@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 23, 2020 at 02:01:16PM +0100, Vlastimil Babka wrote: > > I actually never reproduced it until v5.9, but it's still the same bug > > as it was reported first for v5.7. > > > > See the page is "reserved" in all 3 cases. In the last two crashes > > with the pfn: > > > > pfn 0x39800 -> 0x39800000 min_pfn hit non-RAM: > > > > 39639000-39814fff : Unknown E820 type > > > > pfn 0x7a200 -> 0x7a200000 min_pfn hit non-RAM: > > > > 7a17b000-7a216fff : Unknown E820 type > > It would be nice to also provide a /proc/zoneinfo and how exactly the > "zone_spans_pfn" was violated. I assume we end up below zone's start_pfn, > but is it true? > It must be if zone_spans_pfn is getting triggered. > > This actually seems a false positive bugcheck, the page structures are > > valid and the zones are correct, just it's non-RAM but setting > > pageblockskip should do no harm. However it's possible to solve the > > crash without lifting the bugcheck, by enforcing the invariant that > > the free_pfn cursor doesn't point to reserved pages (which would be > > otherwise implicitly achieved through the PageBuddy check, except in > > the new fast_isolate_around() path). > > > > Fixes: 5a811889de10 ("mm, compaction: use free lists to quickly locate a migration target") > > Signed-off-by: Andrea Arcangeli > > --- > > mm/compaction.c | 5 ++++- > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/mm/compaction.c b/mm/compaction.c > > index 13cb7a961b31..d17e69549d34 100644 > > --- a/mm/compaction.c > > +++ b/mm/compaction.c > > @@ -1433,7 +1433,10 @@ fast_isolate_freepages(struct compact_control *cc) > > page = pageblock_pfn_to_page(min_pfn, > > pageblock_end_pfn(min_pfn), > > cc->zone); > > - cc->free_pfn = min_pfn; > > + if (likely(!PageReserved(page))) > > PageReserved check seems rather awkward solution to me. Wouldn't it be more > obvious if we made sure we don't end up below zone's start_pfn (if my > assumption is correct) in the first place? > > When I check the code: > > unsigned long distance; > distance = (cc->free_pfn - cc->migrate_pfn); > low_pfn = pageblock_start_pfn(cc->free_pfn - (distance >> 2)); > min_pfn = pageblock_start_pfn(cc->free_pfn - (distance >> 1)); > > I think what can happen is that cc->free_pfn <= cc->migrate_pfn after the > very last isolate_migratepages(). I would hope that is not the case because they are not meant to overlap. However, if the beginning of the pageblock was not the start of a zone then the pages would be valid but the pfn would still be outside the zone boundary. If it was reserved, the struct page is valid but not suitable for set_pfnblock_flags_mask. However, it is a concern in general because the potential is there that pages are isolated from the wrong zone. > Then compact_finished() detects that in > compact_zone(), but only after migrate_pages() and thus > fast_isolate_freepages() is called. > > That would mean distance can be negative, or rather a large unsigned number > and low_pfn and min_pfn end up away from the zone? > > Or maybe the above doesn't happen, but cc->free_pfn gets so close to start > of the zone, that the calculations above make min_pfn go below start_pfn? > I think the last part is more likely, going below start_pfn > In any case I would rather make sure we stay within the expected zone > boundaries, than play tricks with PageReserved. Mel? > It would be preferable because this time it's PageReserved that happened to trip up an assumption in set_pfnblock_flags_mask but if it was a real zone and real page then compaction is migrating cross-zone which would be surprising. Maybe this untested patch? diff --git a/mm/compaction.c b/mm/compaction.c index 13cb7a961b31..ef1b5dacc289 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -1330,6 +1330,10 @@ fast_isolate_freepages(struct compact_control *cc) low_pfn = pageblock_start_pfn(cc->free_pfn - (distance >> 2)); min_pfn = pageblock_start_pfn(cc->free_pfn - (distance >> 1)); + /* Ensure the PFN is within the zone */ + low_pfn = max(cc->zone->zone_start_pfn, low_pfn); + min_pfn = max(cc->zone->zone_start_pfn, min_pfn); + if (WARN_ON_ONCE(min_pfn > low_pfn)) low_pfn = min_pfn; -- Mel Gorman SUSE Labs