Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp2772809imm; Mon, 10 Sep 2018 06:20:42 -0700 (PDT) X-Google-Smtp-Source: ANB0VdaS1kDxX/44ltykHUdGdJ1hDk4DfjqlUq4rgGb7xfcsBtpQGnUBqs8hg6SQ8M/8YQTSLmN3 X-Received: by 2002:a65:4849:: with SMTP id i9-v6mr22196758pgs.350.1536585642927; Mon, 10 Sep 2018 06:20:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536585642; cv=none; d=google.com; s=arc-20160816; b=y1fvGF/ldDUIo4QMlnFljHFdz3uZ0TZKAU6SDBRmCkFE6QoXT90iLDSNPPFaMK1BfR l4sGq5uFS3SZNOCswkTVySuLFculd1vR1/dvaa39QyDmTeAk6onR7VuYsznJNl2ogUBj 5lpHil5wb2oIvdSQLtuOQYzUp4cHplmiL1oneJqxCmGm6vsqsSFuv/2Onl+GOm/nHZlp 7Aj9R6bVNoZETSHOPh9txG3KIctXQHGVYAsWIpGsSW0RgWpeosHaUZtcI16Fynot84w9 W4Um77p8zJhTBxE4JGAz2JkrMYl4VMKFlq5FxZ1duWlkwOqK7Lg3fiE5fnNvKcg50l3E v3CQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=AfKO9iib92oHaEKxtVrqoM5GU0O/hODrCLc/gHoD/ss=; b=lgeij3HSoGrv2WMS9hksEi1wFarLDkRFQG4SkSI1twz/FEPYOuH37chyZjxJK3qv+5 A3/kVqurSVy9QZvn+QuMViS9ltJDSR1IN9VVnPsUIv2UKSMrDHSc+T+NgK1zfCwF+kGQ BiBd/9VjXZGedaebfuAfRdvJu3cf8+HgQVrabfVy7e0eCVt49/KYb1so0yL4ro3oZf8F IMdmEe6gu0wttltV/koGXH+TWmJzQu1gZ8rAuKtQxvevm8sR5PtI/E99v+dGiZyBMDj2 iIlr/pnzy0L+onpRvaoDweLb3Yzif5BhTusb7RZ2HxrFugjdx1Hx055ZK5Ovbo9TiLaS B4cA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i189-v6si17773877pgd.668.2018.09.10.06.20.27; Mon, 10 Sep 2018 06:20:42 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728799AbeIJSL7 (ORCPT + 99 others); Mon, 10 Sep 2018 14:11:59 -0400 Received: from mx2.suse.de ([195.135.220.15]:47928 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728020AbeIJSL6 (ORCPT ); Mon, 10 Sep 2018 14:11:58 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 96402B003; Mon, 10 Sep 2018 13:17:54 +0000 (UTC) Date: Mon, 10 Sep 2018 15:17:54 +0200 From: Michal Hocko To: Mikhail Zaslonko Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Pavel.Tatashin@microsoft.com, osalvador@suse.de, gerald.schaefer@de.ibm.com Subject: Re: [PATCH] memory_hotplug: fix the panic when memory end is not on the section boundary Message-ID: <20180910131754.GG10951@dhcp22.suse.cz> References: <20180910123527.71209-1-zaslonko@linux.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180910123527.71209-1-zaslonko@linux.ibm.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [Cc Pavel] On Mon 10-09-18 14:35:27, Mikhail Zaslonko wrote: > If memory end is not aligned with the linux memory section boundary, such > a section is only partly initialized. This may lead to VM_BUG_ON due to > uninitialized struct pages access from is_mem_section_removable() or > test_pages_in_a_zone() function. > > Here is one of the panic examples: > CONFIG_DEBUG_VM_PGFLAGS=y > kernel parameter mem=3075M OK, so the last memory section is not full and we have a partial memory block right? > page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) OK, this means that the struct page is not fully initialized. Do you have a specific place which has triggered this assert? > ------------[ cut here ]------------ > Call Trace: > ([<000000000039b8a4>] is_mem_section_removable+0xcc/0x1c0) > [<00000000009558ba>] show_mem_removable+0xda/0xe0 > [<00000000009325fc>] dev_attr_show+0x3c/0x80 > [<000000000047e7ea>] sysfs_kf_seq_show+0xda/0x160 > [<00000000003fc4e0>] seq_read+0x208/0x4c8 > [<00000000003cb80e>] __vfs_read+0x46/0x180 > [<00000000003cb9ce>] vfs_read+0x86/0x148 > [<00000000003cc06a>] ksys_read+0x62/0xc0 > [<0000000000c001c0>] system_call+0xdc/0x2d8 > > This fix checks if the page lies within the zone boundaries before > accessing the struct page data. The check is added to both functions. > Actually similar check has already been present in > is_pageblock_removable_nolock() function but only after the struct page > is accessed. > Well, I am afraid this is not the proper solution. We are relying on the full pageblock worth of initialized struct pages at many other place. We used to do that in the past because we have initialized the full section but this has been changed recently. Pavel, do you have any ideas how to deal with this partial mem sections now? > Signed-off-by: Mikhail Zaslonko > Reviewed-by: Gerald Schaefer > Cc: > --- > mm/memory_hotplug.c | 20 +++++++++++--------- > 1 file changed, 11 insertions(+), 9 deletions(-) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index 9eea6e809a4e..8e20e8fcc3b0 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -1229,9 +1229,8 @@ static struct page *next_active_pageblock(struct page *page) > return page + pageblock_nr_pages; > } > > -static bool is_pageblock_removable_nolock(struct page *page) > +static bool is_pageblock_removable_nolock(struct page *page, struct zone **zone) > { > - struct zone *zone; > unsigned long pfn; > > /* > @@ -1241,15 +1240,14 @@ static bool is_pageblock_removable_nolock(struct page *page) > * We have to take care about the node as well. If the node is offline > * its NODE_DATA will be NULL - see page_zone. > */ > - if (!node_online(page_to_nid(page))) > - return false; > - > - zone = page_zone(page); > pfn = page_to_pfn(page); > - if (!zone_spans_pfn(zone, pfn)) > + if (*zone && !zone_spans_pfn(*zone, pfn)) > return false; > + if (!node_online(page_to_nid(page))) > + return false; > + *zone = page_zone(page); > > - return !has_unmovable_pages(zone, page, 0, MIGRATE_MOVABLE, true); > + return !has_unmovable_pages(*zone, page, 0, MIGRATE_MOVABLE, true); > } > > /* Checks if this range of memory is likely to be hot-removable. */ > @@ -1257,10 +1255,11 @@ bool is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages) > { > struct page *page = pfn_to_page(start_pfn); > struct page *end_page = page + nr_pages; > + struct zone *zone = NULL; > > /* Check the starting page of each pageblock within the range */ > for (; page < end_page; page = next_active_pageblock(page)) { > - if (!is_pageblock_removable_nolock(page)) > + if (!is_pageblock_removable_nolock(page, &zone)) > return false; > cond_resched(); > } > @@ -1296,6 +1295,9 @@ int test_pages_in_a_zone(unsigned long start_pfn, unsigned long end_pfn, > i++; > if (i == MAX_ORDER_NR_PAGES || pfn + i >= end_pfn) > continue; > + /* Check if we got outside of the zone */ > + if (zone && !zone_spans_pfn(zone, pfn)) > + return 0; > page = pfn_to_page(pfn + i); > if (zone && page_zone(page) != zone) > return 0; > -- > 2.16.4 -- Michal Hocko SUSE Labs