Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp669831imu; Mon, 5 Nov 2018 07:05:35 -0800 (PST) X-Google-Smtp-Source: AJdET5d1NC74AserShYDMS2aBHnRHuDHsdTAfB5h88dIu6wUc3l+lujhE3hHbgV8KgfZCZo3drGo X-Received: by 2002:a17:902:5066:: with SMTP id f35-v6mr22097327plh.145.1541430334839; Mon, 05 Nov 2018 07:05:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541430334; cv=none; d=google.com; s=arc-20160816; b=xvl5DTCFmwbyirVen7X8vaI43prL80IlvlI3PNFhKaj7U+FISH8rjb7IsQa9209TrJ dffjT4vh7CYq1mb5C/iWhzAw3khn26wuTquoPFPOyV2klkFG0c2VGepldn7tbk06jVur dAYT2DiBS6nLo+aRzyYVa6jX8Hi2ZBomTEFQHz1ovH0hVLKJKCcCc7r4aR0qy+0esRFT YCl2BfpvWWOeuOU+IkUv9Qzo+0Bn+qHIgdCE842tfRuuPgJYIXG8RUlsCI2xQ4nNlq3d Gw3J4MQBktuHrcPlaME5G7Yi9zvXBGUgkc6F2vkKHIEaFdVr+U1MrsV1TjGCFVnypsV5 kGPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:references:in-reply-to:date :subject:cc:to:from; bh=GSSm6xWsWkn9tWb14DnUelz8UtpxgNa9JxncWE3Rgos=; b=iU2PjIVk2FzMexgDXUdOz4y0lMhPiY4mr4wmXxx/PnxZOR7Mv+BYJS6dL0Ot0Eyevq Jf0o7yYSIDhGVHA9M4F2er1iS8Qj5B7vW13OGzp3i5QUJDKnN+OzUdwwsKau0cFRmaSU G/VgQ9Dmf69k5Vtrj3yrY3WSNK0AvqcGXF0HPtT7urLsodJ+yhIWMWJB1+LScd0Ix4j+ GuMM7zaHBOVIxQQv2cJ40ljLmRdeyRAE6CC/LcKU9sqPRlNz57hUaFOCyh4xvgTGeXaF ILSTMs5ztFKMxEPLpSAUUXtBjtL4iPg+ZffKgCVb450ZvaEWEoctySdM2wvEk+MhfQbD 2Uig== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 207-v6si27334975pfw.277.2018.11.05.07.04.55; Mon, 05 Nov 2018 07:05:34 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729765AbeKFAYZ (ORCPT + 99 others); Mon, 5 Nov 2018 19:24:25 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:33838 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729349AbeKFAYZ (ORCPT ); Mon, 5 Nov 2018 19:24:25 -0500 Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id wA5F4FKi104526 for ; Mon, 5 Nov 2018 10:04:16 -0500 Received: from e06smtp03.uk.ibm.com (e06smtp03.uk.ibm.com [195.75.94.99]) by mx0b-001b2d01.pphosted.com with ESMTP id 2njp70pqga-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 05 Nov 2018 10:04:16 -0500 Received: from localhost by e06smtp03.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 5 Nov 2018 15:04:07 -0000 Received: from b06cxnps3074.portsmouth.uk.ibm.com (9.149.109.194) by e06smtp03.uk.ibm.com (192.168.101.133) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Mon, 5 Nov 2018 15:04:04 -0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id wA5F43b144105744 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 5 Nov 2018 15:04:03 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id EE434AE065; Mon, 5 Nov 2018 15:04:02 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A9E86AE045; Mon, 5 Nov 2018 15:04:02 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Mon, 5 Nov 2018 15:04:02 +0000 (GMT) From: Mikhail Zaslonko To: akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, mhocko@kernel.org, Pavel.Tatashin@microsoft.com, schwidefsky@de.ibm.com, heiko.carstens@de.ibm.com, gerald.schaefer@de.ibm.com, zaslonko@linux.ibm.com Subject: [PATCH v2 1/1] memory_hotplug: fix the panic when memory end is not on the section boundary Date: Mon, 5 Nov 2018 16:04:01 +0100 X-Mailer: git-send-email 2.16.4 In-Reply-To: <20181105150401.97287-1-zaslonko@linux.ibm.com> References: <20181105150401.97287-1-zaslonko@linux.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18110515-0012-0000-0000-000002C3A4AA X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18110515-0013-0000-0000-000020F7E3E3 Message-Id: <20181105150401.97287-2-zaslonko@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-11-05_08:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1811050139 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org If memory end is not aligned with the sparse memory section boundary, the mapping of such a section is only partly initialized. This may lead to VM_BUG_ON due to uninitialized struct pages access from is_mem_section_removable() or test_pages_in_a_zone() function triggered by memory_hotplug sysfs handlers. Here are the the panic examples: CONFIG_DEBUG_VM_PGFLAGS=y kernel parameter mem=2050M -------------------------- page:000003d082008000 is uninitialized and poisoned page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) Call Trace: ([<0000000000385b26>] test_pages_in_a_zone+0xde/0x160) [<00000000008f15c4>] show_valid_zones+0x5c/0x190 [<00000000008cf9c4>] dev_attr_show+0x34/0x70 [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148 [<00000000003e4194>] seq_read+0x204/0x480 [<00000000003b53ea>] __vfs_read+0x32/0x178 [<00000000003b55b2>] vfs_read+0x82/0x138 [<00000000003b5be2>] ksys_read+0x5a/0xb0 [<0000000000b86ba0>] system_call+0xdc/0x2d8 Last Breaking-Event-Address: [<0000000000385b26>] test_pages_in_a_zone+0xde/0x160 Kernel panic - not syncing: Fatal exception: panic_on_oops CONFIG_DEBUG_VM_PGFLAGS=y kernel parameter mem=3075M -------------------------- page:000003d08300c000 is uninitialized and poisoned page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) Call Trace: ([<000000000038596c>] is_mem_section_removable+0xb4/0x190) [<00000000008f12fa>] show_mem_removable+0x9a/0xd8 [<00000000008cf9c4>] dev_attr_show+0x34/0x70 [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148 [<00000000003e4194>] seq_read+0x204/0x480 [<00000000003b53ea>] __vfs_read+0x32/0x178 [<00000000003b55b2>] vfs_read+0x82/0x138 [<00000000003b5be2>] ksys_read+0x5a/0xb0 [<0000000000b86ba0>] system_call+0xdc/0x2d8 Last Breaking-Event-Address: [<000000000038596c>] is_mem_section_removable+0xb4/0x190 Kernel panic - not syncing: Fatal exception: panic_on_oops This fix checks if the page lies within the zone boundaries before accessing the struct page data. The check is added to both functions. Signed-off-by: Mikhail Zaslonko Reviewed-by: Gerald Schaefer Cc: --- mm/memory_hotplug.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 38d94b703e9d..8402e70f74c2 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1229,9 +1229,8 @@ static struct page *next_active_pageblock(struct page *page) return page + pageblock_nr_pages; } -static bool is_pageblock_removable_nolock(struct page *page) +static bool is_pageblock_removable_nolock(struct page *page, struct zone **zone) { - struct zone *zone; unsigned long pfn; /* @@ -1241,15 +1240,14 @@ static bool is_pageblock_removable_nolock(struct page *page) * We have to take care about the node as well. If the node is offline * its NODE_DATA will be NULL - see page_zone. */ - if (!node_online(page_to_nid(page))) - return false; - - zone = page_zone(page); pfn = page_to_pfn(page); - if (!zone_spans_pfn(zone, pfn)) + if (*zone && !zone_spans_pfn(*zone, pfn)) return false; + if (!node_online(page_to_nid(page))) + return false; + *zone = page_zone(page); - return !has_unmovable_pages(zone, page, 0, MIGRATE_MOVABLE, true); + return !has_unmovable_pages(*zone, page, 0, MIGRATE_MOVABLE, true); } /* Checks if this range of memory is likely to be hot-removable. */ @@ -1257,10 +1255,11 @@ bool is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages) { struct page *page = pfn_to_page(start_pfn); struct page *end_page = page + nr_pages; + struct zone *zone = NULL; /* Check the starting page of each pageblock within the range */ for (; page < end_page; page = next_active_pageblock(page)) { - if (!is_pageblock_removable_nolock(page)) + if (!is_pageblock_removable_nolock(page, &zone)) return false; cond_resched(); } @@ -1296,6 +1295,9 @@ int test_pages_in_a_zone(unsigned long start_pfn, unsigned long end_pfn, i++; if (i == MAX_ORDER_NR_PAGES || pfn + i >= end_pfn) continue; + /* Check if we got outside of the zone */ + if (zone && !zone_spans_pfn(zone, pfn + i)) + return 0; page = pfn_to_page(pfn + i); if (zone && page_zone(page) != zone) return 0; -- 2.16.4