Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp1976101yba; Tue, 2 Apr 2019 21:31:57 -0700 (PDT) X-Google-Smtp-Source: APXvYqw3Ob1ls9A141FzrpYqyMV4L5rSoD1Ov3QdIjnBrOGnX28HZjZDy6KB1onYduVatIRWSlLL X-Received: by 2002:a62:4d43:: with SMTP id a64mr71896930pfb.157.1554265917313; Tue, 02 Apr 2019 21:31:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554265917; cv=none; d=google.com; s=arc-20160816; b=ujkdhyVZbFZUX2x4c809UDpUDamchvhOYJY7V8g3UYOQO4/eLm+ZzeyJQZVvKA4JS/ ZCUd4IlYtzjUp7pLYklA3/Kp8djf9rXKtydaQUKpSYdyDlyCb799Zs3Eu/H7nLdraEPE I8TnXAWBZA0YPGUGMqE7Nz68/ByzrtFswSAwgbGWXzivBXhI4eHsNhpnnB4jIaf29owp p7UgqOq+DPV/6yYZ7or6TbeLQdrR0U30N6cqcynFXqqgd4/taYxSt/Gx0YbO8dms353A 611h5qAO3V9txmydISvdob6f2Jx8OG9ve1cAqUdiEWEx3Bo2IuCc7zz1n8WtCZ7tVrKH Nm/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from; bh=0tmsegy5nwxeJ1f1itRRHuF3L7n3Cp7BKyoDWcbyACM=; b=w5HpH9jPUE4OopDdT2ZYW+FEAy6p2u5gfFhrx0fzAMWCEhO9vXKxz2kNm4b6TbgXWP xKkD7K1Wo3lgsYBtU9iwnpzMkzCOKF8QlsUvXbj55XuhoegX2YcaGDl9+JmRPoTjqhNp iOV8tlADyf24C9FUdUwgsQd53ER3cWzYIz7JgSX/CrQ1J5WH2wvAROmWS1rh6ULyZSQ1 FHXz0EhgDRQvPW3NbY3UkuYx9IeFgkdpBSzYUaU7U2BhQaf2JnIQpPkMrRL3ezFd0dZ7 pLkqeqcWZwK2VXCl2OtrgKfog5wDng7D9GyY8R7GsMCL5SCo1en/4Fh/5OEwUjQqbI/y btnA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v12si6615260pga.148.2019.04.02.21.31.42; Tue, 02 Apr 2019 21:31:57 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728554AbfDCEah (ORCPT + 99 others); Wed, 3 Apr 2019 00:30:37 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:60356 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728526AbfDCEag (ORCPT ); Wed, 3 Apr 2019 00:30:36 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8B98F1993; Tue, 2 Apr 2019 21:30:36 -0700 (PDT) Received: from p8cg001049571a15.arm.com (unknown [10.163.1.97]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id EF1B43F721; Tue, 2 Apr 2019 21:30:30 -0700 (PDT) From: Anshuman Khandual To: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, akpm@linux-foundation.org, will.deacon@arm.com, catalin.marinas@arm.com Cc: mhocko@suse.com, mgorman@techsingularity.net, james.morse@arm.com, mark.rutland@arm.com, robin.murphy@arm.com, cpandya@codeaurora.org, arunks@codeaurora.org, dan.j.williams@intel.com, osalvador@suse.de, logang@deltatee.com, pasha.tatashin@oracle.com, david@redhat.com, cai@lca.pw Subject: [PATCH 4/6] mm/hotplug: Reorder arch_remove_memory() call in __remove_memory() Date: Wed, 3 Apr 2019 10:00:04 +0530 Message-Id: <1554265806-11501-5-git-send-email-anshuman.khandual@arm.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1554265806-11501-1-git-send-email-anshuman.khandual@arm.com> References: <1554265806-11501-1-git-send-email-anshuman.khandual@arm.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Memory hot remove uses get_nid_for_pfn() while tearing down linked sysfs entries between memory block and node. It first checks pfn validity with pfn_valid_within() before fetching nid. With CONFIG_HOLES_IN_ZONE config (arm64 has this enabled) pfn_valid_within() calls pfn_valid(). pfn_valid() is an arch implementation on arm64 (CONFIG_HAVE_ARCH_PFN_VALID) which scans all mapped memblock regions with memblock_is_map_memory(). This creates a problem in memory hot remove path which has already removed given memory range from memory block with memblock_[remove|free] before arriving at unregister_mem_sect_under_nodes(). Hence get_nid_for_pfn() returns -1 skipping subsequent sysfs_remove_link() calls leaving node <-> memory block sysfs entries as is. Subsequent memory add operation hits BUG_ON() because of existing sysfs entries. [ 62.007176] NUMA: Unknown node for memory at 0x680000000, assuming node 0 [ 62.052517] ------------[ cut here ]------------ [ 62.053211] kernel BUG at mm/memory_hotplug.c:1143! [ 62.053868] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 62.054589] Modules linked in: [ 62.054999] CPU: 19 PID: 3275 Comm: bash Not tainted 5.1.0-rc2-00004-g28cea40b2683 #41 [ 62.056274] Hardware name: linux,dummy-virt (DT) [ 62.057166] pstate: 40400005 (nZcv daif +PAN -UAO) [ 62.058083] pc : add_memory_resource+0x1cc/0x1d8 [ 62.058961] lr : add_memory_resource+0x10c/0x1d8 [ 62.059842] sp : ffff0000168b3ce0 [ 62.060477] x29: ffff0000168b3ce0 x28: ffff8005db546c00 [ 62.061501] x27: 0000000000000000 x26: 0000000000000000 [ 62.062509] x25: ffff0000111ef000 x24: ffff0000111ef5d0 [ 62.063520] x23: 0000000000000000 x22: 00000006bfffffff [ 62.064540] x21: 00000000ffffffef x20: 00000000006c0000 [ 62.065558] x19: 0000000000680000 x18: 0000000000000024 [ 62.066566] x17: 0000000000000000 x16: 0000000000000000 [ 62.067579] x15: ffffffffffffffff x14: ffff8005e412e890 [ 62.068588] x13: ffff8005d6b105d8 x12: 0000000000000000 [ 62.069610] x11: ffff8005d6b10490 x10: 0000000000000040 [ 62.070615] x9 : ffff8005e412e898 x8 : ffff8005e412e890 [ 62.071631] x7 : ffff8005d6b105d8 x6 : ffff8005db546c00 [ 62.072640] x5 : 0000000000000001 x4 : 0000000000000002 [ 62.073654] x3 : ffff8005d7049480 x2 : 0000000000000002 [ 62.074666] x1 : 0000000000000003 x0 : 00000000ffffffef [ 62.075685] Process bash (pid: 3275, stack limit = 0x00000000d754280f) [ 62.076930] Call trace: [ 62.077411] add_memory_resource+0x1cc/0x1d8 [ 62.078227] __add_memory+0x70/0xa8 [ 62.078901] probe_store+0xa4/0xc8 [ 62.079561] dev_attr_store+0x18/0x28 [ 62.080270] sysfs_kf_write+0x40/0x58 [ 62.080992] kernfs_fop_write+0xcc/0x1d8 [ 62.081744] __vfs_write+0x18/0x40 [ 62.082400] vfs_write+0xa4/0x1b0 [ 62.083037] ksys_write+0x5c/0xc0 [ 62.083681] __arm64_sys_write+0x18/0x20 [ 62.084432] el0_svc_handler+0x88/0x100 [ 62.085177] el0_svc+0x8/0xc Re-ordering arch_remove_memory() with memblock_[free|remove] solves the problem on arm64 as pfn_valid() behaves correctly and returns positive as memblock for the address range still exists. arch_remove_memory() removes applicable memory sections from zone with __remove_pages() and tears down kernel linear mapping. Removing memblock regions afterwards is consistent. Signed-off-by: Anshuman Khandual --- mm/memory_hotplug.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 0082d69..71d0d79 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1872,11 +1872,10 @@ void __ref __remove_memory(int nid, u64 start, u64 size) /* remove memmap entry */ firmware_map_remove(start, start + size, "System RAM"); + arch_remove_memory(nid, start, size, NULL); memblock_free(start, size); memblock_remove(start, size); - arch_remove_memory(nid, start, size, NULL); - try_offline_node(nid); mem_hotplug_done(); -- 2.7.4