Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp967162imm; Wed, 23 May 2018 08:14:31 -0700 (PDT) X-Google-Smtp-Source: AB8JxZowpmsKtNaYZ97D4CLN++Xt6saNw6YSNWVT6RkfMUAml3JWfowE0JndXy6JFPUEKTKtrorT X-Received: by 2002:a17:902:274a:: with SMTP id j10-v6mr3442983plg.393.1527088471416; Wed, 23 May 2018 08:14:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527088471; cv=none; d=google.com; s=arc-20160816; b=bDWaCmOjR6bxOklfUWaZnAHwlvcvz7mrJ3sC6dvCV3iVXm+zxCbH2NZ0C0H6APWq5X l87AxUUbpZkilVpOUVPUBWr+kVgzlqSf8KC6hZbUTpsGv6Pe4zMa7A+Hkyv0DRtJMOZG j9adq9PQJM6XrXP2E2w+88UnWUWtVbhNDHwglwdzgsJu6UVwiCkiJ+UUbMl1OJErsrMb k2TgGmRfAzH0vGOHcCfF6eObwu1UfyG95+zmQ0bmeb/HKgFNYYetbVa+PkgXtYmxfBPz XRKsO+qBSYnPAGRbqW4HcwL/wFf/c2Cw+UNyk3Itm0kbnddY1HdkCd0xlyy+oBpicVcK P2vA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=doC6gfdj9p4gVcFWGLWwm4nwujXCeDS2QVJuzrm58fs=; b=MFR2ax/qo7vQKIHXF2TkDUxn1opM7ipw1FdesUq1Vvo5rpDFpbt/hJLJSYn4o9Z+Tl ujkmBp/6x92rH+jb3WVYftN2hBT3spqiHcb4F3skt2YxnfZA9MI3Sen51TK6zqrKGR5Z cdWziPBPm8G0MpC3cQCZVICuXLvswo8+W7qhNflhCat4SO5CcYkE8vI7HyPfiSLV5Yn+ IoGLQ9+mtM8RhfY7u1zuEIVQWZENnhymWqtqfJMAOpnpn0gl7nK2F+MQmjcCcDe8GXJA 1t+dRFX2pacygSxqTcF6W7Y8d9jI1jDPp8Qa6oW6avYMDLIzp4cyqMYUTJeLC2b6ZASX wE5g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w12-v6si18807856pfd.113.2018.05.23.08.14.13; Wed, 23 May 2018 08:14:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933584AbeEWPMm (ORCPT + 99 others); Wed, 23 May 2018 11:12:42 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:46480 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S933522AbeEWPMb (ORCPT ); Wed, 23 May 2018 11:12:31 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 60E7DBB405; Wed, 23 May 2018 15:12:30 +0000 (UTC) Received: from t460s.redhat.com (ovpn-116-112.ams2.redhat.com [10.36.116.112]) by smtp.corp.redhat.com (Postfix) with ESMTP id D765010C564A; Wed, 23 May 2018 15:12:27 +0000 (UTC) From: David Hildenbrand To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, David Hildenbrand , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Greg Kroah-Hartman , Rashmica Gupta , Balbir Singh , Andrew Morton , Michal Hocko , Vlastimil Babka , Dan Williams , Joonsoo Kim , Pavel Tatashin , Reza Arbab , Thomas Gleixner Subject: [PATCH v1 09/10] mm/memory_hotplug: teach offline_pages() to not try forever Date: Wed, 23 May 2018 17:11:50 +0200 Message-Id: <20180523151151.6730-10-david@redhat.com> In-Reply-To: <20180523151151.6730-1-david@redhat.com> References: <20180523151151.6730-1-david@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Wed, 23 May 2018 15:12:30 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Wed, 23 May 2018 15:12:30 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'david@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org It can easily happen that we get stuck forever trying to offline pages - e.g. on persistent errors. Let's add a way to change this behavior and fail fast. This is interesting if offline_pages() is called from a driver and we just want to find some block to offline. Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Michael Ellerman Cc: Greg Kroah-Hartman Cc: Rashmica Gupta Cc: Balbir Singh Cc: Andrew Morton Cc: Michal Hocko Cc: Vlastimil Babka Cc: Dan Williams Cc: Joonsoo Kim Cc: Pavel Tatashin Cc: Reza Arbab Cc: Thomas Gleixner Signed-off-by: David Hildenbrand --- arch/powerpc/platforms/powernv/memtrace.c | 2 +- drivers/base/memory.c | 2 +- include/linux/memory_hotplug.h | 8 ++++---- mm/memory_hotplug.c | 14 ++++++++++---- 4 files changed, 16 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c index fc222a0c2ac4..8ce71f7e1558 100644 --- a/arch/powerpc/platforms/powernv/memtrace.c +++ b/arch/powerpc/platforms/powernv/memtrace.c @@ -110,7 +110,7 @@ static bool memtrace_offline_pages(u32 nid, u64 start_pfn, u64 nr_pages) walk_memory_range(start_pfn, end_pfn, (void *)MEM_GOING_OFFLINE, change_memblock_state); - if (offline_pages(start_pfn, nr_pages)) { + if (offline_pages(start_pfn, nr_pages, true)) { walk_memory_range(start_pfn, end_pfn, (void *)MEM_ONLINE, change_memblock_state); return false; diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 3b8616551561..c785e4c01b23 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -248,7 +248,7 @@ memory_block_action(struct memory_block *mem, unsigned long action) ret = online_pages(start_pfn, nr_pages, mem->online_type); break; case MEM_OFFLINE: - ret = offline_pages(start_pfn, nr_pages); + ret = offline_pages(start_pfn, nr_pages, true); break; default: WARN(1, KERN_WARNING "%s(%ld, %ld) unknown action: " diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 497e28f5b000..ae53017b54df 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -303,7 +303,8 @@ static inline void pgdat_resize_init(struct pglist_data *pgdat) {} extern bool is_mem_section_removable(unsigned long pfn, unsigned long nr_pages); extern void try_offline_node(int nid); -extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages); +extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages, + bool retry_forever); extern void remove_memory(int nid, u64 start, u64 size); #else @@ -315,7 +316,8 @@ static inline bool is_mem_section_removable(unsigned long pfn, static inline void try_offline_node(int nid) {} -static inline int offline_pages(unsigned long start_pfn, unsigned long nr_pages) +static inline int offline_pages(unsigned long start_pfn, unsigned long nr_pages, + bool retry_forever) { return -EINVAL; } @@ -333,9 +335,7 @@ extern int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap, bool want_memblock); extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn, unsigned long nr_pages, struct vmem_altmap *altmap); -extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages); extern bool is_memblock_offlined(struct memory_block *mem); -extern void remove_memory(int nid, u64 start, u64 size); extern int sparse_add_one_section(struct pglist_data *pgdat, unsigned long start_pfn, struct vmem_altmap *altmap); extern void sparse_remove_one_section(struct zone *zone, struct mem_section *ms, diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 1610e214bfc8..3a5845a33910 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1633,8 +1633,8 @@ static void node_states_clear_node(int node, struct memory_notify *arg) node_clear_state(node, N_MEMORY); } -static int __ref __offline_pages(unsigned long start_pfn, - unsigned long end_pfn) +static int __ref __offline_pages(unsigned long start_pfn, unsigned long end_pfn, + bool retry_forever) { unsigned long pfn, nr_pages; long offlined_pages; @@ -1686,6 +1686,10 @@ static int __ref __offline_pages(unsigned long start_pfn, pfn = scan_movable_pages(start_pfn, end_pfn); if (pfn) { /* We have movable pages */ ret = do_migrate_range(pfn, end_pfn); + if (ret && !retry_forever) { + ret = -EBUSY; + goto failed_removal; + } goto repeat; } @@ -1752,6 +1756,7 @@ static int __ref __offline_pages(unsigned long start_pfn, * offline_pages - offline pages in a given range (that are currently online) * @start_pfn: start pfn of the memory range * @nr_pages: the number of pages + * @retry_forever: weather to retry (possibly) forever * * This function tries to offline the given pages. The alignment/size that * can be used is given by offline_nr_pages. @@ -1764,9 +1769,10 @@ static int __ref __offline_pages(unsigned long start_pfn, * * Must be protected by mem_hotplug_begin() or a device_lock */ -int offline_pages(unsigned long start_pfn, unsigned long nr_pages) +int offline_pages(unsigned long start_pfn, unsigned long nr_pages, + bool retry_forever) { - return __offline_pages(start_pfn, start_pfn + nr_pages); + return __offline_pages(start_pfn, start_pfn + nr_pages, retry_forever); } #endif /* CONFIG_MEMORY_HOTREMOVE */ -- 2.17.0