Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp179908imu; Fri, 16 Nov 2018 00:31:39 -0800 (PST) X-Google-Smtp-Source: AJdET5f1kM8EnpGNL4l4wqME8hRnpa3zxS5Nfr0oAopRMwDGI4x0jzQ3ecLFfxv5emlCRiUGkI+G X-Received: by 2002:a63:5f41:: with SMTP id t62mr9095881pgb.76.1542357099931; Fri, 16 Nov 2018 00:31:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542357099; cv=none; d=google.com; s=arc-20160816; b=ynwn26EtX5b7gGB0tG0VW6HJ/AfF6+lW5T92YEbLNp5dZI2VIG7A0ZMBow9tYEZ5pJ XHn65T9CdqyuUbSsbak/2tdk4Ph60dngAMmgjXCZObNMxY6lhsg64c3XS4qbL8lvK65T 9jlVnvD8jGcCi0LMU15xO8u9HLjHMCDTemaAKVyya7jPdtdrxnV5ewetZIBgSK4fUY5W VJ830vHjZIjhUmNZAO26tLflD1tSNTHrINAYkDAJZnwjeJdfQsE+bCDCwrW3Las+Y/Xu Muk/FA3m8M+a2LuWdVEd5y2qoB5XlJbwLYgMRgKuLWynqJ9fBrA2cIPRLVSmGo9S2+il 7a7A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=wSKUKWTu/JnWNKe9NUPf6eNFtrvxLeBHWSYH+NlRpmU=; b=gUeObNZBPn+bvkBV9LY3Zy3mGi0u3qdwv4oTcN5bRNgX33k4+Kj6ye9Cge5RL+XJYo enimzvTDMgNYw5pz9vlkrg81UskI0BzRE7LakZvth29iFLFetJxlknxD94SraJq7REO6 HeLi6rYtptnoSl1dvuR7gXLC2QrHbtxK+gO5mQtacNaUzYv225yKgSySX3XlghYSmHQI C5Hm4GeFF8cPUUyBlPRjwPxcu4yOxQUOOrp10A6qC5uXBpYy2JpON0EluXEGl/KCKnyp gEDOzbbWx8MrQrdk7+8ZXMEByD5LysWbv8qzr6AdjCC1Gp+H7DzVHdAR+SwUVfUG7o65 bqyw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h31si5720029pgl.482.2018.11.16.00.31.25; Fri, 16 Nov 2018 00:31:39 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389454AbeKPSl7 (ORCPT + 99 others); Fri, 16 Nov 2018 13:41:59 -0500 Received: from mail-ed1-f66.google.com ([209.85.208.66]:32927 "EHLO mail-ed1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389345AbeKPSl6 (ORCPT ); Fri, 16 Nov 2018 13:41:58 -0500 Received: by mail-ed1-f66.google.com with SMTP id r27so15979648eda.0 for ; Fri, 16 Nov 2018 00:30:38 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=wSKUKWTu/JnWNKe9NUPf6eNFtrvxLeBHWSYH+NlRpmU=; b=gqJuVV0e2UNLs5mGY4qx2Z7dmi4vOLTUA1HTiNUsMl60w5NFqxC32zY0f4NKZ+aMth asUJ/dldlUm2iCa+Ij7gDRZuSHHzMdIYalaX9S2gLF7d9PquVDY+rPcqJvAtT6uNuGL9 OqUkbedthBH7UbJnue01b6PV/XBPDu6l52l+lbGnlvvCIVtUzKPgLSdy3jMzeW4+y//I wlG4ZM4hWxigg74eezZwLtfEbndRWwdEc5R3G6h3Q6edfNTeEUvwwnSjW0+EpU94E3dJ yazTVRKmwJH+ZQRx2u+iaNmcci4o6LEXZfYptGK7EzEGFnWryTszrEosbbFg2cbekT3O EAPQ== X-Gm-Message-State: AGRZ1gKDysFsoLS0XNOQnEv3DEWTPnauStGPTL4dsdfLqDCznYCI/GiD q9J5xCfUUiHAU0MXZJSR2OM= X-Received: by 2002:a50:8d46:: with SMTP id t6mr8854960edt.269.1542357037478; Fri, 16 Nov 2018 00:30:37 -0800 (PST) Received: from tiehlicka.suse.cz (prg-ext-pat.suse.com. [213.151.95.130]) by smtp.gmail.com with ESMTPSA id m13sm5305393edd.2.2018.11.16.00.30.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 16 Nov 2018 00:30:36 -0800 (PST) From: Michal Hocko To: Andrew Morton Cc: Oscar Salvador , Baoquan He , Anshuman Khandual , , LKML , Michal Hocko Subject: [PATCH 4/5] mm, memory_hotplug: print reason for the offlining failure Date: Fri, 16 Nov 2018 09:30:19 +0100 Message-Id: <20181116083020.20260-5-mhocko@kernel.org> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181116083020.20260-1-mhocko@kernel.org> References: <20181116083020.20260-1-mhocko@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Michal Hocko The memory offlining failure reporting is inconsistent and insufficient. Some error paths simply do not report the failure to the log at all. When we do report there are no details about the reason of the failure and there are several of them which makes memory offlining failures hard to debug. Make sure that the memory offlining [mem %#010llx-%#010llx] failed message is printed for all failures and also provide a short textual reason for the failure e.g. [ 1984.506184] rac1 kernel: memory offlining [mem 0x82600000000-0x8267fffffff] failed due to signal backoff this tells us that the offlining has failed because of a signal pending aka user intervention. [akpm: tweak messages a bit] Signed-off-by: Michal Hocko --- mm/memory_hotplug.c | 34 +++++++++++++++++++++++----------- 1 file changed, 23 insertions(+), 11 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index a92b1b8f6218..88d50e74e3fe 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1553,6 +1553,7 @@ static int __ref __offline_pages(unsigned long start_pfn, unsigned long valid_start, valid_end; struct zone *zone; struct memory_notify arg; + char *reason; mem_hotplug_begin(); @@ -1561,7 +1562,9 @@ static int __ref __offline_pages(unsigned long start_pfn, if (!test_pages_in_a_zone(start_pfn, end_pfn, &valid_start, &valid_end)) { mem_hotplug_done(); - return -EINVAL; + ret = -EINVAL; + reason = "multizone range"; + goto failed_removal; } zone = page_zone(pfn_to_page(valid_start)); @@ -1573,7 +1576,8 @@ static int __ref __offline_pages(unsigned long start_pfn, MIGRATE_MOVABLE, true); if (ret) { mem_hotplug_done(); - return ret; + reason = "failure to isolate range"; + goto failed_removal; } arg.start_pfn = start_pfn; @@ -1582,15 +1586,19 @@ static int __ref __offline_pages(unsigned long start_pfn, ret = memory_notify(MEM_GOING_OFFLINE, &arg); ret = notifier_to_errno(ret); - if (ret) - goto failed_removal; + if (ret) { + reason = "notifier failure"; + goto failed_removal_isolated; + } pfn = start_pfn; repeat: /* start memory hot removal */ ret = -EINTR; - if (signal_pending(current)) - goto failed_removal; + if (signal_pending(current)) { + reason = "signal backoff"; + goto failed_removal_isolated; + } cond_resched(); lru_add_drain_all(); @@ -1607,8 +1615,10 @@ static int __ref __offline_pages(unsigned long start_pfn, * actually in order to make hugetlbfs's object counting consistent. */ ret = dissolve_free_huge_pages(start_pfn, end_pfn); - if (ret) - goto failed_removal; + if (ret) { + reason = "failure to dissolve huge pages"; + goto failed_removal_isolated; + } /* check again */ offlined_pages = check_pages_isolated(start_pfn, end_pfn); if (offlined_pages < 0) @@ -1648,13 +1658,15 @@ static int __ref __offline_pages(unsigned long start_pfn, mem_hotplug_done(); return 0; +failed_removal_isolated: + undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE); failed_removal: - pr_debug("memory offlining [mem %#010llx-%#010llx] failed\n", + pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to %s\n", (unsigned long long) start_pfn << PAGE_SHIFT, - ((unsigned long long) end_pfn << PAGE_SHIFT) - 1); + ((unsigned long long) end_pfn << PAGE_SHIFT) - 1, + reason); memory_notify(MEM_CANCEL_OFFLINE, &arg); /* pushback to free area */ - undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE); mem_hotplug_done(); return ret; } -- 2.19.1