Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp8122752imu; Tue, 4 Dec 2018 03:23:42 -0800 (PST) X-Google-Smtp-Source: AFSGD/VE/LzYNTl5YXfh94SrDg9jCT63IpaF6DjXzGAIgFS/6frjRTlvU/AGGaWM86RkwRMWiBnf X-Received: by 2002:a62:b24a:: with SMTP id x71mr20170823pfe.148.1543922622018; Tue, 04 Dec 2018 03:23:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543922621; cv=none; d=google.com; s=arc-20160816; b=BPCMesLwJu++LOOpMs3gXVcUqxTNJ3V+zGts3hG4zZ+1Y/FPM8LLYgyLeZTofOg5zd Eack/ObfztQcZOu/zqRv322AVROpVSkK3Hr0PqN+c7Y5VQzUG1VBmy92nJ3/eVYoMgCl Xq5iXgO/Bo0f4YALZXiriHplzbs62sTFyG/9LQxTGfs951aV5HFviu36iw73/Ou2G3Qg DsyuE0Zk9X4fAcpMRXAArWZKFABBO2r/+1LqjJdlF6mP+sHZ1I2RGm/TwCCW9T4kU9X0 ADkWnclv2J3YWW3iKSZnwmDNTHJ8Nu5obVJSv3V3arEVGSQBe5q/LhRXNZ+pUHOkfK26 5Jrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:organization:autocrypt:openpgp:from:references:cc:to :subject; bh=rhvHEXL4Ot9aV01cXy0l/8fl4j8C0tW3QOWaB1EPUuI=; b=tH4p68PJKW3YpoJhra40P395p2p/UVQ2vRyDkUI1S9C9lz+hr7k+Q54WZdgK7hB3MH JJ7zow5aXs10vD1Uls+tKSWQv8yyCOMA9NJ8xlqHHrygTrIQPf4bIsXQKHhkiNfzpLAk /dR4EgKb//9aUfdQiyDtaCsBK80u0td88Bc3LCXY8A5yaisxLza3U+sYbOrXfDizijkC GygEjjakWfm5nHDH3XbyFtxXcdMnGTifueAtp6SIUlKTHPOPq2ck1tU5CwlEbuglkIpd 8GV1fUCVTknNcMvZ7+3+pNWAEztk0a2z4A/rINaoba+Lcth73Ub2/QcVnXPTc8kMMGxV vZKw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e6si15014760pgk.201.2018.12.04.03.23.27; Tue, 04 Dec 2018 03:23:41 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727458AbeLDLWy (ORCPT + 99 others); Tue, 4 Dec 2018 06:22:54 -0500 Received: from mx1.redhat.com ([209.132.183.28]:44592 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726490AbeLDLWw (ORCPT ); Tue, 4 Dec 2018 06:22:52 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 48AA6308429D; Tue, 4 Dec 2018 11:22:51 +0000 (UTC) Received: from [10.36.118.25] (unknown [10.36.118.25]) by smtp.corp.redhat.com (Postfix) with ESMTP id E71525D78C; Tue, 4 Dec 2018 11:22:48 +0000 (UTC) Subject: Re: [RFC PATCH] hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined To: Michal Hocko , Naoya Horiguchi , Oscar Salvador Cc: Andrew Morton , Dan Williams , Pavel Tatashin , linux-mm@kvack.org, LKML , Michal Hocko , Stable tree References: <20181203100309.14784-1-mhocko@kernel.org> From: David Hildenbrand Openpgp: preference=signencrypt Autocrypt: addr=david@redhat.com; prefer-encrypt=mutual; keydata= xsFNBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABzSREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT7CwX4EEwECACgFAljj9eoCGwMFCQlmAYAGCwkI BwMCBhUIAgkKCwQWAgMBAh4BAheAAAoJEE3eEPcA/4Na5IIP/3T/FIQMxIfNzZshIq687qgG 8UbspuE/YSUDdv7r5szYTK6KPTlqN8NAcSfheywbuYD9A4ZeSBWD3/NAVUdrCaRP2IvFyELj xoMvfJccbq45BxzgEspg/bVahNbyuBpLBVjVWwRtFCUEXkyazksSv8pdTMAs9IucChvFmmq3 jJ2vlaz9lYt/lxN246fIVceckPMiUveimngvXZw21VOAhfQ+/sofXF8JCFv2mFcBDoa7eYob s0FLpmqFaeNRHAlzMWgSsP80qx5nWWEvRLdKWi533N2vC/EyunN3HcBwVrXH4hxRBMco3jvM m8VKLKao9wKj82qSivUnkPIwsAGNPdFoPbgghCQiBjBe6A75Z2xHFrzo7t1jg7nQfIyNC7ez MZBJ59sqA9EDMEJPlLNIeJmqslXPjmMFnE7Mby/+335WJYDulsRybN+W5rLT5aMvhC6x6POK z55fMNKrMASCzBJum2Fwjf/VnuGRYkhKCqqZ8gJ3OvmR50tInDV2jZ1DQgc3i550T5JDpToh dPBxZocIhzg+MBSRDXcJmHOx/7nQm3iQ6iLuwmXsRC6f5FbFefk9EjuTKcLMvBsEx+2DEx0E UnmJ4hVg7u1PQ+2Oy+Lh/opK/BDiqlQ8Pz2jiXv5xkECvr/3Sv59hlOCZMOaiLTTjtOIU7Tq 7ut6OL64oAq+zsFNBFXLn5EBEADn1959INH2cwYJv0tsxf5MUCghCj/CA/lc/LMthqQ773ga uB9mN+F1rE9cyyXb6jyOGn+GUjMbnq1o121Vm0+neKHUCBtHyseBfDXHA6m4B3mUTWo13nid 0e4AM71r0DS8+KYh6zvweLX/LL5kQS9GQeT+QNroXcC1NzWbitts6TZ+IrPOwT1hfB4WNC+X 2n4AzDqp3+ILiVST2DT4VBc11Gz6jijpC/KI5Al8ZDhRwG47LUiuQmt3yqrmN63V9wzaPhC+ xbwIsNZlLUvuRnmBPkTJwwrFRZvwu5GPHNndBjVpAfaSTOfppyKBTccu2AXJXWAE1Xjh6GOC 8mlFjZwLxWFqdPHR1n2aPVgoiTLk34LR/bXO+e0GpzFXT7enwyvFFFyAS0Nk1q/7EChPcbRb hJqEBpRNZemxmg55zC3GLvgLKd5A09MOM2BrMea+l0FUR+PuTenh2YmnmLRTro6eZ/qYwWkC u8FFIw4pT0OUDMyLgi+GI1aMpVogTZJ70FgV0pUAlpmrzk/bLbRkF3TwgucpyPtcpmQtTkWS gDS50QG9DR/1As3LLLcNkwJBZzBG6PWbvcOyrwMQUF1nl4SSPV0LLH63+BrrHasfJzxKXzqg rW28CTAE2x8qi7e/6M/+XXhrsMYG+uaViM7n2je3qKe7ofum3s4vq7oFCPsOgwARAQABwsFl BBgBAgAPBQJVy5+RAhsMBQkJZgGAAAoJEE3eEPcA/4NagOsP/jPoIBb/iXVbM+fmSHOjEshl KMwEl/m5iLj3iHnHPVLBUWrXPdS7iQijJA/VLxjnFknhaS60hkUNWexDMxVVP/6lbOrs4bDZ NEWDMktAeqJaFtxackPszlcpRVkAs6Msn9tu8hlvB517pyUgvuD7ZS9gGOMmYwFQDyytpepo YApVV00P0u3AaE0Cj/o71STqGJKZxcVhPaZ+LR+UCBZOyKfEyq+ZN311VpOJZ1IvTExf+S/5 lqnciDtbO3I4Wq0ArLX1gs1q1XlXLaVaA3yVqeC8E7kOchDNinD3hJS4OX0e1gdsx/e6COvy qNg5aL5n0Kl4fcVqM0LdIhsubVs4eiNCa5XMSYpXmVi3HAuFyg9dN+x8thSwI836FoMASwOl C7tHsTjnSGufB+D7F7ZBT61BffNBBIm1KdMxcxqLUVXpBQHHlGkbwI+3Ye+nE6HmZH7IwLwV W+Ajl7oYF+jeKaH4DZFtgLYGLtZ1LDwKPjX7VAsa4Yx7S5+EBAaZGxK510MjIx6SGrZWBrrV TEvdV00F2MnQoeXKzD7O4WFbL55hhyGgfWTHwZ457iN9SgYi1JLPqWkZB0JRXIEtjd4JEQcx +8Umfre0Xt4713VxMygW0PnQt5aSQdMD58jHFxTk092mU+yIHj5LeYgvwSgZN4airXk5yRXl SE+xAvmumFBY Organization: Red Hat GmbH Message-ID: <41c010e7-078c-5d50-e851-143355abbcb0@redhat.com> Date: Tue, 4 Dec 2018 12:22:48 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.1 MIME-Version: 1.0 In-Reply-To: <20181203100309.14784-1-mhocko@kernel.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.40]); Tue, 04 Dec 2018 11:22:51 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03.12.18 11:03, Michal Hocko wrote: > From: Michal Hocko > > We have received a bug report that an injected MCE about faulty memory > prevents memory offline to succeed. The underlying reason is that the > HWPoison page has an elevated reference count and the migration keeps > failing. There are two problems with that. First of all it is dubious > to migrate the poisoned page because we know that accessing that memory > is possible to fail. Secondly it doesn't make any sense to migrate a > potentially broken content and preserve the memory corruption over to a > new location. > > Oscar has found out that it is the elevated reference count from > memory_failure that is confusing the offlining path. HWPoisoned pages > are isolated from the LRU list but __offline_pages might still try to > migrate them if there is any preceding migrateable pages in the pfn > range. Such a migration would fail due to the reference count but > the migration code would put it back on the LRU list. This is quite > wrong in itself but it would also make scan_movable_pages stumble over > it again without any way out. > > This means that the hotremove with hwpoisoned pages has never really > worked (without a luck). HWPoisoning really needs a larger surgery > but an immediate and backportable fix is to skip over these pages during > offlining. Even if they are still mapped for some reason then > try_to_unmap should turn those mappings into hwpoison ptes and cause > SIGBUS on access. Nobody should be really touching the content of the > page so it should be safe to ignore them even when there is a pending > reference count. > > Debugged-by: Oscar Salvador > Cc: stable > Signed-off-by: Michal Hocko > --- > Hi, > I am sending this as an RFC now because I am not fully sure I see all > the consequences myself yet. This has passed a testing by Oscar but I > would highly appreciate a review from Naoya about my assumptions about > hwpoisoning. E.g. it is not entirely clear to me whether there is a > potential case where the page might be still mapped. I have put > try_to_unmap just to be sure. It would be really great if I could drop > that part because then it is not really great which of the TTU flags to > use to cover all potential cases. > > I have marked the patch for stable but I have no idea how far back it > should go. Probably everything that already has hotremove and hwpoison > code. > > Thanks in advance! This sounds good to me. We treat all HWPoison pages already as movable in has_unmovable_pages() when isolating pages to migrate pages away (and as !movable when trying to isolate a contig range for allocation). If this scenario should not be supported (if HWPoison page that is mapped cannot be offlined), we would have to bail out on such pages way earlier (e.g. in has_unmovable_pages()), failing in do_migrate_range() would be too late. +1 to "HWPoisoning really needs a larger surgery" With the comment update Acked-by: David Hildenbrand > > mm/memory_hotplug.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index c6c42a7425e5..08c576d5a633 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -34,6 +34,7 @@ > #include > #include > #include > +#include > > #include > > @@ -1366,6 +1367,17 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn) > pfn = page_to_pfn(compound_head(page)) > + hpage_nr_pages(page) - 1; > > + /* > + * HWPoison pages have elevated reference counts so the migration would > + * fail on them. It also doesn't make any sense to migrate them in the > + * first place. Still try to unmap such a page in case it is still mapped. > + */ > + if (PageHWPoison(page)) { > + if (page_mapped(page)) > + try_to_unmap(page, TTU_IGNORE_MLOCK | TTU_IGNORE_ACCESS); > + continue; > + } > + > if (!get_page_unless_zero(page)) > continue; > /* > -- Thanks, David / dhildenb