Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp7996044imu; Tue, 4 Dec 2018 00:50:00 -0800 (PST) X-Google-Smtp-Source: AFSGD/X/sEGC1iJAcn4I6OTZ8vbPFyimqlq6JVsepuOl9SpGG66RKzHDTgvL6xUDvsWEn20ouuQ8 X-Received: by 2002:a17:902:b83:: with SMTP id 3mr7716768plr.42.1543913400085; Tue, 04 Dec 2018 00:50:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543913400; cv=none; d=google.com; s=arc-20160816; b=syO0MrNJdTQxn5tnygxpmYy+n1QnHhardjdgpRwfumqMCNRY7KvzvSY9ofEyNGeqJ5 n8jz80WY/Ib9NfvyKilCZxnx7d8Va41V+3TLLhMpQXnS1Zpmz0T5v3vJ2uAR4FI6Nv1g jWLTSqKN9BDc88wQcLTAbi0xjvMLC22FBg0xP1oNs8bJnhN+O/LHiSgeTPHnOVUUuskp QTfZtMOHn6plcvYFWcFOPcpUTbhsQFIGLT+tkvb3ERQAJXYwM6vjGgl52mBbGBkNUvMp o5fGTsJXyTExKnRmz/A15ocThbzZl4A+WyMKfk9RpLLZ5qiMcbkfzp6tW2Xa1PqaMj2U PEGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=bSWrk0LenB744cfx9C1l9ZG79xOAw+ESjXtH+0YC/dA=; b=JGMYn/f8kP1PofwktvqtIh0qYRz+LrrsHK7sqGAPONANBQsdCA31153f1Di3YgewcN Swe/mJoBwSzjVsRgJuwRfHOTko00r/B+qzmNlbeBhrFradO8vbYuj3Gx9HjWsSfwtHI2 0FI4H/05f7n1bMyHvli0H/wt8gqRqgKGrXvC7Bt5AKQduG4zlYP/ZX83SKSJqus41UO2 nif0oA+KwetsKBtCoeWjKRQ74o09FxcDNtJmdt/FDVVl+7ogireRn22keMOFDWOmYiVm /WPS/M7LMAz847Z4LL6TZ6TvW30ehCrfUq3bGJtkNpTywpIjkL6pcLUUAJsq9zDZ/q24 RAaA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a13si18254852pfd.3.2018.12.04.00.49.45; Tue, 04 Dec 2018 00:50:00 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726007AbeLDIsd (ORCPT + 99 others); Tue, 4 Dec 2018 03:48:33 -0500 Received: from mx2.suse.de ([195.135.220.15]:51842 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725764AbeLDIsd (ORCPT ); Tue, 4 Dec 2018 03:48:33 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id C9CC6B009; Tue, 4 Dec 2018 08:48:28 +0000 (UTC) Date: Tue, 4 Dec 2018 09:48:26 +0100 From: Michal Hocko To: Naoya Horiguchi Cc: Oscar Salvador , Andrew Morton , Dan Williams , Pavel Tatashin , "linux-mm@kvack.org" , LKML , Stable tree Subject: Re: [RFC PATCH] hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined Message-ID: <20181204081801.GA1286@dhcp22.suse.cz> References: <20181203100309.14784-1-mhocko@kernel.org> <20181204072116.GA24446@hori1.linux.bs1.fc.nec.co.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181204072116.GA24446@hori1.linux.bs1.fc.nec.co.jp> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 04-12-18 07:21:16, Naoya Horiguchi wrote: > On Mon, Dec 03, 2018 at 11:03:09AM +0100, Michal Hocko wrote: > > From: Michal Hocko > > > > We have received a bug report that an injected MCE about faulty memory > > prevents memory offline to succeed. The underlying reason is that the > > HWPoison page has an elevated reference count and the migration keeps > > failing. There are two problems with that. First of all it is dubious > > to migrate the poisoned page because we know that accessing that memory > > is possible to fail. Secondly it doesn't make any sense to migrate a > > potentially broken content and preserve the memory corruption over to a > > new location. > > > > Oscar has found out that it is the elevated reference count from > > memory_failure that is confusing the offlining path. HWPoisoned pages > > are isolated from the LRU list but __offline_pages might still try to > > migrate them if there is any preceding migrateable pages in the pfn > > range. Such a migration would fail due to the reference count but > > the migration code would put it back on the LRU list. This is quite > > wrong in itself but it would also make scan_movable_pages stumble over > > it again without any way out. > > > > This means that the hotremove with hwpoisoned pages has never really > > worked (without a luck). HWPoisoning really needs a larger surgery > > but an immediate and backportable fix is to skip over these pages during > > offlining. Even if they are still mapped for some reason then > > try_to_unmap should turn those mappings into hwpoison ptes and cause > > SIGBUS on access. Nobody should be really touching the content of the > > page so it should be safe to ignore them even when there is a pending > > reference count. > > > > Debugged-by: Oscar Salvador > > Cc: stable > > Signed-off-by: Michal Hocko > > --- > > Hi, > > I am sending this as an RFC now because I am not fully sure I see all > > the consequences myself yet. This has passed a testing by Oscar but I > > would highly appreciate a review from Naoya about my assumptions about > > hwpoisoning. E.g. it is not entirely clear to me whether there is a > > potential case where the page might be still mapped. > > One potential case is ksm page, for which we give up unmapping and leave > it unmapped. Rather than that I don't have any idea, but any new type of > page would be potentially categorized to this class. Could you be more specific why hwpoison code gives up on ksm pages while we can safely unmap here? [...] > > I think this looks OK (no better idea.) > > Reviewed-by: Naoya Horiguchi Thanks! > I wondered why I didn't find this for long, and found that my testing only > covered the case where PageHWPoison is the first page of memory block. > scan_movable_pages() considers PageHWPoison as non-movable, so do_migrate_range() > started with pfn after the PageHWPoison and never tried to migrate it > (so effectively ignored every PageHWPoison as the above code does.) Yeah, it seems that the hotremove worked only by chance in presence of hwpoison pages so far. The specific usecase which triggered this patch is a heavily memory utilized system with in memory database IIRC. So it is quite likely that hwpoison pages are punched to otherwise used memory. Thanks for the review Naoya! -- Michal Hocko SUSE Labs