Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753219AbcDDFi0 (ORCPT ); Mon, 4 Apr 2016 01:38:26 -0400 Received: from LGEAMRELO13.lge.com ([156.147.23.53]:41185 "EHLO lgeamrelo13.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752795AbcDDFiZ (ORCPT ); Mon, 4 Apr 2016 01:38:25 -0400 X-Original-SENDERIP: 156.147.1.151 X-Original-MAILFROM: minchan@kernel.org X-Original-SENDERIP: 10.177.223.161 X-Original-MAILFROM: minchan@kernel.org Date: Mon, 4 Apr 2016 14:38:30 +0900 From: Minchan Kim To: Naoya Horiguchi Cc: Vlastimil Babka , Andrew Morton , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "jlayton@poochiereds.net" , "bfields@fieldses.org" , Joonsoo Kim , "koct9i@gmail.com" , "aquini@redhat.com" , "virtualization@lists.linux-foundation.org" , Mel Gorman , Hugh Dickins , Sergey Senozhatsky , Rik van Riel , "rknize@motorola.com" , Gioh Kim , Sangseok Lee , Chan Gyun Jeong , Al Viro , YiPing Xu Subject: mm/hwpoison: fix wrong num_poisoned_pages account Message-ID: <20160404053830.GA7042@bbox> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7108 Lines: 171 Forking new thread, Hello Naoya, On Mon, Apr 04, 2016 at 04:45:12AM +0000, Naoya Horiguchi wrote: > On Mon, Apr 04, 2016 at 10:39:17AM +0900, Minchan Kim wrote: > > On Fri, Apr 01, 2016 at 02:58:21PM +0200, Vlastimil Babka wrote: > > > On 03/30/2016 09:12 AM, Minchan Kim wrote: > > > >Procedure of page migration is as follows: > > > > > > > >First of all, it should isolate a page from LRU and try to > > > >migrate the page. If it is successful, it releases the page > > > >for freeing. Otherwise, it should put the page back to LRU > > > >list. > > > > > > > >For LRU pages, we have used putback_lru_page for both freeing > > > >and putback to LRU list. It's okay because put_page is aware of > > > >LRU list so if it releases last refcount of the page, it removes > > > >the page from LRU list. However, It makes unnecessary operations > > > >(e.g., lru_cache_add, pagevec and flags operations. It would be > > > >not significant but no worth to do) and harder to support new > > > >non-lru page migration because put_page isn't aware of non-lru > > > >page's data structure. > > > > > > > >To solve the problem, we can add new hook in put_page with > > > >PageMovable flags check but it can increase overhead in > > > >hot path and needs new locking scheme to stabilize the flag check > > > >with put_page. > > > > > > > >So, this patch cleans it up to divide two semantic(ie, put and putback). > > > >If migration is successful, use put_page instead of putback_lru_page and > > > >use putback_lru_page only on failure. That makes code more readable > > > >and doesn't add overhead in put_page. > > > > > > > >Comment from Vlastimil > > > >"Yeah, and compaction (perhaps also other migration users) has to drain > > > >the lru pvec... Getting rid of this stuff is worth even by itself." > > > > > > > >Cc: Mel Gorman > > > >Cc: Hugh Dickins > > > >Cc: Naoya Horiguchi > > > >Acked-by: Vlastimil Babka > > > >Signed-off-by: Minchan Kim > > > > > > [...] > > > > > > >@@ -974,28 +986,28 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page, > > > > list_del(&page->lru); > > > > dec_zone_page_state(page, NR_ISOLATED_ANON + > > > > page_is_file_cache(page)); > > > >- /* Soft-offlined page shouldn't go through lru cache list */ > > > >+ } > > > >+ > > > >+ /* > > > >+ * If migration is successful, drop the reference grabbed during > > > >+ * isolation. Otherwise, restore the page to LRU list unless we > > > >+ * want to retry. > > > >+ */ > > > >+ if (rc == MIGRATEPAGE_SUCCESS) { > > > >+ put_page(page); > > > > if (reason == MR_MEMORY_FAILURE) { > > > >- put_page(page); > > > > if (!test_set_page_hwpoison(page)) > > > > num_poisoned_pages_inc(); > > > >- } else > > > >+ } > > > > > > Hmm, I didn't notice it previously, or it's due to rebasing, but it > > > seems that you restricted the memory failure handling (i.e. setting > > > hwpoison) to MIGRATE_SUCCESS, while previously it was done for all > > > non-EAGAIN results. I think that goes against the intention of > > > hwpoison, which is IIRC to catch and kill the poor process that > > > still uses the page? > > > > That's why I Cc'ed Naoya Horiguchi to catch things I might make > > mistake. > > > > Thanks for catching it, Vlastimil. > > It was my mistake. But in this chance, I looked over hwpoison code and > > I saw other places which increases num_poisoned_pages are successful > > migration, already freed page and successful invalidated page. > > IOW, they are already successful isolated page so I guess it should > > increase the count when only successful migration is done? > > Yes, that's right. When exiting with migration's failure, we shouldn't call > test_set_page_hwpoison or num_poisoned_pages_inc, so current code checking > (rc != -EAGAIN) is simply incorrect. Your change fixes the bug in memory > error handling. Great! Thanks for confirming, Naoya. I will send it as separate patch with Ccing -stable. > > > And when I read memory_failure, it bails out without killing if it > > encounters HWPoisoned page so I think it's not for catching and > > kill the poor proces. > > > > > > > > Also (but not your fault) the put_page() preceding > > > test_set_page_hwpoison(page)) IMHO deserves a comment saying which > > > pin we are releasing and which one we still have (hopefully? if I > > > read description of da1b13ccfbebe right) otherwise it looks like > > > doing something with a page that we just potentially freed. > > > > Yes, while I read the code, I had same question. I think the releasing > > refcount is for get_any_page. > > As the other callers of page migration do, soft_offline_page expects the > migration source page to be freed at this put_page() (no pin remains.) > The refcount released here is from isolate_lru_page() in __soft_offline_page(). > (the pin by get_any_page is released by put_hwpoison_page just after it.) > > .. yes, doing something just after freeing page looks weird, but that's > how PageHWPoison flag works. IOW, many other page flags are maintained > only during one "allocate-free" life span, but PageHWPoison still does > its job beyond it. Got it. Thanks for the clarification. > > As for commenting, this put_page() is called in any MIGRATEPAGE_SUCCESS > case (regardless of callers), so what we can say here is "we free the > source page here, bypassing LRU list" or something? Naoya, I wrote up the patch but hard to say I write up correct description. Could you review this? Thankks. >From 916b0d5960169e93d1fa3c8b7bdb03fe2b86b455 Mon Sep 17 00:00:00 2001 From: Minchan Kim Date: Mon, 4 Apr 2016 14:26:30 +0900 Subject: [PATCH] mm/hwpoison: fix wrong num_poisoned_pages account Currently, migartion code increses num_poisoned_pages on *failed* migration page as well as successfully migrated one at the trial of memory-failture. It will make the stat wrong. As well, it marks the page as PG_HWPoison even if the migration trial failed. It would make we cannot recover the corrupted page using memory-failure facility. This patches fixes it. Cc: stable@vger.kernel.org Signed-off-by: Minchan Kim --- mm/migrate.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/mm/migrate.c b/mm/migrate.c index 6c822a7b27e0..f9dfb18a4eba 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -975,7 +975,13 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page, dec_zone_page_state(page, NR_ISOLATED_ANON + page_is_file_cache(page)); /* Soft-offlined page shouldn't go through lru cache list */ - if (reason == MR_MEMORY_FAILURE) { + if (reason == MR_MEMORY_FAILURE && rc == MIGRATEPAGE_SUCCESS) { + /* + * With this release, we free successfully migrated + * page and set PG_HWPoison on just freed page + * intentionally. Although it's rather weird, it's how + * HWPoison flag works at the moment. + */ put_page(page); if (!test_set_page_hwpoison(page)) num_poisoned_pages_inc(); -- 1.9.1