Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752717AbcDDFrs (ORCPT ); Mon, 4 Apr 2016 01:47:48 -0400 Received: from TYO201.gate.nec.co.jp ([210.143.35.51]:61246 "EHLO tyo201.gate.nec.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751150AbcDDFrr convert rfc822-to-8bit (ORCPT ); Mon, 4 Apr 2016 01:47:47 -0400 From: Naoya Horiguchi To: Minchan Kim CC: Vlastimil Babka , Andrew Morton , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "jlayton@poochiereds.net" , "bfields@fieldses.org" , Joonsoo Kim , "koct9i@gmail.com" , "aquini@redhat.com" , "virtualization@lists.linux-foundation.org" , Mel Gorman , Hugh Dickins , Sergey Senozhatsky , Rik van Riel , "rknize@motorola.com" , Gioh Kim , Sangseok Lee , "Chan Gyun Jeong" , Al Viro , YiPing Xu Subject: Re: mm/hwpoison: fix wrong num_poisoned_pages account Thread-Topic: mm/hwpoison: fix wrong num_poisoned_pages account Thread-Index: AQHRjjQ1zWRvlXMv8EiJBupev50F/p94t4UA Date: Mon, 4 Apr 2016 05:46:59 +0000 Message-ID: <20160404054658.GA23155@hori1.linux.bs1.fc.nec.co.jp> References: <20160404053830.GA7042@bbox> In-Reply-To: <20160404053830.GA7042@bbox> Accept-Language: ja-JP, en-US Content-Language: ja-JP X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.128.101.10] Content-Type: text/plain; charset="iso-2022-jp" Content-ID: Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7654 Lines: 184 On Mon, Apr 04, 2016 at 02:38:30PM +0900, Minchan Kim wrote: > > Forking new thread, > > Hello Naoya, > > On Mon, Apr 04, 2016 at 04:45:12AM +0000, Naoya Horiguchi wrote: > > On Mon, Apr 04, 2016 at 10:39:17AM +0900, Minchan Kim wrote: > > > On Fri, Apr 01, 2016 at 02:58:21PM +0200, Vlastimil Babka wrote: > > > > On 03/30/2016 09:12 AM, Minchan Kim wrote: > > > > >Procedure of page migration is as follows: > > > > > > > > > >First of all, it should isolate a page from LRU and try to > > > > >migrate the page. If it is successful, it releases the page > > > > >for freeing. Otherwise, it should put the page back to LRU > > > > >list. > > > > > > > > > >For LRU pages, we have used putback_lru_page for both freeing > > > > >and putback to LRU list. It's okay because put_page is aware of > > > > >LRU list so if it releases last refcount of the page, it removes > > > > >the page from LRU list. However, It makes unnecessary operations > > > > >(e.g., lru_cache_add, pagevec and flags operations. It would be > > > > >not significant but no worth to do) and harder to support new > > > > >non-lru page migration because put_page isn't aware of non-lru > > > > >page's data structure. > > > > > > > > > >To solve the problem, we can add new hook in put_page with > > > > >PageMovable flags check but it can increase overhead in > > > > >hot path and needs new locking scheme to stabilize the flag check > > > > >with put_page. > > > > > > > > > >So, this patch cleans it up to divide two semantic(ie, put and putback). > > > > >If migration is successful, use put_page instead of putback_lru_page and > > > > >use putback_lru_page only on failure. That makes code more readable > > > > >and doesn't add overhead in put_page. > > > > > > > > > >Comment from Vlastimil > > > > >"Yeah, and compaction (perhaps also other migration users) has to drain > > > > >the lru pvec... Getting rid of this stuff is worth even by itself." > > > > > > > > > >Cc: Mel Gorman > > > > >Cc: Hugh Dickins > > > > >Cc: Naoya Horiguchi > > > > >Acked-by: Vlastimil Babka > > > > >Signed-off-by: Minchan Kim > > > > > > > > [...] > > > > > > > > >@@ -974,28 +986,28 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page, > > > > > list_del(&page->lru); > > > > > dec_zone_page_state(page, NR_ISOLATED_ANON + > > > > > page_is_file_cache(page)); > > > > >- /* Soft-offlined page shouldn't go through lru cache list */ > > > > >+ } > > > > >+ > > > > >+ /* > > > > >+ * If migration is successful, drop the reference grabbed during > > > > >+ * isolation. Otherwise, restore the page to LRU list unless we > > > > >+ * want to retry. > > > > >+ */ > > > > >+ if (rc == MIGRATEPAGE_SUCCESS) { > > > > >+ put_page(page); > > > > > if (reason == MR_MEMORY_FAILURE) { > > > > >- put_page(page); > > > > > if (!test_set_page_hwpoison(page)) > > > > > num_poisoned_pages_inc(); > > > > >- } else > > > > >+ } > > > > > > > > Hmm, I didn't notice it previously, or it's due to rebasing, but it > > > > seems that you restricted the memory failure handling (i.e. setting > > > > hwpoison) to MIGRATE_SUCCESS, while previously it was done for all > > > > non-EAGAIN results. I think that goes against the intention of > > > > hwpoison, which is IIRC to catch and kill the poor process that > > > > still uses the page? > > > > > > That's why I Cc'ed Naoya Horiguchi to catch things I might make > > > mistake. > > > > > > Thanks for catching it, Vlastimil. > > > It was my mistake. But in this chance, I looked over hwpoison code and > > > I saw other places which increases num_poisoned_pages are successful > > > migration, already freed page and successful invalidated page. > > > IOW, they are already successful isolated page so I guess it should > > > increase the count when only successful migration is done? > > > > Yes, that's right. When exiting with migration's failure, we shouldn't call > > test_set_page_hwpoison or num_poisoned_pages_inc, so current code checking > > (rc != -EAGAIN) is simply incorrect. Your change fixes the bug in memory > > error handling. Great! > > Thanks for confirming, Naoya. > I will send it as separate patch with Ccing -stable. > > > > > > And when I read memory_failure, it bails out without killing if it > > > encounters HWPoisoned page so I think it's not for catching and > > > kill the poor proces. > > > > > > > > > > > Also (but not your fault) the put_page() preceding > > > > test_set_page_hwpoison(page)) IMHO deserves a comment saying which > > > > pin we are releasing and which one we still have (hopefully? if I > > > > read description of da1b13ccfbebe right) otherwise it looks like > > > > doing something with a page that we just potentially freed. > > > > > > Yes, while I read the code, I had same question. I think the releasing > > > refcount is for get_any_page. > > > > As the other callers of page migration do, soft_offline_page expects the > > migration source page to be freed at this put_page() (no pin remains.) > > The refcount released here is from isolate_lru_page() in __soft_offline_page(). > > (the pin by get_any_page is released by put_hwpoison_page just after it.) > > > > .. yes, doing something just after freeing page looks weird, but that's > > how PageHWPoison flag works. IOW, many other page flags are maintained > > only during one "allocate-free" life span, but PageHWPoison still does > > its job beyond it. > > Got it. Thanks for the clarification. > > > > > As for commenting, this put_page() is called in any MIGRATEPAGE_SUCCESS > > case (regardless of callers), so what we can say here is "we free the > > source page here, bypassing LRU list" or something? > > Naoya, I wrote up the patch but hard to say I write up correct description. > Could you review this? > > Thankks. > > From 916b0d5960169e93d1fa3c8b7bdb03fe2b86b455 Mon Sep 17 00:00:00 2001 > From: Minchan Kim > Date: Mon, 4 Apr 2016 14:26:30 +0900 > Subject: [PATCH] mm/hwpoison: fix wrong num_poisoned_pages account > > Currently, migartion code increses num_poisoned_pages on *failed* migration ... increase > migration page as well as successfully migrated one at the trial > of memory-failture. It will make the stat wrong. failure > As well, it marks the page as PG_HWPoison even if the migration > trial failed. It would make we cannot recover the corrupted page > using memory-failure facility. > > This patches fixes it. > > Cc: stable@vger.kernel.org > Signed-off-by: Minchan Kim Thank you for the fix. Acked-by: Naoya Horiguchi > --- > mm/migrate.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/mm/migrate.c b/mm/migrate.c > index 6c822a7b27e0..f9dfb18a4eba 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -975,7 +975,13 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page, > dec_zone_page_state(page, NR_ISOLATED_ANON + > page_is_file_cache(page)); > /* Soft-offlined page shouldn't go through lru cache list */ > - if (reason == MR_MEMORY_FAILURE) { > + if (reason == MR_MEMORY_FAILURE && rc == MIGRATEPAGE_SUCCESS) { > + /* > + * With this release, we free successfully migrated > + * page and set PG_HWPoison on just freed page > + * intentionally. Although it's rather weird, it's how > + * HWPoison flag works at the moment. > + */ > put_page(page); > if (!test_set_page_hwpoison(page)) > num_poisoned_pages_inc(); > -- > 1.9.1 >