Received: by 10.213.65.68 with SMTP id h4csp157934imn; Thu, 5 Apr 2018 20:17:52 -0700 (PDT) X-Google-Smtp-Source: AIpwx49Up5HMXlErMmezgLJ+J8hFDKqrX7zWWf88l3ZBF1D6q7JQaALSagkJW+goHgMhOJvxDMfe X-Received: by 2002:a17:902:822:: with SMTP id 31-v6mr26091034plk.200.1522984672664; Thu, 05 Apr 2018 20:17:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522984672; cv=none; d=google.com; s=arc-20160816; b=zmbSb7Pk7Vs8I+t83Kns6iWJIGEXGw/vpuXWEd/rUJmHJGhZg5hUrD3Y9C0ZWI1yXO iJgn1lF63lz4s+Ai8XYo7gOIwpXFXw9jw94C91hyZ59gEQXdRTnq6t9VU/7C5Wb5V5uG 9of3wlW/8hhSAbNFH9G+b1qHfJRu/bZR3ursdzYWBsv8kTGLIRMlsL668+u8vpXnmBg+ ndfGtVdXvjomOyDUtqOle/F5TUD8PMdc1yVJ+nTe2GfmCJsuDU5Vk35zYVPDSfBQXsiV 38P6wMtIx/gnpzeMtcZ+BOxwanLTOFWYNFUB91amfs5OqGhvY1LUNcfIzaIh6ZUaZRsu ehHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :content-id:content-language:accept-language:in-reply-to:references :message-id:date:thread-index:thread-topic:subject:cc:to:from :arc-authentication-results; bh=YPkAExI+lznyJtG1Gob6z/Zd4BnCgkfSuBO5jYfbPJE=; b=SEqzitCatwntP1adkg9mSkIbGXUTq+RQXmJYBXeez/Pf6xy/leqBBIYAt5tsp0UOx8 Icx78/dM/36pX6JYQ6q8TWqfkQLsWTAaXFWAkgAwbani559IynBCRwANuqItYh1Nd9Xt oZVvm4iSb3Cjm2VB+Yu1ffUVW6h1gHFvgaonK+jlanMLXBbszDvXXKWVpBFsQXUkMdLl 86oZt76yEvbhknxv4kWdzoxLL6wyuHQl+8h+idkpv/7VhbtDPiT4gvDna7gMseJSWzdV zXX1WkwbrEVR6FAPlBIywVBhhgiRWwbbi9/enLhoPsSm7u+cd9rPAsthFka/OBdzI5rR sIxQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n4si7081353pfn.352.2018.04.05.20.17.38; Thu, 05 Apr 2018 20:17:52 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751741AbeDFDPK convert rfc822-to-8bit (ORCPT + 99 others); Thu, 5 Apr 2018 23:15:10 -0400 Received: from tyo161.gate.nec.co.jp ([114.179.232.161]:45956 "EHLO tyo161.gate.nec.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751623AbeDFDPJ (ORCPT ); Thu, 5 Apr 2018 23:15:09 -0400 X-Greylist: delayed 413 seconds by postgrey-1.27 at vger.kernel.org; Thu, 05 Apr 2018 23:15:08 EDT Received: from mailgate02.nec.co.jp ([114.179.233.122]) by tyo161.gate.nec.co.jp (8.15.1/8.15.1) with ESMTPS id w3637tf4010256 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Fri, 6 Apr 2018 12:07:55 +0900 Received: from mailsv01.nec.co.jp (mailgate-v.nec.co.jp [10.204.236.94]) by mailgate02.nec.co.jp (8.15.1/8.15.1) with ESMTP id w3637tF2022916; Fri, 6 Apr 2018 12:07:55 +0900 Received: from mail03.kamome.nec.co.jp (mail03.kamome.nec.co.jp [10.25.43.7]) by mailsv01.nec.co.jp (8.15.1/8.15.1) with ESMTP id w36379Hr020782; Fri, 6 Apr 2018 12:07:55 +0900 Received: from bpxc99gp.gisp.nec.co.jp ([10.38.151.150] [10.38.151.150]) by mail03.kamome.nec.co.jp with ESMTP id BT-MMP-975880; Fri, 6 Apr 2018 12:07:14 +0900 Received: from BPXM23GP.gisp.nec.co.jp ([10.38.151.215]) by BPXC22GP.gisp.nec.co.jp ([10.38.151.150]) with mapi id 14.03.0319.002; Fri, 6 Apr 2018 12:07:12 +0900 From: Naoya Horiguchi To: Michal Hocko CC: "Kirill A. Shutemov" , Zi Yan , "linux-mm@kvack.org" , Andrew Morton , Vlastimil Babka , "linux-kernel@vger.kernel.org" Subject: [PATCH] mm: shmem: enable thp migration (Re: [PATCH v1] mm: consider non-anonymous thp as unmovable page) Thread-Topic: [PATCH] mm: shmem: enable thp migration (Re: [PATCH v1] mm: consider non-anonymous thp as unmovable page) Thread-Index: AQHTzVRbJLbJcuvidESXEazGlmX3mA== Date: Fri, 6 Apr 2018 03:07:11 +0000 Message-ID: <20180406030706.GA2434@hori1.linux.bs1.fc.nec.co.jp> References: <20180403082405.GA23809@hori1.linux.bs1.fc.nec.co.jp> <20180403083451.GG5501@dhcp22.suse.cz> <20180403105411.hknofkbn6rzs26oz@node.shutemov.name> <20180405085927.GC6312@dhcp22.suse.cz> <20180405122838.6a6b35psizem4tcy@node.shutemov.name> <20180405124830.GJ6312@dhcp22.suse.cz> <20180405134045.7axuun6d7ufobzj4@node.shutemov.name> <20180405150547.GN6312@dhcp22.suse.cz> <20180405155551.wchleyaf4rxooj6m@node.shutemov.name> <20180405160317.GP6312@dhcp22.suse.cz> In-Reply-To: <20180405160317.GP6312@dhcp22.suse.cz> Accept-Language: en-US, ja-JP Content-Language: ja-JP X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.128.101.16] Content-Type: text/plain; charset="iso-2022-jp" Content-ID: <096B8D5FD15CDE4E9704754401AD7653@gisp.nec.co.jp> Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-TM-AS-MML: disable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi everyone, On Thu, Apr 05, 2018 at 06:03:17PM +0200, Michal Hocko wrote: > On Thu 05-04-18 18:55:51, Kirill A. Shutemov wrote: > > On Thu, Apr 05, 2018 at 05:05:47PM +0200, Michal Hocko wrote: > > > On Thu 05-04-18 16:40:45, Kirill A. Shutemov wrote: > > > > On Thu, Apr 05, 2018 at 02:48:30PM +0200, Michal Hocko wrote: > > > [...] > > > > > RIght, I confused the two. What is the proper layer to fix that then? > > > > > rmap_walk_file? > > > > > > > > Maybe something like this? Totally untested. > > > > > > This looks way too complex. Why cannot we simply split THP page cache > > > during migration? > > > > This way we unify the codepath for archictures that don't support THP > > migration and shmem THP. > > But why? There shouldn't be really nothing to prevent THP (anon or > shemem) to be migratable. If we cannot migrate it at once we can always > split it. So why should we add another thp specific handling all over > the place? If thp migration works fine for shmem, we can keep anon/shmem thp to be migratable and we don't need any ad-hoc workaround. So I wrote a patch to enable it. This patch does not change any shmem specific code, so I think that it works for file thp (not only shmem,) but I don't test it yet. Thanks, Naoya Horiguchi ----- From e31ec037701d1cc76b26226e4b66d8c783d40889 Mon Sep 17 00:00:00 2001 From: Naoya Horiguchi Date: Fri, 6 Apr 2018 10:58:35 +0900 Subject: [PATCH] mm: enable thp migration for shmem thp My testing for the latest kernel supporting thp migration showed an infinite loop in offlining the memory block that is filled with shmem thps. We can get out of the loop with a signal, but kernel should return with failure in this case. What happens in the loop is that scan_movable_pages() repeats returning the same pfn without any progress. That's because page migration always fails for shmem thps. In memory offline code, memory blocks containing unmovable pages should be prevented from being offline targets by has_unmovable_pages() inside start_isolate_page_range(). So it's possible to change migratability for non-anonymous thps to avoid the issue, but it introduces more complex and thp-specific handling in migration code, so it might not good. So this patch is suggesting to fix the issue by enabling thp migration for shmem thp. Both of anon/shmem thp are migratable so we don't need precheck about the type of thps. Fixes: commit 72b39cfc4d75 ("mm, memory_hotplug: do not fail offlining too early") Signed-off-by: Naoya Horiguchi Cc: stable@vger.kernel.org # v4.15+ --- mm/huge_memory.c | 5 ++++- mm/migrate.c | 19 ++++++++++++++++--- mm/rmap.c | 3 --- 3 files changed, 20 insertions(+), 7 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 2aff58624886..933c1bbd3464 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2926,7 +2926,10 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new) pmde = maybe_pmd_mkwrite(pmde, vma); flush_cache_range(vma, mmun_start, mmun_start + HPAGE_PMD_SIZE); - page_add_anon_rmap(new, vma, mmun_start, true); + if (PageAnon(new)) + page_add_anon_rmap(new, vma, mmun_start, true); + else + page_add_file_rmap(new, true); set_pmd_at(mm, mmun_start, pvmw->pmd, pmde); if (vma->vm_flags & VM_LOCKED) mlock_vma_page(new); diff --git a/mm/migrate.c b/mm/migrate.c index bdef905b1737..f92dd9f50981 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -472,7 +472,7 @@ int migrate_page_move_mapping(struct address_space *mapping, pslot = radix_tree_lookup_slot(&mapping->i_pages, page_index(page)); - expected_count += 1 + page_has_private(page); + expected_count += hpage_nr_pages(page) + page_has_private(page); if (page_count(page) != expected_count || radix_tree_deref_slot_protected(pslot, &mapping->i_pages.xa_lock) != page) { @@ -505,7 +505,7 @@ int migrate_page_move_mapping(struct address_space *mapping, */ newpage->index = page->index; newpage->mapping = page->mapping; - get_page(newpage); /* add cache reference */ + page_ref_add(newpage, hpage_nr_pages(page)); /* add cache reference */ if (PageSwapBacked(page)) { __SetPageSwapBacked(newpage); if (PageSwapCache(page)) { @@ -524,13 +524,26 @@ int migrate_page_move_mapping(struct address_space *mapping, } radix_tree_replace_slot(&mapping->i_pages, pslot, newpage); + if (PageTransHuge(page)) { + int i; + int index = page_index(page); + + for (i = 0; i < HPAGE_PMD_NR; i++) { + pslot = radix_tree_lookup_slot(&mapping->i_pages, + index + i); + radix_tree_replace_slot(&mapping->i_pages, pslot, + newpage + i); + } + } else { + radix_tree_replace_slot(&mapping->i_pages, pslot, newpage); + } /* * Drop cache reference from old page by unfreezing * to one less reference. * We know this isn't the last reference. */ - page_ref_unfreeze(page, expected_count - 1); + page_ref_unfreeze(page, expected_count - hpage_nr_pages(page)); xa_unlock(&mapping->i_pages); /* Leave irq disabled to prevent preemption while updating stats */ diff --git a/mm/rmap.c b/mm/rmap.c index f0dd4e4565bc..8d5337fed37b 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1374,9 +1374,6 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, if (!pvmw.pte && (flags & TTU_MIGRATION)) { VM_BUG_ON_PAGE(PageHuge(page) || !PageTransCompound(page), page); - if (!PageAnon(page)) - continue; - set_pmd_migration_entry(&pvmw, page); continue; } -- 2.7.4