Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753808AbdLGOeG (ORCPT ); Thu, 7 Dec 2017 09:34:06 -0500 Received: from mx2.suse.de ([195.135.220.15]:41315 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753453AbdLGOeE (ORCPT ); Thu, 7 Dec 2017 09:34:04 -0500 Date: Thu, 7 Dec 2017 15:34:01 +0100 From: Michal Hocko To: Zi Yan Cc: linux-mm@kvack.org, Naoya Horiguchi , "Kirill A. Shutemov" , Vlastimil Babka , Andrew Morton , Andrea Reale , LKML Subject: Re: [RFC PATCH] mm: unclutter THP migration Message-ID: <20171207143401.GK20234@dhcp22.suse.cz> References: <20171207124815.12075-1-mhocko@kernel.org> <5A294BE7.4010904@cs.rutgers.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5A294BE7.4010904@cs.rutgers.edu> User-Agent: Mutt/1.9.1 (2017-09-22) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2708 Lines: 61 On Thu 07-12-17 22:10:47, Zi Yan wrote: > Hi Michal, > > Thanks for sending this out. > > Michal Hocko wrote: > > From: Michal Hocko > > > > THP migration is hacked into the generic migration with rather > > surprising semantic. The migration allocation callback is supposed to > > check whether the THP can be migrated at once and if that is not the > > case then it allocates a simple page to migrate. unmap_and_move then > > fixes that up by spliting the THP into small pages while moving the > > head page to the newly allocated order-0 page. Remaning pages are moved > > to the LRU list by split_huge_page. The same happens if the THP > > allocation fails. This is really ugly and error prone [1]. > > > > I also believe that split_huge_page to the LRU lists is inherently > > wrong because all tail pages are not migrated. Some callers will just > > I agree with you that we should try to migrate all tail pages if the THP > needs to be split. But this might not be compatible with "getting > migration results" in unmap_and_move(), since a caller of > migrate_pages() may want to know the status of each page in the > migration list via int **result in get_new_page() (e.g. > new_page_node()). The caller has no idea whether a THP in its migration > list will be split or not, thus, storing migration results might be > quite tricky if tail pages are added into the migration list. Ouch. I wasn't aware of this "beauty". I will try to wrap my head around this code and think about what to do about it. Thanks for point me to it. > We need to consider this when we clean up migrate_pages(). > [...] > > diff --git a/include/linux/migrate.h b/include/linux/migrate.h > > index a2246cf670ba..ec9503e5f2c2 100644 > > --- a/include/linux/migrate.h > > +++ b/include/linux/migrate.h > > @@ -43,9 +43,11 @@ static inline struct page *new_page_nodemask(struct page *page, > > return alloc_huge_page_nodemask(page_hstate(compound_head(page)), > > preferred_nid, nodemask); > > > > - if (thp_migration_supported() && PageTransHuge(page)) { > > - order = HPAGE_PMD_ORDER; > > + if (PageTransHuge(page)) { > > + if (!thp_migration_supported()) > > + return NULL; > We may not need these two lines, since if thp_migration_supported() is > false, unmap_and_move() returns -ENOMEM in your code below, which has > the same result of returning NULL here. yes, this is a left over after rebase. Originally I used to have thp_migration_supported in allocation callbacks but then moved it to unmap_and_move to reduce the code duplication and also it makes much more sense to have this up in the migration layer. I've fixed this up in my local copy now. -- Michal Hocko SUSE Labs