Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752749AbdLMMUj (ORCPT ); Wed, 13 Dec 2017 07:20:39 -0500 Received: from mail-wm0-f67.google.com ([74.125.82.67]:41710 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752509AbdLMMUi (ORCPT ); Wed, 13 Dec 2017 07:20:38 -0500 X-Google-Smtp-Source: ACJfBouWJKdAez20GtxlBLFndv6DidNXLwkP6tDARQTyXmMXeD1LmdBU8OcsmOSsl42+N0rnNDLpVg== Date: Wed, 13 Dec 2017 15:20:35 +0300 From: "Kirill A. Shutemov" To: Michal Hocko Cc: linux-mm@kvack.org, Zi Yan , Naoya Horiguchi , Vlastimil Babka , Andrew Morton , Andrea Reale , LKML , Michal Hocko Subject: Re: [RFC PATCH 3/3] mm: unclutter THP migration Message-ID: <20171213122035.av4kgn2lkbwk3ovn@node.shutemov.name> References: <20171207143401.GK20234@dhcp22.suse.cz> <20171208161559.27313-1-mhocko@kernel.org> <20171208161559.27313-4-mhocko@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171208161559.27313-4-mhocko@kernel.org> User-Agent: NeoMutt/20171208 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2087 Lines: 42 On Fri, Dec 08, 2017 at 05:15:59PM +0100, Michal Hocko wrote: > From: Michal Hocko > > THP migration is hacked into the generic migration with rather > surprising semantic. The migration allocation callback is supposed to > check whether the THP can be migrated at once and if that is not the > case then it allocates a simple page to migrate. unmap_and_move then > fixes that up by spliting the THP into small pages while moving the > head page to the newly allocated order-0 page. Remaning pages are moved > to the LRU list by split_huge_page. The same happens if the THP > allocation fails. This is really ugly and error prone [1]. > > I also believe that split_huge_page to the LRU lists is inherently > wrong because all tail pages are not migrated. Some callers will just > work around that by retrying (e.g. memory hotplug). There are other > pfn walkers which are simply broken though. e.g. madvise_inject_error > will migrate head and then advances next pfn by the huge page size. > do_move_page_to_node_array, queue_pages_range (migrate_pages, mbind), > will simply split the THP before migration if the THP migration is not > supported then falls back to single page migration but it doesn't handle > tail pages if the THP migration path is not able to allocate a fresh > THP so we end up with ENOMEM and fail the whole migration which is > a questionable behavior. Page compaction doesn't try to migrate large > pages so it should be immune. > > This patch tries to unclutter the situation by moving the special THP > handling up to the migrate_pages layer where it actually belongs. We > simply split the THP page into the existing list if unmap_and_move fails > with ENOMEM and retry. So we will _always_ migrate all THP subpages and > specific migrate_pages users do not have to deal with this case in a > special way. > > [1] http://lkml.kernel.org/r/20171121021855.50525-1-zi.yan@sent.com > > Signed-off-by: Michal Hocko Looks good to me. Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov