Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752861AbdLMMHk (ORCPT ); Wed, 13 Dec 2017 07:07:40 -0500 Received: from mail-wm0-f48.google.com ([74.125.82.48]:40905 "EHLO mail-wm0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752584AbdLMMHh (ORCPT ); Wed, 13 Dec 2017 07:07:37 -0500 X-Google-Smtp-Source: ACJfBosxeZY9DEA3r5tLx0PpIXcO3+edjmN+GkjJCBowcaMHh2aT/EcvmljGT/L5sujYtcTnl+GqGw== Date: Wed, 13 Dec 2017 15:07:33 +0300 From: "Kirill A. Shutemov" To: Michal Hocko Cc: linux-mm@kvack.org, Zi Yan , Naoya Horiguchi , Vlastimil Babka , Andrew Morton , Andrea Reale , LKML , Michal Hocko Subject: Re: [RFC PATCH 1/3] mm, numa: rework do_pages_move Message-ID: <20171213120733.umeb7rylswl7chi5@node.shutemov.name> References: <20171207143401.GK20234@dhcp22.suse.cz> <20171208161559.27313-1-mhocko@kernel.org> <20171208161559.27313-2-mhocko@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171208161559.27313-2-mhocko@kernel.org> User-Agent: NeoMutt/20171208 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2828 Lines: 57 On Fri, Dec 08, 2017 at 05:15:57PM +0100, Michal Hocko wrote: > From: Michal Hocko > > do_pages_move is supposed to move user defined memory (an array of > addresses) to the user defined numa nodes (an array of nodes one for > each address). The user provided status array then contains resulting > numa node for each address or an error. The semantic of this function is > little bit confusing because only some errors are reported back. Notably > migrate_pages error is only reported via the return value. This patch > doesn't try to address these semantic nuances but rather change the > underlying implementation. > > Currently we are processing user input (which can be really large) > in batches which are stored to a temporarily allocated page. Each > address is resolved to its struct page and stored to page_to_node > structure along with the requested target numa node. The array of these > structures is then conveyed down the page migration path via private > argument. new_page_node then finds the corresponding structure and > allocates the proper target page. > > What is the problem with the current implementation and why to change > it? Apart from being quite ugly it also doesn't cope with unexpected > pages showing up on the migration list inside migrate_pages path. > That doesn't happen currently but the follow up patch would like to > make the thp migration code more clear and that would need to split a > THP into the list for some cases. > > How does the new implementation work? Well, instead of batching into a > fixed size array we simply batch all pages that should be migrated to > the same node and isolate all of them into a linked list which doesn't > require any additional storage. This should work reasonably well because > page migration usually migrates larger ranges of memory to a specific > node. So the common case should work equally well as the current > implementation. Even if somebody constructs an input where the target > numa nodes would be interleaved we shouldn't see a large performance > impact because page migration alone doesn't really benefit from > batching. mmap_sem batching for the lookup is quite questionable and > isolate_lru_page which would benefit from batching is not using it even > in the current implementation. > > Signed-off-by: Michal Hocko > --- > mm/internal.h | 1 + > mm/mempolicy.c | 5 +- > mm/migrate.c | 306 +++++++++++++++++++++++++-------------------------------- > 3 files changed, 139 insertions(+), 173 deletions(-) The approach looks fine to me. But patch is rather large and hard to review. And how git mixed add/remove lines doesn't help too. Any chance to split it up further? One nitpick: I don't think 'chunk' terminology should go away with the patch. -- Kirill A. Shutemov