Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966315AbcKKDtQ (ORCPT ); Thu, 10 Nov 2016 22:49:16 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:51250 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966207AbcKKDtO (ORCPT ); Thu, 10 Nov 2016 22:49:14 -0500 Subject: Re: [PATCH v2 00/12] mm: page migration enhancement for thp To: Zi Yan , Naoya Horiguchi References: <1478561517-4317-1-git-send-email-n-horiguchi@ah.jp.nec.com> <5822FB60.5040905@linux.vnet.ibm.com> <20161109235223.GA31285@hori1.linux.bs1.fc.nec.co.jp> Cc: "linux-mm@kvack.org" , "Kirill A. Shutemov" , Hugh Dickins , Andrew Morton , Dave Hansen , Andrea Arcangeli , Mel Gorman , Michal Hocko , Vlastimil Babka , Pavel Emelyanov , Balbir Singh , "linux-kernel@vger.kernel.org" , Naoya Horiguchi From: Anshuman Khandual Date: Fri, 11 Nov 2016 09:18:44 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16111103-0020-0000-0000-00000057DC68 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16111103-0021-0000-0000-00000116BEED Message-Id: <58253F9C.6040307@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-11-11_01:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609300000 definitions=main-1611110066 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2732 Lines: 66 On 11/10/2016 07:31 PM, Zi Yan wrote: > On 9 Nov 2016, at 18:52, Naoya Horiguchi wrote: > >> Hi Anshuman, >> >> On Wed, Nov 09, 2016 at 04:03:04PM +0530, Anshuman Khandual wrote: >>> On 11/08/2016 05:01 AM, Naoya Horiguchi wrote: >>>> Hi everyone, >>>> >>>> I've updated thp migration patches for v4.9-rc2-mmotm-2016-10-27-18-27 >>>> with feedbacks for ver.1. >>>> >>>> General description (no change since ver.1) >>>> =========================================== >>>> >>>> This patchset enhances page migration functionality to handle thp migration >>>> for various page migration's callers: >>>> - mbind(2) >>>> - move_pages(2) >>>> - migrate_pages(2) >>>> - cgroup/cpuset migration >>>> - memory hotremove >>>> - soft offline >>>> >>>> The main benefit is that we can avoid unnecessary thp splits, which helps us >>>> avoid performance decrease when your applications handles NUMA optimization on >>>> their own. >>>> >>>> The implementation is similar to that of normal page migration, the key point >>>> is that we modify a pmd to a pmd migration entry in swap-entry like format. >>> >>> Will it be better to have new THP_MIGRATE_SUCCESS and THP_MIGRATE_FAIL >>> VM events to capture how many times the migration worked without first >>> splitting the huge page and how many time it did not work ? >> >> Thank you for the suggestion. >> I think that's helpful, so will try it in next version. >> >>> Also do you >>> have a test case which demonstrates this THP migration and kind of shows >>> its better than the present split and move method ? >> >> I don't have test cases which compare thp migration and split-then-migration >> with some numbers. Maybe measuring/comparing the overhead of migration is >> a good start point, although I think the real benefit of thp migration comes >> from workload "after migration" by avoiding thp split. > > Migrating 4KB pages has much lower (~1/3) throughput than 2MB pages. I assume the 2MB throughput you mentioned is with this THP migration feature enabled. > > What I get is that on average it takes 1987.38 us to migrate 512 4KB pages and > 658.54 us to migrate 1 2MB page. > > I did the test in a two-socket Intel Xeon E5-2640v4 box. I used migrate_pages() > system call to migrate pages. MADV_NOHUGEPAGE and MADV_HUGEPAGE are used to > make 4KB and 2MB pages and each page’s flags are checked to make sure the page > size is 4KB or 2MB THP. > > There is no split page. But the page migration time already tells the story. Right. Just wondering if we can add a test case which measures just this migration time improvement by avoiding the split not the TLB based improvement which the workload will receive as an addition.