Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp5224250imm; Wed, 12 Sep 2018 03:00:21 -0700 (PDT) X-Google-Smtp-Source: ANB0Vdaw5Zseko+Jo+5gxQaYOld8oxaXH8auO+EvZ2QXx89AEbO2SAhRTS95tuoOpWX5bdGOo6IX X-Received: by 2002:a63:3207:: with SMTP id y7-v6mr1287681pgy.101.1536746421120; Wed, 12 Sep 2018 03:00:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536746421; cv=none; d=google.com; s=arc-20160816; b=ck0wLkmB9o01FDLYxPNiQGWJaxMywmOcfqR/IEw1HKQPKG9/isEY9LjONnwq4oSQZP QXA4oK8sE4jYlryYkChpPmrs3+9i9WqdRQafJqYnnw8lHzKd/+T9a8zKVoNmVZs00yjp vJ6nNgA637zmcLn99vjsmVo0Dc154lfxbpLOXy9MSzTTbzToG3rc3JKNxsYCC2gfFD5+ dPywDqONkaGqEemdLFmDI8z/vnBhJa7DK5Paqdv2X7IXs4K93zADg3fYEYVkaIP+cYIM NUp6JbUt0Ljy9Uwx20oQ7DjcSRlXRv2XadcAZB2ldYxW5ucIAG70KOZrlV1rVj9JAKRl pfVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:references:cc:to:subject:from:dkim-signature; bh=fsyu22oVQUzZyQK/54w8JcJG5zggwSObJYESkgC27CU=; b=yjMzVGIAICtedC76zTvc3ezSaaXPZwJgzl61ylKesyvtpk+PiX76CLTGIp8iRrDZej XrwMYqfPLCFUa/hwFs59q6sQe4E//5hyMF5egCRNC7RWOzFkQOn7X1NJzpcdShqWbBh2 w9EQ8iOHvkIuPkZazbQIeUmM3Kny/E79K8h0L5K83C/dOVGk65yxDpLhsaCa0veJymeC vwP+mrH5S1bWM1RbfHudCb0apqZVCgC8gMtO7xGvzGKmKYXfiEeGtYQuJAt4vshB5r1V 20D31KxQiO0isOOV2HX/zM9+mtHv7X96NgkwMSqlbkH+gZhFiJ3StoFA4UghZ9bHOgMR yKlg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=sEjWOW2L; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k13-v6si576122pgg.346.2018.09.12.03.00.05; Wed, 12 Sep 2018 03:00:21 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=sEjWOW2L; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727660AbeILPCR (ORCPT + 99 others); Wed, 12 Sep 2018 11:02:17 -0400 Received: from mail-pg1-f193.google.com ([209.85.215.193]:33897 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726359AbeILPCR (ORCPT ); Wed, 12 Sep 2018 11:02:17 -0400 Received: by mail-pg1-f193.google.com with SMTP id d19-v6so804603pgv.1; Wed, 12 Sep 2018 02:58:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:subject:to:cc:references:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=fsyu22oVQUzZyQK/54w8JcJG5zggwSObJYESkgC27CU=; b=sEjWOW2LTlp6Bf2QEd9RhfWH0JVDpEgh9tc9s1qxAvBtDy/9BBXSvNU42y5fa4MEgz 0YwpHmYOWUuXWNkbL0dkGx87j7VfFPdFI7Vp0ktDlcGrI+40SJA1yQlDVBEo2IKXVSIt zIl2mny73oqNP/GrDZnEbAXcmTKUQjhz9wsNQpCR4/tTxZ2rM4dsFQ8SxqUgmMU5yqab t3qlA8T5Es/ccFL2+V5ix/84XDydNdtZTWRVvydyiyTTU9bxd6JbD7xznjXg0lXI+7sW P47K8LL6AwsS5+mZKLnxgEsSuYPv67j1z9X7fvk3kBu08mdq34s5H7qumyNfTPqAESJJ mMUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:subject:to:cc:references:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=fsyu22oVQUzZyQK/54w8JcJG5zggwSObJYESkgC27CU=; b=ELQHnfxbfsmPc/7H33QzmfGIrFMMEXk+L62vTmcuFB60OUp6ZDidphHVdCdyUENHOj N2lnVLMgD6Ap7I8KzDwQGPXW/xmbJY9oipZxb3yr2b2oAgTxaq99Ek0hmjolsx2soRc7 Hq6MuAFcRWcQtJtHjmR9ZhewFx8sIsq2I5ruudHeRryEn8ub2hdERgGaXfoQCdgz4dnJ KppniiHGtrdPAJ0Q1sLO5GGrBGcer8O+Q0dNmI4elpDCAC8yTQWvOpQ0XmvNyYP6iLOB S5SaMgT8Hx0QuOgHWiTg8NZle4VjWCWYtFZ45poA85uD+65K59mPnF6q7HSWEagrNRpt P0ew== X-Gm-Message-State: APzg51DCw92nZZM7lHV9CcuS3KQvrSVyQjEf4qQQtvhzF8ogWCEl5cJW 3lNpLn3Yv+8/4hkElvbi9yg= X-Received: by 2002:a63:5d4b:: with SMTP id o11-v6mr1241066pgm.349.1536746309549; Wed, 12 Sep 2018 02:58:29 -0700 (PDT) Received: from [9.124.31.179] ([125.16.236.130]) by smtp.gmail.com with ESMTPSA id u184-v6sm2555960pgd.46.2018.09.12.02.58.26 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Sep 2018 02:58:28 -0700 (PDT) From: "Aneesh Kumar K.V" X-Google-Original-From: "Aneesh Kumar K.V" Subject: Re: [PATCH] mm, thp: Fix mlocking THP page with migration enabled To: "Kirill A. Shutemov" , Andrew Morton Cc: Vegard Nossum , linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, Zi Yan , Naoya Horiguchi , Vlastimil Babka , Andrea Arcangeli References: <20180911103403.38086-1-kirill.shutemov@linux.intel.com> Message-ID: <6fcb5b5b-43fa-f1d0-ce78-37fb51b46a75@linux.ibm.com> Date: Wed, 12 Sep 2018 15:28:24 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.0 MIME-Version: 1.0 In-Reply-To: <20180911103403.38086-1-kirill.shutemov@linux.intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 9/11/18 4:04 PM, Kirill A. Shutemov wrote: > A transparent huge page is represented by a single entry on an LRU list. > Therefore, we can only make unevictable an entire compound page, not > individual subpages. > > If a user tries to mlock() part of a huge page, we want the rest of the > page to be reclaimable. > > We handle this by keeping PTE-mapped huge pages on normal LRU lists: the > PMD on border of VM_LOCKED VMA will be split into PTE table. > > Introduction of THP migration breaks the rules around mlocking THP > pages. If we had a single PMD mapping of the page in mlocked VMA, the > page will get mlocked, regardless of PTE mappings of the page. > > For tmpfs/shmem it's easy to fix by checking PageDoubleMap() in > remove_migration_pmd(). > > Anon THP pages can only be shared between processes via fork(). Mlocked > page can only be shared if parent mlocked it before forking, otherwise > CoW will be triggered on mlock(). > > For Anon-THP, we can fix the issue by munlocking the page on removing PTE > migration entry for the page. PTEs for the page will always come after > mlocked PMD: rmap walks VMAs from oldest to newest. > > Test-case: > > #include > #include > #include > #include > #include > > int main(void) > { > unsigned long nodemask = 4; > void *addr; > > addr = mmap((void *)0x20000000UL, 2UL << 20, PROT_READ | PROT_WRITE, > MAP_PRIVATE | MAP_ANONYMOUS | MAP_LOCKED, -1, 0); > > if (fork()) { > wait(NULL); > return 0; > } > > mlock(addr, 4UL << 10); > mbind(addr, 2UL << 20, MPOL_PREFERRED | MPOL_F_RELATIVE_NODES, > &nodemask, 4, MPOL_MF_MOVE | MPOL_MF_MOVE_ALL); > > return 0; > } > > Signed-off-by: Kirill A. Shutemov > Reported-by: Vegard Nossum > Fixes: 616b8371539a ("mm: thp: enable thp migration in generic path") > Cc: [v4.14+] > Cc: Zi Yan > Cc: Naoya Horiguchi > Cc: Vlastimil Babka > Cc: Andrea Arcangeli > --- > mm/huge_memory.c | 2 +- > mm/migrate.c | 3 +++ > 2 files changed, 4 insertions(+), 1 deletion(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 533f9b00147d..00704060b7f7 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -2931,7 +2931,7 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new) > else > page_add_file_rmap(new, true); > set_pmd_at(mm, mmun_start, pvmw->pmd, pmde); > - if (vma->vm_flags & VM_LOCKED) > + if ((vma->vm_flags & VM_LOCKED) && !PageDoubleMap(new)) > mlock_vma_page(new); > update_mmu_cache_pmd(vma, address, pvmw->pmd); > } > diff --git a/mm/migrate.c b/mm/migrate.c > index d6a2e89b086a..01dad96b25b5 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -275,6 +275,9 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma, > if (vma->vm_flags & VM_LOCKED && !PageTransCompound(new)) > mlock_vma_page(new); > > + if (PageTransCompound(new) && PageMlocked(page)) > + clear_page_mlock(page); > + Can you explain this more? I am confused by the usage of 'new' and 'page' there. I guess the idea is if we are removing the migration pte at level 4 table, and if we found the backing page compound don't mark the page Mlocked? -aneesh