Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp560247imm; Wed, 23 May 2018 01:31:59 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqVFqm4oLQoVc0HMZg9QCII/1KX6OolTGX7gnctfDKFGLzTpGFwMIHmHOI4MkZzf4/tSTMm X-Received: by 2002:a63:bf0c:: with SMTP id v12-v6mr1565979pgf.18.1527064319715; Wed, 23 May 2018 01:31:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527064319; cv=none; d=google.com; s=arc-20160816; b=IMDxiBzaYOO2f7P7aIGK/KGJdHPTMSmIjyxzVYFKTOgMQZTkWGWC+lq6+sf9QD2vbP 1NClDHhux/GzC/6qmX6bbFPd1t3n2lFG4U9pKoEIQ8WQ+lmGW4q3asL4Oaoi6YCPdbjE z17xXonm9F0f/Bm8sYOmqr3iXRapi844oIDZiahZeE8dqAh86UgRhybo6DUYmYZNk+00 a6DvyNBmL1VG0Xo8Fso6ZmGN4Q4wykp0EIcxlHSKChhrlz+4H1cGpuWJuU4Auj3yFrC0 /lKxE1v73hCMsq6El5/5tmQahUOSI1d3WsXCzOv4Mswtky+Ky0AlM22CC8dg7KJRH31Q k+JA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=r5fJWceke6VA6XqCta8sbDWJd2+freUpXAQOaYo1tYI=; b=DlCdtkQLTEYwlmJaRtD5j4hvqcgNHXcFOFgFbrpDHfB+ro6OmYeZurwz3fbfzQKDT0 uBzxF4HT84FZ0Lw5wRcP6eQTd3c/IpPBja69AcVsbrzTQ+Zdr7Uj7oEAjuK7JPtHZO3O JXcO5HmmINbu+r8L/Rrf1rgUSzo4L34kDxg/3HSUbw9/H6NYcstsh90roOa3v4Phviex +FZcT3BECyaFN/v9Jx4F4Znvc/mEkcGBsAnRpJtOvmhCWBF54KovVucuxJxRsRTZ9nJs cRqDGaQrXG7wfWJmBrVJ7Hur7crnYsRpx80wawoFXFyGVg+mKJXSgJ1qkOMfMLf+Ig+5 W4FQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s7-v6si14099033pgr.670.2018.05.23.01.31.45; Wed, 23 May 2018 01:31:59 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754570AbeEWI1S (ORCPT + 99 others); Wed, 23 May 2018 04:27:18 -0400 Received: from mga03.intel.com ([134.134.136.65]:7732 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754175AbeEWI1O (ORCPT ); Wed, 23 May 2018 04:27:14 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 23 May 2018 01:27:14 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,432,1520924400"; d="scan'208";a="57726216" Received: from yhuang6-ux31a.sh.intel.com ([10.239.197.97]) by fmsmga001.fm.intel.com with ESMTP; 23 May 2018 01:27:09 -0700 From: "Huang, Ying" To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Huang Ying , "Kirill A. Shutemov" , Andrea Arcangeli , Michal Hocko , Johannes Weiner , Shaohua Li , Hugh Dickins , Minchan Kim , Rik van Riel , Dave Hansen , Naoya Horiguchi , Zi Yan Subject: [PATCH -mm -V3 15/21] mm, THP, swap: Support to copy PMD swap mapping when fork() Date: Wed, 23 May 2018 16:26:19 +0800 Message-Id: <20180523082625.6897-16-ying.huang@intel.com> X-Mailer: git-send-email 2.16.1 In-Reply-To: <20180523082625.6897-1-ying.huang@intel.com> References: <20180523082625.6897-1-ying.huang@intel.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Huang Ying During fork, the page table need to be copied from parent to child. A PMD swap mapping need to be copied too and the swap reference count need to be increased. When the huge swap cluster has been split already, we need to split the PMD swap mapping and fallback to PTE copying. When swap count continuation failed to allocate a page with GFP_ATOMIC, we need to unlock the spinlock and try again with GFP_KERNEL. Signed-off-by: "Huang, Ying" Cc: "Kirill A. Shutemov" Cc: Andrea Arcangeli Cc: Michal Hocko Cc: Johannes Weiner Cc: Shaohua Li Cc: Hugh Dickins Cc: Minchan Kim Cc: Rik van Riel Cc: Dave Hansen Cc: Naoya Horiguchi Cc: Zi Yan --- mm/huge_memory.c | 72 ++++++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 57 insertions(+), 15 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index c4eb7737b313..01fdd59fe6d4 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -941,6 +941,7 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, if (unlikely(!pgtable)) goto out; +retry: dst_ptl = pmd_lock(dst_mm, dst_pmd); src_ptl = pmd_lockptr(src_mm, src_pmd); spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); @@ -948,26 +949,67 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, ret = -EAGAIN; pmd = *src_pmd; -#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION if (unlikely(is_swap_pmd(pmd))) { swp_entry_t entry = pmd_to_swp_entry(pmd); - VM_BUG_ON(!is_pmd_migration_entry(pmd)); - if (is_write_migration_entry(entry)) { - make_migration_entry_read(&entry); - pmd = swp_entry_to_pmd(entry); - if (pmd_swp_soft_dirty(*src_pmd)) - pmd = pmd_swp_mksoft_dirty(pmd); - set_pmd_at(src_mm, addr, src_pmd, pmd); +#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION + if (is_migration_entry(entry)) { + if (is_write_migration_entry(entry)) { + make_migration_entry_read(&entry); + pmd = swp_entry_to_pmd(entry); + if (pmd_swp_soft_dirty(*src_pmd)) + pmd = pmd_swp_mksoft_dirty(pmd); + set_pmd_at(src_mm, addr, src_pmd, pmd); + } + add_mm_counter(dst_mm, MM_ANONPAGES, HPAGE_PMD_NR); + mm_inc_nr_ptes(dst_mm); + pgtable_trans_huge_deposit(dst_mm, dst_pmd, pgtable); + set_pmd_at(dst_mm, addr, dst_pmd, pmd); + ret = 0; + goto out_unlock; } - add_mm_counter(dst_mm, MM_ANONPAGES, HPAGE_PMD_NR); - mm_inc_nr_ptes(dst_mm); - pgtable_trans_huge_deposit(dst_mm, dst_pmd, pgtable); - set_pmd_at(dst_mm, addr, dst_pmd, pmd); - ret = 0; - goto out_unlock; - } #endif + if (thp_swap_supported() && !non_swap_entry(entry)) { + ret = swap_duplicate(&entry, true); + if (!ret) { + add_mm_counter(dst_mm, MM_SWAPENTS, + HPAGE_PMD_NR); + mm_inc_nr_ptes(dst_mm); + pgtable_trans_huge_deposit(dst_mm, dst_pmd, + pgtable); + set_pmd_at(dst_mm, addr, dst_pmd, pmd); + /* make sure dst_mm is on swapoff's mmlist. */ + if (unlikely(list_empty(&dst_mm->mmlist))) { + spin_lock(&mmlist_lock); + if (list_empty(&dst_mm->mmlist)) + list_add(&dst_mm->mmlist, + &src_mm->mmlist); + spin_unlock(&mmlist_lock); + } + } else if (ret == -ENOTDIR) { + /* + * The swap cluster has been split, split the + * pmd map now + */ + __split_huge_swap_pmd(vma, addr, src_pmd); + pte_free(dst_mm, pgtable); + } else if (ret == -ENOMEM) { + spin_unlock(src_ptl); + spin_unlock(dst_ptl); + ret = add_swap_count_continuation(entry, + GFP_KERNEL); + if (ret < 0) { + ret = -ENOMEM; + pte_free(dst_mm, pgtable); + goto out; + } + goto retry; + } else + VM_BUG_ON(1); + goto out_unlock; + } + VM_BUG_ON(1); + } if (unlikely(!pmd_trans_huge(pmd))) { pte_free(dst_mm, pgtable); -- 2.16.1