Received: by 2002:ac0:946b:0:0:0:0:0 with SMTP id j40csp308578imj; Sat, 16 Feb 2019 00:34:37 -0800 (PST) X-Google-Smtp-Source: AHgI3IaJ2iHEnNOqy8DuSUxeMoqM0GuFpTs8WEpCkcmYMEq9FNWL+uf6O0HR3Nsqp3YhaF+ppOEB X-Received: by 2002:a62:4181:: with SMTP id g1mr13815126pfd.45.1550306077824; Sat, 16 Feb 2019 00:34:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1550306077; cv=none; d=google.com; s=arc-20160816; b=VEiBVgg8rRglBTFt862GQXRnN/y4yoEZ0yb4TNeHWQzP1uZT1LSZ3asvP6JmBOV3NO I9opyczk/1+2v6OCh8eCLWGqLPEZDmLqTpPAtYWOTuKc3P8TB+VKRzZ015uSdigM6MwO a5FzVQm97YuXqlkEdmYyO+uaCOM/QGlng6I6sUr9ERrOmZpy2LSu6skw/gZDACg01aT4 GQLaTvIo+m8K9vxwYYqP4DV+EUnWhoFsztRKSG7DNdtJXZNhbzEoZS3rUYFEwwUFyqaT qzphV7JEdaXCwyyY9DOHiy8pq3g/jARrSoA9Ql2Fmz+IcG0rVBC7HqRh3H/9FllBH3ww IhTw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :reply-to:references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature:dkim-signature; bh=b+4owQ22jMmEaJxbgoKlHw/S+Iw79LgMeB8j4fvm/+w=; b=RCth/vk1IXZZ1K53YF3NdSXLSD5Pwtzg14y6qUeCrNdjkGoJTbHC2Unw/My+cBJJFP r10HTyrwRU1Eh9qF9Sysi2me+mwxnTE5mMYOu+bMVa5tj/gJwYfIhBtxTQ9QPdKLmf/p IcJZTXaF0ilsBVbHzVxOMSwWP6/2qYthljUY7pN/Gsotb57Ok3mNxiDb06a1yGh++s/l ueJLw7ooPNyWYI5Lr8UdzI0Jym9GkOKN/XRO9pQo4cleT97j2Q5wNFcp/D+XXzG6cmwg 0GhXca6d9cpstcMVRxFrNhcwJexaLprgU4Kxw2VOfC6uxaSN3b3HSyhjAQDQaiGLieFk UvTw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sent.com header.s=fm2 header.b=OWPUqTmx; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b="wFbCS/8Q"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h3si7708364pfd.250.2019.02.16.00.34.22; Sat, 16 Feb 2019 00:34:37 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm2 header.b=OWPUqTmx; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b="wFbCS/8Q"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2393626AbfBOWJv (ORCPT + 99 others); Fri, 15 Feb 2019 17:09:51 -0500 Received: from wout2-smtp.messagingengine.com ([64.147.123.25]:60895 "EHLO wout2-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730460AbfBOWJo (ORCPT ); Fri, 15 Feb 2019 17:09:44 -0500 Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.west.internal (Postfix) with ESMTP id 915703058; Fri, 15 Feb 2019 17:09:43 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute3.internal (MEProxy); Fri, 15 Feb 2019 17:09:44 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm2; bh=b+4owQ22jMmEa JxbgoKlHw/S+Iw79LgMeB8j4fvm/+w=; b=OWPUqTmxFQ5a9WHvCV1c3pv0rgwda k7X4RrLTo8a67TLIxEEOXgzVmuUQAeKpIbuLSeCJWTt7fawFGN748dglDo/WsoVB m6HBkUfqG/wZaiklaIKbOpY+TXEvIQN08fxDq9Hyl40n4JQGSH+mtk9p0fkxnRtU Ct1vjYMgrkVbnujGF15r67EXGd7CDwLHqcq7qswpkEsM3QMn58L36/mwZmLoTzm6 bt3CoGkY4uun6L5roNEGmwc4XmUe2omNRQp6XN0u/MPxkH0ihYI2lC7Vb+B/vBsq s4UbfX8JO1IMxwS3lpioxA1p8BVDtaifeAZyFX+OMrvQMq6OFiLLOTDew== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=b+4owQ22jMmEaJxbgoKlHw/S+Iw79LgMeB8j4fvm/+w=; b=wFbCS/8Q dQ65SYdwaciXiZ4GIs+iLogpwaw48RgAcLS2b+twa65UCcamgOHSqP8QKd07sRTt xzuD1xMpUddn4yuxZF+ax/mfTPsjNs5lphxbyUN28ViJ6ydGVdGmB6e3Hy8WOv00 KUe66i1ihWtA2z1fzi3VEelkgFGimuNdpcogFedTF/0QoMMokMUPYsxzuVVZ0luW VL9x6oEBY1MwyhPAn2oF62dAxS1+8stPYogU9TMlZZJ+ipmojV97RZ8qbYkix7NQ 9ivTYwGmuoMu7XnZ4QsU5Yv9T8gfrfkEW/q6QvUG6xRU5iKm19krNdrfbGXEnihK 73G9qZfSRIgopQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedtledruddtjedgudehlecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfhuthenuceurghilhhouhhtmecu fedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvufffkf fojghfrhgggfestdekredtredttdenucfhrhhomhepkghiucgjrghnuceoiihirdihrghn sehsvghnthdrtghomheqnecukfhppedvudeirddvvdekrdduuddvrddvvdenucfrrghrrg hmpehmrghilhhfrhhomhepiihirdihrghnsehsvghnthdrtghomhenucevlhhushhtvghr ufhiiigvpedt X-ME-Proxy: Received: from nvrsysarch5.nvidia.com (thunderhill.nvidia.com [216.228.112.22]) by mail.messagingengine.com (Postfix) with ESMTPA id 83858E46AD; Fri, 15 Feb 2019 17:09:41 -0500 (EST) From: Zi Yan To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Dave Hansen , Michal Hocko , "Kirill A . Shutemov" , Andrew Morton , Vlastimil Babka , Mel Gorman , John Hubbard , Mark Hairgrove , Nitin Gupta , David Nellans , Zi Yan Subject: [RFC PATCH 30/31] mm: mem_defrag: thp: PMD THP and PUD THP in-place promotion support. Date: Fri, 15 Feb 2019 14:08:55 -0800 Message-Id: <20190215220856.29749-31-zi.yan@sent.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190215220856.29749-1-zi.yan@sent.com> References: <20190215220856.29749-1-zi.yan@sent.com> Reply-To: ziy@nvidia.com MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Zi Yan PMD THPs will get PMD page table entry promotion as well. PUD THPs only gets PUD page table entry promotion when the toggle is on, which is off by default. Since 1GB THP performs not so good due to shortage of 1GB TLB entries. Signed-off-by: Zi Yan --- mm/mem_defrag.c | 79 +++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 73 insertions(+), 6 deletions(-) diff --git a/mm/mem_defrag.c b/mm/mem_defrag.c index 4d458b125c95..d7a579924d12 100644 --- a/mm/mem_defrag.c +++ b/mm/mem_defrag.c @@ -56,6 +56,7 @@ struct defrag_result_stats { unsigned long dst_non_lru_failed; unsigned long dst_non_moveable_failed; unsigned long not_defrag_vpn; + unsigned int aligned_max_order; }; enum { @@ -689,6 +690,10 @@ int defrag_address_range(struct mm_struct *mm, struct vm_area_struct *vma, page_size = get_contig_page_size(scan_page); + if (compound_order(compound_head(scan_page)) == HPAGE_PUD_ORDER) { + defrag_stats->aligned_max_order = HPAGE_PUD_ORDER; + goto quit_defrag; + } /* PTE-mapped THP not allowed */ if ((scan_page == compound_head(scan_page)) && PageTransHuge(scan_page) && !PageHuge(scan_page)) @@ -714,6 +719,8 @@ int defrag_address_range(struct mm_struct *mm, struct vm_area_struct *vma, /* already in the contiguous pos */ if (page_dist == (long long)(scan_page - anchor_page)) { defrag_stats->aligned += (page_size/PAGE_SIZE); + defrag_stats->aligned_max_order = max(defrag_stats->aligned_max_order, + compound_order(scan_page)); continue; } else { /* migrate pages according to the anchor pages in the vma. */ struct page *dest_page = anchor_page + page_dist; @@ -901,6 +908,10 @@ int defrag_address_range(struct mm_struct *mm, struct vm_area_struct *vma, } else { /* exchange */ int err = -EBUSY; + if (compound_order(compound_head(dest_page)) == HPAGE_PUD_ORDER) { + defrag_stats->aligned_max_order = HPAGE_PUD_ORDER; + goto quit_defrag; + } /* PTE-mapped THP not allowed */ if ((dest_page == compound_head(dest_page)) && PageTransHuge(dest_page) && !PageHuge(dest_page)) @@ -1486,10 +1497,13 @@ static int kmem_defragd_scan_mm(struct defrag_scan_control *sc) up_read(&vma->vm_mm->mmap_sem); } else if (sc->action == MEM_DEFRAG_DO_DEFRAG) { /* go to nearest 1GB aligned address */ + unsigned long defrag_begin = *scan_address; unsigned long defrag_end = min_t(unsigned long, (*scan_address + HPAGE_PUD_SIZE) & HPAGE_PUD_MASK, vend); int defrag_result; + int nr_fails_in_1gb_range = 0; + int skip_promotion = 0; anchor_node = get_anchor_page_node_from_vma(vma, *scan_address); @@ -1583,14 +1597,47 @@ static int kmem_defragd_scan_mm(struct defrag_scan_control *sc) * skip the page which cannot be defragged and restart * from the next page */ - if (defrag_stats.not_defrag_vpn && - defrag_stats.not_defrag_vpn < defrag_sub_chunk_end) { + if (defrag_stats.not_defrag_vpn) { VM_BUG_ON(defrag_sub_chunk_end != defrag_end && defrag_stats.not_defrag_vpn > defrag_sub_chunk_end); - - *scan_address = defrag_stats.not_defrag_vpn; - defrag_stats.not_defrag_vpn = 0; - goto continue_defrag; + find_anchor_pages_in_vma(mm, vma, defrag_stats.not_defrag_vpn); + + nr_fails_in_1gb_range += 1; + if (defrag_stats.not_defrag_vpn < defrag_sub_chunk_end) { + /* reset and continue */ + *scan_address = defrag_stats.not_defrag_vpn; + defrag_stats.not_defrag_vpn = 0; + goto continue_defrag; + } + } else { + /* defrag works for the whole chunk, + * promote to THP in place + */ + if (!defrag_result && + /* skip existing THPs */ + defrag_stats.aligned_max_order < HPAGE_PMD_ORDER && + !(*scan_address & (HPAGE_PMD_SIZE-1)) && + !(defrag_sub_chunk_end & (HPAGE_PMD_SIZE-1))) { + int ret = 0; + /* find a range to promote pmd */ + down_write(&mm->mmap_sem); + ret = promote_huge_page_address(vma, *scan_address); + if (!ret) { + /* + * promote to 2MB THP successful, but it is + * still PTE pointed + */ + /* promote PTE-mapped THP to PMD-mapped */ + promote_huge_pmd_address(vma, *scan_address); + } + up_write(&mm->mmap_sem); + } + /* skip PUD pages */ + if (defrag_stats.aligned_max_order == HPAGE_PUD_ORDER) { + *scan_address = defrag_end; + skip_promotion = 1; + continue; + } } /* Done with current 2MB chunk */ @@ -1606,6 +1653,26 @@ static int kmem_defragd_scan_mm(struct defrag_scan_control *sc) } } + /* defrag works for the whole chunk, promote to PUD THP in place */ + if (!nr_fails_in_1gb_range && + !skip_promotion && /* avoid existing THP */ + !(defrag_begin & (HPAGE_PUD_SIZE-1)) && + !(defrag_end & (HPAGE_PUD_SIZE-1))) { + int ret = 0; + /* find a range to promote pud */ + down_write(&mm->mmap_sem); + ret = promote_huge_pud_page_address(vma, defrag_begin); + if (!ret) { + /* + * promote to 1GB THP successful, but it is + * still PMD pointed + */ + /* promote PMD-mapped THP to PUD-mapped */ + if (mem_defrag_promote_1gb_thp) + promote_huge_pud_address(vma, defrag_begin); + } + up_write(&mm->mmap_sem); + } } } done_one_vma: -- 2.20.1