Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp218998ybf; Thu, 27 Feb 2020 19:40:47 -0800 (PST) X-Google-Smtp-Source: APXvYqwNnyaUkSUsPXDCuFx3etrD0opdEHgj0GKgyiY0pHg9KXf+NMAPT6Gs5AtMYj9jyl/L0cOz X-Received: by 2002:a9d:51c1:: with SMTP id d1mr1695337oth.136.1582861246964; Thu, 27 Feb 2020 19:40:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582861246; cv=none; d=google.com; s=arc-20160816; b=mGrzmejxXZ/zMoYfaMgHMSdpp979LILJnahmTZ+9o7uDPeAgvUsF3YO0RGtB1SNYxu eu/SDy15kj9IXeqcqqfVfF5Z7yy6ckZ4vuxjZC9A/aAK9rdMaYlfa0BbPQ4Zz7wkW/Wk lty+C6vjuE5LKkmQ0sfuOFOxictI4sIG2yvZcrlh5V4GcpcaRnL3e3ZzrmUAyBfXf2Rp KTPa4jRT2y8ePJ3GgvKHkryfAVhe+mUHVRmPNx98QKSrG9PCXL3mSUPpiK+MbV4RTH5B fpLdAHtWkFu+YfHNBp1rctbQG2ts9Eq5MbKPemtOG0jq0RPG4Zj7nR8Y9vRqibs5Dh4A UwSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=GCsRHAhPvdGAzeIz1Xeg97EqEfanNICbuucsz+IzHTU=; b=qleqw8uPSlg/GYMJOkBAfW333jzcNdbSsYGQIw9Fx1YypfGjTgoSYjISqNnEx5Y3Yh RPVprjKMXTXvhjFffPiTgD3Ddu4evD+jh1XMHojvXLc3hTFb1d/bk3DSPeuUtS4TuE35 TzrzOgQi/6rkjxVdVVYG6hPcZaZ12HuQNM+bE0735ulC+Kr0UVFw5uPcsDP47K80LU6w MKAhzWJbPUg3MO5OvTZkpmCtA1Jmqm3mffVvMf3Six5Wpv/gb4yjEHvxSl8jaNnG48AO Smg7jFfU9SEj6WZiiW0jDvwLE54WqI2dxgUwnc+arohfAgMSr9UrCH1MJ7M3opQrzudN yqTw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l23si745239otj.302.2020.02.27.19.40.34; Thu, 27 Feb 2020 19:40:46 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730857AbgB1Dis (ORCPT + 99 others); Thu, 27 Feb 2020 22:38:48 -0500 Received: from mga09.intel.com ([134.134.136.24]:36522 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730796AbgB1Dis (ORCPT ); Thu, 27 Feb 2020 22:38:48 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 27 Feb 2020 19:38:47 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,493,1574150400"; d="scan'208";a="232107402" Received: from yhuang-dev.sh.intel.com ([10.239.159.23]) by orsmga008.jf.intel.com with ESMTP; 27 Feb 2020 19:38:44 -0800 From: "Huang, Ying" To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Huang Ying , David Hildenbrand , Mel Gorman , Vlastimil Babka , Zi Yan , Michal Hocko , Peter Zijlstra , Dave Hansen , Minchan Kim , Johannes Weiner , Hugh Dickins Subject: [RFC 3/3] mm: Discard lazily freed pages when migrating Date: Fri, 28 Feb 2020 11:38:19 +0800 Message-Id: <20200228033819.3857058-4-ying.huang@intel.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200228033819.3857058-1-ying.huang@intel.com> References: <20200228033819.3857058-1-ying.huang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Huang Ying MADV_FREE is a lazy free mechanism in Linux. According to the manpage of mavise(2), the semantics of MADV_FREE is, The application no longer requires the pages in the range specified by addr and len. The kernel can thus free these pages, but the freeing could be delayed until memory pressure occurs. ... Originally, the pages freed lazily by MADV_FREE will only be freed really by page reclaiming when there is memory pressure or when unmapping the address range. In addition to that, there's another opportunity to free these pages really, when we try to migrate them. The main value to do that is to avoid to create the new memory pressure immediately if possible. Instead, even if the pages are required again, they will be allocated gradually on demand. That is, the memory will be allocated lazily when necessary. This follows the common philosophy in the Linux kernel, allocate resources lazily on demand. Signed-off-by: "Huang, Ying" Cc: David Hildenbrand Cc: Mel Gorman Cc: Vlastimil Babka Cc: Zi Yan Cc: Michal Hocko Cc: Peter Zijlstra Cc: Dave Hansen Cc: Minchan Kim Cc: Johannes Weiner Cc: Hugh Dickins --- include/linux/migrate.h | 4 ++++ mm/huge_memory.c | 20 +++++++++++++++----- mm/migrate.c | 16 +++++++++++++++- mm/rmap.c | 10 ++++++++++ 4 files changed, 44 insertions(+), 6 deletions(-) diff --git a/include/linux/migrate.h b/include/linux/migrate.h index 72120061b7d4..2c6cf985a8d3 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -14,8 +14,12 @@ typedef void free_page_t(struct page *page, unsigned long private); * Return values from addresss_space_operations.migratepage(): * - negative errno on page migration failure; * - zero on page migration success; + * + * __unmap_and_move() can also return 1 to indicate the page can be + * discarded instead of migrated. */ #define MIGRATEPAGE_SUCCESS 0 +#define MIGRATEPAGE_DISCARD 1 enum migrate_reason { MR_COMPACTION, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index b1e069e68189..b64f356ab77e 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3063,11 +3063,21 @@ void set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, pmdval = pmdp_invalidate(vma, address, pvmw->pmd); if (pmd_dirty(pmdval)) set_page_dirty(page); - entry = make_migration_entry(page, pmd_write(pmdval)); - pmdswp = swp_entry_to_pmd(entry); - if (pmd_soft_dirty(pmdval)) - pmdswp = pmd_swp_mksoft_dirty(pmdswp); - set_pmd_at(mm, address, pvmw->pmd, pmdswp); + /* Clean lazyfree page, discard instead of migrate */ + if (PageLazyFree(page) && !PageDirty(page)) { + pmd_clear(pvmw->pmd); + zap_deposited_table(mm, pvmw->pmd); + /* Invalidate as we cleared the pmd */ + mmu_notifier_invalidate_range(mm, address, + address + HPAGE_PMD_SIZE); + add_mm_counter(mm, MM_ANONPAGES, -HPAGE_PMD_NR); + } else { + entry = make_migration_entry(page, pmd_write(pmdval)); + pmdswp = swp_entry_to_pmd(entry); + if (pmd_soft_dirty(pmdval)) + pmdswp = pmd_swp_mksoft_dirty(pmdswp); + set_pmd_at(mm, address, pvmw->pmd, pmdswp); + } page_remove_rmap(page, true); put_page(page); } diff --git a/mm/migrate.c b/mm/migrate.c index 981f8374a6ef..b7e7d18af94c 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1122,6 +1122,11 @@ static int __unmap_and_move(struct page *page, struct page *newpage, goto out_unlock_both; } page_was_mapped = 1; + /* Clean lazyfree page, discard instead of migrate */ + if (PageLazyFree(page) && !PageDirty(page)) { + rc = MIGRATEPAGE_DISCARD; + goto out_unlock_both; + } } if (!page_mapped(page)) @@ -1242,7 +1247,16 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page, num_poisoned_pages_inc(); } } else { - if (rc != -EAGAIN) { + /* + * If page is discard instead of migrated, release + * reference grabbed during isolation, free the new + * page. For the caller, this is same as migrating + * successfully. + */ + if (rc == MIGRATEPAGE_DISCARD) { + put_page(page); + rc = MIGRATEPAGE_SUCCESS; + } else if (rc != -EAGAIN) { if (likely(!__PageMovable(page))) { putback_lru_page(page); goto put_new; diff --git a/mm/rmap.c b/mm/rmap.c index 1dcbb1771dd7..bb52883f7b2d 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1569,6 +1569,16 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, swp_entry_t entry; pte_t swp_pte; + /* Clean lazyfree page, discard instead of migrate */ + if (PageLazyFree(page) && !PageDirty(page) && + !(flags & TTU_SPLIT_FREEZE)) { + /* Invalidate as we cleared the pte */ + mmu_notifier_invalidate_range(mm, + address, address + PAGE_SIZE); + dec_mm_counter(mm, MM_ANONPAGES); + goto discard; + } + if (arch_unmap_one(mm, vma, address, pteval) < 0) { set_pte_at(mm, address, pvmw.pte, pteval); ret = false; -- 2.25.0