Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp4911184rwb; Tue, 20 Sep 2022 23:49:48 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7qyVDfmWxV1sYt4WVOF+VQMNG2IQ9D4gOScbV5iitUHaCVgykbypaxO7LaOY0w+0T/UykQ X-Received: by 2002:a05:6402:e9b:b0:454:351c:c222 with SMTP id h27-20020a0564020e9b00b00454351cc222mr10578296eda.216.1663742987973; Tue, 20 Sep 2022 23:49:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663742987; cv=none; d=google.com; s=arc-20160816; b=plEfpYkRwCODt6b0+7xq5CQD5HruqSXidwCwVf57JdNnccGunN6mUohjBg/ETIzPXu mlm/iGsV42HVCxC4/G0aCotNCkcuGcyZK4S6Ml/G+V8L+AJuG7e1+1IoMYgmFlsxCJCS zTsNmOR6oNtzy8OvmbFHE21ysTDYsMtHNQ2cZGfezZpBTHCL9McER6UHg1ACp9033X/e Lj0R1fUmVE003+2M6SkCfAm2P9RuPqpze73ye06vmUR+lyxCy2W6FDZ8LZ5xOlk+UjhG 83nycrujE/nW/bzPQuVrl9ay+ENKE96Bg8Y27CyOSpO2hISxVpOIcAjZjeeE+6YPRgsR MlWw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=8EuXGVOuzdKHfr3zOSL5ChTLd8xOxfeA69Ko7U7T7YM=; b=04HPiKLBLwIxLH+TcNF5PODp5ysYcmJbudy1KK4zvRfhkgy+H8LRkRoG5axKrw/Kho GhXAv3mShqX3nFD6bEBc+LAu4HK/3DxUqL0geOB2OdIPh2t3Hc54oBF1apjxJYXvxNCA hEAPokQgqiYVXnWlhW3WMgmC9fMmpapJQk7AGFYcxpRwEJTbmOx+gQM87oPnSOu5vU1e 70UqfjkM1iaMOobP1WZn8PvxruJfV7TQvY69SOmFEMAh6KhTfN0U+6XkMu+s0DSqyOwX LJKJld5BwLLBwUraQAjHnpfmKNw9Maf4ZPdqWU7Vi7hSlJvHSIv6Ii1OqjApS6gsBHyB Xdzg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=eujtSkn9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hb9-20020a170907160900b0073d932580ffsi1955581ejc.242.2022.09.20.23.49.22; Tue, 20 Sep 2022 23:49:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=eujtSkn9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229618AbiIUGHS (ORCPT + 99 others); Wed, 21 Sep 2022 02:07:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60960 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229716AbiIUGHG (ORCPT ); Wed, 21 Sep 2022 02:07:06 -0400 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA2077EFEE for ; Tue, 20 Sep 2022 23:07:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1663740421; x=1695276421; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=86Q547FC+4GoVik4Kb6Pl8hTy/8I3tx19/Q4rOJo4M4=; b=eujtSkn9ou8MsQIaL06A2buIDxbmVum2PkQHoboBTD3a41GyKrie2AJL isEUhA3lzI+2EnNui473v5XWxWA6M2AmO4Qr8YGO6Uo6rQWrrJt3utEmM so+8LnWYvAhzDLsgSz9o+ay6gsssg3xakSGpHH+HtkG7iGcYCKe5CuRFY yFG4t2sprjWX7uKGTIAyXK7e55Cp3HqCsrGCXKXKPOSoAxRTgUleAc7HI jDGs4ikYFCbgAWAn2YEAkS2oKlbWAPUBrHejlRUvNl7eF+5ITRjTY68X/ Dv3o+nEoHAjkp/LZqHvRDn0PftJcuRqsVMXlaJwvuGqAzcu39nAvtLZDS A==; X-IronPort-AV: E=McAfee;i="6500,9779,10476"; a="280284855" X-IronPort-AV: E=Sophos;i="5.93,332,1654585200"; d="scan'208";a="280284855" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2022 23:07:01 -0700 X-IronPort-AV: E=Sophos;i="5.93,332,1654585200"; d="scan'208";a="649913920" Received: from yhuang6-mobl2.sh.intel.com ([10.238.5.245]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Sep 2022 23:06:59 -0700 From: Huang Ying To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Huang Ying , Zi Yan , Yang Shi , Baolin Wang , Oscar Salvador , Matthew Wilcox Subject: [RFC 4/6] mm/migrate_pages: batch _unmap and _move Date: Wed, 21 Sep 2022 14:06:14 +0800 Message-Id: <20220921060616.73086-5-ying.huang@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220921060616.73086-1-ying.huang@intel.com> References: <20220921060616.73086-1-ying.huang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org In this patch the _unmap and _move stage of the page migration is batched. That for, previously, it is, for each page _unmap() _move() Now, it is, for each page _unmap() for each page _move() Based on this, we can batch the TLB flushing and use some hardware accelerator to copy pages between batched _unmap and batched _move stages. Signed-off-by: "Huang, Ying" Cc: Zi Yan Cc: Yang Shi Cc: Baolin Wang Cc: Oscar Salvador Cc: Matthew Wilcox --- mm/migrate.c | 155 +++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 139 insertions(+), 16 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 1077af858e36..165cbbc834e2 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -996,6 +996,32 @@ static void __migrate_page_extract(struct page *newpage, #define MIGRATEPAGE_UNMAP 1 +static void migrate_page_undo_page(struct page *page, + int page_was_mapped, + struct anon_vma *anon_vma, + struct list_head *ret) +{ + struct folio *folio = page_folio(page); + + if (page_was_mapped) + remove_migration_ptes(folio, folio, false); + if (anon_vma) + put_anon_vma(anon_vma); + unlock_page(page); + list_move_tail(&page->lru, ret); +} + +static void migrate_page_undo_newpage(struct page *newpage, + free_page_t put_new_page, + unsigned long private) +{ + unlock_page(newpage); + if (put_new_page) + put_new_page(newpage, private); + else + put_page(newpage); +} + static int __migrate_page_unmap(struct page *page, struct page *newpage, int force, enum migrate_mode mode) { @@ -1140,6 +1166,8 @@ static int __migrate_page_move(struct page *page, struct page *newpage, rc = move_to_new_folio(dst, folio, mode); + if (rc != -EAGAIN) + list_del(&newpage->lru); /* * When successful, push newpage to LRU immediately: so that if it * turns out to be an mlocked page, remove_migration_ptes() will @@ -1155,6 +1183,11 @@ static int __migrate_page_move(struct page *page, struct page *newpage, lru_add_drain(); } + if (rc == -EAGAIN) { + __migrate_page_record(newpage, page_was_mapped, anon_vma); + return rc; + } + if (page_was_mapped) remove_migration_ptes(folio, rc == MIGRATEPAGE_SUCCESS ? dst : folio, false); @@ -1220,6 +1253,7 @@ static int migrate_page_unmap(new_page_t get_new_page, free_page_t put_new_page, return -ENOMEM; *newpagep = newpage; + newpage->private = 0; rc = __migrate_page_unmap(page, newpage, force, mode); if (rc == MIGRATEPAGE_UNMAP) return rc; @@ -1258,7 +1292,7 @@ static int migrate_page_move(free_page_t put_new_page, unsigned long private, * removed and will be freed. A page that has not been * migrated will have kept its references and be restored. */ - list_del(&page->lru); + list_del_init(&page->lru); } /* @@ -1268,9 +1302,8 @@ static int migrate_page_move(free_page_t put_new_page, unsigned long private, */ if (rc == MIGRATEPAGE_SUCCESS) { migrate_page_done(page, reason); - } else { - if (rc != -EAGAIN) - list_add_tail(&page->lru, ret); + } else if (rc != -EAGAIN) { + list_add_tail(&page->lru, ret); if (put_new_page) put_new_page(newpage, private); @@ -1455,11 +1488,13 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, int pass = 0; bool is_thp = false; struct page *page; - struct page *newpage = NULL; + struct page *newpage = NULL, *newpage2; struct page *page2; int rc, nr_subpages; LIST_HEAD(ret_pages); LIST_HEAD(thp_split_pages); + LIST_HEAD(unmap_pages); + LIST_HEAD(new_pages); bool nosplit = (reason == MR_NUMA_MISPLACED); bool no_subpage_counting = false; @@ -1541,19 +1576,19 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, nr_subpages = compound_nr(page); cond_resched(); - if (PageHuge(page)) + if (PageHuge(page)) { + list_move_tail(&page->lru, &ret_pages); continue; + } rc = migrate_page_unmap(get_new_page, put_new_page, private, page, &newpage, pass > 2, mode, reason, &ret_pages); - if (rc == MIGRATEPAGE_UNMAP) - rc = migrate_page_move(put_new_page, private, - page, newpage, mode, - reason, &ret_pages); /* * The rules are: * Success: page will be freed + * Unmap: page will be put on unmap_pages list, + * new page put on new_pages list * -EAGAIN: stay on the from list * -ENOMEM: stay on the from list * -ENOSYS: stay on the from list @@ -1589,7 +1624,7 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, case -ENOMEM: /* * When memory is low, don't bother to try to migrate - * other pages, just exit. + * other pages, move unmapped pages, then exit. */ if (is_thp) { nr_thp_failed++; @@ -1610,9 +1645,11 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, * the caller otherwise the page refcnt will be leaked. */ list_splice_init(&thp_split_pages, from); - /* nr_failed isn't updated for not used */ nr_thp_failed += thp_retry; - goto out; + if (list_empty(&unmap_pages)) + goto out; + else + goto move; case -EAGAIN: if (is_thp) thp_retry++; @@ -1625,6 +1662,10 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, if (is_thp) nr_thp_succeeded++; break; + case MIGRATEPAGE_UNMAP: + list_move_tail(&page->lru, &unmap_pages); + list_add_tail(&newpage->lru, &new_pages); + break; default: /* * Permanent failure (-EBUSY, etc.): @@ -1645,12 +1686,96 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, nr_failed += retry; nr_thp_failed += thp_retry; nr_failed_pages += nr_retry_pages; +move: + retry = 1; + thp_retry = 1; + for (pass = 0; pass < 10 && (retry || thp_retry); pass++) { + retry = 0; + thp_retry = 0; + nr_retry_pages = 0; + + newpage = list_first_entry(&new_pages, struct page, lru); + newpage2 = list_next_entry(newpage, lru); + list_for_each_entry_safe(page, page2, &unmap_pages, lru) { + /* + * THP statistics is based on the source huge page. + * Capture required information that might get lost + * during migration. + */ + is_thp = PageTransHuge(page) && !PageHuge(page); + nr_subpages = compound_nr(page); + cond_resched(); + + rc = migrate_page_move(put_new_page, private, + page, newpage, mode, + reason, &ret_pages); + /* + * The rules are: + * Success: page will be freed + * -EAGAIN: stay on the unmap_pages list + * Other errno: put on ret_pages list then splice to + * from list + */ + switch(rc) { + case -EAGAIN: + if (is_thp) + thp_retry++; + else if (!no_subpage_counting) + retry++; + nr_retry_pages += nr_subpages; + break; + case MIGRATEPAGE_SUCCESS: + nr_succeeded += nr_subpages; + if (is_thp) + nr_thp_succeeded++; + break; + default: + /* + * Permanent failure (-EBUSY, etc.): + * unlike -EAGAIN case, the failed page is + * removed from migration page list and not + * retried in the next outer loop. + */ + if (is_thp) + nr_thp_failed++; + else if (!no_subpage_counting) + nr_failed++; + + nr_failed_pages += nr_subpages; + break; + } + newpage = newpage2; + newpage2 = list_next_entry(newpage, lru); + } + } + nr_failed += retry; + nr_thp_failed += thp_retry; + nr_failed_pages += nr_retry_pages; + + rc = nr_failed + nr_thp_failed; +out: + /* Cleanup remaining pages */ + newpage = list_first_entry(&new_pages, struct page, lru); + newpage2 = list_next_entry(newpage, lru); + list_for_each_entry_safe(page, page2, &unmap_pages, lru) { + int page_was_mapped = 0; + struct anon_vma *anon_vma = NULL; + + __migrate_page_extract(newpage, &page_was_mapped, &anon_vma); + migrate_page_undo_page(page, page_was_mapped, anon_vma, + &ret_pages); + list_del(&newpage->lru); + migrate_page_undo_newpage(newpage, put_new_page, private); + newpage = newpage2; + newpage2 = list_next_entry(newpage, lru); + } + /* * Try to migrate subpages of fail-to-migrate THPs, no nr_failed * counting in this round, since all subpages of a THP is counted * as 1 failure in the first round. */ - if (!list_empty(&thp_split_pages)) { + if (rc >= 0 && !list_empty(&thp_split_pages)) { /* * Move non-migrated pages (after 10 retries) to ret_pages * to avoid migrating them again. @@ -1662,8 +1787,6 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, goto thp_subpage_migration; } - rc = nr_failed + nr_thp_failed; -out: /* * Put the permanent failure page back to migration list, they * will be put back to the right list by the caller. -- 2.35.1