Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp1794436pxb; Fri, 22 Oct 2021 07:48:53 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxcGq99+hPuor1FJdg2OKRateHOrPQHOX6JYK0n84ENTtLS4aAWEah78ZN86eNH6qvDcE9+ X-Received: by 2002:a05:6a00:1781:b0:44d:faef:f2c0 with SMTP id s1-20020a056a00178100b0044dfaeff2c0mr473477pfg.68.1634914133092; Fri, 22 Oct 2021 07:48:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634914133; cv=none; d=google.com; s=arc-20160816; b=JGkCLgq/s2l9am708tPc6fIPb/X6UhLIS+tVPmwU0hBJrQDm+DtLqrl4mUjOT2Ud8/ jVWcyCBud+Fg6ezoxKa39FHn3Wg1lg9r3X0vV0XvUw37ua6BUQVmJ4ZiRNKTkUKDzvpM m7dIQnaHxNj+x7ri3BVO0a2eV8V80KkBPF1ankDpZlMxU7q4mRiMJv6XjoMSGLMEXBLG lvTdcf0zAp7QRG5iMIziyWOfyRkG1WDG9xqoa9nSkcuzf7yKB/X8V6CNxwNF3WtGzfjU CAV7yOAvuKg2BKwcgRtTUKTNSnn2I8Rr68RicToZ5XwZRTQ1oihHTpw1Jzj8bKIch8yU Vwkw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=/5e9QRbukMVtMs/n6tfnhzJwgm+ab5e/tliIZWTmCVI=; b=sZztIFJPYzUFYSlUC04u5cThY0AmasFjs4KH+XUuI7/qhTSyzHQPLQq2FkStzXDfN7 n0gwb1wAmMRoCVYeZxLDnwaspL4O0Gw1A6lILL09OEGMvCtLdK73fGL8ZD57rD2klv4X yxc3puyZCtYEs+U0V4oMxqncrZlYsJnxyeyWkEFIvHGstjS4VmtVx+1Ct+i5QOU4V3ui ApmLW+l4osJRyiwcaiVKg1l39vk22lYzkifYLw1H90kKzRa2oMe5R2/WUwy2c0JPxymt uN3lAJFvZYzDU7718K8rDRKZAxKvk58SwdK+J+71HpOR1HQPQ/n1Pb3WcqUi3L0ZZdXw EkBA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 7si11260825pgp.144.2021.10.22.07.48.40; Fri, 22 Oct 2021 07:48:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233183AbhJVOtq (ORCPT + 99 others); Fri, 22 Oct 2021 10:49:46 -0400 Received: from outbound-smtp33.blacknight.com ([81.17.249.66]:55714 "EHLO outbound-smtp33.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233190AbhJVOtm (ORCPT ); Fri, 22 Oct 2021 10:49:42 -0400 Received: from mail.blacknight.com (pemlinmail03.blacknight.ie [81.17.254.16]) by outbound-smtp33.blacknight.com (Postfix) with ESMTPS id 8A10ABADB6 for ; Fri, 22 Oct 2021 15:47:22 +0100 (IST) Received: (qmail 29677 invoked from network); 22 Oct 2021 14:47:22 -0000 Received: from unknown (HELO stampy.112glenside.lan) (mgorman@techsingularity.net@[84.203.17.29]) by 81.17.254.9 with ESMTPA; 22 Oct 2021 14:47:22 -0000 From: Mel Gorman To: Andrew Morton Cc: NeilBrown , Theodore Ts'o , Andreas Dilger , "Darrick J . Wong" , Matthew Wilcox , Michal Hocko , Dave Chinner , Rik van Riel , Vlastimil Babka , Johannes Weiner , Jonathan Corbet , Linux-MM , Linux-fsdevel , LKML , Mel Gorman Subject: [PATCH 2/8] mm/vmscan: Throttle reclaim and compaction when too may pages are isolated Date: Fri, 22 Oct 2021 15:46:45 +0100 Message-Id: <20211022144651.19914-3-mgorman@techsingularity.net> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211022144651.19914-1-mgorman@techsingularity.net> References: <20211022144651.19914-1-mgorman@techsingularity.net> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Page reclaim throttles on congestion if too many parallel reclaim instances have isolated too many pages. This makes no sense, excessive parallelisation has nothing to do with writeback or congestion. This patch creates an additional workqueue to sleep on when too many pages are isolated. The throttled tasks are woken when the number of isolated pages is reduced or a timeout occurs. There may be some false positive wakeups for GFP_NOIO/GFP_NOFS callers but the tasks will throttle again if necessary. [shy828301@gmail.com: Wake up from compaction context] [vbabka@suse.cz: Account number of throttled tasks only for writeback] Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka --- include/linux/mmzone.h | 1 + include/trace/events/vmscan.h | 4 +++- mm/compaction.c | 10 ++++++++-- mm/internal.h | 11 +++++++++++ mm/vmscan.c | 22 ++++++++++++++++------ 5 files changed, 39 insertions(+), 9 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 419304093610..9ccd8d95291b 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -275,6 +275,7 @@ enum lru_list { enum vmscan_throttle_state { VMSCAN_THROTTLE_WRITEBACK, + VMSCAN_THROTTLE_ISOLATED, NR_VMSCAN_THROTTLE, }; diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h index c317f9fe0d17..d4905bd9e9c4 100644 --- a/include/trace/events/vmscan.h +++ b/include/trace/events/vmscan.h @@ -28,10 +28,12 @@ ) : "RECLAIM_WB_NONE" #define _VMSCAN_THROTTLE_WRITEBACK (1 << VMSCAN_THROTTLE_WRITEBACK) +#define _VMSCAN_THROTTLE_ISOLATED (1 << VMSCAN_THROTTLE_ISOLATED) #define show_throttle_flags(flags) \ (flags) ? __print_flags(flags, "|", \ - {_VMSCAN_THROTTLE_WRITEBACK, "VMSCAN_THROTTLE_WRITEBACK"} \ + {_VMSCAN_THROTTLE_WRITEBACK, "VMSCAN_THROTTLE_WRITEBACK"}, \ + {_VMSCAN_THROTTLE_ISOLATED, "VMSCAN_THROTTLE_ISOLATED"} \ ) : "VMSCAN_THROTTLE_NONE" diff --git a/mm/compaction.c b/mm/compaction.c index bfc93da1c2c7..7359093d8ac0 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -761,6 +761,8 @@ isolate_freepages_range(struct compact_control *cc, /* Similar to reclaim, but different enough that they don't share logic */ static bool too_many_isolated(pg_data_t *pgdat) { + bool too_many; + unsigned long active, inactive, isolated; inactive = node_page_state(pgdat, NR_INACTIVE_FILE) + @@ -770,7 +772,11 @@ static bool too_many_isolated(pg_data_t *pgdat) isolated = node_page_state(pgdat, NR_ISOLATED_FILE) + node_page_state(pgdat, NR_ISOLATED_ANON); - return isolated > (inactive + active) / 2; + too_many = isolated > (inactive + active) / 2; + if (!too_many) + wake_throttle_isolated(pgdat); + + return too_many; } /** @@ -822,7 +828,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, if (cc->mode == MIGRATE_ASYNC) return -EAGAIN; - congestion_wait(BLK_RW_ASYNC, HZ/10); + reclaim_throttle(pgdat, VMSCAN_THROTTLE_ISOLATED, HZ/10); if (fatal_signal_pending(current)) return -EINTR; diff --git a/mm/internal.h b/mm/internal.h index b495e60c955d..c72d3383ef34 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -45,6 +45,15 @@ static inline void acct_reclaim_writeback(struct page *page) __acct_reclaim_writeback(pgdat, page, nr_throttled); } +static inline void wake_throttle_isolated(pg_data_t *pgdat) +{ + wait_queue_head_t *wqh; + + wqh = &pgdat->reclaim_wait[VMSCAN_THROTTLE_ISOLATED]; + if (waitqueue_active(wqh)) + wake_up(wqh); +} + vm_fault_t do_swap_page(struct vm_fault *vmf); void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma, @@ -120,6 +129,8 @@ extern unsigned long highest_memmap_pfn; */ extern int isolate_lru_page(struct page *page); extern void putback_lru_page(struct page *page); +extern void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason, + long timeout); /* * in mm/rmap.c: diff --git a/mm/vmscan.c b/mm/vmscan.c index 0c1595065f49..1e54e636b927 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1006,12 +1006,12 @@ static void handle_write_error(struct address_space *mapping, unlock_page(page); } -static void -reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason, +void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason, long timeout) { wait_queue_head_t *wqh = &pgdat->reclaim_wait[reason]; long ret; + bool acct_writeback = (reason == VMSCAN_THROTTLE_WRITEBACK); DEFINE_WAIT(wait); /* @@ -1023,7 +1023,8 @@ reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason, current->flags & (PF_IO_WORKER|PF_KTHREAD)) return; - if (atomic_inc_return(&pgdat->nr_writeback_throttled) == 1) { + if (acct_writeback && + atomic_inc_return(&pgdat->nr_writeback_throttled) == 1) { WRITE_ONCE(pgdat->nr_reclaim_start, node_page_state(pgdat, NR_THROTTLED_WRITTEN)); } @@ -1031,7 +1032,9 @@ reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason, prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE); ret = schedule_timeout(timeout); finish_wait(wqh, &wait); - atomic_dec(&pgdat->nr_writeback_throttled); + + if (acct_writeback) + atomic_dec(&pgdat->nr_writeback_throttled); trace_mm_vmscan_throttled(pgdat->node_id, jiffies_to_usecs(timeout), jiffies_to_usecs(timeout - ret), @@ -2176,6 +2179,7 @@ static int too_many_isolated(struct pglist_data *pgdat, int file, struct scan_control *sc) { unsigned long inactive, isolated; + bool too_many; if (current_is_kswapd()) return 0; @@ -2199,7 +2203,13 @@ static int too_many_isolated(struct pglist_data *pgdat, int file, if ((sc->gfp_mask & (__GFP_IO | __GFP_FS)) == (__GFP_IO | __GFP_FS)) inactive >>= 3; - return isolated > inactive; + too_many = isolated > inactive; + + /* Wake up tasks throttled due to too_many_isolated. */ + if (!too_many) + wake_throttle_isolated(pgdat); + + return too_many; } /* @@ -2308,8 +2318,8 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, return 0; /* wait a bit for the reclaimer. */ - msleep(100); stalled = true; + reclaim_throttle(pgdat, VMSCAN_THROTTLE_ISOLATED, HZ/10); /* We are about to die and free our memory. Return now. */ if (fatal_signal_pending(current)) -- 2.31.1