Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp5451202yba; Wed, 10 Apr 2019 20:59:46 -0700 (PDT) X-Google-Smtp-Source: APXvYqy+bB1kz6EPKNBdIxRNHZKAruqLLh3nER2A7aoRnjxP9Hmp3AyHk886nXPADC5ycGp/Pd/Y X-Received: by 2002:a17:902:1003:: with SMTP id b3mr46675275pla.306.1554955186050; Wed, 10 Apr 2019 20:59:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554955186; cv=none; d=google.com; s=arc-20160816; b=HEvK7jeNFjEqCqws/YFmin5BQGKteDwS1T/EKKL4uz3PDnz35bBJw/wq70r0javBxD o1OVkilQvVtl5sMUHQ0aPhuQRSwS2wxzMIQ3gNoc8wBFwU/3uxYpFhJnI/F2W2wiyGTI F1UEZRL8V9gtHIXMkgHOxDtzx7PvdvBpVPHzS9Timx2YNi0OOIy6N2+l+jZcfhKjAMZf 3evsweSmkBPyOjLLuGxsOvTVvT5I/fHcS1OydIboYHJbG4oHJRQwR1/Lfew3P/SCjDSG IjbBUO6jsn2coNM73N/tuWzzhcMNBB62j9L++ufx1Z+QLhKt5TyEGLYWbZQTVv7u8FqB O/cw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from; bh=aomN1GPRo9IazHu+20Dhl0DXd62/KJuH2okPvztfYlE=; b=QGOyKmb7bd9o3+/W3gi7++HM487yhHuf0UuttoYdo4NlfkXBk6LUArEHoZjgVYjkN+ G5DHev/iezI7M65tzwmgCsxIZj3rm2OBEiTPPl2s0/SWC5J7HZiACvp3zAXoME6UukaI OfEF59wZtjGCHJUehrwGN7KRULoOGFmNpAGUk3SX/sFgil7INW0I7QAvLO2GX2pt5djH bE+eo7/ah/bxKM9GTKCzSjSijcKekrH3YcDsXjtsQ1m9ul5c7Jz104f9GnvD8h79JvpY 7p2AnLwbDaunpDn0byz1CkAoTgXvUSvcJ3Fmhw8WuneKW57w9dSWy33QJAsekv7pR+kV XJog== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h2si350153pfk.277.2019.04.10.20.59.30; Wed, 10 Apr 2019 20:59:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727048AbfDKD60 (ORCPT + 99 others); Wed, 10 Apr 2019 23:58:26 -0400 Received: from out30-56.freemail.mail.aliyun.com ([115.124.30.56]:33578 "EHLO out30-56.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726595AbfDKD6Z (ORCPT ); Wed, 10 Apr 2019 23:58:25 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R861e4;CH=green;DM=||false|;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e07486;MF=yang.shi@linux.alibaba.com;NM=1;PH=DS;RN=15;SR=0;TI=SMTPD_---0TP0I5rB_1554955031; Received: from e19h19392.et15sqa.tbsite.net(mailfrom:yang.shi@linux.alibaba.com fp:SMTPD_---0TP0I5rB_1554955031) by smtp.aliyun-inc.com(127.0.0.1); Thu, 11 Apr 2019 11:57:23 +0800 From: Yang Shi To: mhocko@suse.com, mgorman@techsingularity.net, riel@surriel.com, hannes@cmpxchg.org, akpm@linux-foundation.org, dave.hansen@intel.com, keith.busch@intel.com, dan.j.williams@intel.com, fengguang.wu@intel.com, fan.du@intel.com, ying.huang@intel.com, ziy@nvidia.com Cc: yang.shi@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [v2 PATCH 7/9] mm: vmscan: check if the demote target node is contended or not Date: Thu, 11 Apr 2019 11:56:57 +0800 Message-Id: <1554955019-29472-8-git-send-email-yang.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1554955019-29472-1-git-send-email-yang.shi@linux.alibaba.com> References: <1554955019-29472-1-git-send-email-yang.shi@linux.alibaba.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When demoting to PMEM node, the target node may have memory pressure, then the memory pressure may cause migrate_pages() fail. If the failure is caused by memory pressure (i.e. returning -ENOMEM), tag the node with PGDAT_CONTENDED. The tag would be cleared once the target node is balanced again. Check if the target node is PGDAT_CONTENDED or not, if it is just skip demotion. Signed-off-by: Yang Shi --- include/linux/mmzone.h | 3 +++ mm/vmscan.c | 28 ++++++++++++++++++++++++++++ 2 files changed, 31 insertions(+) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index fba7741..de534db 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -520,6 +520,9 @@ enum pgdat_flags { * many pages under writeback */ PGDAT_RECLAIM_LOCKED, /* prevents concurrent reclaim */ + PGDAT_CONTENDED, /* the node has not enough free memory + * available + */ }; enum zone_flags { diff --git a/mm/vmscan.c b/mm/vmscan.c index 80cd624..50cde53 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1048,6 +1048,9 @@ static void page_check_dirty_writeback(struct page *page, static inline bool is_demote_ok(int nid, struct scan_control *sc) { + int node; + nodemask_t used_mask; + /* It is pointless to do demotion in memcg reclaim */ if (!global_reclaim(sc)) return false; @@ -1060,6 +1063,13 @@ static inline bool is_demote_ok(int nid, struct scan_control *sc) if (!has_cpuless_node_online()) return false; + /* Check if the demote target node is contended or not */ + nodes_clear(used_mask); + node = find_next_best_node(nid, &used_mask, true); + + if (test_bit(PGDAT_CONTENDED, &NODE_DATA(node)->flags)) + return false; + return true; } @@ -1502,6 +1512,10 @@ static unsigned long shrink_page_list(struct list_head *page_list, nr_reclaimed += nr_succeeded; if (err) { + if (err == -ENOMEM) + set_bit(PGDAT_CONTENDED, + &NODE_DATA(target_nid)->flags); + putback_movable_pages(&demote_pages); list_splice(&ret_pages, &demote_pages); @@ -2596,6 +2610,19 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc * scan target and the percentage scanning already complete */ lru = (lru == LRU_FILE) ? LRU_BASE : LRU_FILE; + + /* + * The shrink_page_list() may find the demote target node is + * contended, if so it doesn't make sense to scan anonymous + * LRU again. + * + * Need check if swap is available or not too since demotion + * may happen on swapless system. + */ + if (!is_demote_ok(pgdat->node_id, sc) && + (!sc->may_swap || mem_cgroup_get_nr_swap_pages(memcg) <= 0)) + lru = LRU_FILE; + nr_scanned = targets[lru] - nr[lru]; nr[lru] = targets[lru] * (100 - percentage) / 100; nr[lru] -= min(nr[lru], nr_scanned); @@ -3458,6 +3485,7 @@ static void clear_pgdat_congested(pg_data_t *pgdat) clear_bit(PGDAT_CONGESTED, &pgdat->flags); clear_bit(PGDAT_DIRTY, &pgdat->flags); clear_bit(PGDAT_WRITEBACK, &pgdat->flags); + clear_bit(PGDAT_CONTENDED, &pgdat->flags); } /* -- 1.8.3.1