Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp753548iob; Wed, 18 May 2022 12:10:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyUAjjVb13l98nbbuvm+c75krl8mAuf45wO50Hb6h4xWdEqqlbR/003pe2b8iXAxBiN/mJm X-Received: by 2002:a17:902:ab1e:b0:161:7f5f:7b8f with SMTP id ik30-20020a170902ab1e00b001617f5f7b8fmr1064441plb.64.1652901012527; Wed, 18 May 2022 12:10:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1652901012; cv=none; d=google.com; s=arc-20160816; b=Nm58YhLz40cyLbb1ZODnuBwpj9EwsbBepByolbUNYmnb902rL9bXk7Amk9Ionf6VAq 7LWjECQTIZ3G9POJfIQSkOsMt7sXM8di88qt8DFhZ3tIb0X1TjbLq2/93XFT2M3/1v57 ueyWmpNJPUoqB/tDn0v6QHRozTD5MRMvY8REgI9KgK7LlxzNrLJCqjoSlrXf0SamBjCM 8hLj1aj9uFSZQRPN14CmTU2ALsVpSvJk692xjCmBvmAKRk9cv66vUl2Hsn6IKYykA6Om DTS2MITjVmUxhKtFbfV2/59vS5Sm3cMr1H9gumFs8PEdv7meNPokJMzPufkXWWU0xHnm z+Xg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=RYqL0MLG1DbGBmgYMURzmvbXpQhyxhj8gl5i89V5r90=; b=EkbIy2iV3Ec2V40wdF9pK66yk0FfGHOfOGn9jWRMIKJ8KePSymw3Vnyc7C9N7YYR9X yErC3UlcEZycuxbLqrG+1xjpsvuyxHQuRtE5n3RvUDTjwscd8H3uYbMbvXgnXvtqYPfc 2Maz8n4AATPw1/xyDY11c56cnyOUoxFw1rHi4JTaT0DCljlwxpb1n0gQWOxBfz0SIWFy 8v795N3qUdJnDKUeKLNymXToGEGF/bzhksVjPkoKnMjAqztstOAWevRK91Cmv0xtE2L6 WWhkqdcRkv/yqHQzuwFXVY4JVP/T7IcVBu19N/4V/vuFOWAP5KYgVZJVaZsWlNy8xajU 2kfQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=f4jYk60D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id d19-20020a170902e15300b001587f099641si3205648pla.387.2022.05.18.12.10.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 May 2022 12:10:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=f4jYk60D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 447DE880FE; Wed, 18 May 2022 12:09:25 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241768AbiERTJR (ORCPT + 99 others); Wed, 18 May 2022 15:09:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241755AbiERTJQ (ORCPT ); Wed, 18 May 2022 15:09:16 -0400 Received: from mail-qt1-x82f.google.com (mail-qt1-x82f.google.com [IPv6:2607:f8b0:4864:20::82f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3315387A29 for ; Wed, 18 May 2022 12:09:14 -0700 (PDT) Received: by mail-qt1-x82f.google.com with SMTP id 11so69098qtp.9 for ; Wed, 18 May 2022 12:09:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=RYqL0MLG1DbGBmgYMURzmvbXpQhyxhj8gl5i89V5r90=; b=f4jYk60DD7udDhMwkKHqiV6/3YrbRjeifcxR7rq36/5EDL0R7jhskIGVLU0Im7kLx8 70gOwl8k95pY8w6wTsYt4l1qya3jYGUjdU4krCPxnEwDJZa+O9K1GV6WOJqvpU0N2hsU jNbS2Xf5epeNioenJq5vogiBiqKhhtztNd332raQBunqiAtMwZXRSZ4vS9FZ2zahG2Xz Tdtr/WGcNNFQ3sVbXiMJDn7gQQlPw+hV9neokhvbfEx5F6p+rgK6OIluXTNfrNY5OHg2 kr/uYhgBp3FfZ8WlbmqCx/n3SI1I2Qukgj7FPLw+jAg39ZLZlyNizMY6oHGkhfwWayAv A6pQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=RYqL0MLG1DbGBmgYMURzmvbXpQhyxhj8gl5i89V5r90=; b=OSLL/WzynLsKY+7sro340WhX7qqkVJtxrQUSkWuxdEVpf3QOsR2J+Ya2xuOtiVdsl/ 6f29YmwiqzFnU8B7fqkPyW4gmwJuNaCsMejxinHfF8XqVpNuB8IFE/adZRvGSEjOAdBl N5yCCOIAv8+pFVtrs6YCuy886r8GtmjUqQAKz9R+9tMwpZ4xhGNZ8wCQC+3TGW/u4/Y9 vz3kRJ6LbJ+Aal/Gg30IJMbVf11nsOSH75LOFI93P3Zujvzo6404Z1HDqo3iX2JjjgOv APb5S9jq2JeSw7MgPbGuoUz4q2WkNAbHkwfmbEispv8ara9Bj7wWRe+LdbgCHkEk6S/x xKzQ== X-Gm-Message-State: AOAM532TCy57d7GSXFnym1reBT86AW1+uceUXsp47oJQaCro0lFlLXRk SvMwlGtHH1dbV5PY0xvNYdGOFg== X-Received: by 2002:ac8:5f4c:0:b0:2f3:c99c:e246 with SMTP id y12-20020ac85f4c000000b002f3c99ce246mr1158818qta.122.1652900953299; Wed, 18 May 2022 12:09:13 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:b35b]) by smtp.gmail.com with ESMTPSA id f9-20020a05622a104900b002f39b99f69fsm3011qte.57.2022.05.18.12.09.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 May 2022 12:09:12 -0700 (PDT) From: Johannes Weiner To: Dave Hansen , "Huang, Ying" , Yang Shi , Andrew Morton Cc: linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com, Zi Yan , Michal Hocko , Shakeel Butt , Roman Gushchin Subject: [PATCH] Revert "mm/vmscan: never demote for memcg reclaim" Date: Wed, 18 May 2022 15:09:11 -0400 Message-Id: <20220518190911.82400-1-hannes@cmpxchg.org> X-Mailer: git-send-email 2.36.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This reverts commit 3a235693d3930e1276c8d9cc0ca5807ef292cf0a. Its premise was that cgroup reclaim cares about freeing memory inside the cgroup, and demotion just moves them around within the cgroup limit. Hence, pages from toptier nodes should be reclaimed directly. However, with NUMA balancing now doing tier promotions, demotion is part of the page aging process. Global reclaim demotes the coldest toptier pages to secondary memory, where their life continues and from which they have a chance to get promoted back. Essentially, tiered memory systems have an LRU order that spans multiple nodes. When cgroup reclaims pages coming off the toptier directly, there can be colder pages on lower tier nodes that were demoted by global reclaim. This is an aging inversion, not unlike if cgroups were to reclaim directly from the active lists while there are inactive pages. Proactive reclaim is another factor. The goal of that it is to offload colder pages from expensive RAM to cheaper storage. When lower tier memory is available as an intermediate layer, we want offloading to take advantage of it instead of bypassing to storage. Revert the patch so that cgroups respect the LRU order spanning the memory hierarchy. Of note is a specific undercommit scenario, where all cgroup limits in the system add up to <= available toptier memory. In that case, shuffling pages out to lower tiers first to reclaim them from there is inefficient. This is something could be optimized/short-circuited later on (although care must be taken not to accidentally recreate the aging inversion). Let's ensure correctness first. Signed-off-by: Johannes Weiner Cc: Dave Hansen Cc: "Huang, Ying" Cc: Yang Shi Cc: Zi Yan Cc: Michal Hocko Cc: Shakeel Butt Cc: Roman Gushchin --- mm/vmscan.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index c6918fff06e1..7a4090712177 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -528,13 +528,8 @@ static bool can_demote(int nid, struct scan_control *sc) { if (!numa_demotion_enabled) return false; - if (sc) { - if (sc->no_demotion) - return false; - /* It is pointless to do demotion in memcg reclaim */ - if (cgroup_reclaim(sc)) - return false; - } + if (sc && sc->no_demotion) + return false; if (next_demotion_node(nid) == NUMA_NO_NODE) return false; -- 2.36.1