Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp1596961rwb; Thu, 8 Dec 2022 12:44:56 -0800 (PST) X-Google-Smtp-Source: AA0mqf7PGymWrN51lS9XHx6BZ5hk7imtxdITSWS14EgwXnkxTZmuJLG1YrXLBkTUAXdWy1kCeIqR X-Received: by 2002:aa7:c754:0:b0:46b:6096:a884 with SMTP id c20-20020aa7c754000000b0046b6096a884mr37811771eds.152.1670532295961; Thu, 08 Dec 2022 12:44:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670532295; cv=none; d=google.com; s=arc-20160816; b=U5dvyfmCkTb60FWIBBDDcYS4mSF2gb879B+Gmk8CUjV4FjkY7fSC+mjbeQ0GZkXdmG wQnHsPFpMtj8cnyo2EJLLp7QlANECLrF8Y+SW/wVQStu2SE9IeFpG8no5AXzaClA0mJd GTwe47H6EvyR4uTLjG7IFD0XcS7foftAE9FEzV+i95Y41ItKJlbPrunfTj3KDyFHb2F2 o5F632Bc47Wy9uIE9ikzGYDOx1Zo9MyGKJ9TBu+soxjHkH4ndacd4cEmxmVmPgq/IeuA d2aDBs5xuxBSmjCAD3085hBKa3l8O0ZTMTM9V+GGcGXMrM1Xt13rc58ohXdZmUyTcup6 zZyQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=npFjL6Dds3DKnQo07iUTlbqXMaucW9/yvERNl8ZNQRo=; b=DPP2Ve8x178IOZnCW1FBce4ErCWDh0J+ITm/Urs+O7HKGPILU17wtbE1/l6GM5jLCs ox3iJItpkF6AhSybiOttPnSzUOaSYCfkC2dXWMhcDxgMwB7UCuqjs96P/54aJ1h6qmpb cu9JVyVpsaRN/qQEcKz93bjDf8sj/HvwD0T1isCFaKzIXNHaZn1+FNj9yPj+ofOZWZMF dg79RAkNJp/w+cnNWl8Cfzic5/NbM0+B6JwUXw4stzHaUtKjbFTko0O2B632NC61j7nH fpuwPvew2WAh2lqoPu/IWkFMedIxcESOXjAKaQiYneRsGuR64J4Hrl88C+kaSVBlbvcD eiRQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=PtrML52y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id eh23-20020a0564020f9700b004520e27e5b9si6337100edb.626.2022.12.08.12.44.38; Thu, 08 Dec 2022 12:44:55 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=PtrML52y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229990AbiLHU2Q (ORCPT + 74 others); Thu, 8 Dec 2022 15:28:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32908 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229784AbiLHU2M (ORCPT ); Thu, 8 Dec 2022 15:28:12 -0500 Received: from mail-pg1-x530.google.com (mail-pg1-x530.google.com [IPv6:2607:f8b0:4864:20::530]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 676771A393 for ; Thu, 8 Dec 2022 12:28:11 -0800 (PST) Received: by mail-pg1-x530.google.com with SMTP id v3so2094665pgh.4 for ; Thu, 08 Dec 2022 12:28:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=npFjL6Dds3DKnQo07iUTlbqXMaucW9/yvERNl8ZNQRo=; b=PtrML52y4++BjkDz048pa6CUOhJB/mkeWNTiPY7aI2UbgF7O0oTXxRSB+Rgt+XOkh2 IhRH1bB7exc+5heFlRaaYr1yjjAwmfAuerp1aWipjPeaCOdr512Ka1FELYXazYxmhQEi SI7xS7B+WXTt+l2S403kfG2hbvrX5IAdut2oWDHfpaLlRGt4YV6g6R2N/Y+M0OIVauM4 yspqne9tauDtNmi+n/6QI2qbmUkn/MYPIpdZcxb09OW88KyKsov7q5fHtFuGQCLf0/61 TA66+MiESyfsohxgdTOk04PCX9QDVLvzcic6TUKUmrBvMvg3B2oikbP6Edti5vGkX47C FKMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=npFjL6Dds3DKnQo07iUTlbqXMaucW9/yvERNl8ZNQRo=; b=DEEF5/dipw5Hh246PyqBnqvgovp5j+uvvo5nHgi0upHSbxxvz/TJiCNqCkcRMC6qPp 7aTOQS9p+t34OiObzTnIPwKuQRb6evZ6/25XXUNrYMmkIyXWN4sbIhfNQ66NHRUCzBKZ bvDtaLsthsSba4Be/N6W1xvsV6GORzDhjmth9NFHsqjm2/n2qW3/beRnwz54KFbmRU2l Zp+KSH4XhNFENz3fJH+WsrLjLzE1GgtejYeyWn0guYD0aBiRB/K4NtiFY8jQW36/xXre dwiIqKud114803EITbEBtT0XpYdP2KpXKObymaoOzH18v+mQtdyJMU6i9uLaBB8ySViq gPfA== X-Gm-Message-State: ANoB5pnL7kDGzNnHoRgn4gZxfMuKJbmQSUNh7uHzUkZR1jUYgKx3p/Td 51pbXG8DgQpOrNYg7tFymkM= X-Received: by 2002:a62:3142:0:b0:56d:8d19:f331 with SMTP id x63-20020a623142000000b0056d8d19f331mr99706326pfx.7.1670531290779; Thu, 08 Dec 2022 12:28:10 -0800 (PST) Received: from localhost (fwdproxy-prn-008.fbsv.net. [2a03:2880:ff:8::face:b00c]) by smtp.gmail.com with ESMTPSA id b29-20020aa7951d000000b0057555d35f79sm15702101pfp.101.2022.12.08.12.28.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Dec 2022 12:28:10 -0800 (PST) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, bfoster@redhat.com, willy@infradead.org, kernel-team@meta.com Subject: [PATCH v3 1/4] workingset: fix confusion around eviction vs refault container Date: Thu, 8 Dec 2022 12:28:05 -0800 Message-Id: <20221208202808.908690-2-nphamcs@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221208202808.908690-1-nphamcs@gmail.com> References: <20221208202808.908690-1-nphamcs@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Johannes Weiner Refault decisions are made based on the lruvec where the page was evicted, as that determined its LRU order while it was alive. Stats and workingset aging must then occur on the lruvec of the new page, as that's the node and cgroup that experience the refault and that's the lruvec whose nonresident info ages out by a new resident page. Those lruvecs could be different when a page is shared between cgroups, or the refaulting page is allocated on a different node. There are currently two mix-ups: 1. When swap is available, the resident anon set must be considered when comparing the refault distance. The comparison is made against the right anon set, but the check for swap is not. When pages get evicted from a cgroup with swap, and refault in one without, this can incorrectly consider a hot refault as cold - and vice versa. Fix that by using the eviction cgroup for the swap check. 2. The stats and workingset age are updated against the wrong lruvec altogether: the right cgroup but the wrong NUMA node. When a page refaults on a different NUMA node, this will have confusing stats and distort the workingset age on a different lruvec - again possibly resulting in hot/cold misclassifications down the line. Fix the swap check and the refault pgdat to address both concerns. This was found during code review. It hasn't caused notable issues in production, suggesting that those refault-migrations are relatively rare in practice. Signed-off-by: Johannes Weiner Co-developed-by: Nhat Pham Signed-off-by: Nhat Pham --- mm/workingset.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/workingset.c b/mm/workingset.c index ae7e984b23c6..79585d55c45d 100644 --- a/mm/workingset.c +++ b/mm/workingset.c @@ -457,6 +457,7 @@ void workingset_refault(struct folio *folio, void *shadow) */ nr = folio_nr_pages(folio); memcg = folio_memcg(folio); + pgdat = folio_pgdat(folio); lruvec = mem_cgroup_lruvec(memcg, pgdat); mod_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + file, nr); @@ -474,7 +475,7 @@ void workingset_refault(struct folio *folio, void *shadow) workingset_size += lruvec_page_state(eviction_lruvec, NR_INACTIVE_FILE); } - if (mem_cgroup_get_nr_swap_pages(memcg) > 0) { + if (mem_cgroup_get_nr_swap_pages(eviction_memcg) > 0) { workingset_size += lruvec_page_state(eviction_lruvec, NR_ACTIVE_ANON); if (file) { -- 2.30.2