Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp915370pxb; Wed, 3 Mar 2021 20:55:22 -0800 (PST) X-Google-Smtp-Source: ABdhPJx4dprqug9a+iGTK/8Tx2NK+GqOyrGQVpBXs15J1TOb7x8q/XHIplN776nnXNjzvaSVNmoo X-Received: by 2002:a05:6402:4242:: with SMTP id g2mr2330752edb.329.1614833721826; Wed, 03 Mar 2021 20:55:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1614833721; cv=none; d=google.com; s=arc-20160816; b=qXX8WYXreHx/ox70a9rGIFBpMVf6rGpjTYR6SZRkZfgsCZFSD3GXrhd6VfSdsa5tH6 IZ2MQrrZcbMkwN9j/GbcIcZ1y1VGh22Hyl5WCg0xZw6kSWBCStDhk9IhRijhC8KkwPMI NzXAYSgpe0ZW9UCAZJyPC2Vf9gWI2iFLWnSdxPvb+rjKzaPCcVR/1qdd5iC2B59xZDWi nwsrNuZhqp2oh1RQaQgfhBNxzGwTRqScwkM0XzbaZcMZsoCmCnrrQetCIzvFNBZp32ru 2rmdBOY/ZBWg5U044WFN4U5PAXM2g9XGbO4fR7SutiLWyDKXRxrFKqd8jwS7Xhd8B4ir NB6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=uVO4LOiwPFj7hySzUljZWM/Ok2iFvyzmZ7m6sl/6OzE=; b=eX97aCBQcMZnVmIjTh/m1450NGQJTiZpN7iJeGKH+rTbFbYvkbw+y64d2SciEpMuk/ iGFeQuNunLfSZtgpBmOretJVJj4m9K7D0piCJMGn/DBAeACz3piSD/ktCXWT3+KkVYGC HwLUpmGAzFTuV18JCBC1bECkbvr9msQbM8DcykRj0pWQFhVRPlpz1SiQg3kqRNwteFzv JybU6wLHUFP7IBT2ERA51cJQXYbi5kADtcHuFRrv2FB8qIYAkxANVIwgJdTi9L7A5hGq o3xzS0SXoWIyoH2/oRI9l3iO6OsRzPybPBFo2lrOlGk02QXcvDDeRpKSEb4IoCbuizlu /rew== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=1S8Io6E9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d8si5629161ejr.549.2021.03.03.20.54.59; Wed, 03 Mar 2021 20:55:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=1S8Io6E9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1442922AbhCBCRA (ORCPT + 99 others); Mon, 1 Mar 2021 21:17:00 -0500 Received: from mail.kernel.org ([198.145.29.99]:55168 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242276AbhCAToU (ORCPT ); Mon, 1 Mar 2021 14:44:20 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 1B3256527A; Mon, 1 Mar 2021 17:30:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1614619828; bh=6ZG1h8/E58QE122skVhk7QY/kCnUvBA44crw1IfV8hk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=1S8Io6E9Ytr0way8aZG4Ywj2wNE+saWICoUYxKVjVEjFn9y9bvJbJdlSTb1S8wRCD PAqSc7xvqegtUSMLMzKLyiSqXQOWEGTXk2TufdsttYrdSeivcU2pj2rJnL+sy1zqLV ZpGRgt1RAfzldOCnGoMbS1fn3CHjtHknZWwvdrrY= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Muchun Song , Johannes Weiner , Shakeel Butt , Michal Hocko , Vladimir Davydov , Andrew Morton , Linus Torvalds Subject: [PATCH 5.10 596/663] mm: memcontrol: fix swap undercounting in cgroup2 Date: Mon, 1 Mar 2021 17:14:04 +0100 Message-Id: <20210301161211.352295920@linuxfoundation.org> X-Mailer: git-send-email 2.30.1 In-Reply-To: <20210301161141.760350206@linuxfoundation.org> References: <20210301161141.760350206@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Muchun Song commit cae3af62b33aa931427a0f211e04347b22180b36 upstream. When pages are swapped in, the VM may retain the swap copy to avoid repeated writes in the future. It's also retained if shared pages are faulted back in some processes, but not in others. During that time we have an in-memory copy of the page, as well as an on-swap copy. Cgroup1 and cgroup2 handle these overlapping lifetimes slightly differently due to the nature of how they account memory and swap: Cgroup1 has a unified memory+swap counter that tracks a data page regardless whether it's in-core or swapped out. On swapin, we transfer the charge from the swap entry to the newly allocated swapcache page, even though the swap entry might stick around for a while. That's why we have a mem_cgroup_uncharge_swap() call inside mem_cgroup_charge(). Cgroup2 tracks memory and swap as separate, independent resources and thus has split memory and swap counters. On swapin, we charge the newly allocated swapcache page as memory, while the swap slot in turn must remain charged to the swap counter as long as its allocated too. The cgroup2 logic was broken by commit 2d1c498072de ("mm: memcontrol: make swap tracking an integral part of memory control"), because it accidentally removed the do_memsw_account() check in the branch inside mem_cgroup_uncharge() that was supposed to tell the difference between the charge transfer in cgroup1 and the separate counters in cgroup2. As a result, cgroup2 currently undercounts retained swap to varying degrees: swap slots are cached up to 50% of the configured limit or total available swap space; partially faulted back shared pages are only limited by physical capacity. This in turn allows cgroups to significantly overconsume their alloted swap space. Add the do_memsw_account() check back to fix this problem. Link: https://lkml.kernel.org/r/20210217153237.92484-1-songmuchun@bytedance.com Fixes: 2d1c498072de ("mm: memcontrol: make swap tracking an integral part of memory control") Signed-off-by: Muchun Song Acked-by: Johannes Weiner Reviewed-by: Shakeel Butt Acked-by: Michal Hocko Cc: Vladimir Davydov Cc: [5.8+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- mm/memcontrol.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6808,7 +6808,19 @@ int mem_cgroup_charge(struct page *page, memcg_check_events(memcg, page); local_irq_enable(); - if (PageSwapCache(page)) { + /* + * Cgroup1's unified memory+swap counter has been charged with the + * new swapcache page, finish the transfer by uncharging the swap + * slot. The swap slot would also get uncharged when it dies, but + * it can stick around indefinitely and we'd count the page twice + * the entire time. + * + * Cgroup2 has separate resource counters for memory and swap, + * so this is a non-issue here. Memory and swap charge lifetimes + * correspond 1:1 to page and swap slot lifetimes: we charge the + * page to memory here, and uncharge swap when the slot is freed. + */ + if (do_memsw_account() && PageSwapCache(page)) { swp_entry_t entry = { .val = page_private(page) }; /* * The swap entry might not get freed for a long time,