Received: by 2002:a05:7412:5112:b0:fa:6e18:a558 with SMTP id fm18csp485940rdb; Tue, 23 Jan 2024 05:58:29 -0800 (PST) X-Google-Smtp-Source: AGHT+IE4+X3C+Ng7rqmYoBWXtomqoXH87ubONhX9puoMyaPhSfFtbo8tcMSG9+F/8E+0cZMLwrTo X-Received: by 2002:aa7:c844:0:b0:55c:3084:6295 with SMTP id g4-20020aa7c844000000b0055c30846295mr966376edt.27.1706018309760; Tue, 23 Jan 2024 05:58:29 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706018309; cv=pass; d=google.com; s=arc-20160816; b=jWabKU5POg4G+5TA3nb6Xg2oPgXyuJSgW91pwkN+VRS36s7/EsraHPg4+U2k4Pvoqq ONZ5m1lReTkaJ85jXGher/lSXvvGQuaiS4QYWbEcJ6v8G+Lp+/HhH/hC9O5Yc/x2U4CQ ZwL35U082xehwIzAh1VdrESUVpACNrcaIr6AEwu1J28bT0yb6Y0812P9zvIWmwcl+dP1 cedNPMUpwiwm211nbX+GE2HPzsh49p2loTuTfmWuQoENLR1EkxXcRTqYVsJhKF65kLtb s0sIlqcFMoROuFzTW8cb2FLooUyXTnPkSUb1pIYj03/F4FEhwVM34F4+GxlkX9M2R+wL n1uw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=Unn/XUQhMtWV3m5yOooWqaet1YVy9czMl29ZsBMY8K0=; fh=7fAiOkC0jjv2WQIHPVsiBC9QOaiGOQ6lOjKm6hhBRs8=; b=UnGCy2kFm+fFrlXRbHROLh3tuX1nGQ4Z2AyrnWMU1L5iAw23xCXhXhEVetOLpJ9lxF KT+N4XNWwMDn4Xe6n9NyxvBndybEghCYPYrzIHNqiSa7at3LLZk78rol32D28ZqYYQnL KUxgk+IyY0MceEGKoPtfnTx7bPVxibECeZMf40n36Fo2keDnfC/eO12IegtFJS6ijKMH oV7c/Is1gbMy809sHB5xk19QTLlmQshW2aIyNSiLnHOY3+tlQ0EtKrfrBYmODG26Z6g0 SJaPUX6UiGDT94zsgKvB6ztU1TjP+/e5i6Hly4LJ8Hanp3Hp0HuTSylxSW/MSwFnonF1 coEA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=0Twv+Y4Y; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-35416-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-35416-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id z9-20020a50cd09000000b00557ba301e91si12614063edi.355.2024.01.23.05.58.29 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Jan 2024 05:58:29 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-35416-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=0Twv+Y4Y; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-35416-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-35416-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 5B9F71F2232B for ; Tue, 23 Jan 2024 13:58:29 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id ACAED5F84C; Tue, 23 Jan 2024 13:58:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="0Twv+Y4Y" Received: from mail-yw1-f176.google.com (mail-yw1-f176.google.com [209.85.128.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 40D435F572 for ; Tue, 23 Jan 2024 13:58:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.176 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706018299; cv=none; b=S7E+GxD5V0f3vTur1YGV9M1Qb3yb4QIv1EUuMh6gRo7h/J1Fn7CCztg2SjDHReC8PBr2PahSvcMRyzwe/RehPhqiQRf39teWj1IzcbAK0EMqDNkLKAjtYuAqrEo2WQCD9tXC09ucXkCI+xGfHReQtA3EPPLmRTlRBeO0TVodHjU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706018299; c=relaxed/simple; bh=MS6bYrlmg0t1k95bdrFkYhGX+I6eMjeZeyKcqKwleE4=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=Z2uewN3n9ptBVg4llZJRJiSIifB/k0H1EIp3CcvTNkiUWszJEJlWNk30Y5Gbfg1lGrLJZ2tCTbBV/+Xc9xbdAJ1sNYq68MKm+ccCpghgN36btlnWTuuWeFoWh+mIUXAxSXeC+lNprOD054UMoa15rO8VSv4SRGPazXQRLGaghHs= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=0Twv+Y4Y; arc=none smtp.client-ip=209.85.128.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Received: by mail-yw1-f176.google.com with SMTP id 00721157ae682-5ffee6e8770so22904937b3.0 for ; Tue, 23 Jan 2024 05:58:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1706018297; x=1706623097; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Unn/XUQhMtWV3m5yOooWqaet1YVy9czMl29ZsBMY8K0=; b=0Twv+Y4YDQOI97H6c99AC3CHrW0Pp1o/zZOPWqbU7TjUP7yhclHcXscZuf2Nqnf+id qR1zhi8PFpeDTRvUdHjZF5bNUFXizXyY3v1rXkSJEUhEIDQ6a+JPY672NEYN4urdXklq Le3BXd9pfp6A7Tu+mtkwJQwxpGgwN1jirjZxOI1PWleN+DH9TPwTvRcUyAn4EUXGOwqE bp6nlGdJO6GiwTh/2jMF8+bNX6cshFCSM9hyPhGMc432TjYi9DlG7wFXXeVZrs67/J5O NYUPn79HhauzTjVRAc5r4gjDrLamSa1kX8GZWsRgy8+F2n6CtchVd6i2jsa02oy2uiZ0 xQ4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706018297; x=1706623097; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Unn/XUQhMtWV3m5yOooWqaet1YVy9czMl29ZsBMY8K0=; b=dNPI9a0M4gM0OAUIvzyTLCL2QHdsLVQcURLRyVEAV84ef4U8wqQoPxEy+GuBVObMFW 1GgmGbyMttxFZM1usFedZL52FXUPKl4FZVwnQ3F5rQHp78Z0b9wbbpUZgZe4Bjp7BdoL hkmU1/5oqAJQ+Hp4h1Hq8+v10+2KAsDZ8+8bVUidDIzkU7qJyhrX8KdfrDWiFSMunt/l GVpLzx/d/IyqO+2qf/IYURl/LHtBFkzbDKYtkXVH5T5cRTTWSJSzS3u0ZMfjqPGtcxAh 6U+ySxQRHpiMfDAOr8kMruqvYcftlpQjvHjk4lKJun5kzs4x0P59tMANpMDbd60tWXFv losA== X-Gm-Message-State: AOJu0Yzp11lHWWcOM/mzdng1PzewoAIHNvnGII7qaFlSIn9QBdQ2UrAf 597nfH7JqB6QkG47EIPphau4yxZPYnN2fmM7aOU2Wz8SE/M8tVgD4qk37k08rsMmJ0SUzHqQxzD Rx+9KovI7riZ//QrIHG3a1Q7ZHsFUawn/aKp+ X-Received: by 2002:a81:840b:0:b0:600:769:179f with SMTP id u11-20020a81840b000000b006000769179fmr2815522ywf.17.1706018296914; Tue, 23 Jan 2024 05:58:16 -0800 (PST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240121214413.833776-1-tjmercier@google.com> In-Reply-To: From: "T.J. Mercier" Date: Tue, 23 Jan 2024 05:58:05 -0800 Message-ID: Subject: Re: [PATCH] Revert "mm:vmscan: fix inaccurate reclaim during proactive reclaim" To: Michal Hocko Cc: Johannes Weiner , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , android-mm@google.com, yuzhao@google.com, yangyifei03@kuaishou.com, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Jan 23, 2024 at 1:33=E2=80=AFAM Michal Hocko wrot= e: > > On Sun 21-01-24 21:44:12, T.J. Mercier wrote: > > This reverts commit 0388536ac29104a478c79b3869541524caec28eb. > > > > Proactive reclaim on the root cgroup is 10x slower after this patch whe= n > > MGLRU is enabled, and completion times for proactive reclaim on much > > smaller non-root cgroups take ~30% longer (with or without MGLRU). > > What is the reclaim target in these pro-active reclaim requests? Two targets: 1) /sys/fs/cgroup/memory.reclaim 2) /sys/fs/cgroup/uid_0/memory.reclaim (a bunch of Android system services) Note that lru_gen_shrink_node is used for 1, but shrink_node_memcgs is used for 2. The 10x comes from the rate of reclaim (~70k pages/sec vs ~6.6k pages/sec) for 1. After this revert the root reclaim took only about 10 seconds. Before the revert it's still running after about 3 minutes using a core at 100% the whole time, and I'm too impatient to wait longer to record times for comparison. The 30% comes from the average of a few runs for 2: Before revert: $ adb wait-for-device && sleep 120 && adb root && adb shell -t 'time echo "" > /sys/fs/cgroup/uid_0/memory.reclaim' restarting adbd as root 0m09.69s real 0m00.00s user 0m09.19s system After revert: $ adb wait-for-device && sleep 120 && adb root && adb shell -t 'time echo "" > /sys/fs/cgroup/uid_0/memory.reclaim' 0m07.31s real 0m00.00s user 0m06.44s system It's actually a bigger difference for smaller reclaim amounts: Before revert: $ adb wait-for-device && sleep 120 && adb root && adb shell -t 'time echo "3G" > /sys/fs/cgroup/uid_0/memory.reclaim' 0m12.04s real 0m00.00s user 0m11.48s system After revert: $ adb wait-for-device && sleep 120 && adb root && adb shell -t 'time echo "3G" > /sys/fs/cgroup/uid_0/memory.reclaim' 0m06.65s real 0m00.00s user 0m05.91s system > > With > > root reclaim before the patch, I observe average reclaim rates of > > ~70k pages/sec before try_to_free_mem_cgroup_pages starts to fail and > > the nr_retries counter starts to decrement, eventually ending the > > proactive reclaim attempt. > > Do I understand correctly that the reclaim target is over estimated and > you expect that the reclaim process breaks out early Yes. I expect memory_reclaim to fail at some point when it becomes difficult/impossible to reclaim pages where I specify a large amount to reclaim. The ask here is, "please reclaim as much as possible from this cgroup, but don't take all day". But it takes minutes to get there on the root cgroup, working SWAP_CLUSTER_MAX pages at a time. > > After the patch the reclaim rate is > > consistently ~6.6k pages/sec due to the reduced nr_pages value causing > > scan aborts as soon as SWAP_CLUSTER_MAX pages are reclaimed. The > > proactive reclaim doesn't complete after several minutes because > > try_to_free_mem_cgroup_pages is still capable of reclaiming pages in > > tiny SWAP_CLUSTER_MAX page chunks and nr_retries is never decremented. > > I do not understand this part. How does a smaller reclaim target manages > to have reclaimed > 0 while larger one doesn't? They both are able to make progress. The main difference is that a single iteration of try_to_free_mem_cgroup_pages with MGLRU ends soon after it reclaims nr_to_reclaim, and before it touches all memcgs. So a single iteration really will reclaim only about SWAP_CLUSTER_MAX-ish pages with MGLRU. WIthout MGLRU the memcg walk is not aborted immediately after nr_to_reclaim is reached, so a single call to try_to_free_mem_cgroup_pages can actually reclaim thousands of pages even when sc->nr_to_reclaim is 32. (I.E. MGLRU overreclaims less.) https://lore.kernel.org/lkml/20221201223923.873696-1-yuzhao@google.com/