Received: by 2002:a05:7412:5112:b0:fa:6e18:a558 with SMTP id fm18csp604752rdb; Tue, 23 Jan 2024 09:01:56 -0800 (PST) X-Google-Smtp-Source: AGHT+IGSWLiSu81Xdr2AK5jkMXizfZNDM5cB6T+vDNZp6EeDmbNf9FPQaehNJSQmJ1gjjMiq4dAB X-Received: by 2002:a05:6a20:4a05:b0:19a:66a4:7966 with SMTP id fr5-20020a056a204a0500b0019a66a47966mr5799238pzb.55.1706029315596; Tue, 23 Jan 2024 09:01:55 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706029315; cv=pass; d=google.com; s=arc-20160816; b=WD5HLD6vOex7uikbgW2BWfDFeXYEkf8F0jxnrWtmlY0hQtr66MQiKTQGx5HLRjTo+V ZnOFTjdiLylh+fGBiw7ZN1GNjHM2qaLk1WUEQbnI6Xu7acb0Fn/38yQmODuAvmy65fp8 P0XOsAfwXcwVCT5inErP3O2Lsfc+RWDZ7DqXVGrU1sd7zZLm/aGvsaoXiXacucM1hd2n Uzy6TWzi95qKqs8JgbSEM4ap+sgTNiZsdpHQIpvC91VLIfh6OAdZY1SZHotsfTqnhTQX utUkd2HrQKeYD5x4pSlJka4ZN3BMKL9ss1ScFCXZO25cm+FW8SXe21zsYvPCkOuMUwZW ctGg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=gbO1hM7Gq1MMQFtxk3PHCAqmC2ittFH3Mus5UZlycCs=; fh=G39Uaa1XyPDc14yATmuH/YiakivcHQWOi4A1CTJ80/I=; b=eDHr7t/pLUMG8+6IpbQXfxQfkpyKpCMLcUcR42T/wTvV4p+ELCO8njVTAxmbtxhyKn gW4mZxTUXE/U6dLY9UL9Kkle0lNsMcaSD8TB+JzOt853gl8UkOGNULXK3n0m45pHgoP5 Ctyv72fgT+BIWJcz9WgUihX3iY+uH9gsUycZDFyhXZWdrPXWJJEXqwxJGfvYvJKFwEVP W0Wi87CvrqZscG5Xx9kRVk33v0y405Q2TBrlCp0CCi+/SiEkW28mYwjGzKm7V0Hw9u3X vEUuOwRPDV/URjsrfVcwmvc5iG03TMoD1JYg3swrv6brkahrJA0NCQ5mWFhAQF/cA38g 9vIw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=PA70v2ZZ; arc=pass (i=1 spf=pass spfdomain=cmpxchg.org dkim=pass dkdomain=cmpxchg-org.20230601.gappssmtp.com dmarc=pass fromdomain=cmpxchg.org); spf=pass (google.com: domain of linux-kernel+bounces-35710-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-35710-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id o29-20020a635d5d000000b005ce9d563445si9887850pgm.594.2024.01.23.09.01.55 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Jan 2024 09:01:55 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-35710-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=PA70v2ZZ; arc=pass (i=1 spf=pass spfdomain=cmpxchg.org dkim=pass dkdomain=cmpxchg-org.20230601.gappssmtp.com dmarc=pass fromdomain=cmpxchg.org); spf=pass (google.com: domain of linux-kernel+bounces-35710-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-35710-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 0EF0E28E19C for ; Tue, 23 Jan 2024 16:54:49 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 00E998612E; Tue, 23 Jan 2024 16:48:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg-org.20230601.gappssmtp.com header.i=@cmpxchg-org.20230601.gappssmtp.com header.b="PA70v2ZZ" Received: from mail-oi1-f180.google.com (mail-oi1-f180.google.com [209.85.167.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A8C8F85C7F for ; Tue, 23 Jan 2024 16:48:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.180 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706028508; cv=none; b=oVihGXr91ehcaZ26aN/1E5rGIZqL78kYtAB3v7WPJQYUzhY8dQ0ec/WozF45tnqDqoCZ5SryGKDwju2NR9pAHa3A72HOTFqLfMySmqOl1bSGlnv7/N95en1VZ+fJ4PGIShUoOMhK2hx5bbPGcBRZRzAcnq0u7vFxXibWyOomrGQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706028508; c=relaxed/simple; bh=K+Id2pHFvW1zQtjRoC+3WBupoAqObAViI/JK0/PN30A=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=N3YgbMMEMiiQy1KM+LhcqF92bBxby6SApg435+hzrcJcbaWziw+NdUMEoueGH3k7RJ+4zwHk3tGaQfbPAcaAvhIix7ZxMxJoh8I/ejClZXq1+GJXHYZ9P6kaA9NGIoCDM5hc2wrZ3euhAWZthOh1Iq+igMFLS/ns66fNBFS5pFk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org; spf=pass smtp.mailfrom=cmpxchg.org; dkim=pass (2048-bit key) header.d=cmpxchg-org.20230601.gappssmtp.com header.i=@cmpxchg-org.20230601.gappssmtp.com header.b=PA70v2ZZ; arc=none smtp.client-ip=209.85.167.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cmpxchg.org Received: by mail-oi1-f180.google.com with SMTP id 5614622812f47-3bdc2759468so655406b6e.1 for ; Tue, 23 Jan 2024 08:48:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1706028504; x=1706633304; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=gbO1hM7Gq1MMQFtxk3PHCAqmC2ittFH3Mus5UZlycCs=; b=PA70v2ZZHYuoAZqpmB7fz+85LCW0+TO6VG2b4x3NOu0ussKVZ4AXHelDYjuLQ2G1jC 1+VLfSTrXGHXIKxA/A+fzbkiRAk8ItEk7xbiJKsy0Ei+8mPYiRYRoKB17IJRNwUy9U9M EB/1kzw4McDl34441AI1fwHHwVRpgwBTtyF+joV/UYlD9lNX06WQ50QSnLqXmRdxu9/I kmpIjyOCbOmpdmzyg+KtLo2NQDbI7A8bAn+n4buF7qNXa/k8S1nDEADjIMVUhNHu7iyH ejaJ8EVtQLPtZGMFbwVn9VPUtKaLtNMpTIc8Q8bHs0XFv48IQ18LTm4UwqXNNkgxYbNs taIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706028504; x=1706633304; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=gbO1hM7Gq1MMQFtxk3PHCAqmC2ittFH3Mus5UZlycCs=; b=Or2zGkGjYRa/qKLO6jjBjUQLOqnjewcUS5Pkw9T+dxcaV4Mvoa6ZPFmcTk4jWqxd+Z ICxCCQ3i3udrAKVHGFxrUfGbxXjmTwa6BlqG7hK9FqQMXSg9FSTquks/n20T9CLD90d7 RivRk9M17QUthTKKamUJzM+R+9m21pAOWYDYdgB/roK30zYuLolKo2NW1Zr5qY9eAfKi e7uch2x4xUQW1fZtGtdL9yTU/R2mAnZBkkKyeRmyqzrEL5mXLT/GaY2ljYRJVQZy1XEK CGnSb6E9uLUfmNe6ZOJbDbdrO+c3g+Q7WpeO0P1gZhVIn05sFORbwDOUoLzBL8XBe5+w sDrQ== X-Gm-Message-State: AOJu0YyIroeCXJ6wF2JFotMxP+9V2aKsw0Mcg++dewhoSrfpP/O3ugQh Jw8s6W6LEUO1ubQEnds7xcJuSj4NuQt14qSaFSgvV6h8UiCZEiO1oH6GnGSjgO4= X-Received: by 2002:a05:6808:399a:b0:3bd:cbb2:4614 with SMTP id gq26-20020a056808399a00b003bdcbb24614mr185240oib.68.1706028504592; Tue, 23 Jan 2024 08:48:24 -0800 (PST) Received: from localhost (2603-7000-0c01-2716-97cf-7b55-44af-acd6.res6.spectrum.com. [2603:7000:c01:2716:97cf:7b55:44af:acd6]) by smtp.gmail.com with ESMTPSA id bk21-20020a05620a1a1500b0078353f07523sm3256413qkb.1.2024.01.23.08.48.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Jan 2024 08:48:24 -0800 (PST) Date: Tue, 23 Jan 2024 11:48:19 -0500 From: Johannes Weiner To: "T.J. Mercier" Cc: Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , android-mm@google.com, yuzhao@google.com, yangyifei03@kuaishou.com, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] Revert "mm:vmscan: fix inaccurate reclaim during proactive reclaim" Message-ID: <20240123164819.GB1745986@cmpxchg.org> References: <20240121214413.833776-1-tjmercier@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: The revert isn't a straight-forward solution. The patch you're reverting fixed conventional reclaim and broke MGLRU. Your revert fixes MGLRU and breaks conventional reclaim. On Tue, Jan 23, 2024 at 05:58:05AM -0800, T.J. Mercier wrote: > They both are able to make progress. The main difference is that a > single iteration of try_to_free_mem_cgroup_pages with MGLRU ends soon > after it reclaims nr_to_reclaim, and before it touches all memcgs. So > a single iteration really will reclaim only about SWAP_CLUSTER_MAX-ish > pages with MGLRU. WIthout MGLRU the memcg walk is not aborted > immediately after nr_to_reclaim is reached, so a single call to > try_to_free_mem_cgroup_pages can actually reclaim thousands of pages > even when sc->nr_to_reclaim is 32. (I.E. MGLRU overreclaims less.) > https://lore.kernel.org/lkml/20221201223923.873696-1-yuzhao@google.com/ Is that a feature or a bug? * 1. Memcg LRU only applies to global reclaim, and the round-robin incrementing * of their max_seq counters ensures the eventual fairness to all eligible * memcgs. For memcg reclaim, it still relies on mem_cgroup_iter(). If it bails out exactly after nr_to_reclaim, it'll overreclaim less. But with steady reclaim in a complex subtree, it will always hit the first cgroup returned by mem_cgroup_iter() and then bail. This seems like a fairness issue. We should figure out what the right method for balancing fairness with overreclaim is, regardless of reclaim implementation. Because having two different approaches and reverting dependent things back and forth doesn't make sense. Using an LRU to rotate through memcgs over multiple reclaim cycles seems like a good idea. Why is this specific to MGLRU? Shouldn't this be a generic piece of memcg infrastructure? Then there is the question of why there is an LRU for global reclaim, but not for subtree reclaim. Reclaiming a container with multiple subtrees would benefit from the fairness provided by a container-level LRU order just as much; having fairness for root but not for subtrees would produce different reclaim and pressure behavior, and can cause regressions when moving a service from bare-metal into a container. Figuring out these differences and converging on a method for cgroup fairness would be the better way of fixing this. Because of the regression risk to the default reclaim implementation, I'm inclined to NAK this revert.