Received: by 2002:a05:7412:b995:b0:f9:9502:5bb8 with SMTP id it21csp4019040rdb; Thu, 28 Dec 2023 07:35:31 -0800 (PST) X-Google-Smtp-Source: AGHT+IGFyayQ4e1nXuCmLt9dDiPaG29ADR1AH4XehNSpJook/sLJtqIe5tXdgg+FuJQXyeCYa0bY X-Received: by 2002:a05:6a20:9686:b0:196:106d:dd06 with SMTP id hp6-20020a056a20968600b00196106ddd06mr1944860pzc.16.1703777730902; Thu, 28 Dec 2023 07:35:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703777730; cv=none; d=google.com; s=arc-20160816; b=pXerw6owRv2xxlh3ekAUgxEoKECd1yonW9VxNLAaHtF6o1dj7NZgZ1QmVDctOKldzr 062IbitFa9u1RuOGpYNoIgFQn59KDgHMwHEEktLuROQtGYzzWkMpAi+Jd3bMQAGOXe5E hKsACiU8BnvL9qpSaQI7Uqiq+VuP7lMYvdSNdLWya8lkMiuYqpQ+Y5pMC7iDmz8LqPyC D8SWVVJgrG7EFjoFcskOxk5xyMnw3KIWSUQo4WJp9psvF3gCvHQe9b3iXhwKFOij+Sic YsCokAn3qRKdeiRg0I8dMVcbR2EFMMzLcKBacv49vNUeZk3Xp7d0trDUl4grACwX4ifo 8FSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=h986LuRLMj5CtBkm80VjuZOiBJzP2v2Z/obCN9xx1n4=; fh=T2kYDq9wFbB1dh0nKSlGdkm/hNdFka8L9QUGKLRLM44=; b=NAslnduxiMPiUWbRyYotpZZwPuvH+ooKcAA2msWR2fJxuCyaCBhNHLNh9LH0QsfzhN zF1RZDV5phiB3kfrDsu+qQoIl5qA9XpIP7faYS2hRrlw4j9vQyGa2eEanDT/88JyON1e mER/ca/rdqC8zvv25/Kh/G97OuTWhNjuvnkGyCh4Dv3FCWjerembbiOJOS6zHkQ7bbUp 3Y+j7IQMyz44GqBUhyu3+ioJtAv6mEpHLs0HI/rz3Dz2pbJy4HBTJcpjmSP2dAkgrlzm jfvTo4vvAkbV2Jk02dTxKVOc+UV026cvN1BV99hb2XPaKwEIXiUkqvedIaDZgY34cmK7 mNYQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=3glrpVrH; spf=pass (google.com: domain of linux-kernel+bounces-12763-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-12763-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id o10-20020a17090ac70a00b0028bd815859dsi14536792pjt.136.2023.12.28.07.35.30 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 Dec 2023 07:35:30 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-12763-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=3glrpVrH; spf=pass (google.com: domain of linux-kernel+bounces-12763-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-12763-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 88F56285270 for ; Thu, 28 Dec 2023 15:35:30 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 0E93BF4FD; Thu, 28 Dec 2023 15:35:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="3glrpVrH" X-Original-To: linux-kernel@vger.kernel.org Received: from mail-ej1-f45.google.com (mail-ej1-f45.google.com [209.85.218.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AD42CF4E1 for ; Thu, 28 Dec 2023 15:35:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Received: by mail-ej1-f45.google.com with SMTP id a640c23a62f3a-a236456fee1so709267366b.1 for ; Thu, 28 Dec 2023 07:35:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1703777721; x=1704382521; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=h986LuRLMj5CtBkm80VjuZOiBJzP2v2Z/obCN9xx1n4=; b=3glrpVrHDx5LxWi8eh+2wWeAtRFUC+CJMn5rMlUHCXDECVWyW05tTPKqi0wNc4+o4o ToraP8CR3+kN0hX7ipfL2zgcS+ltesf9eGu9JU5RvORGbNBLtoYuGw95b+9wMFbyV9NQ TYUKzrEYECsAXznXlQnrR9bl3oOhLqKlITjTtSdhk2Jr9dH96JhRznlK5mtScRqN5oKS DU9Bmn0HUQgupibeOy2ieXQnaIKhmRXEcY1i50So+ervrRZ7PPmb/pTYlE4EqXibCo1k LO5gAzhf9JLTU4PzvIs6C7zp8aJJ6SNQdBMKXaCRjhDh9hIIWTKDCZ9dCyHPiGTX9R3z TwyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703777721; x=1704382521; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=h986LuRLMj5CtBkm80VjuZOiBJzP2v2Z/obCN9xx1n4=; b=F2PToTFc7iLxeYn28uM/xlNhLqphjCwptucUvZGW8P0C812JL3/0RPmLW6fUrOHRQz H0X7hUXnZXwuBAJvkZJ5unIVzXOKxJZ0vsY7Z8s15cUPpGF95phvpQQEtZhw3+uCGrOG X5eN/y/XyCnUgKq2apdhwl8u+63q0Zh0VFiOKJkSbjON+WZy5edonDuWXBc0ijKe5emu 9inHUSXn0LnjrCGC4P9Na94gr6TUmPnHtGCwLW2OHQPtb66f+rhq67kOWqGnOF/bWVar hFrovcLZ1J4oT0JIDDfGRlFeaZSApR+m0HgY62f66TRKyQTQvrKL0+A1T5b7CJPzTdUu U1Qg== X-Gm-Message-State: AOJu0YwpbiXJqarG8BXdwBPwPJGtZzSGt1N1bAXhmNBVtB5dLng3WfEc Et/hidUoDmRGNuppb+mE2r04my9ohrDOsbU41W9MWZPxrEfi X-Received: by 2002:a17:906:7392:b0:a26:c758:373a with SMTP id f18-20020a170906739200b00a26c758373amr4171692ejl.143.1703777720699; Thu, 28 Dec 2023 07:35:20 -0800 (PST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20231221-async-free-v1-1-94b277992cb0@kernel.org> <20231222115208.ab4d2aeacdafa4158b14e532@linux-foundation.org> <0a052cb1-a5c5-4bee-5bd5-fd5569765012@google.com> In-Reply-To: From: Yosry Ahmed Date: Thu, 28 Dec 2023 07:34:44 -0800 Message-ID: Subject: Re: [PATCH] mm: swap: async free swap slot cache entries To: Chris Li Cc: David Rientjes , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Wei Xu , Yu Zhao , Greg Thelen , Chun-Tse Shao , Suren Baghdasaryan , Brain Geffon , Minchan Kim , Michal Hocko , Mel Gorman , Huang Ying , Nhat Pham , Johannes Weiner , Kairui Song , Zhongkun He , Kemeng Shi , Barry Song , Hugh Dickins Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sun, Dec 24, 2023 at 2:07=E2=80=AFPM Chris Li wrote: > > On Sun, Dec 24, 2023 at 1:13=E2=80=AFPM David Rientjes wrote: > > > > On Sun, 24 Dec 2023, Chris Li wrote: > > > > > On Sat, Dec 23, 2023 at 7:01=E2=80=AFPM David Rientjes wrote: > > > > > > > > On Sat, 23 Dec 2023, Chris Li wrote: > > > > > > > > > > How do you quantify the impact of the delayed swap_entry_free()= ? > > > > > > > > > > > > Since the free and memcg uncharge are now delayed, is there not= the > > > > > > possibility that we stay under memory pressure for longer? (As= suming at > > > > > > least some users are swapping because of memory pressure.) > > > > > > > > > > > > I would assume that since the free and uncharge itself is delay= ed that in > > > > > > the pathological case we'd actually be swapping *more* until th= e async > > > > > > worker can run. > > > > > > > > > > Thanks for raising this interesting question. > > > > > > > > > > First of all, the swap_entry_free() does not impact "memory.curre= nt". > > > > > It reduces "memory.swap.current". Technically it is the swap pres= sure > > > > > not memory pressure that suffers the extra delay. > > > > > > > > > > Secondly, we are talking about delaying up to 64 swap entries for= a > > > > > few microseconds. > > > > > > > > What guarantees that the async freeing happens within a few microse= conds? > > > > > > Linux kernel typically doesn't provide RT scheduling guarantees. You > > > can change microseconds to milliseconds, my following reasoning still > > > holds. > > > > > > > What guarantees that the async freeing happens even within 10s? Your > > responses are implying that there is some deadline by which this freein= g > > absolutely must happen (us or ms), but I don't know of any strong > > guarantees. > > I think we are in agreement there, there are no such strong guarantees > in linux scheduling. However, when there are free CPU resources, the > job will get scheduled to execute in a reasonable table time frame. If > it does not, I consider that a bug if the CPU has idle resources and > the pending jobs are not able to run for a long time. > The existing code doesn't have such a guarantee either, see my point > follows. I don't know why you want to ask for such a guarantee. > > > If there are no absolute guarantees about when the freeing may now occu= r, > > I'm asking how the impact of the delayed swap_entry_free() can be > > quantified. > > Presumably each application has their own SLO metrics for monitoring > their application behavior. I am happy to take a look if any app has > new SLO violations caused by this change. > If you have one metric in mind, please name it so we can look at it > together. During my current experiment and the chromebook benchmark, I > haven't noticed such ill effects show up in the other metrics drops in > a statistically significant manner. That is not the same as saying > such drops don't exist at all. Just I haven't noticed or the SLO > watching system hasn't caught it. > > > The benefit to the current implementation is that there *are* strong > > guarantees about when the freeing will occur and cannot grow exponentia= lly > > before the async worker can do the freeing. > > I don't understand your point. Please help me. In the current code, > for the previous swapin fault that releases the swap slots into the > swap slot caches. Let's say the swap slot remains in the cache for X > seconds until Nth (N < 64) swapin page fault later, the cache is full > and finally all 64 swap slot caches are free. Are you suggesting there > is some kind of guarantee X is less than some fixed bound seconds? > What is that bound then? 10 second? 1 minutes? > > BTW, there will be no exponential growth, that is guaranteed. Until > the 64 entries cache were freed. The swapin code will take the direct > free path for the current swap slot in hand. The direct free path > existed before my change. FWIW, it's 64 * the number of CPUs.