Received: by 2002:a05:7412:d1aa:b0:fc:a2b0:25d7 with SMTP id ba42csp2012078rdb; Wed, 31 Jan 2024 16:59:55 -0800 (PST) X-Google-Smtp-Source: AGHT+IHhj2jciFWyOLwxNsUppAlQHrUxARWJP+V75Xji3Nh0RNYE2oIrC8OawHwAvG6VTFh6cdjf X-Received: by 2002:a9d:6acb:0:b0:6db:fb06:7a23 with SMTP id m11-20020a9d6acb000000b006dbfb067a23mr822049otq.33.1706749194887; Wed, 31 Jan 2024 16:59:54 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706749194; cv=pass; d=google.com; s=arc-20160816; b=p511qopoThOip/o9V1/PVnlEyb1cEQrn1uX4txihPshap8xl1t+1a7pH9vSrGJyHyS QsBEexApQRm7G9YwK65JnjQVupLQZ1GffSr5wymV7zHopuA/xFh4CsVutwgDMJss+H7W lgkrxbXJOHa0/x8tWoBZ6x7naweuE3OEZC5wmEe6UsQ7uXBGBwxBIgc95g14VKoaj9z3 9y95SUN8dbXPNGRqIeoAJ5Jxy1zi5HvbbavRzfiEfqf3Nz4NOZcLvEAFDsBZv8G6d5lE JH3T/lp0TtfY+thdi9CJCD933U1tO7hMc7HQgJhKzjLjGz0bpDnlBvop8nFFJBw4kA4S 04yw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=3elAM1MdU5jxjEQXnGO16pPxMOUt6et3x1F1MM8DzvQ=; fh=G4nHnQNB98I1AU2Wxmmq6qOc5CAdViXfA91YUgVldgc=; b=FWYIbjeIz70CfmT+7Blmqxbm8RZxk0/sIHVPck5bKHOl5+xMIUUZckczUPDmdn7SyQ pxo1o4tEAT4S8f37ZjgtqBaWBU5zQJQ9VfaV7LO705UieJuSIhoXNokJkalRoUCAUsp5 1K+O32bd+10QEMjxedB0sA11iZoKTCQTRE7SO9+CfyJvDB+k9uoqCxCxFGaYU361IGBk rOXkstQho4mDAONrZoJAVrerGflLdaYPyZWWWVzaOI8jkmWPs1tCy1UffbeMI7PScE2o nS0nXjxn4qI9vUaLmFcmEun0LILCExXv0jgR96lod/9ehlnp7YR8nECMi36/ZsLUB1Qq 4QxQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=RQ7+woMJ; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-47459-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-47459-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org X-Forwarded-Encrypted: i=1; AJvYcCX9UdDWqH+rNY0QmQPPJksmImwccK7B6ZXFdPcVus/IIYcS8k5sYWIozcmo+DoxSy+PUWpNGzxLtGypQ0BeZ73stUBjN+LrNZpinQJtag== Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id a65-20020a639044000000b005d8b48044d3si9264285pge.723.2024.01.31.16.59.54 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Jan 2024 16:59:54 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-47459-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=RQ7+woMJ; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-47459-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-47459-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 85FCC286036 for ; Thu, 1 Feb 2024 00:58:02 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id C55103C30; Thu, 1 Feb 2024 00:57:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="RQ7+woMJ" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E6A6D20E4 for ; Thu, 1 Feb 2024 00:57:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706749077; cv=none; b=H19TSl9fm2SKPpKgHZXwGrid0zr0XmgmclLpnRzSlhuFcGhydDajM6aAF/sz+oEwLoZHkvGP/9wt5ADyzc7BQAcISc58CKNXqLbsFIC+So2gB33Rj95N9jRKqq1Mkymz+1KhIcbRJtTVFJiIYOoNoqP9wBj4kt5It59ZGC5/abA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706749077; c=relaxed/simple; bh=8k0ghcR4VPDFES5qhdUN/8xWWs6EqfQ8PKPk1D1kB24=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=YLhoC5uocgPHjPkiWcsOjVmTFQh/jWZRogtJd+lrRKzKG/8gav/EH7icqA8aszwN70G4J6A6eNs9lWGj4DP/9yAZN7QKSSZLWuelpL6MJT2SIjTcW5pGeo1xd+jKHIVKQSXZhFn+EsK1xxOb1eok28IWBN0GmRzNIdybVWk7dKQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=RQ7+woMJ; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5C7F2C4166C for ; Thu, 1 Feb 2024 00:57:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1706749076; bh=8k0ghcR4VPDFES5qhdUN/8xWWs6EqfQ8PKPk1D1kB24=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=RQ7+woMJyhZ5xwaXun7RkiXhY8yZcJcOm5385luZ5v8tXK1kYYWb2RqGkXjaomjiu PaF7q7MjCmt8ICQ/fDoIURd93DpmZD452SGPP4/N+sJ/1MZSikMvRjs5kMrPYfHWQ1 Whxu+QKOkL0InXzf6oLYcRkoHgLfM/oNanA+iCnx3m4oxf5jiq66cO1gMeUHOrLkAd 6L20H2pM+L5SlOVZCRtRdCYwAXVr3OZtYDNwvzBGbCwXwO0OoZBJdCl73Kulu6Wu7B gDc0fKB8THCXtC5b9OS0y6vhylMgXseWdKirjMWcDX0K1NE6ClbM54C9zPV4YmeN+x JFbjYXPG3uXjQ== Received: by mail-il1-f170.google.com with SMTP id e9e14a558f8ab-36380147a69so1640075ab.3 for ; Wed, 31 Jan 2024 16:57:56 -0800 (PST) X-Gm-Message-State: AOJu0YyIMWjRs/2lvPqF/mTclMLXNa+P3bkwRvHtnbVZH7hYphbH6/ze zM79xeKzIGxQBpdagobIQ08vqKq+ICnXlkA/FGSk3dIPiC8lfXYQdCuIISL3lkfXuEB2EzUTjdA chp0VKT68bRf2jxvl+M+X+p3qei1CwocjINhD X-Received: by 2002:a05:6e02:1c87:b0:363:8c6c:4c8e with SMTP id w7-20020a056e021c8700b003638c6c4c8emr1046887ill.31.1706749075395; Wed, 31 Jan 2024 16:57:55 -0800 (PST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20231221-async-free-v1-1-94b277992cb0@kernel.org> In-Reply-To: From: Chris Li Date: Wed, 31 Jan 2024 16:57:44 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] mm: swap: async free swap slot cache entries To: Yosry Ahmed Cc: Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, =?UTF-8?B?V2VpIFh177+8?= , =?UTF-8?B?WXUgWmhhb++/vA==?= , Greg Thelen , Chun-Tse Shao , =?UTF-8?Q?Suren_Baghdasaryan=EF=BF=BC?= , Brain Geffon , Minchan Kim , Michal Hocko , Mel Gorman , Huang Ying , Nhat Pham , Johannes Weiner , Kairui Song , Zhongkun He , Kemeng Shi , Barry Song Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Yosry, On Thu, Dec 28, 2023 at 7:34=E2=80=AFAM Yosry Ahmed = wrote: > > On Thu, Dec 21, 2023 at 10:25=E2=80=AFPM Chris Li wro= te: > > > > We discovered that 1% swap page fault is 100us+ while 50% of > > the swap fault is under 20us. > > > > Further investigation show that a large portion of the time > > spent in the free_swap_slots() function for the long tail case. > > > > The percpu cache of swap slots is freed in a batch of 64 entries > > inside free_swap_slots(). These cache entries are accumulated > > from previous page faults, which may not be related to the current > > process. > > > > Doing the batch free in the page fault handler causes longer > > tail latencies and penalizes the current process. > > > > Move free_swap_slots() outside of the swapin page fault handler into an > > async work queue to avoid such long tail latencies. > > > > Testing: > > > > Chun-Tse did some benchmark in chromebook, showing that > > zram_wait_metrics improve about 15% with 80% and 95% confidence. > > > > I recently ran some experiments on about 1000 Google production > > machines. It shows swapin latency drops in the long tail > > 100us - 500us bucket dramatically. > > > > platform (100-500us) (0-100us) > > A 1.12% -> 0.36% 98.47% -> 99.22% > > B 0.65% -> 0.15% 98.96% -> 99.46% > > C 0.61% -> 0.23% 98.96% -> 99.38% > > I recall you mentioning that mem_cgroup_uncharge_swap() is the most > expensive part of the batched freeing. If that's the case, I am > curious what happens if we move that call outside of the batching > (i.e. once the swap entry is no longer used and will be returned to > the cache). This should amortize the cost of memcg uncharging and > reduce the tail fault latency without extra work. Arguably, it could > increase the average fault latency, but not necessarily in a > significant way. > > Ying pointed out something similar if I understand correctly (and > other operations that can be moved). If the goal is to let the swap fault return as soon as possible. Then the current approach is better. mem_cgroup_uncarge_swap() is only part of it. Not close to all of it. > > Also, if we choose to follow this route, I think there we should flush > the async worker in drain_slots_cache_cpu(), right? Not sure I understand this part. The drain_slots_cache_cpu(), will free the entries already. The current lock around cache->free_lock should protect async workers accessing the entries. What do you mean by flushing? Chris