Received: by 2002:a05:7412:b130:b0:e2:908c:2ebd with SMTP id az48csp1967553rdb; Sun, 19 Nov 2023 19:16:41 -0800 (PST) X-Google-Smtp-Source: AGHT+IGw5KX8fHCzmbJDpvTgR/r0OLwucfw3ijEvvI1NJ3K75LRfHa9za1L+Kzu3Y+UAgFv7AYAZ X-Received: by 2002:a05:6a20:729b:b0:188:75c:e67c with SMTP id o27-20020a056a20729b00b00188075ce67cmr4621922pzk.44.1700450200935; Sun, 19 Nov 2023 19:16:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700450200; cv=none; d=google.com; s=arc-20160816; b=F1gUFU5Gcf3nwNOb3Z/cgyAH+nb7MrG7XAIpbujf4qjQbUIXbz7wadJNcdQ1hgr/A2 5Lzt8sy1UWWc2et+Uo/ivFnbcTBl+6navLsVPbNWAK7QquiEfyBRyvFH7GffX7lAv/SD qPJ4SwcofIqUvJlKW/d+yL+gVvmC0naO4rc5TwqLCYjovzKGPSbwvwr4wASJ4ir2FvIC IP5ZRoZuErBk6AKu8IsAGEnJScojELskPxXWuisOKCf9/X3K0ZnrZAckqVbu5m6NWTYG Pn3+A3C7UO+Mj5x3l4557R+c6BRs9R0AUFtRwIpoEMkGZ4GWNGoyjov9ffZjOhrI2p2c L0Cg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=JoUBMBH+0Q/OryRpOsGJL0OZmSVelGYXaGmXu0ohb3Q=; fh=F21QgvswZiXGPdI6T3sj6ALdS3MAMpHjEvCyYeDY+CU=; b=Do3XverGLN9biykRmzj99+XNH31xxk/jrs62pOr3YKxcv+nwXD3ULp3VkmA95FtU9W TK5pfmY7wmm9kcse+BLIg8u+S0JeOCwWzQKm8d1S1Tj5uv5+nKrV1vXh70ATt3HbAibL aAj/8RHGfouRqnkMIArdNWwUfTkK1ylGmU4apHoW3259VyplS9pS+yKY0MjvLQR7eQse fW39DjxDGCOYh7FrCFVeGSiqIXeACio8OlRc2L76ugFIlavorgYjK9o/txX8WIMxps76 G/ZqH90kFg1//QSVZvDE+4RkObz8T89ANdjd5EjrAfCDEojbjq8xaREAix5tBNTvq6+8 ovlA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=ZdOcQVWk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Return-Path: Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id t12-20020a170902e84c00b001cf56e3a193si3370575plg.168.2023.11.19.19.16.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 19 Nov 2023 19:16:40 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=ZdOcQVWk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id E93198060453; Sun, 19 Nov 2023 19:16:37 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231807AbjKTDQa (ORCPT + 99 others); Sun, 19 Nov 2023 22:16:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57224 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229596AbjKTDQ2 (ORCPT ); Sun, 19 Nov 2023 22:16:28 -0500 Received: from mail-lf1-x133.google.com (mail-lf1-x133.google.com [IPv6:2a00:1450:4864:20::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E3F40137 for ; Sun, 19 Nov 2023 19:16:24 -0800 (PST) Received: by mail-lf1-x133.google.com with SMTP id 2adb3069b0e04-50970c2115eso5635734e87.1 for ; Sun, 19 Nov 2023 19:16:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1700450183; x=1701054983; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=JoUBMBH+0Q/OryRpOsGJL0OZmSVelGYXaGmXu0ohb3Q=; b=ZdOcQVWkiNpJulpQ8oyD+srqU8iNFr3v0YZmwWQe7/sAXCZjIc8cbs/EhI5pfeXl+1 NJZaACLZkVhcdI2xc2bg7gxfrB/S2dlXaXDFPQsA/RcReNZZwD31E20LpLn4/Xq0etvy tKDk1bd1DTYoL7fjwLxxHDEkKL6SsXPJ+UREO1AiZUBrNxx9wecXIa2p7MqRzCSZyARO Jx0mbWQU2p/70fkyRoHeiZy8VhXEwY2ASdmH0jclGxIawLmYWTllICJh81LGmt+Rmn7o f+I5cnzCqb2KpEL5G7/2dXCR9nstZ0ggg+6afNWLnJUzT4ASgwo2I1+n7pQs/qmTO4px zCWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700450183; x=1701054983; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JoUBMBH+0Q/OryRpOsGJL0OZmSVelGYXaGmXu0ohb3Q=; b=WztZuHhzgkbVRjC6wmIk5EKphyHqFcTWj2GAbnauz97JFR5s1CilnqdNs+PEY4jMVZ xdPDV26EOVnyzvuBHfBJhDTfOUXoEYaf/6Ouq5I1E0tOkLr0SAFMxZNa8EOg6R11yXGB F+1JU+YYpjRR7Vp1KwV4sp+sdqS6uMfs5+DGwBOiXXDkSn3tnrtJOv6WoG+8uJ1CcwBH M0gJ5VHZzVtYLlOPiFf0W94dYOoZUKdGW0c9GCBm6fPCx8Nr0sgsazWoy1rGRoqWGL9W DpBRRRhFuuKLbAdp49ABA5V5kq0YTkCI/Chr1m50xCfbMQSlQl6IYjsAMXWOOmHWPPSi Bq0A== X-Gm-Message-State: AOJu0YxnIOD+ObUHiVRgrG4F/TUCsOJ9gRhBa/X3c6O72BUGj4TTzUkA RxnZVd4pV4icrKgjPhhtxMCzHFPE0fW8WaLgsmABNg== X-Received: by 2002:ac2:563b:0:b0:509:8db5:949 with SMTP id b27-20020ac2563b000000b005098db50949mr3953145lff.6.1700450183088; Sun, 19 Nov 2023 19:16:23 -0800 (PST) MIME-Version: 1.0 References: <20231113130601.3350915-1-hezhongkun.hzk@bytedance.com> In-Reply-To: From: Zhongkun He Date: Mon, 20 Nov 2023 11:16:11 +0800 Message-ID: Subject: Re: [External] Re: [PATCH] mm:zswap: fix zswap entry reclamation failure in two scenarios To: Chris Li Cc: Yosry Ahmed , Andrew Morton , Johannes Weiner , Nhat Pham , Seth Jennings , Dan Streetman , Vitaly Wool , linux-mm , LKML , Ying Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Sun, 19 Nov 2023 19:16:38 -0800 (PST) > > Thanks for the clarification. Keep in mind that memory freeing from > and zswap entry and zpool does not directly translate into page free. > If the page has other none freed zswap entry or zsmalloc usage, those > pages will not be free to the system. That is the fragmentation cost I > was talking about. Yes, it may need to be compacted. > > With this consideration, do you know many extra pages it can release > back to the system by this patch in your usage case? If the difference > is very small, it might not be worth the extra complexity to release > those. > The original intention of this patch is to make shrink work properly=EF=BC= =8C not to release cache and related memory. > > The original intention of this patch is to solve the problem that > > shrink_work() fails to reclaim memory in two situations. > > > > For case (1), the zswap_writeback_entry() will failed for the > > __read_swap_cache_async return NULL because the swap has been > > freed but cached in swap_slots_cache, so the memory come from > > the zswap entry struct and compressed page. > In those cases, if we drop the swap_slots_cache, it will also free > those zswap entries and compressed pages (zpool), right? > > > Count =3D SWAP_BATCH * ncpu. > > That is the upper limit. Not all CPUs have swap batches fully loaded. Yes. > > > Solution: move the zswap_invalidate() out of batches, free it once the = swap > > count equal to 0. > Per previous discussion, this will have an impact on the > swap_slot_cache behavior. > We need some data points for cost benefit analysis. > > > For case (2), the zswap_writeback_entry() will failed for !page_was_al= located > > because zswap_load will have two copies of the same page in memory > > (compressed and uncompressed) after faulting in a page from zswap whe= n > > zswap_exclusive_loads disabled. The amount of memory is greater but dep= ends > > on the usage. > > That is basically disable the future swap out page IO write > optimization that skip the write if the page hasn't changed. If the > system are low on memory, that is justifiable. Again, it seems we can > have a pass to drop the compressed memory if the swap count is zero > (and mark page dirty). > OK. > > > > Why do we need to release them? > > Consider this scenario,there is a lot of data cached in memory and zswa= p, > > hit the limit=EF=BC=8Cand shrink_worker will fail. The new coming data = will be written > > Yes, the shrink_worker will need to allocate a page to store > uncompressed data for write back. > > > directly to swap due to zswap_store failure. Should we free the last on= e > > to store the latest one in zswap. > > The "last one" you mean the most recent zswap entry written into zswap? > Well, you need to allocate a page to write it out, that is an async proce= ss. > Shrink the zpool now is kind of too late already. > The last zswap_entry in zswap_pool->lru. > > According to the previous discussion, the writeback is inevitable. > > So I want to make zswap_exclusive_loads_enabled the default behavior > > or make it the only way to do zswap loads. It only makes sense when > > We need some data point for how often we swap it out to zswap again, > where the zswap out write can be saved by using the existing compressed d= ata. > > It is totally possible this page IO write out optimization is not > worthwhile for zswap. > We need some data to support that claim. > Got it. I will find it. > > the page is read and no longer dirty. If the page is read frequently, i= t > > should stay in cache rather than zswap. The benefit of doing this is > > very small, i.e. two copies of the same page in memory. > > If the benefit of doing this is very small, that seems to be the > argument against this patch? > Again we need some data points for cost and benefit analysis. > Yes, it is the new idea to make zswap_exclusive_loads_enabled the default behavior or make it the only way to do zswap loads. > Chris