Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp4288872rdb; Mon, 11 Dec 2023 14:56:13 -0800 (PST) X-Google-Smtp-Source: AGHT+IFM/jtHE1cWrYr8T/8j+cnR3PW8hiidkSah0O94joYR7z+9bKvRkiOYaobkub78WPOpLTU7 X-Received: by 2002:a05:6e02:1b87:b0:35d:66d9:479b with SMTP id h7-20020a056e021b8700b0035d66d9479bmr9329515ili.24.1702335373488; Mon, 11 Dec 2023 14:56:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702335373; cv=none; d=google.com; s=arc-20160816; b=PxBhx/OnS+7jXpshT/6oEfh/8+OPf2+XWQdnUK7omwIG4ApZKqSmIpwToa+7jvL2wb th94zx9IpIcqpoEi0+AWlUp4lVGolIwq0Ab7MBypNIp8zcBMcilrM0o03Bv4EY4ALztK UeuCpEfTXJn2oScx02TeLDOqHCWB3/2hkKANBF1chjcv2C3AysGwrBGXyFpY6Dirt5QT JZKANftST+wAw6shZT60w1FG9CeMpzT1jnWLHByAIaNcmVSctIQ80aW3O1zbkscAmVeQ /7ZVk0rcndJPfHNFQYlYZd28cmv9i9j44J0jqlJQlWrb3pR2NAEI7UHujUg6NIvb+Q07 7Cyw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:sender:dkim-signature; bh=jR8cUnQxE8Fxjjkvko5Z57/p4/4j9l8gY2eevyVTgGg=; fh=OKU5yKAVfpaxARK3ZM756mCxHob40Pn3ABu1ki+e/z4=; b=e/lUGpRa+LAxyqadU7RhMW3jq5jtSjlMbr7fCxF9Nh2NrGy0WQnkCYEGccVM42yqYH GMcLgBimHEVzEdzb4P6532wmdpcJXUJ1HNZsrVQCzjr0FnvCDFm0G7pPEv+6m4om/Ie5 pEob+47RqJU18PJeHbXdc3DZzHdzU0a+ZOa8oQbJRrj2yKFjNQYeT/Q+sQgUnML4mGc/ VVTEKtGHn3uAQfS1EUJftYvY6O049UAe8J9AIJcPlehzem60dooMs+s/GetGCM5olIEH vcr4uuIW0wvBC+L2ESEPFpb84epvfY/AXPhQHH+DnY79xoUVXZtf5rvTvZwzmaptTqjF H8rQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=i4QnZhx5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from pete.vger.email (pete.vger.email. [2620:137:e000::3:6]) by mx.google.com with ESMTPS id h5-20020a170902f7c500b001d0725baa37si6621328plw.169.2023.12.11.14.56.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Dec 2023 14:56:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) client-ip=2620:137:e000::3:6; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=i4QnZhx5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id A101A80BC11B; Mon, 11 Dec 2023 14:56:10 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230022AbjLKWzy (ORCPT + 99 others); Mon, 11 Dec 2023 17:55:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230102AbjLKWzl (ORCPT ); Mon, 11 Dec 2023 17:55:41 -0500 Received: from mail-oi1-x231.google.com (mail-oi1-x231.google.com [IPv6:2607:f8b0:4864:20::231]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D314F9A; Mon, 11 Dec 2023 14:55:47 -0800 (PST) Received: by mail-oi1-x231.google.com with SMTP id 5614622812f47-3b9e2a014e8so3361930b6e.2; Mon, 11 Dec 2023 14:55:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702335347; x=1702940147; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:sender:from:to:cc:subject:date:message-id :reply-to; bh=jR8cUnQxE8Fxjjkvko5Z57/p4/4j9l8gY2eevyVTgGg=; b=i4QnZhx5jBd+twQDJQcFQHkDBJbxvntV0tYd3zd/qlEwgrHSORXmI2VhNGmnsc3w0t XBKYkPRZoRGVIrZXzZGGbC1CcVwyPkFs5W47uO38vUAlRZ8c03CxAA+EARsm2ph5Txu1 sDmCIvVxIn3B7CSioar0dBUamrHe/rHuULey94iDc955LHlvEh8ESQeH6A/X5f2kBwqN JyWY0w7jphiPAKGPKi0jWX4A567+F9g6d5gDT5OiuCDM14RA92OGJ3S9KdUYHM6lHyGd n4+mgRgqzELEQXuT6uholJTxXVY6h4C/xLsbEGMQ4mR/DUCbFqOdxaZ5N8LHoD2UrzL4 8bNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702335347; x=1702940147; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:sender:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jR8cUnQxE8Fxjjkvko5Z57/p4/4j9l8gY2eevyVTgGg=; b=NfdISvxr1nTAj9MYcFQDQVv5Iu4zb0MyjL1cd7b5b5I5zi+lHwmm+Tq/+2OERg95R/ TXjWH08LIgk+znxqVT95W6rocppJWHosNQRcU5QojuS0ClFhEj9rbTaFX0LBpB7wJD07 a1XCNcL8FgFeaeM2v7shwbW7PPJM2MfnQxSG/WBL0VX33Y0pj7UxxuDTtH2ZqsF69n2k 5WXJwNmfhfv/CBsnxlhtFuA7jamST/KgnjgjgWlWHbIWzynQ10ppvv3xE3IJ7f3bx1wm nXThfdNbvTpzpu5xKWR8d7ZTnP9/jC3awqX0s+//FEI9HUKTmObmZTZ3/6jr6EtmKBu2 kxIA== X-Gm-Message-State: AOJu0YzFfpeLXWR2O2Mnzsx9qfJPl7JHQYyYENQFYXjZZx+XWSM/TAzw dJ0K2iVZSqz/UfPhL1Hg46U= X-Received: by 2002:a05:6808:1704:b0:3a4:316c:8eeb with SMTP id bc4-20020a056808170400b003a4316c8eebmr6901931oib.40.1702335346953; Mon, 11 Dec 2023 14:55:46 -0800 (PST) Received: from google.com ([2620:0:1000:8411:f6ed:4bc3:49fd:2063]) by smtp.gmail.com with ESMTPSA id s16-20020a62e710000000b006cb4fa1174dsm6792664pfh.124.2023.12.11.14.55.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Dec 2023 14:55:46 -0800 (PST) Sender: Minchan Kim Date: Mon, 11 Dec 2023 14:55:43 -0800 From: Minchan Kim To: Johannes Weiner Cc: Chris Li , Nhat Pham , akpm@linux-foundation.org, tj@kernel.org, lizefan.x@bytedance.com, cerasuolodomenico@gmail.com, yosryahmed@google.com, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, muchun.song@linux.dev, hughd@google.com, corbet@lwn.net, konrad.wilk@oracle.com, senozhatsky@chromium.org, rppt@kernel.org, linux-mm@kvack.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, david@ixit.cz, Kairui Song , Zhongkun He Subject: Re: [PATCH v6] zswap: memcontrol: implement zswap writeback disabling Message-ID: References: <20231207192406.3809579-1-nphamcs@gmail.com> <20231209034229.GA1001962@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231209034229.GA1001962@cmpxchg.org> X-Spam-Status: No, score=-1.0 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Mon, 11 Dec 2023 14:56:10 -0800 (PST) On Fri, Dec 08, 2023 at 10:42:29PM -0500, Johannes Weiner wrote: > On Fri, Dec 08, 2023 at 03:55:59PM -0800, Chris Li wrote: > > I can give you three usage cases right now: > > 1) Google producting kernel uses SSD only swap, it is currently on > > pilot. This is not expressible by the memory.zswap.writeback. You can > > set the memory.zswap.max = 0 and memory.zswap.writeback = 1, then SSD > > backed swapfile. But the whole thing feels very clunky, especially > > what you really want is SSD only swap, you need to do all this zswap > > config dance. Google has an internal memory.swapfile feature > > implemented per cgroup swap file type by "zswap only", "real swap file > > only", "both", "none" (the exact keyword might be different). running > > in the production for almost 10 years. The need for more than zswap > > type of per cgroup control is really there. > > We use regular swap on SSD without zswap just fine. Of course it's > expressible. > > On dedicated systems, zswap is disabled in sysfs. On shared hosts > where it's determined based on which workload is scheduled, zswap is > generally enabled through sysfs, and individual cgroup access is > controlled via memory.zswap.max - which is what this knob is for. > > This is analogous to enabling swap globally, and then opting > individual cgroups in and out with memory.swap.max. > > So this usecase is very much already supported, and it's expressed in > a way that's pretty natural for how cgroups express access and lack of > access to certain resources. > > I don't see how memory.swap.type or memory.swap.tiers would improve > this in any way. On the contrary, it would overlap and conflict with > existing controls to manage swap and zswap on a per-cgroup basis. > > > 2) As indicated by this discussion, Tencent has a usage case for SSD > > and hard disk swap as overflow. > > https://lore.kernel.org/linux-mm/20231119194740.94101-9-ryncsn@gmail.com/ > > +Kairui > > Multiple swap devices for round robin or with different priorities > aren't new, they have been supported for a very, very long time. So > far nobody has proposed to control the exact behavior on a per-cgroup > basis, and I didn't see anybody in this thread asking for it either. > > So I don't see how this counts as an obvious and automatic usecase for > memory.swap.tiers. > > > 3) Android has some fancy swap ideas led by those patches. > > https://lore.kernel.org/linux-mm/20230710221659.2473460-1-minchan@kernel.org/ > > It got shot down due to removal of frontswap. But the usage case and > > product requirement is there. > > +Minchan > > This looks like an optimization for zram to bypass the block layer and > hook directly into the swap code. Correct me if I'm wrong, but this > doesn't appear to have anything to do with per-cgroup backend control. Hi Johannes, I haven't been following the thread closely, but I noticed the discussion about potential use cases for zram with memcg. One interesting idea I have is to implement a swap controller per cgroup. This would allow us to tailor the zram swap behavior to the specific needs of different groups. For example, Group A, which is sensitive to swap latency, could use zram swap with a fast compression setting, even if it sacrifices some compression ratio. This would prioritize quick access to swapped data, even if it takes up more space. On the other hand, Group B, which can tolerate higher swap latency, could benefit from a slower compression setting that achieves a higher compression ratio. This would maximize memory efficiency at the cost of slightly slower data access. This approach could provide a more nuanced and flexible way to manage swap usage within different cgroups.