Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp2518327rwd; Fri, 2 Jun 2023 10:27:44 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6abh+p695mO+PaQ4/zd6A7xZR6e5M9I+9fSoM/cBrvkttQpXUys+5bVF8L20dy7BEr5WEl X-Received: by 2002:a05:6a20:244f:b0:10b:397c:b1bf with SMTP id t15-20020a056a20244f00b0010b397cb1bfmr15280335pzc.15.1685726863797; Fri, 02 Jun 2023 10:27:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685726863; cv=none; d=google.com; s=arc-20160816; b=eX3kXB/pdM1xlRJG5EcJl0CzAOZC9PbQ58gE9Fz0sZavYnpCgj4mQXZ7KQtrf/l97a rhKInPng96mYGg7FQwaqO4vCzX75eVW8qqvIzDahlt20cEV5lop4uo/IpWd3lETEq7ex Fh5eeqjCKNoBoe9/+vtDusPhtzrDgg1V9C3abpqy+cCWnvwVyrw7ZqkqdV/vWIU1dLmB vX3XceTRRRxnrar1+D6Ir+6F8TpyqFn4HUicH97uw48Ut7FCyb900svlS7U5qqKPtIwL t2oGsE8pMf/DWfg+h5I0pJyU5Jf1We8AOMJxs/MeCNbBRNHgfTWb3UHybAA711D3DRsl aWMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=5UQdCHARuiwiuV0zQiL53aYkSXJ/lt2DZ527kuxBwaY=; b=e2AAugXuyFkp3HexFwQrAe2NocUIhjArOL97vcz9E1QC4KBzgxlQXHhlfBwNVwc8p0 OSKivIc1b0FqNoBSRg+h8BVrtFl3N/ovAS5WAmAW/IUvg+wzJkBPJr//ozdtmMAyh5Zw no0CoNwm0xbJSDJRHhH94eTHs1g59owfm2ha4MGkh9jUf5EjsDW/q11H3O+IHFZTra3u Luaw0Elf2h5NeW3iPglXUSDosnArmVpGv3gk4JdwPdFGwLq91fEnpHjU8T+1N86R5atW sonb2iLVqq3cMVWK2tPPdh024WKW54cbTWHX2Khzvn/tbYHP03/ws5KqJlDnL2HVhjtl N1cQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cmpxchg-org.20221208.gappssmtp.com header.s=20221208 header.b=CIbrKYKM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bv131-20020a632e89000000b0053f262d3e32si1312414pgb.654.2023.06.02.10.27.31; Fri, 02 Jun 2023 10:27:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20221208.gappssmtp.com header.s=20221208 header.b=CIbrKYKM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235692AbjFBQtu (ORCPT + 99 others); Fri, 2 Jun 2023 12:49:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236884AbjFBQtq (ORCPT ); Fri, 2 Jun 2023 12:49:46 -0400 Received: from mail-qk1-x72b.google.com (mail-qk1-x72b.google.com [IPv6:2607:f8b0:4864:20::72b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A5072196 for ; Fri, 2 Jun 2023 09:49:44 -0700 (PDT) Received: by mail-qk1-x72b.google.com with SMTP id af79cd13be357-75afa109e60so217425585a.2 for ; Fri, 02 Jun 2023 09:49:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20221208.gappssmtp.com; s=20221208; t=1685724584; x=1688316584; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=5UQdCHARuiwiuV0zQiL53aYkSXJ/lt2DZ527kuxBwaY=; b=CIbrKYKMxgv+t6fixphVWkC0XWrLAyTWTUlLZkgYk2HBxaiD2NczHhQjU6ISFSVv5Z dXtSVC8kYXjZc16PGFUNhq8NbuB9+6DIYbNDSmHpKeEya/OGChgKs1tus81bGVuRMm4T a5E2v6DdUbOvbK5ybeYeYpfrKYT5D/KG0NNxid94JKMeEdqvez3oha9yQ8gKGh21oElz 9tq0P0Bq192rXi4jlVryqygLyNMw/1p4Cg1wJzHvrQ2gfg/Ps3H26iwd4EG/C1W8YVhv Et5wg9w3Te0f1Hiq0qITMVKyMa04ek6/Q7v5ByDfgTe6Oqw55UM3xotYjz9x9KnN3MBj sG9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685724584; x=1688316584; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=5UQdCHARuiwiuV0zQiL53aYkSXJ/lt2DZ527kuxBwaY=; b=hrpM42JYY8q/7fEeDqOWk//jrWNxNH1kFLfyjpFzoAD2h7GEK7WcgHtw+ElsRp+3tV 6EI/T/RQhZsOoJeKubj0uepqb+KbroAT09wGoYdbRMjA3kWQHA8vDke8qQftoPYwxHqe Hs4NDkQHDwevlc+aOQl84BIkg+tLtik5Zoytfu8sRRqZl8ivtnXzw+cTTdyhoVUZQ0nl eTR9pz9+wgEaLxKhLTkYW7Gr23ubF5DtPE3bFKGOKyhewNAo51MCZUO/GOm1/K8cGXT0 pH0jO5AmUX+cZZ210vWbX1nIIUDkzlVDrMB2cuTORo76j9Fos+fwxczFrmSuOAvFSAnc +qwg== X-Gm-Message-State: AC+VfDxSQUAJwDb4RNi8QINYplda35DKCeBFA/PxYJjZUY/eXV/4teKY foglxGhj25b43u+IE5fs8y666Q== X-Received: by 2002:a05:6214:2242:b0:5a3:79cd:8ef7 with SMTP id c2-20020a056214224200b005a379cd8ef7mr13654976qvc.23.1685724583763; Fri, 02 Jun 2023 09:49:43 -0700 (PDT) Received: from localhost ([2620:10d:c091:400::5:ec58]) by smtp.gmail.com with ESMTPSA id f12-20020a0cf7cc000000b0062821057ac7sm1037327qvo.39.2023.06.02.09.49.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Jun 2023 09:49:43 -0700 (PDT) Date: Fri, 2 Jun 2023 12:49:42 -0400 From: Johannes Weiner To: Yosry Ahmed Cc: Konrad Rzeszutek Wilk , Andrew Morton , Seth Jennings , Dan Streetman , Vitaly Wool , Nhat Pham , Domenico Cerasuolo , Yu Zhao , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm: zswap: multiple zpool support Message-ID: <20230602164942.GA215355@cmpxchg.org> References: <20230531022911.1168524-1-yosryahmed@google.com> <20230601155825.GF102494@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 01, 2023 at 10:51:39AM -0700, Yosry Ahmed wrote: > On Thu, Jun 1, 2023 at 8:58 AM Johannes Weiner wrote: > > On Wed, May 31, 2023 at 02:29:11AM +0000, Yosry Ahmed wrote: > > > +config ZSWAP_NR_ZPOOLS_ORDER > > > + int "Number of zpools in zswap, as power of 2" > > > + default 0 > > > + depends on ZSWAP > > > + help > > > + This options determines the number of zpools to use for zswap, it > > > + will be 1 << CONFIG_ZSWAP_NR_ZPOOLS_ORDER. > > > + > > > + Having multiple zpools helps with concurrency and lock contention > > > + on the swap in and swap out paths, but uses a little bit of extra > > > + space. > > > > This is nearly impossible for a user, let alone a distribution, to > > answer adequately. > > > > The optimal value needs to be found empirically. And it varies heavily > > not just by workload but by implementation changes. If we make changes > > to the lock holding time or redo the data structures, a previously > > chosen value might no longer be a net positive, and even be harmful. > > Yeah, I agree that this can only be tuned empirically, but there is a > real benefit to tuning it, at least in our experience. I imagined > having the config option with a default of 0 gives those who want to > tune it the option, while not messing with users that do not care. Realistically, how many users besides you will be able to do this tuning and benefit from this? > > Architecturally, the pool scoping and the interaction with zswap_tree > > is currently a mess. We're aware of lifetime bugs, where swapoff kills > > the tree but the pool still exists with entries making dead references > > e.g. We likely need to rearchitect this in the near future - maybe tie > > pools to trees to begin with. > > > > (I'm assuming you're already using multiple swap files to avoid tree > > lock contention, because it has the same scope as the pool otherwise.) > > > > Such changes quickly invalidate any settings the user or distro might > > make here. Exposing the implementation detail of the pools might even > > get in the way of fixing bugs and cleaning up the architecture. > > I was under the impression that config options are not very stable. > IOW, if we fix the lock contention on an architectural level, and > there is no more benefit to tuning the number of zpools per zswap > pool, we can remove the config option. Is this incorrect? The problem is that it complicates the code for everybody, for the benefit of a likely very small set of users - who have now opted out of any need for making the code better. And we have to keep this, and work around it, until things work for thosee users who have no incentive to address the underlying issues. That's difficult to justify. > > I don't think this patch is ready for primetime. A user build time > > setting is not appropriate for an optimization that is heavily tied to > > implementation details and workload dynamics. > > What would you suggest instead? We do find value in having multiple > zpools, at least for the current architecture. I think it makes sense to look closer at the lock contention, and whether this can be improved more generically. zbud and z3fold don't have a lock in the ->map callback that heavily shows in the profile in your changelog. Is this zsmalloc specific? If so, looking more closely at zsmalloc locking makes more sense. Is a lockless implementation feasible? Can we do rw-locking against zs_compact() to allow concurrency in zs_map_object()? This would benefit everybody, even zram users. Again, what about the zswap_tree.lock and swap_info_struct.lock? They're the same scope unless you use multiple swap files. Would it make sense to tie pools to trees, so that using multiple swapfiles for concurrency purposes also implies this optimization?