Received: by 2002:ab2:6309:0:b0:1fb:d597:ff75 with SMTP id s9csp1191409lqt; Fri, 7 Jun 2024 10:25:10 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXtqM0s7g/q1EaLAdF6xqOew2X31qi6v1suZafmDylhLhuTdptCLjFzJILl48HtHaJh1b3qVu1SXRm+vHgAmtvvoO/KZYiqoyTYxhYCSQ== X-Google-Smtp-Source: AGHT+IFOtKDA7FMjcB+EKb3nbbByTteJUFkUOm3xTWg2OJE1AnOuRfgdSH0ih0pclTz1PypKo+pW X-Received: by 2002:a05:6a20:96cb:b0:1b2:a899:e57e with SMTP id adf61e73a8af0-1b2f9ccbf77mr2853065637.59.1717781109960; Fri, 07 Jun 2024 10:25:09 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717781109; cv=pass; d=google.com; s=arc-20160816; b=zhrMx72R2Jo4XN68ab15szpIEqm/mRotuhGXKepiID5jxw35y+RyvVbbooHrD6dnHC BQ2/sn9EI+Gd8ac8/ScixRY2BBkT1Tbs2VSKCM9fnIXzkM+BwSXti2PcuOlbXPiIptd2 uSH9d7HmFPtijHjFrQPGay+3FLKJTf8Zb2gnnQrv59dFAlCwvLBU2dwHdFNXlkhs9KcC A8bWKrG7X7GN3C/wuYm89Y0/JwOlfxXPfHY0DD4cOwWGpmXwLzl41jZvzp0jemW7RCrE yqCYPT1V0BVOI5usWcA3R2nzIxQPFYypFDQRJ9SBK6ClvgumKZHAUNKx7KXaB/h5WPOD 2Z0Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=BZcPHS5nObMkGpNMDkoJlbRs4OmeOKlnWXQEnPNBwVU=; fh=eHojcLNm2AcAGN/8jTxr83777LW5ZcKVSNyS+ZLR9Kc=; b=dChsEU9Z2wR0AQFG4Gojya8WkOBTwQ5M+4dM2V+ipEQqhgk+ROTKunaDsT88YXXpa6 abBShFCyIR2DEAFnx3A01QFpg3CAaRd2jO2kxaHpl7KV612Sq2pE+IEgRLvw+fi5IX7R m0APLbVGsO66NTlRa4fxwqwi/W/GY/ejoDjas+WeuMlwwMjAMruccJHxHjloW1g1ydjm iU83nFdhDigCLvzARVtzKr6nvLkUa1ziWThtcrE9Op1V83cTquvGaYOFvOyZqbL4vBnc IbBs1117bPQH1+ZGn8bDNDq/P/BPNYfeDBzgeX1GQEpyx7BMD6uZ2seo6ImsC9CS6c8+ /iHg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=cBM6VbKw; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-206538-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-206538-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id d2e1a72fcca58-70417443bcasi1071448b3a.291.2024.06.07.10.25.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Jun 2024 10:25:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-206538-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=cBM6VbKw; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-206538-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-206538-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 7EF1C287713 for ; Fri, 7 Jun 2024 17:25:09 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 744BB19AD84; Fri, 7 Jun 2024 17:25:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="cBM6VbKw" Received: from mail-ej1-f51.google.com (mail-ej1-f51.google.com [209.85.218.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C27C615ACB for ; Fri, 7 Jun 2024 17:25:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.51 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717781103; cv=none; b=Ota3LaRXcwhWvMZxuo9IVyYJbI4q0jEqLlWltzD/9cHDZJlnaAT61DDmw145x5LACTS7i6yPNRDMJDIkoklPwoKZwyXneO1i80B0xv3ZjuYaLPffPR3Fp3qYoLseOy5jfFXMkiSZA6/sGd171CzWu8TqKZbkn6NXC0nTCIED7+8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717781103; c=relaxed/simple; bh=kx/7HInJQDABKtXyDpTJDHcemdDalaTc+DLiFdz7WO8=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=IhVjPuJ1IZ370pzT47MNgVlZcAbPOFZhmMcmrUQMXT5d7+iJp8GK7OP5TrfACmg9KXiOXK1vS4jrhwG74mWh5ZNWuLd5dSzYoJWcxvFj285dxmcbtCA710WYO7nomO9vIA7aeoNDLiE/WH+TrrkjQ/R1PSRmr9uzgtD2Fh91Uz0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=cBM6VbKw; arc=none smtp.client-ip=209.85.218.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Received: by mail-ej1-f51.google.com with SMTP id a640c23a62f3a-a6267778b3aso291009766b.3 for ; Fri, 07 Jun 2024 10:25:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717781100; x=1718385900; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=BZcPHS5nObMkGpNMDkoJlbRs4OmeOKlnWXQEnPNBwVU=; b=cBM6VbKw/UqwG2MSyRPeufyjMp1D6AGgKYfyAjx2l8lQszt5ropB6wo0U5Py/n2gcC sN8a6/hLzgPKgTEvnkQrnhwy1wFMJX/07dbSsPfDwdNWeD399t6amjhH7raaKPLxA+iN 4gtzMGfpVeUtzGhlcwAIiOIr1iivTVWFL5wSgULPKGjLVGGoOaliby0HGuD6mQg9Sz/c McqZ/teqGaJmqH0kAqq0DxTYH0PgY0WapJ8yJtyqEXUu5FY4MK9iRmFY0QFW+hxlZe3w MThpuvNWCiyikEpnqFd0vRdDDz+5OJrkKs6hJ0fmqppWZdOYC2KOX0NG29xa6P2M7qon zRHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717781100; x=1718385900; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BZcPHS5nObMkGpNMDkoJlbRs4OmeOKlnWXQEnPNBwVU=; b=YGG5f1RP8pqLiaWybySflsDpitwA+M5fMVGDf5pEpONM/Mo9ije/u7RbWTVcZacov5 uqm+vl29hw7OuN+Lkca7hzRvAUYoqXEUuMc5oBJOXZT6LeAf6KoOtykUnNpFow/+bK0M xWLAeEilWCbdJS5nhF2+uMGgxA2Tzil04iYXS1Ak6YGL1VO37mNanPCWs7DsDYe2FuiW Yio66eoh5cMfXOhf6QcG0PThvQ7ZaAnY64G/w2FTlBjlPW5kqAMnV7C8VWhw6oAPYT8S /v0UPNUCrAzjXdMe2M4BlaKyFt3sLa08GlQdmsLi5TsoVcjix/NG/Zzh8ddyCeK06ggk afCw== X-Forwarded-Encrypted: i=1; AJvYcCVPA+PWNhqDxS9HKGaYzD0xKjfelGogCOGhQyoJOp9qBZx4yMcu8gdR8h57s8H//cz6SXq0UAiDLOO5cu+tiaNn4mLVHBK+S+s6Vxj5 X-Gm-Message-State: AOJu0YxQzQYGWvhF3uyWgOhP8uKjIBdifwIpi8Etn5MgH6TKwcFmH7Zy rArbD710StaU3ue8XlVz5A4MtdPPYLltM9zgel7qDILp8Owby02xgrYOOJRSAqcvFIr1I7zBp2M RN8wD8Kecf4xXgFZ1OQZ/hqmbfsDahxqZWYXSJP0rHzucCg+nmD75 X-Received: by 2002:a17:906:fe49:b0:a68:fdfd:8041 with SMTP id a640c23a62f3a-a6cd7a843f0mr302689966b.42.1717781099759; Fri, 07 Jun 2024 10:24:59 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240604175340.218175-1-yosryahmed@google.com> In-Reply-To: From: Yosry Ahmed Date: Fri, 7 Jun 2024 10:24:23 -0700 Message-ID: Subject: Re: [PATCH] mm: zsmalloc: share slab caches for all zsmalloc zpools To: Minchan Kim Cc: Andrew Morton , Sergey Senozhatsky , Vlastimil Babka , David Rientjes , Christoph Lameter , Erhard Furtner , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yu Zhao Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, Jun 7, 2024 at 9:14=E2=80=AFAM Minchan Kim wro= te: > > On Thu, Jun 06, 2024 at 04:03:55PM -0700, Yosry Ahmed wrote: > > On Thu, Jun 6, 2024 at 3:36=E2=80=AFPM Minchan Kim = wrote: > > > > > > On Tue, Jun 04, 2024 at 05:53:40PM +0000, Yosry Ahmed wrote: > > > > Zswap creates multiple zpools to improve concurrency. Each zsmalloc > > > > zpool creates its own 'zs_handle' and 'zspage' slab caches. Current= ly we > > > > end up with 32 slab caches of each type. > > > > > > > > Since each slab cache holds some free objects, we end up with a lot= of > > > > free objects distributed among the separate zpool caches. Slab cach= es > > > > are designed to handle concurrent allocations by using percpu > > > > structures, so having a single instance of each cache should be eno= ugh, > > > > and avoids wasting more memory than needed due to fragmentation. > > > > > > > > Additionally, having more slab caches than needed unnecessarily slo= ws > > > > down code paths that iterate slab_caches. > > > > > > > > In the results reported by Eric in [1], the amount of unused slab m= emory > > > > in these caches goes down from 242808 bytes to 29216 bytes (-88%). = This > > > > is calculated by (num_objs - active_objs) * objsize for each 'zs_ha= ndle' > > > > and 'zspage' cache. Although this patch did not help with the alloc= ation > > > > failure reported by Eric with zswap + zsmalloc, I think it is still > > > > worth merging on its own. > > > > > > > > [1]https://lore.kernel.org/lkml/20240604134458.3ae4396a@yea/ > > > > > > I doubt this is the right direction. > > > > > > Zsmalloc is used for various purposes, each with different object > > > lifecycles. For example, swap operations relatively involve short-liv= ed > > > objects, while filesystem use cases might have longer-lived objects. > > > This mix of lifecycles could lead to fragmentation with this approach= . > > > > Even in a swapfile, some objects can be short-lived and some objects > > can be long-lived, and the line between swap and file systems both > > becomes blurry with shmem/tmpfs. I don't think having separate caches > > > Many allocators differentiate object lifecycles to minimize > fragmentation. While this isn't a new concept, you argue it's irrelevant > without a clearcut use case. > > > here is vital, but I am not generally familiar with the file system > > use cases and I don't have data to prove/disprove it. > > The use case I had in mind was build output directories (e.g., Android). > These consume object files in zram until the next build. > > Other potential scenarios involve separate zrams: one for foreground > apps (short-term) and another for cached apps (long-term). Even > zswap and zram could have different object lifecycles, as zswap might > write back more aggressively. > > While you see no clear use cases, I disagree with dismissing this > concept without strong justification. I was just unaware of these use cases, as I mentioned. I didn't really know how zram was used with file systems. Thanks for the examples :) > > > > > > > > > I believe the original problem arose when zsmalloc reduced its lock > > > granularity from the class level to a global level. And then, Zswap w= ent > > > to mitigate the issue with multiple zpools, but it's essentially anot= her > > > bandaid on top of the existing problem, IMO. > > > > IIRC we reduced the granularity when we added writeback support to > > zsmalloc, which was relatively recent. I think we have seen lock > > contention with zsmalloc long before that. We have had a similar patch > > internally to use multiple zpools in zswap for many years now. > > > > +Yu Zhao > > > > Yu has more historical context about this, I am hoping he will shed > > more light about this. > > > > > > > > The correct approach would be to further reduce the zsmalloc lock > > > granularity. > > > > I definitely agree that the correct approach should be to fix the lock > > contention at the source and drop zswap's usage of multiple zpools. > > Nonetheless, I think this patch provides value in the meantime. The > > fragmentation within the slab caches is real with zswap's use case. > > OTOH, sharing a cache between swap and file system use cases leading > > to fragmentation within the same slab cache is a less severe problem > > in my opinion. > > > > That being said, I don't feel strongly. If you really don't like this > > patch I am fine with dropping it. > > How about introducing a flag like "bool slab_merge" in zs_create_pool? > This would allow zswap to unify slabs while others don't. Yeah this should work. But I'll wait until we have more data and we know whether we need to keep using multiple zpools for zswap. I sent this patch because I thought it would be generally useful to share caches (e.g. if we have zram and zswap on the same system), but given that you said it is actually preferable that the caches are separate, it may not be. I'll wait for more data before sending any more patches to address this. Andrew, could you please take this out of mm-unstable? Thanks!