Received: by 2002:ab2:b82:0:b0:1f3:401:3cfb with SMTP id 2csp669075lqh; Thu, 28 Mar 2024 12:32:05 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXk7ZpoCcHVj/AxBJSQNL8xX+FTrdTztRk0Jx8f1RJHQl1FN8wcbxgQOOF7fdMxtHLLqw+Y71jn51rBXJERgmNxdJgez8OjmfD83dSLQA== X-Google-Smtp-Source: AGHT+IFMpgnNqjNRXed9C0EA7y7Fum/omrHBPCnlejSnKIXvA1f1QFTyWT0cKkeYcM9erZ25k8QJ X-Received: by 2002:a05:622a:3d4:b0:432:b311:c9e1 with SMTP id k20-20020a05622a03d400b00432b311c9e1mr357641qtx.28.1711654325040; Thu, 28 Mar 2024 12:32:05 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1711654325; cv=pass; d=google.com; s=arc-20160816; b=YPKYkscaBLbE5IvKRV4c83P+CtNWgKMcmnIAcCvXmmPQS70J5ilFhp5oWkE7qwPo6T USMsBZgvkn4SXIdnrWYPAtJ26CIuhadTG8Hhv+S8/UaxQnRvc4FqFYrIrBpRJE3O1P6K 1xBCaQ5GErIJaURyOkIwJPzBd29uaD0No9Q7kQbKEQKiio4OjGEn5yvXTrBSKtU2rVYc aMeXTQG9rm1sRwPF442YiV8/TJBKZd06vcRwB/Knc7o46ENHnaiYzzi+B2IhLkWuFbRW wHujSeMjmE/+unJ1kbPJkwEd1cNpChH5GXyYKv8lKclVAiS1VQcCK2i/LXTb9mY02bqn 9GEw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=cLx2tAUygTKT9ZnXo1CoRPLOCNpgrVxvtKfxkgCnBAA=; fh=oqJtDEbnLNdyZgIK2yRY3LgyU1I403xOflkhYIgplss=; b=n83VF9mz3DqDdu1722uRO5Y7If/LUwH9DJHCgFQh1bvf3o02fyLEqoAeigRm0GQHH0 0qSGpNS1of44VFOHvY4V6yzQ7lYiJFMya9CUjTHl4+Qmg9e5oExHRHRDqEpC7K9Xp0Ga 8EkI9sn4hdvq/e9xvc+MUQbfi1wuItnGDSfhgGnSYvpQPL7e+vKVNwH1GPq94xzVd9x+ 7cILeXOYPmxC2VMryebdp4RukehCGbfYhJF+azJav7YqeYA5BTqS/72yK1JIOA06dAEn droAcyopoErszY/UoiaDyftqaWMCdQ/69ZP48CmTLaQ/B+iPsCeHPQUI7FQ2v4Apg0ic 5u3A==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b="2f8XVu/M"; arc=pass (i=1 spf=pass spfdomain=cmpxchg.org dkim=pass dkdomain=cmpxchg-org.20230601.gappssmtp.com dmarc=pass fromdomain=cmpxchg.org); spf=pass (google.com: domain of linux-kernel+bounces-123481-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-123481-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id j20-20020a05622a039400b00430a25f1f97si2065807qtx.48.2024.03.28.12.32.04 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 Mar 2024 12:32:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-123481-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b="2f8XVu/M"; arc=pass (i=1 spf=pass spfdomain=cmpxchg.org dkim=pass dkdomain=cmpxchg-org.20230601.gappssmtp.com dmarc=pass fromdomain=cmpxchg.org); spf=pass (google.com: domain of linux-kernel+bounces-123481-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-123481-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id B23B41C272F0 for ; Thu, 28 Mar 2024 19:32:04 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 9BC4E1384A6; Thu, 28 Mar 2024 19:31:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg-org.20230601.gappssmtp.com header.i=@cmpxchg-org.20230601.gappssmtp.com header.b="2f8XVu/M" Received: from mail-qv1-f47.google.com (mail-qv1-f47.google.com [209.85.219.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EBF4C1C6BD for ; Thu, 28 Mar 2024 19:31:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.47 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711654318; cv=none; b=XCcA3/eAYFncgBfyZ01l3csGY+zmIzgShj8rgtGnGz97jbfAI+3VVSdpd0jxWV2ZglxgPlJRQ5FBz2YupekNUkVykrR8RKrzcmBSW/3kWOnJcij3VjwEYtKq5Q/DT/zutiaclpXBfaitP6zHMHEUiE5LBulR9CnZsYbODedlFHM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711654318; c=relaxed/simple; bh=vatiIdJrdRlS2MYbi4ISypIEpVHIfQtH9o7WH2N3czU=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=nRYqe9GGqiKXEaWKWtBo7SPR5A5ZZifzmSJHU1GtTJUe26gXK1REmVXbKDLdl2J4Vpq1yj63zSPHcH6HrG3ULCJh1qaB0TtFShyi/x0eqxR4UU6h4E2sJEL0R7h5SZmKl1ZeBF7F7kZyBKS2S+YoJtWFMGAisrVgz+3cPi4vkpA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org; spf=pass smtp.mailfrom=cmpxchg.org; dkim=pass (2048-bit key) header.d=cmpxchg-org.20230601.gappssmtp.com header.i=@cmpxchg-org.20230601.gappssmtp.com header.b=2f8XVu/M; arc=none smtp.client-ip=209.85.219.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cmpxchg.org Received: by mail-qv1-f47.google.com with SMTP id 6a1803df08f44-6986eb97f31so5274626d6.3 for ; Thu, 28 Mar 2024 12:31:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1711654314; x=1712259114; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=cLx2tAUygTKT9ZnXo1CoRPLOCNpgrVxvtKfxkgCnBAA=; b=2f8XVu/MnAeUvh88B5tAjLqTywOFNTfjaeNlH0794OPk9uZO2+Ow2fFUOZelzCpO5t HJG1grB+taQNo7Rht5MfTZVGzqkCBIZZwFZXQSsIuzlyHUVLwJMVRy/cN4GbEIgorGEh HgvWoEoSCvCK350jMu1Aix29mBlULEeoE4cCe8qkA3MjC8P2X+6vPe6BmVJnV+fFQFUn 4IYrsoMILqBGAjb2GjcquPHTaaAeTqyFRgIYKAKlnakqbec9jD+kMEVbUVfS7/wYO/PF e3w8yXbjITL7UX/AiViuIlLv3Maov5O0+dm7rzXHHUm/G7HIOHdIz+GeWNuUVW76tpuP kUHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711654314; x=1712259114; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=cLx2tAUygTKT9ZnXo1CoRPLOCNpgrVxvtKfxkgCnBAA=; b=NietvUxxewOYVWvHYXKjqIJl1ayJYCnLS0BHU9UK+VFSTyYMMUtVgmN7qgWvVLg48R T9/w2WgXiocjMbBo9ZuOYl8Gn68TX4f0sSfNJrjgIfwDMtq2NbqueqAV4UrESl64IwJF o2krlj6pu0JVoLKgAHWSCYxBORm3TYEEzlXKVeejFaYL3w6RT8rsRXpmRPe/5trlcJ6v Kgtmk2MNMx7hGfoJoOJ1nb/Xp3R93yglPVIwvWEo2FBjOwATf62T8cKXFShyfXSnXYUg 1dZ6Rf14nS7eJuNQVKu3O6w/7z6j1e5YxuYGh7qs7eR65wy1KCTgnN3nTerv8M3iYI6o ss3g== X-Forwarded-Encrypted: i=1; AJvYcCXNrXhyAA/PTjPeIscdjn+uw7ZmuG3mmxAnaLw/KZe7g0MZD+gDAQWQBsnPkVlBK5SRfgjuYIkJqDhL6cJTXRzqPZVo5oLriY9HLgT0 X-Gm-Message-State: AOJu0YwTla41JgGcDfSCPycIE2KFeSAy/9BaUL7H+YFpkrPH6J8NXQwa tt8xR8dbjLtistG2HQRain3LdNwFHORAyQAEQQloWRszltcW15XXF27CUx8FrJw= X-Received: by 2002:a0c:fe64:0:b0:696:9aae:5d9f with SMTP id b4-20020a0cfe64000000b006969aae5d9fmr224256qvv.3.1711654314646; Thu, 28 Mar 2024 12:31:54 -0700 (PDT) Received: from localhost ([2620:10d:c091:400::5:bb1f]) by smtp.gmail.com with ESMTPSA id 9-20020a0562140d6900b006967f875b72sm901512qvs.139.2024.03.28.12.31.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 Mar 2024 12:31:54 -0700 (PDT) Date: Thu, 28 Mar 2024 15:31:49 -0400 From: Johannes Weiner To: Yosry Ahmed Cc: Andrew Morton , Nhat Pham , Chengming Zhou , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH 6/9] mm: zswap: drop support for non-zero same-filled pages handling Message-ID: <20240328193149.GF7597@cmpxchg.org> References: <20240325235018.2028408-1-yosryahmed@google.com> <20240325235018.2028408-7-yosryahmed@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240325235018.2028408-7-yosryahmed@google.com> On Mon, Mar 25, 2024 at 11:50:14PM +0000, Yosry Ahmed wrote: > The current same-filled pages handling supports pages filled with any > repeated word-sized pattern. However, in practice, most of these should > be zero pages anyway. Other patterns should be nearly as common. > > Drop the support for non-zero same-filled pages, but keep the names of > knobs exposed to userspace as "same_filled", which isn't entirely > inaccurate. > > This yields some nice code simplification and enables a following patch > that eliminates the need to allocate struct zswap_entry for those pages > completely. > > There is also a very small performance improvement observed over 50 runs > of kernel build test (kernbench) comparing the mean build time on a > skylake machine when building the kernel in a cgroup v1 container with a > 3G limit: > > base patched % diff > real 70.167 69.915 -0.359% > user 2953.068 2956.147 +0.104% > sys 2612.811 2594.718 -0.692% > > This probably comes from more optimized operations like memchr_inv() and > clear_highpage(). Note that the percentage of zero-filled pages during > this test was only around 1.5% on average, and was not affected by this > patch. Practical workloads could have a larger proportion of such pages > (e.g. Johannes observed around 10% [1]), so the performance improvement > should be larger. > > [1]https://lore.kernel.org/linux-mm/20240320210716.GH294822@cmpxchg.org/ > > Signed-off-by: Yosry Ahmed This is an interesting direction to pursue, but I actually thinkg it doesn't go far enough. Either way, I think it needs more data. 1) How frequent are non-zero-same-filled pages? Difficult to generalize, but if you could gather some from your fleet, that would be useful. If you can devise a portable strategy, I'd also be more than happy to gather this on ours (although I think you have more widespread zswap use, whereas we have more disk swap.) 2) The fact that we're doing any of this pattern analysis in zswap at all strikes me as a bit misguided. Being efficient about repetitive patterns is squarely in the domain of a compression algorithm. Do we not trust e.g. zstd to handle this properly? I'm guessing this goes back to inefficient packing from something like zbud, which would waste half a page on one repeating byte. But zsmalloc can do 32 byte objects. It's also a batching slab allocator, where storing a series of small, same-sized objects is quite fast. Add to that the additional branches, the additional kmap, the extra scanning of every single page for patterns - all in the fast path of zswap, when we already know that the vast majority of incoming pages will need to be properly compressed anyway. Maybe it's time to get rid of the special handling entirely?