Received: by 2002:a05:7412:798b:b0:fc:a2b0:25d7 with SMTP id fb11csp148840rdb; Wed, 21 Feb 2024 21:56:18 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWtkpNiXjIyT57qZ5lbKioLnX6FCU+9Gx2F1z4qkBVYNlsIXWuphGnZ6e+bO80cGB3ca5xhXWF/97voA9OPRFmtJHJiKDT2cbpp+k63IQ== X-Google-Smtp-Source: AGHT+IGHXQ0txEvKFC98WAxImarqGxI+OEx5UwqPr/Ht9WS49i/P6TK2qu1t/Mvy3JahzbWQMGve X-Received: by 2002:a05:6214:440d:b0:68f:b8dd:fb90 with SMTP id oj13-20020a056214440d00b0068fb8ddfb90mr1403296qvb.2.1708581378667; Wed, 21 Feb 2024 21:56:18 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708581378; cv=pass; d=google.com; s=arc-20160816; b=Z9enRexxxaZALdaRWnxFCVNAdUTVPsdylMnyC9uIPHDX4+0Uy9Dz7GnQRbQr66Krk4 k4TFnOU9f8gPp2WXi4Q5ykCBgLqOQjR2Ez19ZJsaxpPr+xDQna7p2e8lt3nnzwjyI8fw ejfrb6sylssGZc8bJmCYyFAcPpyL5ZtO0gyk+/E2FDLZWKZv/VBL030gPh1f4thhqZEq BYNX0ZXILMPlx+J+nL8VHVvZ7Tat8aWIj4CyY1+nsfaAY2gYHs/Z3+60kS9cDD9XCnLJ TjCHxNFBcCRvrKlF1z8gQ3xo2wXrpBbWP0bFnW8K/PCJXTZH2vTJeb+7RqCQa7EOMkre 55iw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:references:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:in-reply-to:date :dkim-signature; bh=nWBzNYdm8swsY3BfFe1SAI8Gh15aYs1jSIrXf1I5HWs=; fh=KVH5auCTu+szu065s4/Z99LfWX0INH9/phRyFcVxsRc=; b=Mo/eqltzSde6RcQY8w1jm7LZenrUwiRNr9XJZ0HR0Bxe6q5u9hOLEXEdqgBxwBYaeW k0Mb6T9nZWtHsbp1hLqSGmDT2q8mxINreelIuv78Gacon0bT9uEBVteu83DtfjMv8WW2 gTvoM2AJauKEQxXYG61VtpF/kQb8WJkU4LbBi/HRfPZw+4lws/qBq2nMllKrBGXLrlqw aoF9G5lXt/u5jFLCKKct8a8NPlugO9pVEn8yCYag66+HjT52K28elq/zesYBIeegd4og v8bt76pHb4rRs1TpcGSN2cKDPRQf+t9XBE08jqlWdNoq91PIvMa476xJtB2mvv4U6gNQ CZbQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=XLM6TXbn; arc=pass (i=1 spf=pass spfdomain=flex--yosryahmed.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-75959-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-75959-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id t2-20020a0562140c6200b0068fbf242776si261077qvj.243.2024.02.21.21.56.18 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Feb 2024 21:56:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-75959-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=XLM6TXbn; arc=pass (i=1 spf=pass spfdomain=flex--yosryahmed.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-75959-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-75959-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 21C9A1C21CF3 for ; Thu, 22 Feb 2024 05:56:18 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 45ACD12E40; Thu, 22 Feb 2024 05:56:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="XLM6TXbn" Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EEDF2FBF4 for ; Thu, 22 Feb 2024 05:56:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708581370; cv=none; b=fXIxqxgJ+uB4TG92fRL+pLK1a3bagmuRckQpbR+ezkXHZyKtLungUdsyTd/HRnEwuhsyJAniXXqoXN8u//UphOFszXPjKTpdcmQmx01raxcCPQ7aIuaqfnoBg8ydGjFIDrLjAcy4iZwtUn9/uleLTRz+CNrrbDllLE1zvC2z4j4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708581370; c=relaxed/simple; bh=cH6WUIYhHOf/brVu8JhOBk7fqsYUZd7lhl+n3W5/sFE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=EBbcXKU7v5hsw1aHEbMg4uWLdoF2g+VeDgo/TiPURw7FLWyfZf23WPDgU0QsEnP7dGeptVXaOqYZYKewvlE6/k/5MEP3jqAAHjjSYVA1ckT28o/f8mw0IKE8UzX/cX2iFyCpNhvgWxa6RlrRcpZ0W2sNb5M21iLLsZ1XF6qfTNg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--yosryahmed.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=XLM6TXbn; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--yosryahmed.bounces.google.com Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-dc6ceade361so2933328276.0 for ; Wed, 21 Feb 2024 21:56:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1708581368; x=1709186168; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=nWBzNYdm8swsY3BfFe1SAI8Gh15aYs1jSIrXf1I5HWs=; b=XLM6TXbnZWbVM9/GbpsKAKcYXPbRDwglEAwXFU+8j/X07f9vGRT4CwL7gGoNTDTKcr L1yxiBuY9kD4zDPDaKd+mDMWTYViLHLTBmj2eHs/Nw2RKtO9t0wQs9EXRW+s8kGjMN8i 5fT20bd//HM1EoC+ReSZfvpTtgfSk4sz3J9AXFQKI74m5uXM13vdc+hycE/MrNRzUW/m vekBeMJ0toc2FF5SRxrb1uIlr7y6hc75Ol3S3GTVgNopvTZXJDLCH4u8osafS5X1n6KN uafTze6/TXkFeQuQc+kgxF2yjfY+R5MqVIGagj4wYS7Y9PhEKKXpuMPMIaRjqbeNqK+Y 33aw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708581368; x=1709186168; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=nWBzNYdm8swsY3BfFe1SAI8Gh15aYs1jSIrXf1I5HWs=; b=n8XEGEzNzSYKOWXe9G6jG1Q7yHbUdkhn0JEhPlgQPdsvAgjcctfvgUJEeb2MVFwD9z dbGUOkqo/z/cKp64okuoOVFew55K1RnTj/FwUNyJXJeHrruCb3J0SJ0KTX5wbTylA3a9 hWqMOWwNMoi7pt1S8r2hXSTvzO6nQAxYiWW2xOv6eP0n3UFTk3bMsmkhsqe6oUwSt3I/ njNKFVDqMSV22LkARyQd2xbAH2BUyDcCSFsgCBcQoucJB8pwv078H/+Z/FGKHkFa70Kx jjFH3VKGfWy22TG1O+yNYNwvs4VxvLOqOk+31gCp9ymqadMHgiIS3LemRd3rdTvfDAsk ZPpg== X-Forwarded-Encrypted: i=1; AJvYcCV1uwOI6j4YNU/E1JFHzqioxHXplBeZqOcvSel5bwXNDYzy0DsIyFjg6iq1Qel7oQRVUXM0uiRAHDswrywrLwsvMhKWxbXpwIslDjE9 X-Gm-Message-State: AOJu0YxsMbJXAzecxoPZ36Pzfojh1g2dLVbTaO0S6e9esio3APbOXBiv 5+qFBxPT5VEb5j8nBTgDLHfAp9vd19WFUWkKxQTT/R3yxo52zRbC3eSehH7Pq3dWIJcz8iFOwdG J2vAlYJn6Nme6rNTGPw== X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:29b4]) (user=yosryahmed job=sendgmr) by 2002:a5b:ed0:0:b0:dc6:c2e4:5126 with SMTP id a16-20020a5b0ed0000000b00dc6c2e45126mr424431ybs.12.1708581367904; Wed, 21 Feb 2024 21:56:07 -0800 (PST) Date: Thu, 22 Feb 2024 05:56:04 +0000 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: Message-ID: Subject: Re: [RFC] Analyzing zpool allocators / Removing zbud and z3fold From: Yosry Ahmed To: Chengming Zhou Cc: Andrew Morton , Vitaly Wool , Miaohe Lin , Johannes Weiner , Nhat Pham , Linux-MM , Linux Kernel Mailing List , Christoph Hellwig , Sergey Senozhatsky , Minchan Kim , Chris Down , Seth Jennings , Dan Streetman , Chris Li Content-Type: text/plain; charset="us-ascii" On Thu, Feb 22, 2024 at 11:54:44AM +0800, Chengming Zhou wrote: > On 2024/2/9 11:27, Yosry Ahmed wrote: > > Hey folks, > > > > This is a follow up on my previously sent RFC patch to deprecate > > z3fold [1]. This is an RFC without code, I thought I could get some > > discussion going before writing (or rather deleting) more code. I went > > back to do some analysis on the 3 zpool allocators: zbud, zsmalloc, > > and z3fold. > > This is a great analysis! Sorry for being late to see it. > > I want to vote for this direction, zram has been using zsmalloc directly, > zswap can also do this, which is simpler and we can just maintain and optimize > only one allocator. The only evident downside is dependence on MMU, right? AFAICT, yes. I saw a lot of positive responses when I sent an RFC to mark z3fold as deprecated, but there were some opposing opinions as well, which is why I did this simple analysis. I was hoping we can make forward progress with that, but was disappointed it didn't get as much attention as the deprecation RFC :) > > And I'm trying to optimize the scalability performance for zsmalloc now, > which is bad so zswap has to use 32 pools to workaround it. (zram only use > one pool, should also have the scalability problem on big server, maybe > have to use many zram block devices to workaround it too.) That's slightly orthogonal. Zsmalloc is not really showing worse performance than other allocators, so this should be a separate effort. > > But too many pools would cause more memory waste and more fragmentation, > so the resulted compression ratio is not good enough. > > As for the MMU dependence, we can actually avoid it? Maybe I missed something, > we can get object's memory vecs from zsmalloc, then send it to decompress, > which should support length(memory vecs) > 1? IIUC the dependency on MMU is due to the use of kmalloc() APIs and the fact that we may be using highmem pages. I think we may be able to work around that dependency but I didn't look closely. Hopefully Minchan or Sergey could shed more light on this. > > > > > [1]https://lore.kernel.org/linux-mm/20240112193103.3798287-1-yosryahmed@google.com/ > > > > In this analysis, for each of the allocators I ran a kernel build test > > on tmpfs in a limit cgroup 5 times and captured: > > (a) The build times. > > (b) zswap_load() and zswap_store() latencies using bpftrace. > > (c) The maximum size of the zswap pool from /proc/meminfo::Zswapped. > > Here should use /proc/meminfo::Zswap, right? > Zswap is the sum of pool pages size, Zswapped is the swapped/compressed pages. Oh yes, it is /proc/meminfo::Zswap actually. I miswrote it in my email. Thanks!