Received: by 2002:ab2:710b:0:b0:1ef:a325:1205 with SMTP id z11csp827014lql; Mon, 11 Mar 2024 21:04:46 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXksChj81bC4B0el2MqciONL3hw7XdPf2y9pVm2fil28CG31UkGGR1NHp0z689x5YnQHuJlD95yLhGvVgK+NUJ5jW4nTD6s7QvHaAQVbQ== X-Google-Smtp-Source: AGHT+IHrK2+ujXNqu8Ep+XAF0iBSLi++92qC+ir9ieVCRwZ4SR8Eijy6FNWd7+Np3S/kemxdULiO X-Received: by 2002:a05:6512:36c7:b0:513:5a7b:1093 with SMTP id e7-20020a05651236c700b005135a7b1093mr1261455lfs.21.1710216286304; Mon, 11 Mar 2024 21:04:46 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1710216286; cv=pass; d=google.com; s=arc-20160816; b=hXA8PUWCpf93TZRf/oXa2MSsQPoHxXUxlujMyXT+iHMlGuIHVfZqcazFmKWCnghTfh t+z1D02F1QIODmoHAxIZ0Q3tia3w8HjI3R329KkEXUA3JSGmJYHlQnj9rpA2woaQkG/Y JnejljDKdLOcWmhvgv7cGKT6GyhTEhmm8RLBr69UMfmuoE1cuZ8y2b8dYTllGRwNRGay +B3i3FBKoDGj5Ik8Hb5aempJdxppKmQkVnCEKN8VeGfoDgpctN8mHBkkhpnMiDmDcdeE hzIVB2w+n84xwzhBfet0b1olVIo4CrnT4uaXqBqyLMznh6P0qunyije8GW57+JbZ5pg8 YS7A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:references:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:in-reply-to:date :dkim-signature; bh=DKGHJnbXSzQyoajwNYJ2VuXwhIvgziwgeBLpVKyZTQA=; fh=dhkP7JmUO41U6jDUh8gUd0OZClfnZiykR2JUhzwdx3A=; b=VI8WRThbiZstzy+0YXK06P+sSqFjeLPzW+UHAt5OSMdejjEneAB4eZd3IK+N+u0clN /yI5P3S7qSKfUV2pROUU4Ne3TQVA/fmP0Y6ZclZCCyljd9wqX8WR95e9OA3spY5UBJid 1VeeWDVMfCrguNQwpAFk/zwEFMBhMImSuq+dW6HNGibC+kYdQFFG+gf+ff+T7Ltt6PtU EKXDC+wtrsHDvAoBPRFAxsmObrAFCHQ0nU/t/NAtErfmYv6pUlrp90NFkGDc/uQJH/V+ EtxaVwes3l39Z0GZtPD52rEKLN3eUnEy/zb7omga1Ds6e3hpHv3LH1SAXschMlBXzm0l eRRg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=yjhUyvpi; arc=pass (i=1 spf=pass spfdomain=flex--yosryahmed.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-99811-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-99811-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id cu2-20020a170906e00200b00a435397e2d2si3139326ejb.937.2024.03.11.21.04.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Mar 2024 21:04:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-99811-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=yjhUyvpi; arc=pass (i=1 spf=pass spfdomain=flex--yosryahmed.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-99811-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-99811-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id D605F1F21B9D for ; Tue, 12 Mar 2024 04:04:45 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id D886CB674; Tue, 12 Mar 2024 04:04:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="yjhUyvpi" Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5AB6A16FF2B for ; Tue, 12 Mar 2024 04:04:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710216278; cv=none; b=Xof7757OrJFiuby3sBpI3nLOYFNmom4JjF+uUCQ8i9TCGFESM6YXLaSS/XVPkUOJDBlPzj0gjfrXdJ/QmNZHso2ik7Kah/PhnXy7q6fdguBT8O+BxrltkYTcTZbit5QWUnegGDndYAa9eCfj3XUNN9tbACJdkD37RczMinDNWM4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710216278; c=relaxed/simple; bh=TApHmX3ZfnF/JWvRBVZ/APk2hxMJw/vVztjTlsQKCug=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=K2piMXvzDTIGd0ZTGQsPT9tDCRubGWQTALXc8YUJWzgsk21S21HMdElGp2wwozSJ0hwnfH6oOSI9wh/O7QNhH8VefAyn9EbXUjddB86S3PMSGj3An/hsMO3EsPwL/kE8ALTwwacyDvyWj2zV9FAAdOlqg27tQ9SP9pJj/rk1AGM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--yosryahmed.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=yjhUyvpi; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--yosryahmed.bounces.google.com Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-dc6dbdcfd39so6734467276.2 for ; Mon, 11 Mar 2024 21:04:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1710216275; x=1710821075; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=DKGHJnbXSzQyoajwNYJ2VuXwhIvgziwgeBLpVKyZTQA=; b=yjhUyvpiP9J1YMgY6HQNhO7XiO1tHQp52hRBvWSub+YwVj+k80krkUvxDuC5e/QBrD BqwCoP2dv0MrCEOyz/POvZToUF8zIl68QiaN0ywDa72IUGChloDCBnQiYI8keXmkPLpl hnO1GBblpUaU+y5vQ+c1z0b3jsQv9ULiCkYtYUJZB5pB/uFB02SEFNGbD8RzxESmjXOR P7W0EQYRVmp0k3icemvgi6AUZuXhoJP0nry7nyBRhnHcLwlyH2HZIrtkrkESbpkqSPTb 7/cAqMmnCupYAqhUqGIqh2hWKn+LMyVCRDztdXa858VF5uQL2MVB+c7gTtyK0Joya6Th 3MgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710216275; x=1710821075; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=DKGHJnbXSzQyoajwNYJ2VuXwhIvgziwgeBLpVKyZTQA=; b=WKH3tdgVX9YELThegz31DNJFOFp4uuyLDhiDo69aGWokoLhIp+cuG2mYEC/S6wPxul z1YmeNJsbqfHoAG8NbHwa8HFVHHdo3Y3pG34CKylWRd+40/9KG+A4EoHCdHE6PmVTgmU m0NKryWSUXCKECu8b8OUxaHF62sb7h3q3mQkr41D/bhIn8cVWlkFmYONYPIV/Wahw6Fj 4h5y9/iyBWLIc4fL31/ZL/hFiMMUfiDsh0tL+nPJc0Cb5VT0cjLAxCrJJtvg8zaXov5d 4OAKnWM+sa83q5Nvyyk+7iXjM+meVyg6idSl5nAZ/RX26G5rwdLzMGG3pCRK4Q/pGQcw gUhA== X-Forwarded-Encrypted: i=1; AJvYcCUNoxJkXukg3CXhImvlh88yoOJXocRiYBiAh6652aA35mjZDO31epp3w8bJKdjLhDIQYeABiJKoYShLPFuvBnAN+2fH2vsUiv79N7Ol X-Gm-Message-State: AOJu0Yy29KIq0Hk/0cV6GYW1LaugdK3OdXNmGaNmMdRlOL2MMQ4g1hfV hE9ESLX6kx9LjseU86LDieOggVPAfJbnyuXJZMAJ6MQXfAEXN/2DxrGQS37GgwvJcXAjLRnzxBX xr+wVaXaKtaeG4TblJA== X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:29b4]) (user=yosryahmed job=sendgmr) by 2002:a25:c712:0:b0:dcf:6b50:9bd7 with SMTP id w18-20020a25c712000000b00dcf6b509bd7mr2202554ybe.7.1710216275352; Mon, 11 Mar 2024 21:04:35 -0700 (PDT) Date: Tue, 12 Mar 2024 04:04:33 +0000 In-Reply-To: <20240312023411.GA22705@cmpxchg.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240311161214.1145168-1-hannes@cmpxchg.org> <20240312023411.GA22705@cmpxchg.org> Message-ID: Subject: Re: [PATCH 1/2] mm: zswap: optimize zswap pool size tracking From: Yosry Ahmed To: Johannes Weiner Cc: Andrew Morton , Nhat Pham , Chengming Zhou , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="us-ascii" On Mon, Mar 11, 2024 at 10:34:11PM -0400, Johannes Weiner wrote: > On Mon, Mar 11, 2024 at 10:09:35PM +0000, Yosry Ahmed wrote: > > On Mon, Mar 11, 2024 at 12:12:13PM -0400, Johannes Weiner wrote: > > > Profiling the munmap() of a zswapped memory region shows 50%(!) of the > > > total cycles currently going into updating the zswap_pool_total_size. > > > > Yikes. I have always hated that size update scheme FWIW. > > > > I have also wondered whether it makes sense to just maintain the number > > of pages in zswap as an atomic, like zswap_stored_pages. I guess your > > proposed scheme is even cheaper for the load/invalidate paths because we > > do nothing at all. It could be an option if the aggregation in other > > paths ever becomes a problem, but we would need to make sure it > > doesn't regress the load/invalidate paths. Just sharing some thoughts. > > Agree with you there. I actually tried doing it that way at first, but > noticed zram uses zs_get_total_pages() and actually wants a per-pool > count. I didn't want the backend to have to update two atomics, so I > settled for this version. Could be useful to document this context if you send a v2. This version is a big improvement anyway, so hopefully we don' t need to revisit. > > > > There are three consumers of this counter: > > > - store, to enforce the globally configured pool limit > > > - meminfo & debugfs, to report the size to the user > > > - shrink, to determine the batch size for each cycle > > > > > > Instead of aggregating everytime an entry enters or exits the zswap > > > pool, aggregate the value from the zpools on-demand: > > > > > > - Stores aggregate the counter anyway upon success. Aggregating to > > > check the limit instead is the same amount of work. > > > > > > - Meminfo & debugfs might benefit somewhat from a pre-aggregated > > > counter, but aren't exactly hotpaths. > > > > > > - Shrinking can aggregate once for every cycle instead of doing it for > > > every freed entry. As the shrinker might work on tens or hundreds of > > > objects per scan cycle, this is a large reduction in aggregations. > > > > > > The paths that benefit dramatically are swapin, swapoff, and > > > unmaps. There could be millions of pages being processed until > > > somebody asks for the pool size again. This eliminates the pool size > > > updates from those paths entirely. > > > > This looks like a big win, thanks! I wonder if you have any numbers of > > perf profiles to share. That would be nice to have, but I think the > > benefit is clear regardless. > > I deleted the perf files already, but can re-run it tomorrow. Thanks! > > > I also like the implicit cleanup when we switch to maintaining the > > number of pages rather than bytes. The code looks much better with all > > the shifts and divisions gone :) > > > > I have a couple of comments below. With them addressed, feel free to > > add: > > Acked-by: Yosry Ahmed > > Thanks! > > > > @@ -1385,6 +1365,10 @@ static void shrink_worker(struct work_struct *w) > > > { > > > struct mem_cgroup *memcg; > > > int ret, failures = 0; > > > + unsigned long thr; > > > + > > > + /* Reclaim down to the accept threshold */ > > > + thr = zswap_max_pages() * zswap_accept_thr_percent / 100; > > > > This calculation is repeated twice, so I'd rather keep a helper for it > > as an alternative to zswap_can_accept(). Perhaps zswap_threshold_page() > > or zswap_acceptance_pages()? > > Sounds good. I went with zswap_accept_thr_pages(). Even better. > > > > @@ -1711,6 +1700,13 @@ void zswap_swapoff(int type) > > > > > > static struct dentry *zswap_debugfs_root; > > > > > > +static int debugfs_get_total_size(void *data, u64 *val) > > > +{ > > > + *val = zswap_total_pages() * PAGE_SIZE; > > > + return 0; > > > +} > > > +DEFINE_DEBUGFS_ATTRIBUTE(total_size_fops, debugfs_get_total_size, NULL, "%llu"); > > > > I think we are missing a newline here to maintain the current format > > (i.e "%llu\n"). > > Oops, good catch! I had verified the debugfs file (along with the > others) with 'grep . *', which hides that this is missing. Fixed up. > > Thanks for taking a look. The incremental diff is below. I'll run the > tests and recapture the numbers tomorrow, then send v2. LGTM. Feel free to carry the Ack forward.