Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp67594rdg; Wed, 11 Oct 2023 21:37:18 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG3PWZTTdyGUxeU1RVAkzt5kCSjguoXO7vdvCRwZTtTTShP/LKW+gjb8oHDpHxKsNPOwqE4 X-Received: by 2002:a05:6870:7d1c:b0:1e9:69fc:a90e with SMTP id os28-20020a0568707d1c00b001e969fca90emr6936973oab.34.1697085438653; Wed, 11 Oct 2023 21:37:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697085438; cv=none; d=google.com; s=arc-20160816; b=EWNnvyIGRlDZGJjdffabu7ZzYbZqCYEzbTpExCFVVu+gcLPRsXpQX6DhTkkhIib0Wv 85edicGMIb1nVcyWV8uaw358XLEspuHHNq0gaLrCZvoj2HnewDWwe69ZZQElFO4+SmEf KqEHBl7ESKjO6J62RXLvIWAW/JHb1jlF9mZjpyKZqgyT1XbltN/A/Im7yBhhNl2CzHSX H/XZBtRuaLOsGeIoBOmHFTgHX0+pAKvkc/ogXNQEPMu2Vv6tUF7fRjqWes8AQnHEKdaE iKlNFgG/PlUp/vdVnjN074b0BLvhC1DJD8XxNscOZ+jpeQJy6PWdvo9SF28WLA3ftNUt k4ZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:message-id:in-reply-to :subject:cc:to:from:date:dkim-signature; bh=a5OuTYbjlFj2ZnZUysa+ox563mmr19kTFx3cd83SMJA=; fh=snJqv/pv5lXAF4GXkkedpxWJZqvNlpYWV02Q3w9qpJo=; b=idaIPonwe7u8T7h/R7YNqDhBkbwWTjG8cXpJ3ctHC8YD2owciMymGff8YqfGnrfVOv 1/39xtNgE2X8P1IpaNylhZax2lRdQmS0ygQWH/6eVqcZrG3TcOpw3ngjJQE/+nGS+2QS ubPTq8oHiKoClLALmF9lfCbju1KycgWhjEP6NFmx2S5ZAq3HAhDkKzuEKSuJR9nXGNaj GaGhSECM6t7joDbeQbvhP5AyjEykNECBRmrdXt+gl9lF7Fkbvb2ThoLs80F0RUhipWeK dVuI5ah1Gbdn7aoAa/3MtILaeZuZhb/w3IQvKiVz7TfkqP6XtCDMCdF4K3n/mOoq6bdT PI9Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=bR6FB3Md; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id x185-20020a6386c2000000b005775e2a7951si1274867pgd.345.2023.10.11.21.37.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 21:37:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=bR6FB3Md; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id C8C6E80FE537; Wed, 11 Oct 2023 21:37:15 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343491AbjJLEhH (ORCPT + 99 others); Thu, 12 Oct 2023 00:37:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51634 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232842AbjJLEhF (ORCPT ); Thu, 12 Oct 2023 00:37:05 -0400 Received: from mail-yw1-x1132.google.com (mail-yw1-x1132.google.com [IPv6:2607:f8b0:4864:20::1132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 02A1ABA for ; Wed, 11 Oct 2023 21:37:04 -0700 (PDT) Received: by mail-yw1-x1132.google.com with SMTP id 00721157ae682-59b5484fbe6so7454477b3.1 for ; Wed, 11 Oct 2023 21:37:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1697085423; x=1697690223; darn=vger.kernel.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=a5OuTYbjlFj2ZnZUysa+ox563mmr19kTFx3cd83SMJA=; b=bR6FB3MdhayRl1/0ey0RupRbvooUZ6o0GOpgayypmdBkz2ZWLNeyEKy9Ikc7S4g90f c/4HA0csVgbaMed+EqrpmKtvIRSaABoLmtwIV3xTpeLwVkOHFH8wORf0hMJAgjBs4INO rVNCjOvxSsD4ucS2boBRfal/2fG2fMh3FVv+T48fb0XeOjzZOhq+dGiw7n4i+MS5FmrK nlf8sQGQTfj0r+1+twNV6DgqduLVNlIVFMLu7iuDJjMAmxljQmroI/wVFJq4ttMV05i1 khU9wkDcoAabzEwC0ZI0DwUy4pJx8mqJfGwmzK9RcxiZyyzP+Ce+EXFGvU+GPw6K8vMr lLFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697085423; x=1697690223; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=a5OuTYbjlFj2ZnZUysa+ox563mmr19kTFx3cd83SMJA=; b=mdRMLmMFrduDf4hMKRnxYZaAb7ZuDAMtz+UwleV8XWHCWtX3OsF548gwEjZFT3q2l0 ydAjlbRaTM57h/CkG54n4WyvpxGe3FbU65KbnitreNLV+QuSLvQnSvf2IrOLBKZyIznq D0P0VClYtC74dHbio4tvAvDgUUtAdGTPWw6/FlDFeKFU5rY42XBnw2NnKtHhMUmiiMaY t0Me6GZcx74yOtdxRfRHi26ivWrg8XjQe9654wUrDZVIAx/NUQpb9l7QV00ZbW4gFxdu lauhxQOzN3mapknJFUEH9IRq6yKGlJU8DTStKZguApgnbMlrkmBTb6IQepJJYhHGVNH1 T+NQ== X-Gm-Message-State: AOJu0YwLzf97iHj/NkHkilRV2HusNAUr3Ni0qtNtWQmLXQ9vbXygM70B fydftWdg7Nh782K5jALPQ+7ZmQ== X-Received: by 2002:a81:c307:0:b0:594:e148:3c42 with SMTP id r7-20020a81c307000000b00594e1483c42mr20822490ywk.52.1697085422882; Wed, 11 Oct 2023 21:37:02 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id w136-20020a0dd48e000000b0059b547b167esm5668442ywd.98.2023.10.11.21.37.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 21:37:02 -0700 (PDT) Date: Wed, 11 Oct 2023 21:36:59 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Dave Chinner cc: Hugh Dickins , Andrew Morton , Tim Chen , Dave Chinner , "Darrick J. Wong" , Christian Brauner , Carlos Maiolino , Chuck Lever , Jan Kara , Matthew Wilcox , Johannes Weiner , Axel Rasmussen , Dennis Zhou , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH 8/8] shmem,percpu_counter: add _limited_add(fbc, limit, amount) In-Reply-To: Message-ID: References: <2451f678-38b3-46c7-82fe-8eaf4d50a3a6@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Wed, 11 Oct 2023 21:37:15 -0700 (PDT) On Mon, 9 Oct 2023, Dave Chinner wrote: > On Thu, Oct 05, 2023 at 10:35:33PM -0700, Hugh Dickins wrote: > > On Thu, 5 Oct 2023, Dave Chinner wrote: > > > > > > Hmmmm. IIUC, this only works for addition that approaches the limit > > > from below? > > > > That's certainly how I was thinking about it, and what I need for tmpfs. > > Precisely what its limitations (haha) are, I'll have to take care to > > spell out. > > > > (IIRC - it's a while since I wrote it - it can be used for subtraction, > > but goes the very slow way when it could go the fast way - uncompared > > percpu_counter_sub() much better for that. You might be proposing that > > a tweak could adjust it to going the fast way when coming down from the > > "limit", but going the slow way as it approaches 0 - that would be neat, > > but I've not yet looked into whether it's feasily done.) Easily done once I'd looked at it from the right angle. > > > > > > > > So if we are approaching the limit from above (i.e. add of a > > > negative amount, limit is zero) then this code doesn't work the same > > > as the open-coded compare+add operation would? > > > > To it and to me, a limit of 0 means nothing positive can be added > > (and it immediately returns false for that case); and adding anything > > negative would be an error since the positive would not have been allowed. > > > > Would a negative limit have any use? There was no reason to exclude it, once I was thinking clearly about the comparisons. > > I don't have any use for it, but the XFS case is decrementing free > space to determine if ENOSPC has been hit. It's the opposite > implemention to shmem, which increments used space to determine if > ENOSPC is hit. Right. > > > It's definitely not allowing all the possibilities that you could arrange > > with a separate compare and add; whether it's ruling out some useful > > possibilities to which it can easily be generalized, I'm not sure. > > > > Well worth a look - but it'll be easier for me to break it than get > > it right, so I might just stick to adding some comments. > > > > I might find that actually I prefer your way round: getting slower > > as approaching 0, without any need for specifying a limit?? That the > > tmpfs case pushed it in this direction, when it's better reversed? Or > > that might be an embarrassing delusion which I'll regret having mentioned. > > I think there's cases for both approaching and upper limit from > before and a lower limit from above. Both are the same "compare and > add" algorithm, just with minor logic differences... Good, thanks, you've saved me: I was getting a bit fundamentalist there, thinking to offer one simplest primitive from which anything could be built. But when it came down to it, I had no enthusiam for rewriting tmpfs's used_blocks as free_blocks, just to avoid that limit argument. > > > > Hence I think this looks like a "add if result is less than" > > > operation, which is distinct from then "add if result is greater > > > than" operation that we use this same pattern for in XFS and ext4. > > > Perhaps a better name is in order? > > > > The name still seems good to me, but a comment above it on its > > assumptions/limitations well worth adding. > > > > I didn't find a percpu_counter_compare() in ext4, and haven't got > > Go search for EXT4_FREECLUSTERS_WATERMARK.... Ah, not a percpu_counter_compare() user, but doing its own thing. > > > far yet with understanding the XFS ones: tomorrow... > > XFS detects being near ENOSPC to change the batch update size so > taht when near ENOSPC the percpu counter always aggregates to the > global sum on every modification. i.e. it becomes more accurate (but > slower) near the ENOSPC threshold. Then if the result of the > subtraction ends up being less than zero, it takes a lock (i.e. goes > even slower!), undoes the subtraction that took it below zero, and > determines if it can dip into the reserve pool or ENOSPC should be > reported. > > Some of that could be optimised, but we need that external "lock and > undo" mechanism to manage the reserve pool space atomically at > ENOSPC... Thanks for going above and beyond with the description; but I'll be honest and admit that I only looked quickly, and did not reach any conclusion as to whether such usage could or should be converted to percpu_counter_limited_add() - which would never take any XFS locks, of course, so might just end up doubling the slow work. But absolutely I agree with you, and thank you for pointing out, how stupidly useless percpu_counter_limited_add() was for decrementing - it was nothing more than a slow way of doing percpu_counter_sub(). I'm about to send in a 9/8, extending it to be more useful: thanks. Hugh