Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 321ADC433FE for ; Sun, 28 Nov 2021 18:05:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239289AbhK1SJO (ORCPT ); Sun, 28 Nov 2021 13:09:14 -0500 Received: from rere.qmqm.pl ([91.227.64.183]:52591 "EHLO rere.qmqm.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231344AbhK1SHN (ORCPT ); Sun, 28 Nov 2021 13:07:13 -0500 Received: from remote.user (localhost [127.0.0.1]) by rere.qmqm.pl (Postfix) with ESMTPSA id 4J2GZ84l0lzGX; Sun, 28 Nov 2021 19:03:44 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=rere.qmqm.pl; s=1; t=1638122634; bh=sEqOa0ZEPEWEUlBRSLFAiVwd6iLDymwK+DV3juGIkQI=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=c8BMIb6Ys1MM9jtnUFS+OzCZESND169gxM1+qoGuDHQxJkjYTnLg1dYXL65SLSeBD OptKwwF3f6jjeAWkz8hKGEIpBsOyA7AZLbEyI0Ul8i9xig/dKvGR9NZ6kh+Ny2ulTW NSMm5sqIVijexkgrukA2T59koDDkGancK18mcZt702hpU7mgoZQDDZbSDYgNkFc9BO VIw8ONHFtRVbVnzFi4YPF5fViRQnoyL8aNWKj0ctBrm3IfLQEgZY5IGVCfNRdmPFM4 VnIE/zolNmZ9Tr6q4VJbIye3TlVbgURGaBOqFwsRQiEoMGrVw4g28fuRp3xh+IiOQd dpDJSAp182XbQ== X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.3 at mail Date: Sun, 28 Nov 2021 19:03:41 +0100 From: mirq-test@rere.qmqm.pl To: Yury Norov Cc: linux-kernel@vger.kernel.org, "James E.J. Bottomley" , "Martin K. Petersen" , "Paul E. McKenney" , "Rafael J. Wysocki" , Alexander Shishkin , Alexey Klimov , Amitkumar Karwar , Andi Kleen , Andrew Lunn , Andrew Morton , Andy Gross , Andy Lutomirski , Andy Shevchenko , Anup Patel , Ard Biesheuvel , Arnaldo Carvalho de Melo , Arnd Bergmann , Borislav Petkov , Catalin Marinas , Christoph Hellwig , Christoph Lameter , Daniel Vetter , Dave Hansen , David Airlie , David Laight , Dennis Zhou , Dinh Nguyen , Geetha sowjanya , Geert Uytterhoeven , Greg Kroah-Hartman , Guo Ren , Hans de Goede , Heiko Carstens , Ian Rogers , Ingo Molnar , Jakub Kicinski , Jason Wessel , Jens Axboe , Jiri Olsa , Jonathan Cameron , Juri Lelli , Kalle Valo , Kees Cook , Krzysztof Kozlowski , Lee Jones , Marc Zyngier , Marcin Wojtas , Mark Gross , Mark Rutland , Matti Vaittinen , Mauro Carvalho Chehab , Mel Gorman , Michael Ellerman , Mike Marciniszyn , Nicholas Piggin , Palmer Dabbelt , Peter Zijlstra , Petr Mladek , Randy Dunlap , Rasmus Villemoes , Roy Pledge , Russell King , Saeed Mahameed , Sagi Grimberg , Sergey Senozhatsky , Solomon Peachy , Stephen Boyd , Stephen Rothwell , Steven Rostedt , Subbaraya Sundeep , Sudeep Holla , Sunil Goutham , Tariq Toukan , Tejun Heo , Thomas Bogendoerfer , Thomas Gleixner , Ulf Hansson , Vincent Guittot , Vineet Gupta , Viresh Kumar , Vivien Didelot , Vlastimil Babka , Will Deacon , bcm-kernel-feedback-list@broadcom.com, kvm@vger.kernel.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-crypto@vger.kernel.org, linux-csky@vger.kernel.org, linux-ia64@vger.kernel.org, linux-mips@vger.kernel.org, linux-mm@kvack.org, linux-perf-users@vger.kernel.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-snps-arc@lists.infradead.org, linuxppc-dev@lists.ozlabs.org Subject: Re: [PATCH 0/9] lib/bitmap: optimize bitmap_weight() usage Message-ID: References: <20211128035704.270739-1-yury.norov@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20211128035704.270739-1-yury.norov@gmail.com> Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org On Sat, Nov 27, 2021 at 07:56:55PM -0800, Yury Norov wrote: > In many cases people use bitmap_weight()-based functions like this: > > if (num_present_cpus() > 1) > do_something(); > > This may take considerable amount of time on many-cpus machines because > num_present_cpus() will traverse every word of underlying cpumask > unconditionally. > > We can significantly improve on it for many real cases if stop traversing > the mask as soon as we count present cpus to any number greater than 1: > > if (num_present_cpus_gt(1)) > do_something(); > > To implement this idea, the series adds bitmap_weight_{eq,gt,le} > functions together with corresponding wrappers in cpumask and nodemask. Having slept on it I have more structured thoughts: First, I like substituting bitmap_empty/full where possible - I think the change stands on its own, so could be split and sent as is. I don't like the proposed API very much. One problem is that it hides the comparison operator and makes call sites less readable: bitmap_weight(...) > N becomes: bitmap_weight_gt(..., N) and: bitmap_weight(...) <= N becomes: bitmap_weight_lt(..., N+1) or: !bitmap_weight_gt(..., N) I'd rather see something resembling memcmp() API that's known enough to be easier to grasp. For above examples: bitmap_weight_cmp(..., N) > 0 bitmap_weight_cmp(..., N) <= 0 ... This would also make the implementation easier in not having to copy and paste the code three times. Could also use a simple optimization reducing code size: #include int bitmap_weight_cmp(long *bits, size_t nbits, size_t cmp) { for (size_t i = 0; i < nbits / BITS_PER_LONG; ++i, ++bits) if (check_sub_overflow(cmp, popcount(*bits), &cmp)) return 1; nbits %= BITS_PER_LONG; if (nbits && check_sub_overflow(cmp, popcount(*bits & GENMASK(nbits)), &cmp)) return 1; return cmp ? -1 : 0; } Best Regards Micha? Miros?aw