Received: by 2002:a05:6358:45e:b0:b5:b6eb:e1f9 with SMTP id 30csp1077256rwe; Wed, 24 Aug 2022 14:35:44 -0700 (PDT) X-Google-Smtp-Source: AA6agR6az01YQlqai1ErYT3tadAFAe4iUdMnRJQ+HTPJCKaQbA40F21scy1cmu6C0PRbg4U/inEB X-Received: by 2002:a63:6116:0:b0:42b:1631:215f with SMTP id v22-20020a636116000000b0042b1631215fmr646894pgb.273.1661376944182; Wed, 24 Aug 2022 14:35:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1661376944; cv=none; d=google.com; s=arc-20160816; b=sUHElRgEwLo0k5wpd6U+t0jGAImJ2X351glg7zoFbhDtfKpMNKnKfPxA7SGBVpU/Ui CYGGtZZ1vjipYjTsPAms2eufUziGDsWJT2c005AFoMq5q6x9VqzapBFBCpaaKH2utGf+ 9pGpnI4DJ4GHajTy7hB+6ssqBOBBb3zSeRTwPUWlGWw7igZdLtSXzeCUkVKblnCoPgc8 WG+w3K/DjhpLmRUmMe2KNyPRzjDDZyOXmcTZLAwoP/ZAq1cCGcceM0Niu3DdHEmTEEkV +/1Y9Zdy5ljLGgjTGBhOnpr4P+jEP1FdZ31Re19LS5tVhuX85IeB+X32cFBq/r/lwW0K VCEA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=NrHh48Gf19GlQMqAuAe/NxXBbY/CiWw/bN0ZZpZssVk=; b=hZ0QOoinpkfGwOdMFtmsagTodZNqqtKabyfujZD22ZsWOPVqszZhQEG9HG0uxpKBnC /fEn6kBhu05A6G/KfCzkfC8HsQdKBWcECG70I20Z0kCAntBCKx5Bp9qL0uyhFnxuCghZ Eroz98IDcAst8tTRcmi+112n9PEFPe4nn+5sIGDad2s04o2Vm5/TQOv7kwha6Dm66Ziy IgpcgbzleQk+JPd5i6wSqNqmVYzUxktbEDkj0BnkNUT1YS2eJcsdt1/Vd2SaO0SrGdkI oYQEPPwxzf9M/VRLCIsAQhIUGbmwqI2utVlc+MVttPiYgBHYNg73O8rRvg23INdtkClO yf1g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="Hn+k/ujB"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 21-20020a630015000000b0042a93b62540si11794917pga.68.2022.08.24.14.35.24; Wed, 24 Aug 2022 14:35:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="Hn+k/ujB"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240737AbiHXV1b (ORCPT + 99 others); Wed, 24 Aug 2022 17:27:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37298 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238348AbiHXV13 (ORCPT ); Wed, 24 Aug 2022 17:27:29 -0400 Received: from mail-qt1-x831.google.com (mail-qt1-x831.google.com [IPv6:2607:f8b0:4864:20::831]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 34A627C76A for ; Wed, 24 Aug 2022 14:27:28 -0700 (PDT) Received: by mail-qt1-x831.google.com with SMTP id l5so13812100qtv.4 for ; Wed, 24 Aug 2022 14:27:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc; bh=NrHh48Gf19GlQMqAuAe/NxXBbY/CiWw/bN0ZZpZssVk=; b=Hn+k/ujBweFlPuihiZhFdJsivHBvjwnCG0HCpC4oENGJoEFa0bCxazOwVm5o5IRsTv usIGaWcI0y1qatxjb7sQVOOENpdPO2pxTSf8VW4hrcPmWo8qBA5aHCyKCO3LIuQ3jy3R GAfUD9ZlLtWLS4aYw+tuzZArUx3FGSakUD3bJ8LrlXt0nsRAaqQdQGAowHbn+uu/IqZL Qe0iUTJVyQTZINW0S4TbDaQxNR/lfjDk/uj/INvsgkTfotxDGPf9uVlFfQXM1nX3LHnV Dt+pwqcQHZNl0z5Wz0CIs54oqU7fZDdaXRs/Iuzh7yRhV6M/QmEv16vuIcNhXXcQPCld AzfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc; bh=NrHh48Gf19GlQMqAuAe/NxXBbY/CiWw/bN0ZZpZssVk=; b=b48EyeRKRtXp8nCkvu5bclLy1iwYG3odEFpvPZLY7GJQBhgaIaSbsy4xDVvlKq6RcY 4Lh6qJHmvHzHJofz5NsuZYXAp/bOWQw0LuOch3QInEWmXZ2nFSyEtQAscJ1n7c8PBiTd 1De8WQVSa2KPbl7aLaLnH0lM5mFH4h1u9Jocwj+xkSa+/mwNdTYQGbBTAwQBZkGsdgnV LHPbcxgk9+aEOq7OsGUtMmjnzhvXIyZ+NT8EhmF7bsm04zU2RUs0pZrN7a39lfg3+Yhf hAwECayoJcH04YHiBPRbpXRfT9niPGk5PhpMQhw85jtBjeJnNMX2Z0fxbPB8DAFHYSzA HJHg== X-Gm-Message-State: ACgBeo2gDdEvRmGQ2pjyf4l2AWcD+nS2zfpUlxJOFrmFM8aO/+tDzDqm 5weBTaK1b+nz+oY9QUCdeqk= X-Received: by 2002:ac8:7dc4:0:b0:343:622d:5fda with SMTP id c4-20020ac87dc4000000b00343622d5fdamr1115818qte.197.1661376447268; Wed, 24 Aug 2022 14:27:27 -0700 (PDT) Received: from localhost ([2601:4c1:c100:2270:5a54:d9d9:c2a4:527e]) by smtp.gmail.com with ESMTPSA id r17-20020ac85e91000000b00344c29bc045sm6078438qtx.25.2022.08.24.14.27.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Aug 2022 14:27:26 -0700 (PDT) Date: Wed, 24 Aug 2022 14:27:26 -0700 From: Yury Norov To: Andy Shevchenko Cc: Linus Torvalds , Linux Kernel Mailing List , Guenter Roeck , Dennis Zhou , Russell King , Catalin Marinas , Andy Shevchenko , Rasmus Villemoes , Alexey Klimov , Kees Cook , Andy Whitcroft Subject: Re: [PATCH v2 3/3] lib/find_bit: optimize find_next_bit() functions Message-ID: References: <20220824012624.2826445-1-yury.norov@gmail.com> <20220824012624.2826445-4-yury.norov@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 24, 2022 at 08:56:02PM +0300, Andy Shevchenko wrote: > On Wed, Aug 24, 2022 at 8:54 PM Andy Shevchenko > wrote: > > On Wed, Aug 24, 2022 at 4:53 PM Yury Norov wrote: > > > On Wed, Aug 24, 2022 at 12:19:05PM +0300, Andy Shevchenko wrote: > > > > On Wed, Aug 24, 2022 at 4:56 AM Yury Norov wrote: > > ... > > > > > > +#define FIND_NEXT_BIT(EXPRESSION, size, start) \ > > > > > +({ \ > > > > > + unsigned long mask, idx, tmp, sz = (size), __start = (start); \ > > > > > + \ > > > > > + if (unlikely(__start >= sz)) \ > > > > > + goto out; \ > > > > > + \ > > > > > + mask = word_op(BITMAP_FIRST_WORD_MASK(__start)); \ > > > > > + idx = __start / BITS_PER_LONG; \ > > > > > + \ > > > > > + for (tmp = (EXPRESSION) & mask; !tmp; tmp = (EXPRESSION)) { \ > > > > > > > > for (unsigned long tmp ...; > > > > But hey, why not loop over idx (which probably should be named as > > > > offset) > > > > > > Offset in structure, index in array, isn't? > > > > > > > as I proposed in the first patch? You will drop a lot of > > > > divisions / multiplications, no? > > > > > > Those divisions and multiplications are optimized away, and > > > what you suggested blows up the EXPRESSION. > > > > > > I tried like this: > > > mask = word_op(BITMAP_FIRST_WORD_MASK(__start)); > > > idx = __start / BITS_PER_LONG; > > > tmp = (EXPRESSION); > > > > > > while (1) { > > > if (tmp) { > > > sz = min(idx * BITS_PER_LONG + __ffs(word_op(tmp)), sz); > > > break; > > > } > > > > > > if (++idx > sz) > > > break; > > > > > > tmp = (EXPRESSION); > > > } > > > > > > And it generated the same code, but looks less expressive to me. > > > If you have some elegant approach in mind - can you please share > > > it, and how the generated code looks? > > > > for (unsigned long idx = 0; idx < sz; idx++) { > > Of source 0 should be changed to whatever start you have there. > > > unsigned long tmp; > > > > tmp = (EXPRESSION); > > if (tmp) { > > ... > > } > > } > > > > No? No. For the first iteration, the tmp can't be calculated inside the loop (my example above is wrong) because we need to clear first bits: mask = BITMAP_FIRST_WORD_MASK(__start); idx = __start / BITS_PER_LONG; tmp = (EXPRESSION) & mask; // First fetch is here while (1) { if (tmp) { // Evaluate here sz = min(idx * BITS_PER_LONG + __ffs(tmp), sz); break; } if (++idx > sz) // Increment here break; tmp = (EXPRESSION); // Other fetches here } Trying to move iterator increment inside the for-loop, like you suggested would break the sequence - common-case word fetch will happen before the idx++.