Received: by 2002:a05:6358:701b:b0:131:369:b2a3 with SMTP id 27csp4024786rwo; Mon, 24 Jul 2023 22:31:14 -0700 (PDT) X-Google-Smtp-Source: APBJJlEEI9dthdmRlF5TYV3CGNVE/V18PwBASot51MVFzaU0BiplQuDZkBOWaMI6elVqhulkDqVp X-Received: by 2002:a17:90a:a60c:b0:262:ba7f:30cd with SMTP id c12-20020a17090aa60c00b00262ba7f30cdmr7819851pjq.31.1690263074546; Mon, 24 Jul 2023 22:31:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690263074; cv=none; d=google.com; s=arc-20160816; b=AD3OHPlSwG0PvvTHlenwhs1VOkkslVHnMewhr/ZPoWITNq5yV0XikKQBrd/XyXfjos a4qHRGncsPt0/0CYRVbGgWJmVlRNA2QGXCWhVTC3+D6y3luxeygulPTVjiYgQRTM9IvI oXDWKhpaXa1qD5SzJC1RkG6QQMZhHkvthtYolr2Cit3L/SAluM3PG/Kg9uhRgIXCSicz 5IKyXzZY0ho9ONDEmtXwmJXGdco1+JIjFim5o1OFg+QBrXNI3WDWO1TszMa56JUicGxy KnMLRXQwIwzz05kw/IsVoBkwzpr32B9mI5SI0VZqxIHiYvcM92JAW26U9CODU6/7sNYF 81DQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=tVslJkVhhPHT+yMOLAi97A/B19Ed0JkmeSyU1/87Zew=; fh=e45Xf7+ul15rt8aCHu0RYXke+6hjHn1ypqAUuk7SfF4=; b=gZE1J/Oy+ojWEMm6wbe6HCspcfq+aV7+nRRvFly4AKBZw8L/OXd31yoQKPMZNS+odf /qX8aeue10i+LcBvhWpuQVAL/+GXyYEZ6XgN/82lWgcgnIRdpSA1ioY//JDu9+xqEJTz GT5pjJ8r+ja0V41unbgWy4Gp7j2B8YvlzV0yCAQPj3N4rgt0MQ3ffip2z1npsKOgxUWh iUhT6/uH9/eE4gyPSf480GD9r9CSBI3epHTeGPUh5xUqqvMt0idVNBqjMmg3XZ9Vd2F0 PR+udpZhdq/nsXWs9pn3+XVSMQor3nkpBU7Nd6odyCI2FdK9Cmp01SmJ6cqztFWHOBjb MKDg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=LeF8ZOkP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id lw8-20020a17090b180800b0026308f709e6si14050020pjb.113.2023.07.24.22.31.02; Mon, 24 Jul 2023 22:31:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=LeF8ZOkP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231166AbjGYFEk (ORCPT + 99 others); Tue, 25 Jul 2023 01:04:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38472 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229541AbjGYFEj (ORCPT ); Tue, 25 Jul 2023 01:04:39 -0400 Received: from mail-il1-x133.google.com (mail-il1-x133.google.com [IPv6:2607:f8b0:4864:20::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 519F210F6 for ; Mon, 24 Jul 2023 22:04:38 -0700 (PDT) Received: by mail-il1-x133.google.com with SMTP id e9e14a558f8ab-348c62db335so11514615ab.0 for ; Mon, 24 Jul 2023 22:04:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690261477; x=1690866277; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=tVslJkVhhPHT+yMOLAi97A/B19Ed0JkmeSyU1/87Zew=; b=LeF8ZOkPZfKXU1YDm6y069PfmiOw25aDOBl9wEi0O6JOXGnpd5OWmKr8jW5HzGqOgs zlCouop+w5iBba2yFhZN6ogEMTwK2iBZr64Y715W+azBrkDRdvtqkYPIxafaYRvfclGu Ob+OiV7cpHuOJNsAK/7pRev532JfVumw0xdV4JEklzZWYgEUa8nOuub8fke5N1dCGIC/ 13rsXlnJYgdP7cAEhnDb4NYzC/MxBHTKd5Fl7MrYkHPICAVm084wjkkUM7LatlNrIhC/ KJib8YH2tbY4Jmjt0cRskcgkHxReLGUuovfeR6Xpp68PsmjMnImUDxXoexs3/sD36cSv 78Bg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690261477; x=1690866277; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=tVslJkVhhPHT+yMOLAi97A/B19Ed0JkmeSyU1/87Zew=; b=ibk/MNsjW2k0l2nUiWo5H8S+t67E2LhdsON56bHQ+QtqRgbyxIthZameFPF/bKAZiX M2poF5OYD5f6aWJz2oJGgpjfgkdU9hSLmSKov6zFC2yektPOmC6cblkGMNzi4pPOovw1 q+K1g4Q+9NxR8BKTzCdZleWdNtS66oUL9tOVMwUhO0zstvcSQ8L9qDWHf4lq/XvYW31G AmjVsvTv8DOvSV9TrQnEfapjQQByTrR1cAlP/6MlVYU0uwo7unG3DIzTikp5AOf7t+oy fRDhWK3OleRUHaPskjQF+9m2jSxyRm3xqmNaAKjiX1i9s70UUa0LpV1KdxtO6Q/7uCsB KelA== X-Gm-Message-State: ABy/qLbxTojisiPoaEVEr+OyWLsVFpRLeEZhjnIHszVAaGF7FHALfQqX qs6rPe2Y7DwGWS7Yw+ozdcEHRTo0RE87IQ== X-Received: by 2002:a05:6e02:2190:b0:348:8b42:47c with SMTP id j16-20020a056e02219000b003488b42047cmr2089788ila.17.1690261477523; Mon, 24 Jul 2023 22:04:37 -0700 (PDT) Received: from localhost ([216.228.127.128]) by smtp.gmail.com with ESMTPSA id b26-20020a63a11a000000b00563962dbc70sm5932675pgf.58.2023.07.24.22.04.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Jul 2023 22:04:36 -0700 (PDT) Date: Mon, 24 Jul 2023 22:04:34 -0700 From: Yury Norov To: Andy Shevchenko Cc: Alexander Potapenko , catalin.marinas@arm.com, will@kernel.org, pcc@google.com, andreyknvl@gmail.com, linux@rasmusvillemoes.dk, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, eugenis@google.com, syednwaris@gmail.com, william.gray@linaro.org, Arnd Bergmann Subject: Re: [PATCH v4 1/5] lib/bitmap: add bitmap_{set,get}_value() Message-ID: References: <20230720173956.3674987-1-glider@google.com> <20230720173956.3674987-2-glider@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 24, 2023 at 11:36:36AM +0300, Andy Shevchenko wrote: > On Sat, Jul 22, 2023 at 06:57:23PM -0700, Yury Norov wrote: > > On Thu, Jul 20, 2023 at 07:39:52PM +0200, Alexander Potapenko wrote: > > > > + map[index] &= ~(GENMASK(nbits - 1, 0) << offset); > > > > 'GENMASK(nbits - 1, 0) << offset' looks really silly. > > But you followed the thread to get a clue why it's written in this form, right? Yes, I did. But I don't expect everyone looking at kernel code would spend time recovering discussions that explain why that happened. So, at least it would be fine to drop a comment. > ... > > > With all that I think the implementation should look something like > > this: > > I would go this way if and only if the code generation on main architectures > with both GCC and clang is better. > > And maybe even some performance tests need to be provided. For the following implementation: void my_bitmap_write(unsigned long *map, unsigned long value, unsigned long start, unsigned long nbits) { unsigned long w, end; if (unlikely(nbits == 0)) return; value &= GENMASK(nbits - 1, 0); map += BIT_WORD(start); start %= BITS_PER_LONG; end = start + nbits - 1; w = *map & (end < BITS_PER_LONG ? ~GENMASK(end, start) : BITMAP_LAST_WORD_MASK(start)); *map = w | (value << start); if (end < BITS_PER_LONG) return; w = *++map & BITMAP_LAST_WORD_MASK(end + 1 - BITS_PER_LONG); *map = w | (value >> (BITS_PER_LONG - start)); } This is the bloat-o-meter output: $ scripts/bloat-o-meter lib/test_bitmap.o.orig lib/test_bitmap.o add/remove: 8/0 grow/shrink: 1/0 up/down: 2851/0 (2851) Function old new delta test_bitmap_init 3846 5484 +1638 test_bitmap_write_perf - 401 +401 bitmap_write - 271 +271 my_bitmap_write - 248 +248 bitmap_read - 229 +229 __pfx_test_bitmap_write_perf - 16 +16 __pfx_my_bitmap_write - 16 +16 __pfx_bitmap_write - 16 +16 __pfx_bitmap_read - 16 +16 Total: Before=36964, After=39815, chg +7.71% And this is the performance test: for (cnt = 0; cnt < 5; cnt++) { time = ktime_get(); for (nbits = 1; nbits <= BITS_PER_LONG; nbits++) { for (i = 0; i < 1000; i++) { if (i + nbits > 1000) break; bitmap_write(bmap, val, i, nbits); } } time = ktime_get() - time; pr_err("bitmap_write:\t%llu\t", time); time = ktime_get(); for (nbits = 1; nbits <= BITS_PER_LONG; nbits++) { for (i = 0; i < 1000; i++) { if (i + nbits > 1000) break; my_bitmap_write(bmap, val, i, nbits); } } time = ktime_get() - time; pr_cont("%llu\n", time); } Which on x86_64/kvm with GCC gives: Orig My [ 1.630731] test_bitmap: bitmap_write: 299092 252764 [ 1.631584] test_bitmap: bitmap_write: 299522 252554 [ 1.632429] test_bitmap: bitmap_write: 299171 258665 [ 1.633280] test_bitmap: bitmap_write: 299241 252794 [ 1.634133] test_bitmap: bitmap_write: 306716 252934 So, it's ~15% difference in performance and 8% in size. I don't insist on my implementation, but I think, we'd experiment for more with code generation. Thanks, Yury