Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp2653613ybl; Sat, 24 Aug 2019 23:14:02 -0700 (PDT) X-Google-Smtp-Source: APXvYqwiy3iDRezq279V5MzJ5twHDJS7ZEwaoY+uaEXRpSdyId1DsOLYHwVb4pdNzJ15IM+ePQuO X-Received: by 2002:a65:5889:: with SMTP id d9mr10737477pgu.380.1566713642420; Sat, 24 Aug 2019 23:14:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1566713642; cv=none; d=google.com; s=arc-20160816; b=pC+nOwaPkWYq+TYWTY9RBFPkwOGrvCEcM0hYaJqNut/031C0rWFTOcgT6t+FJCLr4N A1cfmSjkVktirBaA132GfSQxZv89ND99bNyMLs6PtqOZPzLCcLUTRCCjlVpTcO2fu05O 7nxgPbJzDzaG0nbSNy0uiS5EF/XLzIfnhB7PMTU3Pxok/ZAKYM2QDvh+LabsU7bdE7Ig HVCqqVRtRcLD/pwGvsbRkpEqvqARGvXauNLH1hRdtZvibZXcg4KYMKzyBf7MYlkpoAch 0l8gY0Ez40jIWIe5Jm8h2/iJazxM1io2I4fm5BjSlCRom3dz3drcTG+AIFxXyjW6u31j x2ag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=JpNLDg+KbIEgZXCqJMuphSCAD3VmeNChk2WlGZXcmTM=; b=HU0unaGi68nOe0dbXviACoiCSWupu/wYHXQa74QCHh/egbQtCsUbXCxSCVH0gh1qle MCpRf7TSI49WVzD1c2BqrcPdnkrSl7ZRp6YlK3HPYwIOVhdaNOZgGRSVOlvJJtpZ/P/i U3I/XYK6srYrbd6+JHKclrZwL9zBMDz3T6nuWHYOWAfvny3xUnGVga2fKAIYTfnPGJIl XZffCbcVA02sxg8rCmoCKCoPLcPU8lRqJqy7U4dL32VuFziLZV21lzT/3It+8U+1XWcJ AHMJkiBq92MYnpWLrZUfF2VagwdhaU9sOtNpirgP1aAAgoH5fRjp0Cck3ucyHdwMBTkO EPpQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=k6MEvGTL; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a37si6865365pla.294.2019.08.24.23.13.33; Sat, 24 Aug 2019 23:14:02 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=k6MEvGTL; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726461AbfHYGMG (ORCPT + 99 others); Sun, 25 Aug 2019 02:12:06 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:57202 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725792AbfHYGMG (ORCPT ); Sun, 25 Aug 2019 02:12:06 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=JpNLDg+KbIEgZXCqJMuphSCAD3VmeNChk2WlGZXcmTM=; b=k6MEvGTLP99lhrLpceqHgFSml G9ZIXHbGuMPSYdG+JTUQHLsZXaY8kreOrv3MxHREKQ+lS9LovBQJCCHXv7+ejU8IFEviTLHfByPN4 fe2WH/h2wHRO5ggFgeu0Zudl7UtaX2CilZGs9xtZK8Z8RnQNKjf0OVOJgzp2nUsptBhtKzPJ4CWXu VoT/z7IZMAQtAdmMyOHRmicelf2gSj75d5Ap+HdOwizF1VtYmwMkZmGcPn9Q0RWby0D/vfVNEIpDn QB6v8LYGXr+/sE9rLLYVifd4O/CWBYyyfrbYpn8/FxWir/CB7jYRNVCb7tYcjEnCIky0IP5qL4nzw 4wB5c07aw==; Received: from willy by bombadil.infradead.org with local (Exim 4.92 #3 (Red Hat Linux)) id 1i1lks-0001Yj-Kw; Sun, 25 Aug 2019 06:11:58 +0000 Date: Sat, 24 Aug 2019 23:11:58 -0700 From: Matthew Wilcox To: Denis Efremov Cc: akpm@linux-foundation.org, Akinobu Mita , Jan Kara , linux-kernel@vger.kernel.org, Matthew Wilcox , dm-devel@redhat.com, linux-fsdevel@vger.kernel.org, linux-media@vger.kernel.org, Erdem Tumurov , Vladimir Shelekhov Subject: Re: [PATCH v2] lib/memweight.c: open codes bitmap_weight() Message-ID: <20190825061158.GC28002@bombadil.infradead.org> References: <20190821074200.2203-1-efremov@ispras.ru> <20190824100102.1167-1-efremov@ispras.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190824100102.1167-1-efremov@ispras.ru> User-Agent: Mutt/1.11.4 (2019-03-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Aug 24, 2019 at 01:01:02PM +0300, Denis Efremov wrote: > This patch open codes the bitmap_weight() call. The direct > invocation of hweight_long() allows to remove the BUG_ON and > excessive "longs to bits, bits to longs" conversion. Honestly, that's not the problem with this function. Take a look at https://danluu.com/assembly-intrinsics/ for a _benchmarked_ set of problems with popcnt. > BUG_ON was required to check that bitmap_weight() will return > a correct value, i.e. the computed weight will fit the int type > of the return value. What? No. Look at the _arguments_ of bitmap_weight(): static __always_inline int bitmap_weight(const unsigned long *src, unsigned int nbits) > With this patch memweight() controls the > computation directly with size_t type everywhere. Thus, the BUG_ON > becomes unnecessary. Why are you bothering? How are you allocating half a gigabyte of memory? Why are you calling memweight() on half a gigabyte of memory? > if (longs) { > - BUG_ON(longs >= INT_MAX / BITS_PER_LONG); > - ret += bitmap_weight((unsigned long *)bitmap, > - longs * BITS_PER_LONG); > + const unsigned long *bitmap_long = > + (const unsigned long *)bitmap; > + > bytes -= longs * sizeof(long); > - bitmap += longs * sizeof(long); > + for (; longs > 0; longs--, bitmap_long++) > + ret += hweight_long(*bitmap_long); > + bitmap = (const unsigned char *)bitmap_long; > } If you really must change anything, I'd rather see this turned into a loop: while (longs) { unsigned int nbits; if (longs >= INT_MAX / BITS_PER_LONG) nbits = INT_MAX + 1; else nbits = longs * BITS_PER_LONG; ret += bitmap_weight((unsigned long *)bitmap, sz); bytes -= nbits / 8; bitmap += nbits / 8; longs -= nbits / BITS_PER_LONG; } then we only have to use Dan Luu's optimisation in bitmap_weight() and not in memweight() as well. Also, why does the trailer do this: for (; bytes > 0; bytes--, bitmap++) ret += hweight8(*bitmap); instead of calling hweight_long on *bitmap & mask?