Received: by 2002:ab2:60d1:0:b0:1f7:5705:b850 with SMTP id i17csp1468960lqm; Thu, 2 May 2024 16:32:52 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCW1f2sWtdzluhZvnfGgptV41rY+WSmk7U972HFS2GzXYYU94AHHZ91OcMt8m2rSD0JIfrxFGXD7vYFWnj2hhpAVgMYeL/T3t1g01Rd2wQ== X-Google-Smtp-Source: AGHT+IHhYQMneXF6s3HJZvaX+BXjGn5jXGMRjl68vnUJz/7Mj8D7nLq55PnXJPXIq/ddjdjKtX06 X-Received: by 2002:a17:906:358f:b0:a59:871:8f9c with SMTP id o15-20020a170906358f00b00a5908718f9cmr399180ejb.61.1714692772719; Thu, 02 May 2024 16:32:52 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1714692772; cv=pass; d=google.com; s=arc-20160816; b=O/KRA6Z3DiYKbu7v4hgTon4312CkbCgwXNOOagh5jDkswlUwNHnz980QpXi4blPzfO H6oIPTgkb6D+BpnGqXi/Oao5MeBcBaR4tQsv/4t5Ln+x3vCuL45EQUKX5HB7bUwx9es5 8Cw+koMNk0s/iZvf2MsPE+MzaMksrr4HX155x54DqXvK7ncs+vtEGi2gErmmoQNy0hd7 CfuqdZbWZUa3DT/dZCySXX0r+tSzuUfD80mCRVj00RkWSlOoK7hquY91Zfz3tBb1moNb 3tH1IQQqQ2m3IncMn1qq5Ij2c6RoN4Qdlnu4/kDWmgzAqg7kUT8/IaPozhIXGsso/0jY knww== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=xwV1HZVleW01thm7cKmWmN1euQJwSchKow0lguZfgFs=; fh=lWUVuc8MgAq3n2AjrT103xvWFOsCrEZumKbTHi7N/v4=; b=egpx6VdI3Ldrga2vHZVg5YOhCll6cuyqmjIBCtmsMj+wZhE4qaK+TLmU+0OotdgoWD VoQATHLCEvK7oV2d1IJ5ndEFgWnJIqDDuKsMPbgSgAsv3HbGqH6/6/VaSKhwHTRlSEft D2xQwIxtkj2mhhezVn4TClNPEZFD0Tu5eC5Ivh6apdT+H3O4vBtyBSMK1RPrBoXaSlYy fly/p+XnrqOYiG/StLXicjh29eH/G1H6dcLfUnqdhXGb/95zZ5qh81glre16zipCsGlk 8neqKtm5uzjPa+Mni5zqn77MOnEZ4dsNQC0oGNQEfdSIT6M4Y4pkgKeEf4Zmsi0nvt/d JV9g==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=FovkIkSu; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-167066-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-167066-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id w21-20020a1709064a1500b00a5890bcb1d6si229126eju.801.2024.05.02.16.32.52 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 May 2024 16:32:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-167066-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=FovkIkSu; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-167066-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-167066-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 753A11F2385C for ; Thu, 2 May 2024 23:32:52 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 33D1920322; Thu, 2 May 2024 23:32:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="FovkIkSu" Received: from mail-yw1-f177.google.com (mail-yw1-f177.google.com [209.85.128.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EC62046B80 for ; Thu, 2 May 2024 23:32:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.177 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714692737; cv=none; b=O7crce+CHYmPJoOT2Z+z4h/lVYR+LeDqzBk6WzDEPr3A+W+DVJdoaULGKEwFTEcFoUC1WWGh+yNd/K0/oilZM6PR4qHMfuosV9MiVEHmE3WgAWq6lVm2EdScXh4fN0BsI7fUsD+cgyYyst8jP7v3HMF/apXUH4St0XV3GUmAY/I= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714692737; c=relaxed/simple; bh=JjG9zTh3YcEGevq2mGu1lapQZDi6ixXR4kkaGkiUikM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=TlPE4er0/5wEXKcn1+SlldVeWWW4alcfFhx5IcKv18tFFMLUcN9aYshR6hGQUSN8L8goTanGXw6ZP3SyY7v5FQ/05M6Qc6IOEHzonLRTvi5TdfnM6fKMAET4dYY7BrZ60bUC+2/jKvBR6t1S2ABJM2itdOSpxY+jV1Esy3kvQh0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=FovkIkSu; arc=none smtp.client-ip=209.85.128.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yw1-f177.google.com with SMTP id 00721157ae682-61bee45d035so38695817b3.1 for ; Thu, 02 May 2024 16:32:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714692735; x=1715297535; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xwV1HZVleW01thm7cKmWmN1euQJwSchKow0lguZfgFs=; b=FovkIkSuwpRyRsoZTvFnqnVfc2SiM8x8HEbO/pnTc5uEHyyiAdeMFzSFrLt/e3XgtI 38L4AAT09UtcF9PvA24dzbopMKwSr4UDi7vACrxmU/IJJpdDp0BDmkOWpzEBJspY9I69 5ZvoqCkIN/yfPK953BJwShB1LdLZJvTf8rFBPv1YFVQerkPuY4naIyO5Wh8nI8SnF7dy UuzufOS0/uUPefcA8J584XzfuQxkUBZy7kVWxCHYdxRPNz56ZyAFV+TOvyFsTkjm/z7H ehtFSWM0no8MkMQkC8Q7oEHvTlj2gZpI0a2hg3QGfmNhBasUt0pmCCoCjcf701P/UfyL OotQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714692735; x=1715297535; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xwV1HZVleW01thm7cKmWmN1euQJwSchKow0lguZfgFs=; b=iTxf9fHuRK2iBKcnCZdsco1t6+2iC/9gxkynqGDk+i9Rb/NR6zSOKh7s+u1ocDrAjz tCYcJaBJjx/228MaI70SkDSkVZ9TzQ8bsTGz9xrAg3kpJi1+epMlXsaMGoSiEVrgteAz h+SzfQPEnKspawXD2G35TPHWcSAJRYOEZe0GOdpmXom810eUILTgI40gderHyXULwQvY VNqfYWNINzntRMw7xxum+3NYv6FZWyrxafReKHE/CxzbctdghCMdHcCjKu2HW2PzMXmk PTnm+Faud2/9Kg21vUkO1k3vJLNxhk7+UhusB8JYcHpfNTIZZb5tZ91o5LL4WKSVFd3f u/7g== X-Gm-Message-State: AOJu0Yw3Izi/iKpnjdoAxxyc0oYFd3LWHiW1O1rPtyH8mpGTIVUvIGXd yhGqYTSZ4wYmaVOZzCD082H3ug5eYnRcQ1Ai3S5fF8zuguWLSZJV/OeK0g== X-Received: by 2002:a81:4882:0:b0:61b:1f0d:838b with SMTP id v124-20020a814882000000b0061b1f0d838bmr1103567ywa.14.1714692734769; Thu, 02 May 2024 16:32:14 -0700 (PDT) Received: from localhost ([69.73.66.55]) by smtp.gmail.com with ESMTPSA id s20-20020a819f14000000b006152af6131dsm420052ywn.119.2024.05.02.16.32.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 May 2024 16:32:14 -0700 (PDT) From: Yury Norov To: linux-kernel@vger.kernel.org Cc: Yury Norov , Rasmus Villemoes , Andrew Morton , Kuan-Wei Chiu Subject: [PATCH 3/4] bitops: squeeze even more out of fns() Date: Thu, 2 May 2024 16:32:03 -0700 Message-Id: <20240502233204.2255158-4-yury.norov@gmail.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20240502233204.2255158-1-yury.norov@gmail.com> References: <20240502233204.2255158-1-yury.norov@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit The function clears N-1 first set bits to find the N'th one with: while (word && n--) word &= word - 1; In the worst case, it would take 63 iterations. Instead of linear walk through the set bits, we can do a binary search by using hweight(). This would work even better on platforms supporting hardware-assisted hweight() - pretty much every modern arch. On my Ryzen 9 5900X, the test_fns() benchmark runs binary fns() twice faster, comparing to linear one. The fns8() returns 64 to make sure that in case of no bit found, the return value will be greater than the bit capacity of arguments of all fnsXX() functions up to fns64(). Signed-off-by: Yury Norov --- include/linux/bitops.h | 42 +++++++++++++++++++++++++++++++++++++----- 1 file changed, 37 insertions(+), 5 deletions(-) diff --git a/include/linux/bitops.h b/include/linux/bitops.h index 57ecef354f47..1c4773db56e0 100644 --- a/include/linux/bitops.h +++ b/include/linux/bitops.h @@ -247,17 +247,49 @@ static inline unsigned long __ffs64(u64 word) return __ffs((unsigned long)word); } +static inline unsigned long fns8(u8 word, unsigned int n) +{ + while (word && n--) + word &= word - 1; + + return word ? __ffs(word) : 64; +} + +static inline unsigned long fns16(u16 word, unsigned int n) +{ + unsigned int w = hweight8((u8)word); + + return n < w ? fns8((u8)word, n) : 8 + fns8((u8)(word >> 8), n - w); +} + +static inline unsigned long fns32(u32 word, unsigned int n) +{ + unsigned int w = hweight16((u16)word); + + return n < w ? fns16((u16)word, n) : 16 + fns16((u16)(word >> 16), n - w); +} + +static inline unsigned long fns64(u64 word, unsigned int n) +{ + unsigned int w = hweight32((u32)word); + + return n < w ? fns32((u32)word, n) : 32 + fns32((u32)(word >> 32), n - w); +} + /** * fns - find N'th set bit in a word * @word: The word to search - * @n: Bit to find + * @n: Bit to find, counting from 0 + * + * Returns N'th set bit. If no such bit found, returns >= BITS_PER_LONG */ static inline unsigned long fns(unsigned long word, unsigned int n) { - while (word && n--) - word &= word - 1; - - return word ? __ffs(word) : BITS_PER_LONG; +#if BITS_PER_LONG == 64 + return fns64(word, n); +#else + return fns32(word, n); +#endif } /** -- 2.40.1