Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752877AbcD1SC7 (ORCPT ); Thu, 28 Apr 2016 14:02:59 -0400 Received: from 216-12-86-13.cv.mvl.ntelos.net ([216.12.86.13]:57816 "EHLO brightrain.aerifal.cx" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751824AbcD1SCz (ORCPT ); Thu, 28 Apr 2016 14:02:55 -0400 Date: Thu, 28 Apr 2016 13:58:44 -0400 From: Rich Felker To: Geert Uytterhoeven Cc: George Spelvin , Andrew Morton , Peter Zijlstra , zengzhaoxiu@163.com, "David S. Miller" , Helge Deller , Ivan Kokshaysky , James Hogan , "James E.J. Bottomley" , Jonas Bonn , Lennox Wu , Ley Foon Tan , alpha , "linux-arm-kernel@lists.infradead.org" , "linux-kernel@vger.kernel.org" , linux-m68k , "open list:METAG ARCHITECTURE" , Linux MIPS Mailing List , Parisc List , Linux-sh list , Russell King , linux , Chen Liqin , Matt Turner , Michal Simek , nios2-dev@lists.rocketboards.org, Ralf Baechle , Richard Henderson , sparclinux , uclinux-h8-devel@lists.sourceforge.jp, Yoshinori Sato , zhaoxiu.zeng@gmail.com Subject: Re: [patch V3] lib: GCD: add binary GCD algorithm Message-ID: <20160428175843.GZ21636@brightrain.aerifal.cx> References: <1461843824-19853-1-git-send-email-zengzhaoxiu@163.com> <20160428164856.10120.qmail@ns.horizon.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1089 Lines: 26 On Thu, Apr 28, 2016 at 07:51:06PM +0200, Geert Uytterhoeven wrote: > On Thu, Apr 28, 2016 at 6:48 PM, George Spelvin wrote: > > Another few comments: > > > > 1. Would ARCH_HAS_FAST_FFS involve fewer changes than CPU_NO_EFFICIENT_FFS? > > No, as you want to _disable_ ARCH_HAS_FAST_FFS / _enable_ > CPU_NO_EFFICIENT_FFS as soon as you're enabling support for a > CPU that doesn't support it. > > Logical OR is easier in both the Kconfig and C preprocessor languages > than logical NAND. > > E.g. in Kconfig, a CPU core not supporting it can just select > CPU_NO_EFFICIENT_FFS. How does a CPU lack an efficient ffs/ctz anyway? There are all sorts of ways to implement it without a native insn, some of which are almost or just as fast as the native insn on cpus that have the latter. On anything with a fast multiply, the de Bruijn sequence approach is near-optimal, and otherwise one of the binary-search type approaches (possibly branchless) can be used. If the compiler doesn't generate an appropriate one for __builtin_ctz, that's arguably a compiler bug. Rich