Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp5770646pxu; Thu, 22 Oct 2020 10:34:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwD5UyOiQpRUnV3cWrSk8IyCY/JRIDhc32R8uod8i9inxaEj942iL8oK+C8OMCGuHHlvfrO X-Received: by 2002:a17:906:3641:: with SMTP id r1mr3222165ejb.405.1603388054579; Thu, 22 Oct 2020 10:34:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603388054; cv=none; d=google.com; s=arc-20160816; b=mz0y5FbLfDZbvj4zJY+Wb0tV7TS2uisZPiKbeNV8eyqae1q3mYpriOfChj7g7rvC0V 6e/FDvMmvI5LM/SdIMde4ppLOxIBgFo7ozge3SwPLObEV3n1j5vuyt8bpQbsc5nTTwcY jLmxG3RECRLVm3uQFcBK5lqP14TBCFrdgQfLpaz9JqgRVB+z/6YS4Ppc9Rx6ffBQVOOs t9WITv530lO4SS2DCrlsGECn19M4/3jeOl8bs0RJsJ6kGmyWPlvGH+0wM8fM4SjS41iv TOTt4a4sGIskrsKifO86X8CjlrjbKolbBQB8/PnTASQEjB/lSO998PvdDSNWNc+kfAW1 yH9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:date:cc:to:subject:from:message-id; bh=devdrBxjNW9EaFDwynUrH4TavGHEIfHoQ5mfbnh+FxQ=; b=0mNqS/knAOUAxpIDa4+Njsm4KfBpbYaup63ba4FT+T+jM9O4tn88BOIvnnZsI74K0p 94HJMvd+16h7Gkqd6BJwI7ZZlD5X009ET7TgVjaImjCta5ByFzok+28E1nGWxwlbopK2 03xyEmoEw5gg9Ut7wEyVaq9OwPR5JtBr0+low/W9R4qHBkxhaFbTln/t8WlMMGnKfEd7 NaaqVyL5yuSlo9+Q5kOK1fRYAMMFmIpWUkhCATYA3QpbCOiQVpzZ8brx0y9zy/zbP9to i5R6DWpS/BFqCV+h2eD+TVqRoq37B1g5zrblV7bip6m3MDzkV8rQ45cMpR1DLhfbQTFk lvCQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id yd22si64238ejb.101.2020.10.22.10.33.52; Thu, 22 Oct 2020 10:34:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2900632AbgJVOFu (ORCPT + 99 others); Thu, 22 Oct 2020 10:05:50 -0400 Received: from pegase1.c-s.fr ([93.17.236.30]:56799 "EHLO pegase1.c-s.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2900628AbgJVOFu (ORCPT ); Thu, 22 Oct 2020 10:05:50 -0400 Received: from localhost (mailhub1-int [192.168.12.234]) by localhost (Postfix) with ESMTP id 4CH8K54y6wz9vBJk; Thu, 22 Oct 2020 16:05:45 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [192.168.12.234]) (amavisd-new, port 10024) with ESMTP id Kg6bZPMRYehR; Thu, 22 Oct 2020 16:05:45 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 4CH8K535WMz9vBJj; Thu, 22 Oct 2020 16:05:45 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 061C48B816; Thu, 22 Oct 2020 16:05:47 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id k5jpwr9JFZVo; Thu, 22 Oct 2020 16:05:46 +0200 (CEST) Received: from po17688vm.idsi0.si.c-s.fr (unknown [192.168.4.90]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 9A1DD8B805; Thu, 22 Oct 2020 16:05:46 +0200 (CEST) Received: by po17688vm.idsi0.si.c-s.fr (Postfix, from userid 0) id 246D76680D; Thu, 22 Oct 2020 14:05:46 +0000 (UTC) Message-Id: <348c2d3f19ffcff8abe50d52513f989c4581d000.1603375524.git.christophe.leroy@csgroup.eu> From: Christophe Leroy Subject: [PATCH] powerpc/bitops: Fix possible undefined behaviour with fls() and fls64() To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , jakub@redhat.com, segher@kernel.crashing.org Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org Date: Thu, 22 Oct 2020 14:05:46 +0000 (UTC) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org fls() and fls64() are using __builtin_ctz() and _builtin_ctzll(). On powerpc, those builtins trivially use ctlzw and ctlzd power instructions. Allthough those instructions provide the expected result with input argument 0, __builtin_ctz() and __builtin_ctzll() are documented as undefined for value 0. The easiest fix would be to use fls() and fls64() functions defined in include/asm-generic/bitops/builtin-fls.h and include/asm-generic/bitops/fls64.h, but GCC output is not optimal: 00000388 : 388: 2c 03 00 00 cmpwi r3,0 38c: 41 82 00 10 beq 39c 390: 7c 63 00 34 cntlzw r3,r3 394: 20 63 00 20 subfic r3,r3,32 398: 4e 80 00 20 blr 39c: 38 60 00 00 li r3,0 3a0: 4e 80 00 20 blr 000003b0 : 3b0: 2c 03 00 00 cmpwi r3,0 3b4: 40 82 00 1c bne 3d0 3b8: 2f 84 00 00 cmpwi cr7,r4,0 3bc: 38 60 00 00 li r3,0 3c0: 4d 9e 00 20 beqlr cr7 3c4: 7c 83 00 34 cntlzw r3,r4 3c8: 20 63 00 20 subfic r3,r3,32 3cc: 4e 80 00 20 blr 3d0: 7c 63 00 34 cntlzw r3,r3 3d4: 20 63 00 40 subfic r3,r3,64 3d8: 4e 80 00 20 blr When the input of fls(x) is a constant, just check x for nullity and return either 0 or __builtin_clz(x). Otherwise, use cntlzw instruction directly. For fls64() on PPC64, do the same but with __builtin_clzll() and cntlzd instruction. On PPC32, lets take the generic fls64() which will use our fls(). The result is as expected: 00000388 : 388: 7c 63 00 34 cntlzw r3,r3 38c: 20 63 00 20 subfic r3,r3,32 390: 4e 80 00 20 blr 000003a0 : 3a0: 2c 03 00 00 cmpwi r3,0 3a4: 40 82 00 10 bne 3b4 3a8: 7c 83 00 34 cntlzw r3,r4 3ac: 20 63 00 20 subfic r3,r3,32 3b0: 4e 80 00 20 blr 3b4: 7c 63 00 34 cntlzw r3,r3 3b8: 20 63 00 40 subfic r3,r3,64 3bc: 4e 80 00 20 blr Fixes: 2fcff790dcb4 ("powerpc: Use builtin functions for fls()/__fls()/fls64()") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/bitops.h | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/bitops.h b/arch/powerpc/include/asm/bitops.h index 4a4d3afd5340..299ab33505a6 100644 --- a/arch/powerpc/include/asm/bitops.h +++ b/arch/powerpc/include/asm/bitops.h @@ -216,15 +216,34 @@ static inline void arch___clear_bit_unlock(int nr, volatile unsigned long *addr) */ static inline int fls(unsigned int x) { - return 32 - __builtin_clz(x); + int lz; + + if (__builtin_constant_p(x)) + return x ? 32 - __builtin_clz(x) : 0; + asm("cntlzw %0,%1" : "=r" (lz) : "r" (x)); + return 32 - lz; } #include +/* + * 64-bit can do this using one cntlzd (count leading zeroes doubleword) + * instruction; for 32-bit we use the generic version, which does two + * 32-bit fls calls. + */ +#ifdef CONFIG_PPC64 static inline int fls64(__u64 x) { - return 64 - __builtin_clzll(x); + int lz; + + if (__builtin_constant_p(x)) + return x ? 64 - __builtin_clzll(x) : 0; + asm("cntlzd %0,%1" : "=r" (lz) : "r" (x)); + return 64 - lz; } +#else +#include +#endif #ifdef CONFIG_PPC64 unsigned int __arch_hweight8(unsigned int w); -- 2.25.0