Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp1115714rwb; Thu, 10 Nov 2022 11:21:29 -0800 (PST) X-Google-Smtp-Source: AMsMyM53Q1MDtynnzjbmjbSKd9Ep8hbHU4I3oeFmmy9WgkQ5R/or4McBa26jRLN3G1lr/UP1+Eca X-Received: by 2002:a63:5459:0:b0:456:f7bd:a1 with SMTP id e25-20020a635459000000b00456f7bd00a1mr3047099pgm.79.1668108088862; Thu, 10 Nov 2022 11:21:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668108088; cv=none; d=google.com; s=arc-20160816; b=aTE06BVngcu18buKrKrxV2KYIwcmYa3CM5C5I4i9nUdYqNYOnEQmb2QfjSo5injLVl tB9L/mxXHBaOJOe3IvyKIcXascuZeZn5L4pO2kNKeA3LhjuGHvjmljitsYdER1rqtiLA zbx5r5RcCPlLDZiDci2sxuhdGWWAQ6vfiPeziha2w3rxP6jtSJV8pWmdyk0rpQ3N6IXU HuiBnkfFrlm30cWpDMzp9VVyNLwIfPH0XgyK/4p9IVPViSszIsJeXb32ciFZZkSw0MVC URv6KN2pgbexznq4V98BHGhVt4bx8pOEz+Q2CwxkQgUYNXEPWavPnOr0wc7J8wEnamy6 v0mg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=8BOfwJyZkB5NeEkoK9HFl+Jpr8ENfJAqXI5cJXPbV58=; b=go8lbLvJJOtS53r4nuqtgg4C9HAzGBbZi/geIQfs+LJSmIJJ16h2VqvTPUS/yhHs3m heIbfvson5s3sGEoel5TsGFYLXU7ynhJ1uUYuNOiU/licGN/KTuoSMh8WighMyeXBWn/ DRVO9hPsWLCmM5TVoni8VcA2cnHwkjN6iS8b0oRW95/5bQjf9bLM05dc7WqTJX0Z1/Ge AJCrNr7rpta0dwCIHCxbv1w4in9cK0kUA2w1280oNqUo/a/1T7n7vzyTYFxUN2S/TGOy kAGchY18A9wYgPa2/yxKuS2cJ9TGjnTfZCHTADy9AQNu3KoYDipMjl/em3Qs+D0n36iZ eiLw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=nVYsd80h; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id oa9-20020a17090b1bc900b00214021e87d1si5331346pjb.173.2022.11.10.11.21.16; Thu, 10 Nov 2022 11:21:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=nVYsd80h; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229675AbiKJTIa (ORCPT + 93 others); Thu, 10 Nov 2022 14:08:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47596 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230296AbiKJTHn (ORCPT ); Thu, 10 Nov 2022 14:07:43 -0500 Received: from mail-pj1-x102d.google.com (mail-pj1-x102d.google.com [IPv6:2607:f8b0:4864:20::102d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D63659FC0 for ; Thu, 10 Nov 2022 11:06:22 -0800 (PST) Received: by mail-pj1-x102d.google.com with SMTP id b11so2453690pjp.2 for ; Thu, 10 Nov 2022 11:06:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=8BOfwJyZkB5NeEkoK9HFl+Jpr8ENfJAqXI5cJXPbV58=; b=nVYsd80hcxMv8ff6fLEa671TtzbPmVVCP1aPUXP0s6+BujlXVaze5aQociyjzmAnlW 5W4BpzMnVVqWD3kXN+QkmA4t42sP4n+cIQrhaQkxhE/T8UB51PRMjJwk4f/XxJTULlAF 0UX9NtEt9fcOSitjV6oaZxroCFZqPJ67lU7C1CqNAzDVpktF3ylxkr+XFKfC5tBGD+mo GpVfjDW6xYl5VXv6oN4COhPIVZ1yKi9o+8ZBFzGkMcenCokt3/NuMFkE2S5wyvZDy26P CzLzbhARvWXDpI4fowrGuZOZA4Kt3z+9+afVXuWt8ep5rdycTCXrZOIuwkgyfeBHc+WO 9Dyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=8BOfwJyZkB5NeEkoK9HFl+Jpr8ENfJAqXI5cJXPbV58=; b=R8DsRY1ZCV1tMyE2RZZEtaXin96YgOrV6+Ex45DceO0aqse3rodGwqfgWNGNufdsBU v0MrMHzRLHBIU+lCK1LR2RnPEnb4jN0nEwUJIFU6tjNnhDicVdIToR1UD/UyoJckWxN0 D7xLdM9MNJeWV0u4Lm4EZbXjINAbfXka5ldzrR1M4+HJgpOzKAvuGYNCrxwPPIR3Vj4w aUh4QzP7lzoEAOkRrB9LaeIZknY3A5MrdOZX4AP8lJ7uN+zy1DQvkhmQkrCn31CtFQ91 9JCX+2gpmEnn50hhcyxN+TTBU8fUlJUsL8JduYxJ39ppEwbmRqYwxbc7p8i0cRLhlPUB 8fAw== X-Gm-Message-State: ACrzQf11Q4Ebx2jLNdzn7z8TrzNWfTgKNmdnhJzRAdi2o3uXOOcEMFcy 6KiwMyA/A2wOz9zRmZ2BE66u9zqDjwcFx1mXVWKIKw== X-Received: by 2002:a17:90a:c24a:b0:213:13aa:3e2a with SMTP id d10-20020a17090ac24a00b0021313aa3e2amr82957783pjx.107.1668107181512; Thu, 10 Nov 2022 11:06:21 -0800 (PST) MIME-Version: 1.0 References: <20221106095106.849154-1-mailhol.vincent@wanadoo.fr> <20221106095106.849154-2-mailhol.vincent@wanadoo.fr> In-Reply-To: <20221106095106.849154-2-mailhol.vincent@wanadoo.fr> From: Nick Desaulniers Date: Thu, 10 Nov 2022 11:06:10 -0800 Message-ID: Subject: Re: [PATCH v1 1/2] x86/asm/bitops: Replace __fls() by its generic builtin implementation To: Vincent Mailhol Cc: x86@kernel.org, Ingo Molnar , Borislav Petkov , Thomas Gleixner , linux-kernel@vger.kernel.org, Yury Norov , llvm@lists.linux.dev, Borislav Petkov Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Nov 6, 2022 at 1:51 AM Vincent Mailhol wrote: > > Below snippet: > > #include > > unsigned int foo(unsigned long word) > { > return __fls(word); > } > > produces this on GCC 12.1.0: > > 0000000000000000 : > 0: f3 0f 1e fa endbr64 > 4: e8 00 00 00 00 call 9 > 9: 53 push %rbx > a: 48 89 fb mov %rdi,%rbx > d: e8 00 00 00 00 call 12 > 12: 48 0f bd c3 bsr %rbx,%rax > 16: 5b pop %rbx > 17: 31 ff xor %edi,%edi > 19: e9 00 00 00 00 jmp 1e > > and that on clang 14.0.6: > > 0000000000000000 : > 0: f3 0f 1e fa endbr64 > 4: e8 00 00 00 00 call 9 > 9: 53 push %rbx > a: 50 push %rax > b: 48 89 fb mov %rdi,%rbx > e: e8 00 00 00 00 call 13 > 13: 48 89 1c 24 mov %rbx,(%rsp) > 17: 48 0f bd 04 24 bsr (%rsp),%rax > 1c: 48 83 c4 08 add $0x8,%rsp > 20: 5b pop %rbx > 21: c3 ret > > The implementation from [1] > produces the exact same code on GCC and below code on clang: > > 0000000000000000 : > 0: f3 0f 1e fa endbr64 > 4: e8 00 00 00 00 call 9 > 9: 53 push %rbx > a: 48 89 fb mov %rdi,%rbx > d: e8 00 00 00 00 call 12 > 12: 48 0f bd c3 bsr %rbx,%rax > 16: 5b pop %rbx > 17: c3 ret > > The builtin implementation is better for two reasons: > > 1/ it saves two instructions on clang (a push and a stack pointer > decrement) because of a useless tentative to save rax. > > 2/ when used on constant expressions, the compiler is only able to > fold the builtin version (c.f. [2]). > > For those two reasons, replace the assembly implementation by its > builtin counterpart. > > [1] https://elixir.bootlin.com/linux/v6.0/source/include/asm-generic/bitops/builtin-__fls.h > > [2] commit 146034fed6ee ("x86/asm/bitops: Use __builtin_ffs() to evaluate constant expressions") > > CC: Borislav Petkov > CC: Nick Desaulniers > CC: Yury Norov > Signed-off-by: Vincent Mailhol LGTM; thanks for the patch! Reviewed-by: Nick Desaulniers > --- > arch/x86/include/asm/bitops.h | 14 +------------- > 1 file changed, 1 insertion(+), 13 deletions(-) > > diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h > index 2edf68475fec..a31453d7686d 100644 > --- a/arch/x86/include/asm/bitops.h > +++ b/arch/x86/include/asm/bitops.h > @@ -285,19 +285,7 @@ static __always_inline unsigned long variable_ffz(unsigned long word) > (unsigned long)__builtin_ctzl(~word) : \ > variable_ffz(word)) > > -/* > - * __fls: find last set bit in word > - * @word: The word to search > - * > - * Undefined if no set bit exists, so code should check against 0 first. > - */ > -static __always_inline unsigned long __fls(unsigned long word) > -{ > - asm("bsr %1,%0" > - : "=r" (word) > - : "rm" (word)); > - return word; > -} > +#include > > #undef ADDR > > -- > 2.37.4 > -- Thanks, ~Nick Desaulniers