Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp556695iob; Fri, 13 May 2022 07:34:13 -0700 (PDT) X-Google-Smtp-Source: ABdhPJygPYoGlcRKVHessLnOnYSDqQRlzilhDZ8tvWEM3TrPyTSOJIvvad/kKctCqnIpMwN+dWz2 X-Received: by 2002:a05:6402:2945:b0:41d:aad:c824 with SMTP id ed5-20020a056402294500b0041d0aadc824mr40570129edb.364.1652452453670; Fri, 13 May 2022 07:34:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1652452453; cv=none; d=google.com; s=arc-20160816; b=1F/GV9GrW1OnVkUMuZiJ8zsp3J8X0j97IcobdUjGA/1YP1ZBwFTod/86S9VsGTp3uj 0LvnD6saNsnkk5Cbgz3s9MPT7WhyIaBNdqe18B0/16WnSdlWKB1F9nX+MTU8qtrn/iTa 1mRutQkNeYFTHt1SXQP3PjiIy3vdRpLwHKPQQKUUf2bNlzVRHrUQFR2WI/sUpf+z2LxL CB24oGxdHh0FozYSGeJsvjq7VhAtQzsY2gKQ1akokeIUEiEQ4Y64O+BYAowYeRAa7Phw R60ebdY7PhIDjicjtiXyxPtuDKrHW7Ws7twLQ4NQ0Nnxvtt07f9el7Hk9ySecSd1gtrq Ksrw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from:sender :dkim-signature; bh=TrL32Nt81/NQSNF8rFsX4dzPpMYEmEyoU2tdFctinr4=; b=pymBg3TGSPUd6LrXzs4zhg1pCs227HxF7tuZvcqXCnbOGSf9+X+g66K4jUWsrS9drh d4nLatFCPl0LBfZV2srqJHfn0qkb13lEzadAiTpp8AjlNtQRYWSRy9iiiBPc4s0p39oM eNQXhz1peRvck+CJjm+F+nBkHLR4a7k14BbXozrH2sC/62m0rtZr98wNQQMLrKsb96Tl WIzwuu2znjxQaT8e1InXso3RN5uhilGhv8wOnMYRVY8vj8CNLxEv6Fz+SFAZ3jgdi3Yp OKd9plUdSS4vgQ9CHDxpdxbcYakf+QOR8Gn7OYLByVVO/O7HpXPIak/JaNzHrPdZ1Gq7 HgiA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=p5W3iSAA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h12-20020a17090634cc00b006f46e624f70si2040827ejb.214.2022.05.13.07.33.45; Fri, 13 May 2022 07:34:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=p5W3iSAA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349877AbiELBUR (ORCPT + 99 others); Wed, 11 May 2022 21:20:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36436 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349868AbiELBUQ (ORCPT ); Wed, 11 May 2022 21:20:16 -0400 Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CD10287A24 for ; Wed, 11 May 2022 18:20:14 -0700 (PDT) Received: by mail-pf1-x432.google.com with SMTP id p8so3428070pfh.8 for ; Wed, 11 May 2022 18:20:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=TrL32Nt81/NQSNF8rFsX4dzPpMYEmEyoU2tdFctinr4=; b=p5W3iSAAD3DwAarCXV14GiMOE8396VxmuWGBCC+xBcGLRTuEqMlpBQfVKva3ueDvWT suWJIUZdtVoZspuRBIIN2bln0EIkuXe012K8U4Y/JaMq6eMO5UfuT3N+3JOnAsgkcXen gOhWf5bxWbzseK2JAMJio558696uP62Yx+wGvQkydZj8XnJQJ2uuD7exdHLmXcYto1sx 3EWH81ZE3M4ykhLLqJIYN3YQ9MZ9lS7i2ImGPZ3tCm3U8aCjD46VeNuCLEaCVKkb/BXp yy9FgwlGuCvF3wFQo3Nuyk2TsLC1OAxeUePrgahcNmK+BwgAcFN8hKCLARhaFRyoJvIl +cVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references:mime-version:content-transfer-encoding; bh=TrL32Nt81/NQSNF8rFsX4dzPpMYEmEyoU2tdFctinr4=; b=qmgOPomGLyCKOR43r56P03kHQTpQ1qH5EjdTZb/qopIdh17ElAxjgF72WAZ6W5q3+X KIS/wQ/S/ZbSnUfzgrQ2c2D0hQcH32VjLzZyNJ+aMm3cUcHGrsvuGKSAZ9w5rb7NyQF3 TwSjV7je9inxBxar4ufkKx1q9xV+peIFWtWFF+5C3y0rvqmb8vhR0i+e1VyS9fZ35J4b RhFgeExVrgpkFV5hFhPu1/KAqJcOkwV4iGmT0mFSZF1rrksp26TX3m21Lmzut+R/TQNc vifyVz43+PuNF67kR/e22lxlSA/aGiiEcYpSSZukiu7EoriWhyt/w0AzWaz7rUxPx671 3cqQ== X-Gm-Message-State: AOAM530wPhZU3sbFp6Zv4y9dQLvRSF6dXqoH6KkNxadjBpZYeK8yyFI8 eMp2GNdYA60MQ1tJ5YH+EbI= X-Received: by 2002:a63:88c8:0:b0:3ab:1871:13b4 with SMTP id l191-20020a6388c8000000b003ab187113b4mr23246137pgd.85.1652318414220; Wed, 11 May 2022 18:20:14 -0700 (PDT) Received: from localhost.localdomain (124x33x176x97.ap124.ftth.ucom.ne.jp. [124.33.176.97]) by smtp.gmail.com with ESMTPSA id x5-20020a170902ea8500b0015e8d4eb24bsm2545978plb.149.2022.05.11.18.20.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 May 2022 18:20:13 -0700 (PDT) Sender: Vincent Mailhol From: Vincent Mailhol To: Nick Desaulniers , Thomas Gleixner , Ingo Molnar , Borislav Petkov Cc: Dave Hansen , x86@kernel.org, "H . Peter Anvin" , Nathan Chancellor , Tom Rix , linux-kernel@vger.kernel.org, llvm@lists.linux.dev, David Howells , Jan Beulich , Christophe JAILLET , Vincent Mailhol Subject: [PATCH v4 0/2] x86/asm/bitops: optimize ff{s,z} functions for constant expressions Date: Thu, 12 May 2022 10:18:53 +0900 Message-Id: <20220512011855.1189653-1-mailhol.vincent@wanadoo.fr> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220511160319.1045812-1-mailhol.vincent@wanadoo.fr> References: <20220511160319.1045812-1-mailhol.vincent@wanadoo.fr> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The compilers provide some builtin expression equivalent to the ffs(), __ffs() and ffz() function of the kernel. The kernel uses optimized assembly which produces better code than the builtin functions. However, such assembly code can not be optimized when used on constant expression. This series relies on __builtin_constant_p to select the optimal solution: * use kernel assembly for non constant expressions * use compiler's __builtin function for constant expressions. I also think that the fls() and fls64() can be optimized in a similar way, using __builtin_ctz() and __builtin_ctzll() but it is a bit less trivial so I want to focus on this series first. If it get accepted, I will then work on those two additionnal function. ** Statistics ** Patch 1/2 optimizes 26.7% of ffs() calls and patch 2/2 optimizes 27.9% of __ffs() and ffz() calls (details of the calculation in each patch). ** Changelog ** v3 -> v4: * (no changes on code, only commit comment was modified) * Remove note and link to Nick's message in patch 1/2, c.f.: https://lore.kernel.org/all/CAKwvOdnnDaiJcV1gr9vV+ya-jWxx7+2KJNTDThyFctVDOgt9zQ@mail.gmail.com/ * Add Reviewed-by: Nick Desaulniers in tag in patch 2/2. v2 -> v3: * Redacted out the instructions after ret and before next function in the assembly output. * Added a note and a link to Nick's message on the constant propagation missed-optimization in clang: https://lore.kernel.org/all/CAKwvOdnH_gYv4qRN9pKY7jNTQK95xNeH1w1KZJJmvCkh8xJLBg@mail.gmail.com/ * Fix copy/paste typo in statistics of patch 1/2. Number of occurences before patches are 1081 and not 3607 (percentage reduction of 26.7% remains correct) * Rename the functions as follow: - __varible_ffs() -> variable___ffs() - __variable_ffz() -> variable_ffz() * Add Reviewed-by: Nick Desaulniers in tag in patch 1/2. Vincent Mailhol (2): x86/asm/bitops: ffs: use __builtin_ffs to evaluate constant expressions x86/asm/bitops: __ffs,ffz: use __builtin_ctzl to evaluate constant expressions arch/x86/include/asm/bitops.h | 64 +++++++++++++++++++++-------------- 1 file changed, 38 insertions(+), 26 deletions(-) -- 2.35.1