Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp1098815iog; Sat, 25 Jun 2022 00:29:58 -0700 (PDT) X-Google-Smtp-Source: AGRyM1uiijwKEnnJFajfaZfheoxBy7jJ8JOtGpf4MvU7kLO3fvJy4+jeloFcCFJwAnS4uFv5FF7P X-Received: by 2002:a05:6402:360d:b0:435:710a:2531 with SMTP id el13-20020a056402360d00b00435710a2531mr3517925edb.377.1656142197834; Sat, 25 Jun 2022 00:29:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656142197; cv=none; d=google.com; s=arc-20160816; b=Kf6bvNH1Mu1FdNKYMKdSnzoM5+SS555VBRhA1Btd6/51jfyMXgqEvQqpErFW/ctgFk T3EsdIgVPJEKqU5jj9/Lgk9hRIXHTRaypR7bZLLHASjFTFYle9hmUmwoz1+OCBQCS7AA 2bkQB3U2s9OW8zFX38k06BwA5Z9bIrA38BwZWE5OIzZS8AHigvXl49q6/xqUcpWPeAoy 5wq4Br0WFPvWJLtQNDr5VLWVP7XbCJpf0aQ0YnjBAPlj+DT/fUgyD//Aqf2wCdFU7CyY kKJ2RggWFFESlgKiblw/pXGeCzFNpWKv+kITqoVdqWDE1N3+oBVuZN0CSetKkiHOMoEA x9sQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from:sender :dkim-signature; bh=aK0OouJ7vj79B3CKhHBNrE1LgvDzY0FwEUqvzc5U2io=; b=WFHglxGprTg4xF/BfJXQU+/NUXStFYq+RIEUMsB0o+vV0Ajp1ZbHdKYtYX+uqLhKcb I3tF2z3JhJdFGFi4T7gha4j7fytWj6f0dLzbML9jvgy8PUp/mOEARvAJS3ALqgEYFNN+ GcOjaDVQ0iNo4Wo7hKAinN/y3xtQcal+GBS+zyjrS0tP45vUJoKXZrRlSAsydr4EbTrU yNHp7u0VE1K4GONj/hqeCP5X7n6hraRtoB/lgjAI3rIjuiPMXHeJEeWcoT6MjfdMTutR veD3TlLMVsn0+ClOgA5qyRt6cf645j1bkpP/aao1DdzWfglPl0YUmNgD5S2pYJlAUMhz /G3w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=ah3bFiqJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gt16-20020a1709072d9000b0071d3b6e50fbsi2926757ejc.402.2022.06.25.00.29.33; Sat, 25 Jun 2022 00:29:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=ah3bFiqJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231527AbiFYH1R (ORCPT + 99 others); Sat, 25 Jun 2022 03:27:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44414 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229722AbiFYH1N (ORCPT ); Sat, 25 Jun 2022 03:27:13 -0400 Received: from mail-pj1-x102b.google.com (mail-pj1-x102b.google.com [IPv6:2607:f8b0:4864:20::102b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7BACA248FB for ; Sat, 25 Jun 2022 00:27:12 -0700 (PDT) Received: by mail-pj1-x102b.google.com with SMTP id go6so4749226pjb.0 for ; Sat, 25 Jun 2022 00:27:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=aK0OouJ7vj79B3CKhHBNrE1LgvDzY0FwEUqvzc5U2io=; b=ah3bFiqJbvOkS77zS58PHpqgSCsOJLA33/tQciJ5ahsHGdcQ4S0Z28AfnEpgmcgVPT c2JyiLUESs5eFVsx6CY4urzNt9pWPIGJSZJigt35B9o0U20fBBDe0MFikZb07hRsWZ9C IRZ06HrnPDO3xi/qSmzW1WNyphVN8WdgEFY6V2f0YapfjS8ibHvXxsPal4c7re44TAaF yajKjjbbj21Ps/QNytNL3BSyayV0Uk4RBvhl0YR7dnIlIm5mJf4ohAKSRbv6BMgLCFIp UF9DYkSbfOLWVHrgX5gxlgZxYMNqenBmHsVEW3fH+HFWA+OOp+bTDAHsr1BloVFaNrk0 uPbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references:mime-version:content-transfer-encoding; bh=aK0OouJ7vj79B3CKhHBNrE1LgvDzY0FwEUqvzc5U2io=; b=05UD1/NcC0o7EhB4rvIpRZRjeHclBk6HHtrNi5wLbXa7C8Ypbuy6JKaehdZNykD9aJ czHZSSAnsaWwXjm1jaasska14S6teSCKn085xiIdfQ2E19otkgEzB6iE5gHhEPrU5VUt vxRpgt2D8lIsvBpzMPDkaAREhE4aKx4a56zKNoBUQPHzdYvgVdFxaSv6NqdHWrJHVekw C9R+kVsr/HK/2IKMmzTw7krbFoiazIICjT0v3m0TTH0mE5gpGOIOR5NaQdsZdgBnEVOp qItlaIINOZ/YPElNWnNSLq3p2ZhMf64stcP/f0+tnTy7qSNwPY1h9PsIW9/XIvTPCNPk sAMw== X-Gm-Message-State: AJIora97RKE78f/nB1MasLnistBWvlA+pD7NgrwYesuuX2qxewmbYAb8 emi0QHzno3sypXDQ2uxDSLE= X-Received: by 2002:a17:903:2452:b0:16a:3b58:48fd with SMTP id l18-20020a170903245200b0016a3b5848fdmr2996311pls.67.1656142031812; Sat, 25 Jun 2022 00:27:11 -0700 (PDT) Received: from localhost.localdomain (124x33x176x97.ap124.ftth.ucom.ne.jp. [124.33.176.97]) by smtp.gmail.com with ESMTPSA id o12-20020a17090a5b0c00b001e29ddf9f4fsm2973821pji.3.2022.06.25.00.27.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 25 Jun 2022 00:27:11 -0700 (PDT) Sender: Vincent Mailhol From: Vincent Mailhol To: Nick Desaulniers , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, Peter Zijlstra Cc: Dave Hansen , "H . Peter Anvin" , Nathan Chancellor , Tom Rix , linux-kernel@vger.kernel.org, llvm@lists.linux.dev, David Howells , Jan Beulich , Christophe JAILLET , Joe Perches , Josh Poimboeuf , Vincent Mailhol Subject: [RESEND PATCH v4 0/2] x86/asm/bitops: optimize ff{s,z} functions for constant expressions Date: Sat, 25 Jun 2022 16:26:43 +0900 Message-Id: <20220625072645.251828-1-mailhol.vincent@wanadoo.fr> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220511160319.1045812-1-mailhol.vincent@wanadoo.fr> References: <20220511160319.1045812-1-mailhol.vincent@wanadoo.fr> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The compilers provide some builtin expression equivalent to the ffs(), __ffs() and ffz() function of the kernel. The kernel uses optimized assembly which produces better code than the builtin functions. However, such assembly code can not be optimized when used on constant expression. This series relies on __builtin_constant_p to select the optimal solution: * use kernel assembly for non constant expressions * use compiler's __builtin function for constant expressions. ** Statistics ** Patch 1/2 optimizes 26.7% of ffs() calls and patch 2/2 optimizes 27.9% of __ffs() and ffz() calls (details of the calculation in each patch). ** Changelog ** v3 -> v4: * (no changes on code, only commit comment was modified) * Remove note and link to Nick's message in patch 1/2, c.f.: https://lore.kernel.org/all/CAKwvOdnnDaiJcV1gr9vV+ya-jWxx7+2KJNTDThyFctVDOgt9zQ@mail.gmail.com/ * Add Reviewed-by: Nick Desaulniers in tag in patch 2/2. v2 -> v3: * Redacted out the instructions after ret and before next function in the assembly output. * Added a note and a link to Nick's message on the constant propagation missed-optimization in clang: https://lore.kernel.org/all/CAKwvOdnH_gYv4qRN9pKY7jNTQK95xNeH1w1KZJJmvCkh8xJLBg@mail.gmail.com/ * Fix copy/paste typo in statistics of patch 1/2. Number of occurences before patches are 1081 and not 3607 (percentage reduction of 26.7% remains correct) * Rename the functions as follow: - __varible_ffs() -> variable___ffs() - __variable_ffz() -> variable_ffz() * Add Reviewed-by: Nick Desaulniers in tag in patch 1/2. Vincent Mailhol (2): x86/asm/bitops: ffs: use __builtin_ffs to evaluate constant expressions x86/asm/bitops: __ffs,ffz: use __builtin_ctzl to evaluate constant expressions arch/x86/include/asm/bitops.h | 64 +++++++++++++++++++++-------------- 1 file changed, 38 insertions(+), 26 deletions(-) -- 2.35.1