Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp746428iob; Thu, 12 May 2022 03:54:02 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwDFZQazGtXwBbgEvDPmiq6oP2uRqKFSr5gMQ0tuvDRrWKrgqaOQOyM7Ge33hJ7pOHcZ5Sj X-Received: by 2002:aa7:cad4:0:b0:428:715f:5ce4 with SMTP id l20-20020aa7cad4000000b00428715f5ce4mr30952269edt.124.1652352842764; Thu, 12 May 2022 03:54:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1652352842; cv=none; d=google.com; s=arc-20160816; b=Zvi8RL/JT4QvoP3LVCa6UJWyC9MowM+b8EBvAm2lW1bLSYqDrvshzfI+p+B2nMRALK 204T0OCvjz05YslchxWFm++PCsnGdCqfKXpo7DEV0fCepo8xRk3SNkP7LWpwt0y5Vxul wtMExE9jXdP48autDIaRxDcBmkWVO5PZKGknQ4NnDapahN/ECQw0pFQEqhgYOdiYNssQ MWQ1G1M7QaZBE+uc7xCv7Ksnpbu3EA3ekU3S3610vKjs66MnquKvHeoG+irI4SgNRIc1 keS0VRVpjFZxjAnQup34k+F7N2laeANAb68yQupmSmK/xuzU9+i3c4RfYwF5ljKemCtH 7Xaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:sender:dkim-signature; bh=qCk9G2cMF/SICHI8ao9nOv/XHwNyqHuD3L/gqBqavrY=; b=KUuGwP/Lbi9Czg4HXk4imvdJb8vXIen53wcWyyhnFaficiaw/EswugQIA5B6EtpkAn YCarTpQYldQtTkNen29tqQtohCRpvx0E87QQULg5Tr6ntzTJCc6LA4JWzvSX88L8MXbH JWh7NyvWWPWg7rU1I662Jea+OLxColr8QSg1f5HyZBMrGdZxbXgDBNASkYMa8fIKpvXw TL3RqnRe31r8xPyhczQPIAGpft7MDT/P2E01S78IV1C3XvyccjBnQbRxA5sI8Q9wqKmn FtoWjcLU5hQFA2Lxmf1q7oEaVYo/G7shJ5NZ1oISNm6YXrzcApcSGwEUw6Sk4asLsTgC 6vHg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=Jb+K99Cb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id he33-20020a1709073da100b006e7fe0f67d6si4937396ejc.586.2022.05.12.03.53.36; Thu, 12 May 2022 03:54:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=Jb+K99Cb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344221AbiEKQDy (ORCPT + 99 others); Wed, 11 May 2022 12:03:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233545AbiEKQDw (ORCPT ); Wed, 11 May 2022 12:03:52 -0400 Received: from mail-pf1-x42c.google.com (mail-pf1-x42c.google.com [IPv6:2607:f8b0:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2A4511912CA for ; Wed, 11 May 2022 09:03:51 -0700 (PDT) Received: by mail-pf1-x42c.google.com with SMTP id y41so2351977pfw.12 for ; Wed, 11 May 2022 09:03:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=qCk9G2cMF/SICHI8ao9nOv/XHwNyqHuD3L/gqBqavrY=; b=Jb+K99Cb+7nxhYnfUAhHLzW7+vUWuxV6Kf0/rR3ZSUpywfQa2f0Qv6HS3km/PVCdqu pAh/XkEMKdnieUol2I2IMdLbThGsSHTRt9L7srPIEiSdgADymqSFzoTpYDBvJuo5q7Qd Wko3GGDAaMtPyShp5qgaAz38Jc/J4oetIsbcEcMcAhXncDeKgjjH4StDv2F4s/wB7Kdl 7Cnd76NrobAFKklidqp/jtkZszqNfUujZx5T6Dnmxx7+Dkxj1gBUBujZpQCOvAvL9wj8 RdjTvsnIXn8MUGZpPFmrppdzwdQJwQ86aAWXXB0RCQFclcNbKg28Pqs/LVVnLCnSHPHO 0vkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :mime-version:content-transfer-encoding; bh=qCk9G2cMF/SICHI8ao9nOv/XHwNyqHuD3L/gqBqavrY=; b=4BUE99wzvmssGUdKIwpZZ1YSqvAonlIn/Qnu/xvMaq6F/XkVV7dY7r/5wBHtN99tfb dkmJ7cmy8+tnG76fGyI7FAlrnXFEWsg3r5tFRxoYSBVC8QJacmaIK6uMnP+Py5CNHczM 4D2+6oXIsfT/KWzz1XsA8+KldEjleGymv6i/J5K51zv1TbZADtRN1if0ieEW0U+HgFTL Iubei+THTjcA9LTVd9hbsq6aXfPJ+npxnElsnl3HwksECEZfLxkdyoa2Dy1XwXtZAcQn 8Gm7JeUBdo15rQQMz2DAt000dfxbOiUGgozEqz8uT2ANtBjb+BWMF5wIzJUhfDY1pOBz EYkQ== X-Gm-Message-State: AOAM533dQsXpVqOW6TBkbmOLV8ZeLwNS21RIak9jAX8vtJFBeQIEjKRY EDocEJ9kwL/5GCpuBvVeV2E= X-Received: by 2002:a63:751d:0:b0:3db:2e7b:f93 with SMTP id q29-20020a63751d000000b003db2e7b0f93mr2560049pgc.496.1652285030554; Wed, 11 May 2022 09:03:50 -0700 (PDT) Received: from localhost.localdomain (124x33x176x97.ap124.ftth.ucom.ne.jp. [124.33.176.97]) by smtp.gmail.com with ESMTPSA id w11-20020a63c10b000000b003c14af50624sm42825pgf.60.2022.05.11.09.03.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 May 2022 09:03:49 -0700 (PDT) Sender: Vincent Mailhol From: Vincent Mailhol To: Nick Desaulniers , Thomas Gleixner , Ingo Molnar , Borislav Petkov Cc: Dave Hansen , x86@kernel.org, "H . Peter Anvin" , Nathan Chancellor , Tom Rix , linux-kernel@vger.kernel.org, llvm@lists.linux.dev, David Howells , Jan Beulich , Vincent Mailhol Subject: [PATCH v2 0/2] x86/asm/bitops: optimize ff{s,z} functions for constant expressions Date: Thu, 12 May 2022 01:03:17 +0900 Message-Id: <20220511160319.1045812-1-mailhol.vincent@wanadoo.fr> X-Mailer: git-send-email 2.35.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,RCVD_IN_SBL_CSS, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The compilers provide some builtin expression equivalent to the ffs(), __ffs() and ffz() function of the kernel. The kernel uses optimized assembly which produces better code than the builtin functions. However, such assembly code can not be optimized when used on constant expression. This series relies on __builtin_constant_p to select the optimal solution: * use kernel assembly for non constant expressions * use compiler's __builtin function for constant expressions. I also think that the fls() and fls64() can be optimized in a similar way, using __builtin_ctz() and __builtin_ctzll() but it is a bit less trivial so I want to focus on this series first. If it get accepted, I will then work on those two additionnal function. ** Statistics ** Patch 1/2 optimizes 26.7% of ffs() calls and patch 2/2 optimizes 27.9% of __ffs() and ffz() calls (details of the calculation in each patch). ** Changelog ** v1 -> v2: * Use the ORC unwinder for the produced assembly code in patch 1. * Rename the functions as follow: - __ffs_asm() -> variable_ffs() - __ffs_asm_not_zero() -> __variable_ffs() - ffz_asm() -> variable_ffs() * fit #define ffs(x) in a single line. * Correct the statistics for ffs() in patch 1 and add the statistics for __ffs() and ffz() in patch 2. Vincent Mailhol (2): x86/asm/bitops: ffs: use __builtin_ffs to evaluate constant expressions x86/asm/bitops: __ffs,ffz: use __builtin_ctzl to evaluate constant expressions arch/x86/include/asm/bitops.h | 64 +++++++++++++++++++++-------------- 1 file changed, 38 insertions(+), 26 deletions(-) -- 2.35.1