Received: by 2002:a05:6358:bb9e:b0:b9:5105:a5b4 with SMTP id df30csp5433820rwb; Wed, 7 Sep 2022 02:48:08 -0700 (PDT) X-Google-Smtp-Source: AA6agR6zL8kgMGg4ctetYf02G+4H8PfNSus4/dxNNS7HxcYdcHQboA9TNvE8LG9bJ9WFBHDFAvKI X-Received: by 2002:a17:90b:164d:b0:1fe:324e:912e with SMTP id il13-20020a17090b164d00b001fe324e912emr29631881pjb.27.1662544088355; Wed, 07 Sep 2022 02:48:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662544088; cv=none; d=google.com; s=arc-20160816; b=KJbj8xgvCi0+wC0GCFpnJT3wFfDD4f3MneHL7J5S2AvX+RbNJJplLvq8okRLJM21b1 iWdNuN3m3B+OOX80ITJXnluwrgu+3sxQBsQ4zSD7M4U+vYu9KFCWfWkpekwr/J1LJabM 5uHq3nOdwpFxvI7f00a/fWnkGZ8tokD1+g24hmG8Rx+7m+mfebHHOpJiM6USNyQL9opK DdS/1BilMs5hE5Hhci/syTzGcxe6zxdmEUCxxI02eOV0zBhfE1INL3POHy9bkoa3J5Au xQVsPC6eOv0lwqQFJux5HQEu4f61FdIkX2KNkLVYC9it1V8+DGGbHk6twjo5xiY1jB8U Jw4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=h0QONGGPGNbVVlQEPBNDXLwYxqBmPY9d3DCogfYZAgA=; b=pNqwzJdRjsGq9oIIgV5yrAyoGOIbKZtpXPmakkrBoygLptAs2fZZHw6E13zo+8j1l6 SW+BnwWFt+8CBJtZqfaZzT2cahNKMtQufrJXQ8bhGh5xsO1Fy6p72tI++LwswYpFvX8R ZnasV8bprqNDCHyWIT0nQNaIveGnCygPuAUSuuIIKnLtmZ7kvjphTd7FaveEiInkUq/U esCDsF+eOeM5AfrvmkQF8Q+iQwLItO3jFX826+/QJ4As26MOqEFamD1wm2IICkLoVA12 DYe6vmbBZvC2BnvC7N94kGJnexXSYz+TbAsrujV/JNKggjHCKAkImyNdsNKFufzBbAeY arKQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i14-20020a63e90e000000b0042bf6bcaaa5si5242548pgh.395.2022.09.07.02.47.55; Wed, 07 Sep 2022 02:48:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230369AbiIGJKS (ORCPT + 99 others); Wed, 7 Sep 2022 05:10:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47910 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229628AbiIGJKL (ORCPT ); Wed, 7 Sep 2022 05:10:11 -0400 Received: from smtp.smtpout.orange.fr (smtp-13.smtpout.orange.fr [80.12.242.13]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 059C280B7C for ; Wed, 7 Sep 2022 02:10:09 -0700 (PDT) Received: from YC20090004.ad.ts.tri-ad.global ([103.175.111.222]) by smtp.orange.fr with ESMTPA id Vr47oZr83tFxAVr4HoSn4I; Wed, 07 Sep 2022 11:10:07 +0200 X-ME-Helo: YC20090004.ad.ts.tri-ad.global X-ME-Auth: bWFpbGhvbC52aW5jZW50QHdhbmFkb28uZnI= X-ME-Date: Wed, 07 Sep 2022 11:10:07 +0200 X-ME-IP: 103.175.111.222 From: Vincent Mailhol To: Borislav Petkov Cc: Nick Desaulniers , Thomas Gleixner , Ingo Molnar , x86@kernel.org, Peter Zijlstra , Dave Hansen , "H . Peter Anvin" , Nathan Chancellor , Tom Rix , linux-kernel@vger.kernel.org, llvm@lists.linux.dev, David Howells , Jan Beulich , Christophe Jaillet , Joe Perches , Josh Poimboeuf , Yury Norov , Vincent Mailhol Subject: [PATCH v8 0/2] x86/asm/bitops: optimize ff{s,z} functions for constant expressions Date: Wed, 7 Sep 2022 18:09:33 +0900 Message-Id: <20220907090935.919-1-mailhol.vincent@wanadoo.fr> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220511160319.1045812-1-mailhol.vincent@wanadoo.fr> References: <20220511160319.1045812-1-mailhol.vincent@wanadoo.fr> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The compilers provide some builtin expression equivalent to the ffs(), __ffs() and ffz() functions of the kernel. The kernel uses optimized assembly which produces better code than the builtin functions. However, such assembly code can not be folded when used with constant expressions. This series relies on __builtin_constant_p to select the optimal solution: * use kernel assembly for non constant expressions * use compiler's __builtin function for constant expressions. ** Statistics ** Patch 1/2 optimizes 26.7% of ffs() calls and patch 2/2 optimizes 27.9% of __ffs() and ffz() calls (details of the calculation in each patch). ** Changelog ** v7 -> v8: * (no changes on code, only commit comment was modified) * Rewrite introduction of patch 2/2 to add nuances on the define/undefined behaviors of __builting_clzl(0), __ffs(0) and ffz(~0UL). v6 -> v7: * (no changes on code, only commit tag was modified) * Add Reviewed-by: Yury Norov in both patches v5 -> v6: * Rename variable___ffs() into variable__ffs() (two underscores instead of three) v4 -> v5: * (no changes on code, only commit comment was modified) * Rewrite the commit log: - Use two spaces instead of `| ' to indent code snippets. - Do not use `we'. - Do not use `this patch' in the commit description. Instead, use imperative tone. Link: https://lore.kernel.org/all/YvUZVYxbOMcZtR5G@zn.tnic/ v3 -> v4: * (no changes on code, only commit comment was modified) * Remove note and link to Nick's message in patch 1/2, c.f.: Link: https://lore.kernel.org/all/CAKwvOdnnDaiJcV1gr9vV+ya-jWxx7+2KJNTDThyFctVDOgt9zQ@mail.gmail.com/ * Add Reviewed-by: Nick Desaulniers tag in patch 2/2. v2 -> v3: * Redacted out the instructions after ret and before next function in the assembly output. * Added a note and a link to Nick's message on the constant propagation missed-optimization in clang: Link: https://lore.kernel.org/all/CAKwvOdnH_gYv4qRN9pKY7jNTQK95xNeH1w1KZJJmvCkh8xJLBg@mail.gmail.com/ * Fix copy/paste typo in statistics of patch 1/2. Number of occurences before patches are 1081 and not 3607 (percentage reduction of 26.7% remains correct) * Rename the functions as follow: - __varible_ffs() -> variable___ffs() - __variable_ffz() -> variable_ffz() * Add Reviewed-by: Nick Desaulniers tag in patch 1/2. Vincent Mailhol (2): x86/asm/bitops: ffs: use __builtin_ffs to evaluate constant expressions x86/asm/bitops: __ffs,ffz: use __builtin_ctzl to evaluate constant expressions arch/x86/include/asm/bitops.h | 64 +++++++++++++++++++++-------------- 1 file changed, 38 insertions(+), 26 deletions(-) -- 2.35.1