Received: by 2002:a05:6358:bb9e:b0:b9:5105:a5b4 with SMTP id df30csp2500921rwb; Sun, 4 Sep 2022 17:57:24 -0700 (PDT) X-Google-Smtp-Source: AA6agR5aaBDeNX5YGeYrM9RcBZ58LcOQ1IYT+7bwjGcEcW7C7R0yW7yE9tY1kkegY1hF5FIbszq5 X-Received: by 2002:a17:90a:2e0c:b0:1fd:ad5a:21d1 with SMTP id q12-20020a17090a2e0c00b001fdad5a21d1mr17327042pjd.132.1662339444443; Sun, 04 Sep 2022 17:57:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662339444; cv=none; d=google.com; s=arc-20160816; b=T4AKBXAap6qiVaWUKo6G32h2IYLnAeg9ba7JvT1TnKaqJJryHRme7yHewLE/N8bAjr 0zEtvbjYrb8YIOGQSoAFQTJtp0d+2Zfsf7umzj8SadBK/FSNlDeeNhRVwBQIXWUa7dMB yfd1Zb2dF3jUPWGuJtDHzavsnmQlZKpw7LyK6v5VY1+8CXssMZNZORnm+m33q2ZTUo9n NzGp89C/h+6I/EuLKxGWe9SQFevbK+xBHfJ8L9yfLdtVtGBlM6FDVZOfqAU5dLEuRMrg tQnNSE8opjwgBpnDslF8eSvJ7DqywlPAnGxLfoGScAyG0tqpAykir2GI4uQfJ8IrSsO3 BDUg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=rWAi5WNforpSUGT3SY6fJqmVJd28K6zQeyxDBhIrKdg=; b=KmfzCngFAHga9tDRm9ACw+N3N8wVzcQbk0mFGrlyXWLcYB/fKSlkauaOQR91JggzzR wYWV9iAwJTIaPLPerhN1b1VbTTNDWM7uqiJwrvTwWP8cWYv91hR/W0HeSqyU7dmxeNBF HRqfPJNO/Vcs1f4lQolqodVTsHfiXoQ1LO4KMAoUUj66esVQovxjqgPIgQrazFLhl7CJ 1Yv+Uy4HACVaF1BC5m/ZDP7h9B4kGFUu0LHAo8aPE7jYO0R87KJ9AJc183eaGJpwvwAG /3FEWT4v70x7OCJPlk0P/LruxJ8M/mVTVzLk0lFpKlynkZwGxXd48Eb8cOSuRrha2aEl pkrg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l10-20020a170902f68a00b00172ccb3f4eesi10208946plg.113.2022.09.04.17.57.04; Sun, 04 Sep 2022 17:57:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230013AbiIEAiE (ORCPT + 99 others); Sun, 4 Sep 2022 20:38:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44780 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229560AbiIEAiD (ORCPT ); Sun, 4 Sep 2022 20:38:03 -0400 Received: from smtp.smtpout.orange.fr (smtp-27.smtpout.orange.fr [80.12.242.27]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9383A2655F for ; Sun, 4 Sep 2022 17:38:01 -0700 (PDT) Received: from YC20090004.ad.ts.tri-ad.global ([103.175.111.222]) by smtp.orange.fr with ESMTPA id V07SoNhIZJ83FV07bovION; Mon, 05 Sep 2022 02:37:59 +0200 X-ME-Helo: YC20090004.ad.ts.tri-ad.global X-ME-Auth: bWFpbGhvbC52aW5jZW50QHdhbmFkb28uZnI= X-ME-Date: Mon, 05 Sep 2022 02:37:59 +0200 X-ME-IP: 103.175.111.222 From: Vincent Mailhol To: Borislav Petkov Cc: Nick Desaulniers , Thomas Gleixner , Ingo Molnar , x86@kernel.org, Peter Zijlstra , Dave Hansen , "H . Peter Anvin" , Nathan Chancellor , Tom Rix , linux-kernel@vger.kernel.org, llvm@lists.linux.dev, David Howells , Jan Beulich , Christophe Jaillet , Joe Perches , Josh Poimboeuf , Yury Norov , Vincent Mailhol Subject: [PATCH v7 0/2] x86/asm/bitops: optimize ff{s,z} functions for constant expressions Date: Mon, 5 Sep 2022 09:37:30 +0900 Message-Id: <20220905003732.752-1-mailhol.vincent@wanadoo.fr> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220511160319.1045812-1-mailhol.vincent@wanadoo.fr> References: <20220511160319.1045812-1-mailhol.vincent@wanadoo.fr> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_MSPIKE_H2, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The compilers provide some builtin expression equivalent to the ffs(), __ffs() and ffz() functions of the kernel. The kernel uses optimized assembly which produces better code than the builtin functions. However, such assembly code can not be folded when used with constant expressions. This series relies on __builtin_constant_p to select the optimal solution: * use kernel assembly for non constant expressions * use compiler's __builtin function for constant expressions. ** Statistics ** Patch 1/2 optimizes 26.7% of ffs() calls and patch 2/2 optimizes 27.9% of __ffs() and ffz() calls (details of the calculation in each patch). ** Changelog ** v6 -> v7: * (no changes on code, only commit tag was modified) * Add Reviewed-by: Yury Norov in both patches v5 -> v6: * Rename variable___ffs() into variable__ffs() (two underscores instead of three) v4 -> v5: * (no changes on code, only commit comment was modified) * Rewrite the commit log: - Use two spaces instead of `| ' to indent code snippets. - Do not use `we'. - Do not use `this patch' in the commit description. Instead, use imperative tone. Link: https://lore.kernel.org/all/YvUZVYxbOMcZtR5G@zn.tnic/ v3 -> v4: * (no changes on code, only commit comment was modified) * Remove note and link to Nick's message in patch 1/2, c.f.: Link: https://lore.kernel.org/all/CAKwvOdnnDaiJcV1gr9vV+ya-jWxx7+2KJNTDThyFctVDOgt9zQ@mail.gmail.com/ * Add Reviewed-by: Nick Desaulniers tag in patch 2/2. v2 -> v3: * Redacted out the instructions after ret and before next function in the assembly output. * Added a note and a link to Nick's message on the constant propagation missed-optimization in clang: Link: https://lore.kernel.org/all/CAKwvOdnH_gYv4qRN9pKY7jNTQK95xNeH1w1KZJJmvCkh8xJLBg@mail.gmail.com/ * Fix copy/paste typo in statistics of patch 1/2. Number of occurences before patches are 1081 and not 3607 (percentage reduction of 26.7% remains correct) * Rename the functions as follow: - __varible_ffs() -> variable___ffs() - __variable_ffz() -> variable_ffz() * Add Reviewed-by: Nick Desaulniers tag in patch 1/2. Vincent Mailhol (2): x86/asm/bitops: ffs: use __builtin_ffs to evaluate constant expressions x86/asm/bitops: __ffs,ffz: use __builtin_ctzl to evaluate constant expressions arch/x86/include/asm/bitops.h | 64 +++++++++++++++++++++-------------- 1 file changed, 38 insertions(+), 26 deletions(-) -- 2.35.1