Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp1751634ioo; Mon, 23 May 2022 02:23:24 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyceuTOX6Eon1uxR4LKJxUX0lTo8p5B108LCA04X0VdelSfwEURxlJS+lu35KO1aGmb5hiK X-Received: by 2002:a17:90a:3f8c:b0:1df:28c5:e87a with SMTP id m12-20020a17090a3f8c00b001df28c5e87amr25305355pjc.170.1653297803967; Mon, 23 May 2022 02:23:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1653297803; cv=none; d=google.com; s=arc-20160816; b=RKXIxLQyoOkiXCzNEhUAJQEVP2+bdTNjtMO2kL6DexL8LRs2dhjBS9Eu3EUZiLvB7A Qa58yrEkE7uBMBNo7PrFwcASxsKU8RkIA+E6xGwypSlICh4L1QcC1fPjZAs1o4o/s471 Cu5BJUi2By2pTd7Ans3/aQzZB/xCGc4D1sxeI5yrKnVOBsOVteQ9zQA4Aeg15aemyacG QT7RnL/N8WY+e8/Hh0TuUPumRn/LL0DBsPlU8Gsp+/uSd2WPCFkjDc4p/OlRB1opaQ1k kiJnZbm6CTvwegBJKQyAA3b2MGP0Vr2F1+7V5f/8efe7DnMJys78Ogjz9LB350xp0NS5 kLZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version; bh=nSdESsXQYczC4k5qR6EUhoG55XqgQrQ395OHEw3icQs=; b=DFvLgYcvksZ8IDzNapL+SYu6REhucEO5u1oBZl5PNZIBint41lpXq4l/rcvVHw9CKY 2bKWQx27+DKcrHhsxE+DD10t4yJPZmmHeQK0oSzGT3TY0fSDSpjLRNqwb7+Faex9Bn3q RrAsBWOmUZoaBT7X5EeJs221XFQLc8lUgpvvLboKwGlrZypLnrjGLamh2mOHW79B0qb4 NIZANmxu4h/CXsCTPUhuPBxYW473uwsp7jvEsNGB7vDQS3568/31Dqvj11GFxL1Ij1Lw pzF6XUdp4BOxPrBlOSt2gCXNSij8mdFSYdS/LK9MdftLjr2lUpL1z4CTsBvSM3RsCEpb LE9g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id l5-20020a056a00140500b00518827f712asi9700535pfu.353.2022.05.23.02.23.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 May 2022 02:23:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 7B789222A0; Mon, 23 May 2022 02:23:21 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232946AbiEWJXP (ORCPT + 99 others); Mon, 23 May 2022 05:23:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47644 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232871AbiEWJXK (ORCPT ); Mon, 23 May 2022 05:23:10 -0400 Received: from mail-yb1-f178.google.com (mail-yb1-f178.google.com [209.85.219.178]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CEA621AF3F for ; Mon, 23 May 2022 02:23:06 -0700 (PDT) Received: by mail-yb1-f178.google.com with SMTP id q135so24310655ybg.10 for ; Mon, 23 May 2022 02:23:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=nSdESsXQYczC4k5qR6EUhoG55XqgQrQ395OHEw3icQs=; b=Pi4LW90U1ShTzGmDT8SQ/xibhk2aC2q6v2wGZ975dtuViGwyrXikIW8b+JdJbC99mf mDcf2tHMSFEeUr7u6c8yBuEL+mXJ/x1lkPQbi3ElWtrxayU2jOaRxCjIYvBBigLpxBqe +ZNJabHY003bTYMx0kOceRcai2d9GfP1wtbnRo0d2YsEv5D2ciRURu/IT1zunA2YPkVQ IaNuekRLYDTCm55r2QVVQO0GBEcFxqj1CbT1yQvr7cT5W3xWBy5qqxkorTCbiyiznYO4 BgotvFSYKq8MLuI6EGAJU/0C5oRTS5pHFdqsE4vEwJD+zPOcFXB7N2+7pC8sSAUCq964 yLrQ== X-Gm-Message-State: AOAM532n+s9B/dc1l2jov+ABJP+AkUpFAbNQQBno5Goma4ce5E1rupoM G0kOn0u0p3peEXvH/7fRE9zR32VFywuoYbRNVCY= X-Received: by 2002:a5b:491:0:b0:64a:f42f:6973 with SMTP id n17-20020a5b0491000000b0064af42f6973mr19994401ybp.20.1653297785863; Mon, 23 May 2022 02:23:05 -0700 (PDT) MIME-Version: 1.0 References: <20220511160319.1045812-1-mailhol.vincent@wanadoo.fr> <20220512011855.1189653-1-mailhol.vincent@wanadoo.fr> In-Reply-To: <20220512011855.1189653-1-mailhol.vincent@wanadoo.fr> From: Vincent MAILHOL Date: Mon, 23 May 2022 18:22:54 +0900 Message-ID: Subject: Re: [PATCH v4 0/2] x86/asm/bitops: optimize ff{s,z} functions for constant expressions To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org Cc: Dave Hansen , "H . Peter Anvin" , Nathan Chancellor , Tom Rix , linux-kernel@vger.kernel.org, llvm@lists.linux.dev, David Howells , Jan Beulich , Christophe JAILLET , Nick Desaulniers Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu. 12 May 2022 at 10:20, Vincent Mailhol wrote: > The compilers provide some builtin expression equivalent to the ffs(), > __ffs() and ffz() function of the kernel. The kernel uses optimized > assembly which produces better code than the builtin > functions. However, such assembly code can not be optimized when used > on constant expression. > > This series relies on __builtin_constant_p to select the optimal solution: > > * use kernel assembly for non constant expressions > > * use compiler's __builtin function for constant expressions. > > I also think that the fls() and fls64() can be optimized in a similar > way, using __builtin_ctz() and __builtin_ctzll() but it is a bit less > trivial so I want to focus on this series first. If it get accepted, I > will then work on those two additionnal function. > > > ** Statistics ** > > Patch 1/2 optimizes 26.7% of ffs() calls and patch 2/2 optimizes 27.9% > of __ffs() and ffz() calls (details of the calculation in each patch). > > > ** Changelog ** > > v3 -> v4: > > * (no changes on code, only commit comment was modified) > > * Remove note and link to Nick's message in patch 1/2, c.f.: > https://lore.kernel.org/all/CAKwvOdnnDaiJcV1gr9vV+ya-jWxx7+2KJNTDThyFctVDOgt9zQ@mail.gmail.com/ > > * Add Reviewed-by: Nick Desaulniers in tag in patch 2/2. > > > v2 -> v3: > > * Redacted out the instructions after ret and before next function > in the assembly output. > > * Added a note and a link to Nick's message on the constant > propagation missed-optimization in clang: > https://lore.kernel.org/all/CAKwvOdnH_gYv4qRN9pKY7jNTQK95xNeH1w1KZJJmvCkh8xJLBg@mail.gmail.com/ > > * Fix copy/paste typo in statistics of patch 1/2. Number of > occurences before patches are 1081 and not 3607 (percentage > reduction of 26.7% remains correct) > > * Rename the functions as follow: > - __varible_ffs() -> variable___ffs() > - __variable_ffz() -> variable_ffz() > > * Add Reviewed-by: Nick Desaulniers in tag in patch 1/2. > > Vincent Mailhol (2): > x86/asm/bitops: ffs: use __builtin_ffs to evaluate constant > expressions > x86/asm/bitops: __ffs,ffz: use __builtin_ctzl to evaluate constant > expressions Hi Thomas, Ingo and Borislav, Are there any chances for you to pick those two patches during this week's merge windows? https://lore.kernel.org/all/20220512011855.1189653-2-mailhol.vincent@wanadoo.fr/ https://lore.kernel.org/all/20220512011855.1189653-3-mailhol.vincent@wanadoo.fr/ Thank you! Yours sincerely, Vincent Mailhol