Received: by 2002:a05:6358:45e:b0:b5:b6eb:e1f9 with SMTP id 30csp573767rwe; Thu, 1 Sep 2022 04:38:32 -0700 (PDT) X-Google-Smtp-Source: AA6agR4EusBbR6ubX1EcBTwiNYt8D+gDBMTHgJipSocsc3vrXtLSt6eRqIjas0CxeMBV6J2I+Kr3 X-Received: by 2002:a17:907:2cef:b0:741:4fbf:4628 with SMTP id hz15-20020a1709072cef00b007414fbf4628mr15459609ejc.334.1662032312060; Thu, 01 Sep 2022 04:38:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662032312; cv=none; d=google.com; s=arc-20160816; b=mTOWBMH2x6C82KgAvRPcBw4MWTKvq62MQmjuNaT58/ICsdhAuBR5D3vRHt2aeVpXXh lxagVamgDiUsgX6XKIzlo4xz23v9YBRM8DOtn4L1nlx6xlzy3xtpWiUS5Li3yJ/4mTzM JXK+CSBqn6ZEHTspKTWkkyxaZlAwDPbUYhOt3fO9K8YelI6ontKHoLHAD07htPTx1sGl ifs2XvBS4rnuwiGtBSK1sqWwb0peRyk2GMNHSd9+2SDbOVAE/nTZXQSGTnMLzZpkcPKU yQfxYmIPS7jGWMvaMHrAa3+Th2K+Sb5tnJelnx+hmVqyCI6LKLBiTEPHcqWFjL50FwA+ Esvw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version; bh=o0dNB2dEotiHbBAgc7JJeptZGUlaTEB/UzQJO7GfLf0=; b=RglNdrRClyHMlY5r8lA52WpB0Rw9cOAfOsmXRlTaSxJXkdJWU2jl7N/0LGRFWXzuFA 0GJ+fKFCPp+TgAuKc0ZOSEp6Yc1fIzyFPWnh76BEADOXD+zKHi2fkKY6lIHhuWCy8GRA l/YljAvpluYQw7SE2CL5VWtxqdnxo3677rzp0Mju5ZoSEfBLLdGkCISTd0XL0365NVx2 UaKGb8DJPMFh+AVYDrA81EBk8V29UGumFKn4hfWlZaq3Z2ej2XWyc6hrT9vuwQv1PsjJ Yqzehqi+bMOUBJ11+Zuxl8jbNoC+/le4s281xN3j3qYYm9BO8kL0EL4ChUieGga0wTp7 xB3Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id m6-20020a1709061ec600b00706b9787b43si11389803ejj.319.2022.09.01.04.38.05; Thu, 01 Sep 2022 04:38:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233598AbiIAKa0 (ORCPT + 99 others); Thu, 1 Sep 2022 06:30:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34230 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233514AbiIAKaY (ORCPT ); Thu, 1 Sep 2022 06:30:24 -0400 Received: from mail-yb1-f181.google.com (mail-yb1-f181.google.com [209.85.219.181]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E19C0BD117 for ; Thu, 1 Sep 2022 03:30:22 -0700 (PDT) Received: by mail-yb1-f181.google.com with SMTP id e71so7983276ybh.9 for ; Thu, 01 Sep 2022 03:30:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=o0dNB2dEotiHbBAgc7JJeptZGUlaTEB/UzQJO7GfLf0=; b=bspJqr6d2pNTsTde3J/UZTklopjaOq0xQ657lWZba3kWfZ39ts0XRC6i0lMDVKm1/a lmSUBSAs+KvDq6OOV0bVfk3iClwt0D2Bmukd23jGf3Xp6MFkGWFc9uBjb5GRShZDEwCJ z6W9dRPllVhxxz2y5dVE6sG9qn1JwnxqGDAq6qP0ArJ5wh7oH6AO4Iqg86PvFjVkfd78 sQqc2MjhRojfpnRh0hzRLrqsxyf/MUeBi6MGHqG70yVSBpa13NKBAdpprvNIYXYxqO4u SmFGHGrzO2+SOYzVGggZ4p8I8h6G/Ic/68W2dkCqiQXL4QLOp3lngn/RjSmG2w98Ged0 m6rw== X-Gm-Message-State: ACgBeo1ZL4r5F6//ba0oN9gv9Dj/pKXJNSj3kmKvWaLJPxD0un4149Dw a5ZiLibGovDJJ2+RLKUt2+UooS9vkssvWI7G97M= X-Received: by 2002:a25:34d8:0:b0:6a2:2cc3:3b2d with SMTP id b207-20020a2534d8000000b006a22cc33b2dmr431315yba.142.1662028221974; Thu, 01 Sep 2022 03:30:21 -0700 (PDT) MIME-Version: 1.0 References: <20220511160319.1045812-1-mailhol.vincent@wanadoo.fr> <20220831075742.295-1-mailhol.vincent@wanadoo.fr> In-Reply-To: From: Vincent MAILHOL Date: Thu, 1 Sep 2022 19:30:10 +0900 Message-ID: Subject: Re: [PATCH v6 0/2] x86/asm/bitops: optimize ff{s,z} functions for constant expressions To: Yury Norov Cc: Borislav Petkov , Nick Desaulniers , Thomas Gleixner , Ingo Molnar , x86@kernel.org, Peter Zijlstra , Dave Hansen , "H . Peter Anvin" , Nathan Chancellor , Tom Rix , linux-kernel@vger.kernel.org, llvm@lists.linux.dev, David Howells , Jan Beulich , Christophe Jaillet , Joe Perches , Josh Poimboeuf Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.4 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue. 1 sept. 2022 at 12:49, Yury Norov wrote: > On Wed, Aug 31, 2022 at 01:54:01AM -0700, Yury Norov wrote: > > On Wed, Aug 31, 2022 at 04:57:40PM +0900, Vincent Mailhol wrote: > > > The compilers provide some builtin expression equivalent to the ffs(), > > > __ffs() and ffz() functions of the kernel. The kernel uses optimized > > > assembly which produces better code than the builtin > > > functions. However, such assembly code can not be folded when used > > > with constant expressions. > > > > > > This series relies on __builtin_constant_p to select the optimal solution: > > > > > > * use kernel assembly for non constant expressions > > > > > > * use compiler's __builtin function for constant expressions. > > > > > > > > > ** Statistics ** > > > > > > Patch 1/2 optimizes 26.7% of ffs() calls and patch 2/2 optimizes 27.9% > > > of __ffs() and ffz() calls (details of the calculation in each patch). > > > > Hi Vincent, > > > > Can you please add a test for this? We've recently added a very similar > > test_bitmap_const_eval() in lib/test_bitmap.c. > > > > dc34d5036692c ("lib: test_bitmap: add compile-time optimization/evaluations > > assertions") > > > > Would be nice to have something like this for ffs() and ffz() in > > lib/test_bitops.c. > > > > Please keep me in loop in case of new versions. Hi Yury, My patch only takes care of the x86 architecture. Assuming some other architectures are not optimized yet, adding such a test might break some builds. I am fine with adding the test, however, I will not write patches for the other architecture because I do not have the environment to compile and test it. Does it still make sense to add the test before fixing all the architectures? > Also, what about fls? Is there any difference with ffs/ffz wrt compile > time optimizations? If not, would be great if the series will take > care of it too. Agree. The fls() and fls64() can use __builtin_ctz() and __builtin_ctzll(). However, those two functions are a bit less trivial. I wanted to have this first series approved first before working on *fls*(). Yours sincerely, Vincent Mailhol