Received: by 2002:a05:6358:bb9e:b0:b9:5105:a5b4 with SMTP id df30csp5377873rwb; Wed, 7 Sep 2022 01:42:15 -0700 (PDT) X-Google-Smtp-Source: AA6agR5TnSCcChoCTjyizibEvi3epk/udyUUosxLP7/BQqNm22t4lys72ljpc0SXO+ILtQXaahCY X-Received: by 2002:a17:907:97c1:b0:741:a098:1703 with SMTP id js1-20020a17090797c100b00741a0981703mr1574765ejc.216.1662540134914; Wed, 07 Sep 2022 01:42:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662540134; cv=none; d=google.com; s=arc-20160816; b=Ev01HBtC9Ip2eoACJ/u1TXjbzmPOJ775zEz8mCqNL7Dyd8Pxqi+efDkR2ZrFPoT+if /wrky3bedJawzZP7tSnWXHVSPDsGAdK+HERh1vW5QesUq0BfSO8TZQkn+hDR6D6yYFCb d6nLHzgPmZJmebbCrX97xM5B+uTtP4NQS9MT0pHqBc5KPnl3glKydTrGu0hczWsIfYdN jmql/UETg0AeZPQKl8ypJQEa8wZs6JwLAqMypCzefe/QB19b3GwcQxjF/QGIe3nnHZXx lq3dKc0Rol62ujh+pgkTNK+A2oRbZiTrUcHojk9FwdfFYdD6HMilmvhV7KQrFNHN1DIK 2/Ug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version; bh=pu4HqqONsDYi/LIKMwq8BTMzY2qFryOF9rqHFDwaORw=; b=xJ/04hXWeNx2AYHp7moP2Ftvj6YMFEESGIwUUhJu9qzglF6rx52ot5tTUR8yBYGWEy ZsKFuX2mYNC3LtGt+HgQX75Zuc49fYcSEY/FcJm4sMys3RcNqaYBoBnYJfylR5ba7LNU Clml2GViTYTAkgiVFMATgZ8ERxJzoUK+d+jYXEE2CyHLh6HxuJDokOLTjWtyQ/1GaJHm wsYO1i8vi3FDM0B9HoBfBLTafZMJFkV9T734N50QAztGI/hQbPsYKCkTmuCxqgHBRb5p abN7jvVMRM1RMw1+qwhj0TFy+u3sqV8l74SSvXd8mUw9+vwCSg3jhhmLC9GwIXvZm5fq pWSA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i8-20020a05640242c800b00447971254b9si13017512edc.26.2022.09.07.01.41.49; Wed, 07 Sep 2022 01:42:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230181AbiIGHur convert rfc822-to-8bit (ORCPT + 99 others); Wed, 7 Sep 2022 03:50:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48396 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230150AbiIGHuX (ORCPT ); Wed, 7 Sep 2022 03:50:23 -0400 Received: from mail-yb1-f175.google.com (mail-yb1-f175.google.com [209.85.219.175]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8E77A7EFD0 for ; Wed, 7 Sep 2022 00:50:08 -0700 (PDT) Received: by mail-yb1-f175.google.com with SMTP id a67so8651038ybb.3 for ; Wed, 07 Sep 2022 00:50:08 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date; bh=CBvemt91QSwjWpSOvmqvp3ctyqgTCTMhNhU8MmRcjWA=; b=BwlilT49Kmi9m3lkYqn9ClUoFV5FeMOc6jWT396Up2Qt0XqYAOTXRKWXV1ingY/Jp2 vW6lNCeEDvM7SlWrcfiWFmQ34cQhDDXDnueZLp9/chW3ekn+c12EKkh+lfuN3/lG83jC 9A/rKCX4k1NIw6d+kMIXtx21rd/vt0gUGZAbmSJaa9rAn6ho7G0pxANUAYQ5O0GhRZGy Kr//6pii2PlXbdFoAAycTH0Uw1izTvaViBYdOfY+pUjeEqWLXj7tSJGe8Pu02qSuc7Hr DAMTRQRCEV34q7a9viyv+XHPkul3scrSRjr/X9iRrFE1JJVy2q7B/R3On7KLZEwSu0p0 HX8Q== X-Gm-Message-State: ACgBeo19Eb1pZYeViy6ZbKrVx9IFJTs3WMZNNX4p5/i9JAVgZuZH0oQw U7+DttswuT7PdiKz2fqIaMcOuwCPaL5KphTmGUI= X-Received: by 2002:a25:3b17:0:b0:6a9:1d5f:4573 with SMTP id i23-20020a253b17000000b006a91d5f4573mr1834315yba.423.1662537006417; Wed, 07 Sep 2022 00:50:06 -0700 (PDT) MIME-Version: 1.0 References: <20220511160319.1045812-1-mailhol.vincent@wanadoo.fr> <20220905003732.752-1-mailhol.vincent@wanadoo.fr> In-Reply-To: From: Vincent MAILHOL Date: Wed, 7 Sep 2022 16:49:55 +0900 Message-ID: Subject: Re: [PATCH v7 0/2] x86/asm/bitops: optimize ff{s,z} functions for constant expressions To: Nick Desaulniers Cc: Borislav Petkov , Thomas Gleixner , Ingo Molnar , x86@kernel.org, Peter Zijlstra , Dave Hansen , "H . Peter Anvin" , Nathan Chancellor , Tom Rix , linux-kernel@vger.kernel.org, llvm@lists.linux.dev, David Howells , Jan Beulich , Christophe Jaillet , Joe Perches , Josh Poimboeuf , Yury Norov Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT X-Spam-Status: No, score=-1.4 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed. 7 Sep. 2022 at 16:04, Nick Desaulniers wrote: > On Tue, Sep 6, 2022 at 11:26 AM Nick Desaulniers > wrote: > > > > On Sun, Sep 4, 2022 at 5:38 PM Vincent Mailhol > > wrote: > > > > > > The compilers provide some builtin expression equivalent to the ffs(), > > > __ffs() and ffz() functions of the kernel. The kernel uses optimized > > > assembly which produces better code than the builtin > > > functions. However, such assembly code can not be folded when used > > > with constant expressions. > > > > Another tact which may help additional sources other than just the > > Linux kernel; it seems that compilers should be able to fold this. Initially, I thought that you were suggesting folding the asm code (which doesn’t seem trivial at all). > > Vincent, if you're interested in making such an optimization in LLVM, > > we'd welcome the contribution, and I'd be happy to show you where to > > make such changes within LLVM; please let me know off thread. > > Oh right, it already does. > https://github.com/llvm/llvm-project/blob/ea953b9d9a65c202985a79f1f95da115829baef6/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp#L2635 > I see what's happening. Constant propagation sinks constants into a > specialized version of ffs when there's only 1 callsite in a given > translation unit (or multiple call sites with the same constant). > Then dead argument elimination removes the argument, so libcall > optimization thinks this isn't the ffs(int) you're looking for, and > skips it. Isn’t it a wise decision to skip it? How should the optimization be able to decide that the redefined ffs() is equivalent to __builtin_ffs()? More generally, if I write my own foo() which shadows a __builtin_foo() function, the two functions might do something totally different and I would be pissed off if the compiler decided to constant-fold my foo(). Dummy example: =================== char *s; /* ffs: fast forward string * @i: how many bytes to move forward * * Move forward the global s pointer by @i or strlen(s) (whoever is smaller). * * Return: how many bytes we move forward. */ int ffs(int i) { int len = strlen(s); int forward = i < len ? i : len; s += forward; return forward; } =================== How would you instruct the compiler to constant-fold the kernel’s ffs() but not fold above dummy ffs()? > Nice. > https://github.com/llvm/llvm-project/issues/57599 > I guess ffs() is usually forward declared in strings.h, so we don't > have such a static inline definition available to constant > prop/specialize in normal C code.