Received: by 2002:ac0:e350:0:0:0:0:0 with SMTP id g16csp2119444imn; Mon, 1 Aug 2022 11:49:57 -0700 (PDT) X-Google-Smtp-Source: AA6agR674+kDSHvWbWKfKZJ6cVXz3p2vLjN7IGBhkbF/zPktIVIWhZqpgu4IF+1QkRKiERjfSimM X-Received: by 2002:a17:90a:9383:b0:1f4:fd59:7e47 with SMTP id q3-20020a17090a938300b001f4fd597e47mr7564484pjo.172.1659379797367; Mon, 01 Aug 2022 11:49:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1659379797; cv=none; d=google.com; s=arc-20160816; b=P6Wl/vTlvuZtM3wM9uXsD14kFtDAhP9DwWdpIOfB13llm7yxoSWzXLllntq/e1b+uX 978u63aGzb45eZ28MznsYdISWKPW6wAw2U4W/jwqeSImPEMJsqYd27Pgf0mMR0aLYaTw ZAoHwUEnZJmxTNsZPyM2t3NGYWiYApSqFeFqxDEYELZkzxpf/ZudgyNKOtFMc+Po6wyS kya3Kdo71pW/P//0sn1ICPQZgEPM9B3vpD5lLi2itUt6ThtnVyeIqSgWLQhCZ/OCNt3y FVmx2PbUadXu2mBzwAx/n2GEUDOIpq66Zb04lK5BrHPSqZ9SO9wm4WATJGEnXnlmOGFW MdWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=7yJue3wpE46mqMp6WfUlVgKTJyxRggT/HsVWQliUjAA=; b=SjXJO3+QTu+3a1IiJFraeFk38khWeqt/BoUdy/kIUkmNS4rxFscbLk1eRxHCGawP8E q4O718P4I+4Z9KRQC5gFl5igbnUFpqZZbTU5YDgvGiEdpnw9fm6KmhyBxnJT35LTD1Jy Tg8t7zBwbT2rHu5IO0yHdc0ZuYNKFEx5gg+tGclfOiIb1RkIe62BAL2Wdds5xXY7PyyE lfef8W+ePTZsVTtikQQooX1N6MB++22R/zucVZe0UV3mk+sFCttnVKhIUboYmOoQNQwT k1HEYXZPYUmLYs4rUf3+68eOJ7MbwdTlRLDhMRHu9pRlZz2Iuh3RwjP7TiYhJAh4f6VM 4UMA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=baZtFM+E; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id cm12-20020a056a00338c00b0052b8a92ed84si6754264pfb.172.2022.08.01.11.49.42; Mon, 01 Aug 2022 11:49:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=baZtFM+E; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231549AbiHAR70 (ORCPT + 99 others); Mon, 1 Aug 2022 13:59:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33344 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234232AbiHAR7K (ORCPT ); Mon, 1 Aug 2022 13:59:10 -0400 Received: from mail-lf1-x131.google.com (mail-lf1-x131.google.com [IPv6:2a00:1450:4864:20::131]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 69B6BDFC2 for ; Mon, 1 Aug 2022 10:59:00 -0700 (PDT) Received: by mail-lf1-x131.google.com with SMTP id w15so18469644lft.11 for ; Mon, 01 Aug 2022 10:59:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=7yJue3wpE46mqMp6WfUlVgKTJyxRggT/HsVWQliUjAA=; b=baZtFM+E8jy0urk7ZQ/53EmacTrKbsBnvC4gnQsFSE4Xor6gw+u3FxpwZ3oOBhnwpW rLj/pVIQZC4/vYyFqDZQCnOTkDIUNeHUIinm/xvmU1t+cJbC6QjCDmIgcBM9yuAehGeR z3jZkpzc/sjT3iR4N5Kw9qPzwsgeSvm3VXBF5sj6kzz7wWaj1kT4oH42lixMmwWy3cEp 0hwtc+ML4IqNE5y8J0JxkRWe++b8jbXAvp50q/q4jQmV0JNcSUNY9oI19wEMddsPC7u1 aseeDg5zh5jGKprMt726AzYUEB3oc+hRmMHVQwelMzX/IItZKahz9+UTzSCXghGppvNL 4FGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=7yJue3wpE46mqMp6WfUlVgKTJyxRggT/HsVWQliUjAA=; b=b3CFhcwyH7pkkgnwzSaUsaDySz8KE319yuAKfVj0DPMzLiXJySYhWves/xftBiXf6Y Fc20smHpkq0eerC7ChRNETQDevhbWhOFpyMiSxAHuDszbGpDXJrR7YKTyGKSnIxiSmWS faY6H733X+/1xXyQi8fENGRmRfND/yCRyWVIaF7tEWa+ESddBcWkaRmdDE1lmTxow2GO iud4N45oYOByQu5sHJDvoq0JykkxDrDNN1va/lvGmdXp+mCf9+HsLpG5zLp6zIOwtWWC dljSVn5GPkmqO+8O0dYwmlJxfyaOSN7EWqRPlrYYValKXndSxX512a/rzLoIA6rBZBxX NP1w== X-Gm-Message-State: AJIora8lVitumUIK/TdGJplEKfJ6nF99mAdmPN8mxIEdOmO0YQvYVjeB CaPS0KA2JQD5zz3jJE2SeXXJ0qqdPL1hyB5++dcc+w== X-Received: by 2002:a19:7902:0:b0:48a:6872:68ab with SMTP id u2-20020a197902000000b0048a687268abmr5702496lfc.626.1659376738622; Mon, 01 Aug 2022 10:58:58 -0700 (PDT) MIME-Version: 1.0 References: <20220728161208.865420-1-yury.norov@gmail.com> In-Reply-To: From: Nick Desaulniers Date: Mon, 1 Aug 2022 10:58:46 -0700 Message-ID: Subject: Re: [PATCH 0/5] lib/find: optimize find_bit() functions To: Linus Torvalds Cc: Nathan Chancellor , Linux Kernel Mailing List , clang-built-linux Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 28, 2022 at 2:49 PM Linus Torvalds wrote: > > On Thu, Jul 28, 2022 at 11:49 AM Linus Torvalds > wrote: > > > > It builds for me and seems to generate reasonable code, although I > > notice that clang messes up the "__ffs()" inline asm and forces the > > source into memory. > > I have created a llvm issue for this at > > https://github.com/llvm/llvm-project/issues/56789 Thanks for the report. I left a response there (in case you have notification emails from github filtered; following up here). https://github.com/llvm/llvm-project/issues/56789#issuecomment-1201525395 So it looks like at least 3 things we can clean up: 1. https://github.com/llvm/llvm-project/issues/20571 2. https://github.com/llvm/llvm-project/issues/34191 3. https://github.com/llvm/llvm-project/issues/33216 > > and while I noticed this while looking at the rather odd code > generation for the bit finding functions, it seems to be a general > issue with clang inline asm. > > It looks like any instruction that takes a mod/rm input (so a register > or memory) will always force the thing to be in memory. Which is very > pointless in itself, but it actually causes some functions to have a > stack frame that they wouldn't otherwise need or want. So it actually > has secondary downsides too. > > And yes, that particular case could be solved with __builtin_ctzl(), > which seems to DTRT. But that uses plain bsf, and we seem to really > want tzcnt ("rep bsf") here, although I didn't check why (the comment > explicitly says "Undefined if no bit exists", which is the main > difference between bsf and tzcnt). > > I _think_ it's because tzcnt is faster when it exists exactly because > it always writes the destination, so 'bsf' is actually the inferior > op, and clang shouldn't generate it. > > But the "rm" thing exists elsewhere too, and I just checked - this > same issue seems to happen with "g" too (ie "any general integer > input"). > > Linus -- Thanks, ~Nick Desaulniers