Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp3831149rwb; Sun, 9 Oct 2022 11:54:31 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6UdIM5fXKjc8tEutRG/sFSwBRwg/w9+UJO8Rrl/S2jHNepdApffoeqhq/PKW5oLbyTdX1J X-Received: by 2002:a17:902:f707:b0:178:77ca:a770 with SMTP id h7-20020a170902f70700b0017877caa770mr15285634plo.54.1665341656490; Sun, 09 Oct 2022 11:54:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1665341656; cv=none; d=google.com; s=arc-20160816; b=gItfCtTmwJEyTEy7Iso3rwXpTqMFGRHCwkn/93bzEQfwWenzy4u57hAk5iebSUZ4Vx p6OSRW4SuBHy9IPmU/wYFxtC2jjZni4LWZAIebqaaMEos37V+qQV1BKac8A1a05E4xiE 5KcmKwFyz5H7wjW7cij44GODV91zdG/DGAW+HVp58xVwQGaF2ZXDY7Hk86v9hfixsuna eg87zfPiZMx+JyB9HO/sw1p2ZXlB5Rpd6rhIPW1pEE8velG83wqIzs1+o9k5ALK8Iwt6 DECFA1y3NL7kBhnGnhIwKVhzqIChbE6xjzsG4gvlA8+PcusCFczA9GR4HkjUDPrOWVZ/ XOGA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=5rtuPGpogfyVYgZcI2SbLOIBnoJjG6zxWxx0mQqKJ5I=; b=OotvZRcwQUoy5uQIqTt1BYhqPBEoVH78TOs7Vr/BNcoHUNVz1LEJ4dmWnN0uPsIDXB UaBD9UMFfKrGH8kweJZTulWpcXDTM85CnRrwGqsijbFfCnHRti6AjSVUE5pHaYxFnU8N ZcMgrfwNN0g+UvuVW2i82SoJElgbvEnCH3UDQd53HIMW4L5Z3zb5iQ5LOOVMih5Zhf/x /YVgwRc05DsxxmERsQOxcLsOztdtHKgOkpuBgDumVPl8u5vgOwvZdZ9ufiWdBE5LXDe8 e+ppoEp/e42RDWPso0OSKSmFINlFVhtTQyvXCVZlp4GhfWAup92eEzVBgZYrTgdRE3Vs cc9Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 12-20020a170902c24c00b0016d9877e753si9113568plg.46.2022.10.09.11.54.04; Sun, 09 Oct 2022 11:54:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229950AbiJIR7j (ORCPT + 99 others); Sun, 9 Oct 2022 13:59:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43618 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230012AbiJIR7g (ORCPT ); Sun, 9 Oct 2022 13:59:36 -0400 Received: from 1wt.eu (wtarreau.pck.nerim.net [62.212.114.60]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 14733B6; Sun, 9 Oct 2022 10:59:31 -0700 (PDT) Received: (from willy@localhost) by pcw.home.local (8.15.2/8.15.2/Submit) id 299HxKtH028913; Sun, 9 Oct 2022 19:59:20 +0200 Date: Sun, 9 Oct 2022 19:59:20 +0200 From: Willy Tarreau To: Alexey Dobriyan Cc: lkp@intel.com, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, "Paul E. McKenney" Subject: Re: tools/nolibc: fix missing strlen() definition and infinite loop with gcc-12 Message-ID: <20221009175920.GA28685@1wt.eu> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_PASS, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Alexey, On Sun, Oct 09, 2022 at 06:45:49PM +0300, Alexey Dobriyan wrote: > Willy Tarreau wrote: > > +#if defined(__GNUC__) && (__GNUC__ >= 12) > > +__attribute__((optimize("no-tree-loop-distribute-patterns"))) > > +#endif > > static __attribute__((unused)) > > -size_t nolibc_strlen(const char *str > > I'd suggest to use asm("") in the loop body. It worked in the past > to prevent folding division loop back into division instruction. Ah excellent idea! I initially thought about using asm() to hide a variable provenance but didn't like it much because it undermines code optimization. But you're right, with an empty asm() statement alone, the loop will not look like an strlen() anymore. Just tried and it works like a charm, I'll resend a patch so that we can get rid of the ugly ifdef. > Or switch to > > size_t f(const char *s) > { > const char *s0 = s; > while (*s++) > ; > return s - s0 - 1; > } > > which compiles to 1 branch, not 2. In fact it depends. In the original code that approach was part of the ones I had considered, but it doesn't always in better code due to the prologue and epilogue being larger. It's only better at -O1, and -O2, but not -Os, and once you add asm() into it, only -O1 remains better: $ nm --size len.o|grep O|rev|sort|rev 000000000000001a T len_while_O1 0000000000000022 T len_while_asm_O1 0000000000000026 T len_for_O1 000000000000001a T len_while_O2 000000000000002b T len_while_asm_O2 0000000000000021 T len_for_O2 0000000000000013 T len_while_Os 0000000000000015 T len_while_asm_Os 000000000000000e T len_for_Os This observation seems consistent for me on x86_64, i386, arm and arm64. > But of course they could recognise this pattern too. Yes definitely, hence the need for asm() there as well to complete the comparison. Thanks for the suggestion, I'll send a v2 shortly. Willy