Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp505712rwi; Mon, 10 Oct 2022 03:40:56 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4g06Ip7Ix6+cXep/ChKCAlAGYLPo+rrw9xwGP8U56m09EKyDuv/P7XIHXbJMnSVnfwOB4t X-Received: by 2002:a17:90a:e601:b0:20c:ab05:f1c5 with SMTP id j1-20020a17090ae60100b0020cab05f1c5mr12663440pjy.243.1665398456029; Mon, 10 Oct 2022 03:40:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1665398456; cv=none; d=google.com; s=arc-20160816; b=bkgqHFG6y00J52odlVqlGadYZz1Mt+Q0xltAF+0HHA0YmDBGpbU/y0U8lrJzvPQxoM ORIFZLUkDQsCEehVmaQ1S+bGYXD89tUYLQjbwB3QqHzcDVNppva8RwAfFROMv18S8b7q MitDdtJDZrewVGBGFFylcxOCBvQJrB9Z4Bx5TsvMN8BiANGuNjqDccQ3+IfSlwL12BeZ r5Mdc7/rFLXYrvcUaAbHuYLMBfr0z4BZ6YD6JctLIJWJGlzsb7arexxl0XTHRrKj2l1g EDKBwdHcVAXpPdYaVSyqgSztnSC0CtEiypb04sDMMP2UeO/3sVGPt7ZeBcnD6x7n4pAZ 66Rg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :mime-version:accept-language:in-reply-to:references:message-id:date :thread-index:thread-topic:subject:cc:to:from; bh=Ait2e0hpmfg/mqkuwKBAvzn8TRSDyNOaxW36HMjizl8=; b=DAutB86u8h51DcA4jLD3KboHsJan3DzrCqM6wVO+T+lappch9OLzKyVCDJzIGoUHP/ sxaoiG6uUfSFVkFtRngunRuSWfVycqybZQVfIVFDy5/1VTMqnGBJmQPeDbgYI5t9DnxL v1NeRupjQ8fYTUCUBcvPSRqHQURFWdnFXCEDTNOJ7XQt474jx7j/xWmq6WY0lAYjbvIs /UpmAWgiYxWn+e+dTnHuxCMkPsQz35Vju/82qDpsu8dkJwDS1XQUGVMobmlPiWrAxEX8 jGEbL7bnGwNFifd1FIVreDfPIU6hacedKX8x4WzDZ5DZqVb73IUs4YnT7UjVZB0epRBQ Q8UQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l5-20020a17090ab70500b001fd7ced8960si16093867pjr.92.2022.10.10.03.40.44; Mon, 10 Oct 2022 03:40:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231433AbiJJKEB convert rfc822-to-8bit (ORCPT + 99 others); Mon, 10 Oct 2022 06:04:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37136 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231327AbiJJKD7 (ORCPT ); Mon, 10 Oct 2022 06:03:59 -0400 Received: from eu-smtp-delivery-151.mimecast.com (eu-smtp-delivery-151.mimecast.com [185.58.86.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1841E1CB0F for ; Mon, 10 Oct 2022 03:03:57 -0700 (PDT) Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id uk-mta-321-F5c44r0_NSKo1yZuF9u8Yg-1; Mon, 10 Oct 2022 11:03:55 +0100 X-MC-Unique: F5c44r0_NSKo1yZuF9u8Yg-1 Received: from AcuMS.Aculab.com (10.202.163.4) by AcuMS.aculab.com (10.202.163.4) with Microsoft SMTP Server (TLS) id 15.0.1497.38; Mon, 10 Oct 2022 11:03:53 +0100 Received: from AcuMS.Aculab.com ([::1]) by AcuMS.aculab.com ([::1]) with mapi id 15.00.1497.040; Mon, 10 Oct 2022 11:03:53 +0100 From: David Laight To: 'Willy Tarreau' , Alexey Dobriyan CC: "lkp@intel.com" , "linux-kselftest@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "Paul E. McKenney" Subject: RE: tools/nolibc: fix missing strlen() definition and infinite loop with gcc-12 Thread-Topic: tools/nolibc: fix missing strlen() definition and infinite loop with gcc-12 Thread-Index: AQHY3A4FfEAIYReKOkCQTHoqkpVhA64HYvog Date: Mon, 10 Oct 2022 10:03:53 +0000 Message-ID: <9e16965f1d494084981eaa90d73ca80e@AcuMS.aculab.com> References: <20221009175920.GA28685@1wt.eu> <20221009183604.GA29069@1wt.eu> In-Reply-To: <20221009183604.GA29069@1wt.eu> Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.107] MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Willy Tarreau > Sent: 09 October 2022 19:36 ... > By the way, just for the sake of completeness, the one that consistently > gives me a better output is this one: > > size_t strlen(const char *str) > { > const char *s0 = str--; > > while (*++str) > ; > return str - s0; > } > > Which gives me this: > > > 0000000000000000 : > 0: 48 8d 47 ff lea -0x1(%rdi),%rax > 4: 48 ff c0 inc %rax > 7: 80 38 00 cmpb $0x0,(%rax) > a: 75 f8 jne 4 > c: 48 29 f8 sub %rdi,%rax > f: c3 ret > > But this is totally ruined by the addition of asm() in the loop. However > I suspect that the construct is difficult to match against a real strlen() > since it starts on an extra character, thus placing the asm() statement > before the loop could durably preserve it. It does work here (the code > remains the exact same one), but for how long, that's the question. Maybe > we can revisit the various loop-based functions in the future with this in > mind. clang wilfully and persistently generates: strlen: # @strlen movq $-1, %rax .LBB0_1: # =>This Inner Loop Header: Depth=1 cmpb $0, 1(%rdi,%rax) leaq 1(%rax), %rax jne .LBB0_1 retq But feed the C for that into gcc and it generates a 'jmp strlen' at everything above -O1. I suspect that might run with less clocks/byte than the code above. Somewhere I hate some complier pessimisations. Substituting a call to strlen() is typical. strlen() is almost certainly optimised for long strings. If the string is short the coded loop will be faster. The same is true (and probably more so) for memcpy. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)