Received: by 2002:a05:7412:8d08:b0:f9:2d0a:d759 with SMTP id bj8csp224249rdb; Sun, 17 Dec 2023 09:01:29 -0800 (PST) X-Google-Smtp-Source: AGHT+IHgr1lFmK5a6kbuNwwm4KwcLsrLJocl2PjSMZC7UKVkYfFjP30C8UUvMGbELqMHbeQSqq+s X-Received: by 2002:a05:6214:29c2:b0:67f:3445:8e08 with SMTP id gh2-20020a05621429c200b0067f34458e08mr2751956qvb.125.1702832489125; Sun, 17 Dec 2023 09:01:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702832489; cv=none; d=google.com; s=arc-20160816; b=rQCAVmrpICPWMRERcS5yGemOdJzNGIB5tcZ9cgj7F9SMGfOHC81fgi7vqx1bhXBdeg m4cZkH0kP4kNII3a7ma/4h/2vaYDJg1TaXP+Q/fi32TU09wRD1AQxE2pFZ6b7LEmipMZ 1UEACcoSNg4SNWuhxoDVQ3mmoPwxuPybxs6W8cAnggEAyfj8trW9b9Tn/Sci13fTTYl4 SVb5vcODZKydMZ6D1zoxgKsi7KlShioWFMDC9e19w3Fth96M5tqxFB91HgFXhccGvcQ0 qwlCipBzrC0XUk46I8WwjRMfun2sGQydlSqcf892MAJbUZJT0hwK+p1YNxmf6dQUhcdP De4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:content-language:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:accept-language :in-reply-to:references:message-id:date:thread-index:thread-topic :subject:cc:to:from; bh=9vSIQmTrWH3vofzS1MKG9wA0oBhxa0V74otIV0sXO6I=; fh=oVjX24B+VH8muKPAgYA9AUUcWAqwirHcI+RgGShc97U=; b=mZTnERoFptwGa6ABhQOhSq6JwDffQHOqpig5lf0SzSxWJ3PSmtoWIlt1dDLi4deYqc llrtl49NiN/aZEjUtPuTtI+2Y4vgGA/JTrssodsI+NNn5M8QstGfFfv0wY3OTD4kXvII MLIjdhsBu+KUwMf7g9HDLYsPnpwR4U07kGlxjZx33iVs0G0WEB0vzo31PQr39jj2vcSU ndyVj2Mbk5yTB4xFro/UxAooxDRg2T1MIGV7lLn3fWTqlVem6S9cHwwcmewlJKFF6IqP CZKO3Wmig3TRbrlnrsb9IOJ7hhqovqorrF86/cvLS0yNoLYU6BZ0s8tCG+URtOSqnkA3 ew3Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-2692-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-2692-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id 4-20020a0562140d4400b0067a9172f628si13549073qvr.123.2023.12.17.09.01.28 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 17 Dec 2023 09:01:29 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-2692-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-2692-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-2692-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id DB2201C20EAB for ; Sun, 17 Dec 2023 17:01:28 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 629F145C03; Sun, 17 Dec 2023 17:01:21 +0000 (UTC) X-Original-To: linux-kernel@vger.kernel.org Received: from eu-smtp-delivery-151.mimecast.com (eu-smtp-delivery-151.mimecast.com [185.58.86.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 147A044C9B for ; Sun, 17 Dec 2023 17:01:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=ACULAB.COM Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=aculab.com Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) by relay.mimecast.com with ESMTP with both STARTTLS and AUTH (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id uk-mta-91-vRzNanjRPfeqkSuhywGmvQ-1; Sun, 17 Dec 2023 17:01:07 +0000 X-MC-Unique: vRzNanjRPfeqkSuhywGmvQ-1 Received: from AcuMS.Aculab.com (10.202.163.4) by AcuMS.aculab.com (10.202.163.4) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Sun, 17 Dec 2023 17:00:48 +0000 Received: from AcuMS.Aculab.com ([::1]) by AcuMS.aculab.com ([::1]) with mapi id 15.00.1497.048; Sun, 17 Dec 2023 17:00:48 +0000 From: David Laight To: 'Ivan Orlov' , "paul.walmsley@sifive.com" , "palmer@dabbelt.com" , "aou@eecs.berkeley.edu" CC: "conor.dooley@microchip.com" , "ajones@ventanamicro.com" , "samuel@sholland.org" , "alexghiti@rivosinc.com" , "linux-riscv@lists.infradead.org" , "linux-kernel@vger.kernel.org" , "skhan@linuxfoundation.org" Subject: RE: [PATCH] riscv: lib: Optimize 'strlen' function Thread-Topic: [PATCH] riscv: lib: Optimize 'strlen' function Thread-Index: AQHaLduBo9lhsHug1EOTPi9OJpSM+LCttqAA Date: Sun, 17 Dec 2023 17:00:48 +0000 Message-ID: <86d3947bce1f49c395224998e7d65dc2@AcuMS.aculab.com> References: <20231213154530.1970216-1-ivan.orlov0322@gmail.com> In-Reply-To: <20231213154530.1970216-1-ivan.orlov0322@gmail.com> Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable From: Ivan Orlov > Sent: 13 December 2023 15:46 >=20 > The current non-ZBB implementation of 'strlen' function iterates the > memory bytewise, looking for a zero byte. It could be optimized to use > the wordwise iteration instead, so we will process 4/8 bytes of memory > at a time. ... > 1. If the address is unaligned, iterate SZREG - (address % SZREG) bytes > to align it. An alternative is to mask the address and 'or' in non-zero bytes into the first word - might be faster. ... > Here you can find the benchmarking results for the VisionFive2 board > comparing the old and new implementations of the strlen function. >=20 > Size: 1 (+-0), mean_old: 673, mean_new: 666 > Size: 2 (+-0), mean_old: 672, mean_new: 676 > Size: 4 (+-0), mean_old: 685, mean_new: 659 > Size: 8 (+-0), mean_old: 682, mean_new: 673 > Size: 16 (+-0), mean_old: 718, mean_new: 694 ... Is that 32bit or 64bit? The word-at-a-time strlen() is typically not worth it for 32bit. I'd also guess that pretty much all the calls in-kernel are short. You might try counting as: histogram[ilog2(strlen_result)]++ and seeing what it shows for some workload. I bet you (a beer if I see you!) that you won't see many over 1k. =09David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1= PT, UK Registration No: 1397386 (Wales)