Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp8910122rwi; Tue, 25 Oct 2022 12:26:30 -0700 (PDT) X-Google-Smtp-Source: AMsMyM76023SCFvOAO6ydiyGmWbkJo94SGpRI7KuxxzPLkdWWgbr1FsDGgApwoVtcioavSttNQgv X-Received: by 2002:a17:906:7055:b0:78b:9148:6b41 with SMTP id r21-20020a170906705500b0078b91486b41mr32924503ejj.629.1666725980069; Tue, 25 Oct 2022 12:26:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666725980; cv=none; d=google.com; s=arc-20160816; b=ZXcU42iN1B1l9BIDE3wxgumu3ZWbNOGYH+NFajU4LoMPN0iEwDEz8gPDZ4HzPfPV6+ aZkdjmTGHEEQSH+zcijfxeMFTthpyktP2oVW4BpXjCRT5C74ag65Axn1ryHWTwT0yhVo mQqB7q0vqtUGQUz6skkuQqoMIoKd5/3rxNQdcAVV3GNc+eAecr/G15tdPecKWYis/AMJ CW9zdAbGx8QrGwpqRWXRvVsa4M3hJDd5alCXZFXd4vyX05epov4GT7MR3qUmqIGgKdlW i8R0PuC2Pf/psGPEyxuzOulYasDC87UMDhfP0GDrnrbegyIwAM2Si+3Ldv0r9M+h/dQM 5CpQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=PRbQ9l2cJ5/Xp7Wc/kQYxbrvovXMK7bxbsvxlXJVha4=; b=WIKgPQSRN64n/LnkDIWq4zQeRKBgHitE+J9sKiWLm0Lqdjbj4V8+lRlnS9aqIhBeEr Z5GvqoQB+sADyoewMAO1LY/7TZnAmNvqh/2IBb0DzoiHWkblT1tbHGEJIzP50Y/yJQ7h 4/3CLifKMl1YHfpmmThTr9OnPHINOcd34FAvODP/yebAFNgsEqSg8r3c8worD425VflG cYGuGAMtXkboSXkZ247Vi5gvAmRtG/0t1fPlBrDxazVYe++AXJNrncgbjf1nwS616Jnr fiAildF8ydEn/9SjQZJkmW4bgNfx8jWfE+M12RZgDy09ifVYfAm2Od9/jJehRh14sP+H KetA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=ExzfS4DS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s18-20020a056402521200b0045ca15afe1csi1504357edd.553.2022.10.25.12.25.53; Tue, 25 Oct 2022 12:26:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=ExzfS4DS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232222AbiJYTUD (ORCPT + 99 others); Tue, 25 Oct 2022 15:20:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39472 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231374AbiJYTUA (ORCPT ); Tue, 25 Oct 2022 15:20:00 -0400 Received: from mail-qv1-xf2f.google.com (mail-qv1-xf2f.google.com [IPv6:2607:f8b0:4864:20::f2f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4A0DF27CDA for ; Tue, 25 Oct 2022 12:19:59 -0700 (PDT) Received: by mail-qv1-xf2f.google.com with SMTP id e15so9330596qvo.4 for ; Tue, 25 Oct 2022 12:19:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=PRbQ9l2cJ5/Xp7Wc/kQYxbrvovXMK7bxbsvxlXJVha4=; b=ExzfS4DSZyz8+0pwq7h/bN4D3MONYWpRC6+ZTn+LxivHa9MBkAx4+EnUe3Qz1w209U 2XEJT/oMSiFfbTmSgyA2Y487y1OQcl+GON0GjVBCGWp5ilK92YhusdAH5SZGJzqeybr0 XgYaJ6rP42+EqOUhVfhjhI25W80lyT0g2P3ux6fgGWsN8S72qveYLrMsadR8taDuctoE IWi1knY6XxKrGP9rl/gz1h4/dECuM5BYvLMmAfY1Ps80r9pmeR8cpL9hGyaKwB3KcbUi yqwH+srJPqc+d9JROjFVR3uuU7wuEpVHP5BEWlgXQ8+GqHkPKI8PHCmeAu+wv0vnZsiZ tULQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=PRbQ9l2cJ5/Xp7Wc/kQYxbrvovXMK7bxbsvxlXJVha4=; b=rP39Ca3BZ7ZV4tQW/sggx0IXmuNIWz95AJMcHqbD5yTihQP0r7brDPjBj33gJtD4mA m6asjhGac49TmxFtp+9QGsuRiZYlJWJTzLdks9RnJ/APqxYlNxQ3yM58tuqKY45iba/c ipaZ7Yt2EdbewyaJGU0vSCUWQ6qXVyhACmQQi5iHimVlNJsmy8/SnSQLquhrClTcd3qk doCmpNvXjjFO6wS9+EXz293Il9tfJ6KCvLHU+35ut/TXpd8i3GI/0O8qq2D5Mr145KsC 8k5TYin7+GrDJ19t2zy7ETEJsdLaKJRLNK0py4HWp8RgGSCuwF8MpkrBVlY/PVO0Y6o4 TXgQ== X-Gm-Message-State: ACrzQf2ePF92AxslnMwz26g+WWkqvt4vxF5lQy7vNOkznzLxlc5zl/4U kP/pKKy0OBkCEvIs8VB4EXRQP4MvWTGN7r+x5TdQsqw0TTw= X-Received: by 2002:a0c:f00f:0:b0:4bb:6167:d338 with SMTP id z15-20020a0cf00f000000b004bb6167d338mr14629329qvk.11.1666725598347; Tue, 25 Oct 2022 12:19:58 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Andy Shevchenko Date: Tue, 25 Oct 2022 22:19:22 +0300 Message-ID: Subject: Re: [PATCH] lib/string.c: Improve strcasecmp speed by not lowering if chars match To: Nathan Moinvaziri Cc: Andy Shevchenko , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 25, 2022 at 8:53 PM Nathan Moinvaziri wrote: > > Hi Andy, > > I appreciate your quick feedback! > > I have done as you suggested and published my results this time using Google benchmark: > https://github.com/nmoinvaz/strcasecmp Thank you for sharing! Looks promising, but may I suggest a few things: 1) have you considered the word-at-a-time use (like strscpy() does)? 2) instead of using tolower() on both sides, have you considered (with the above in mind) to use XOR over words and if they are not 0, check if the result is one of possible combinations of 0x20 and then by excluding the non-letters from the range you may find the difference? So, I think it's a good exercise for the twiddling of bits. > After you review it, and if you still think the patch is worthwhile then I can fix the other problems you mentioned for the original patch. If you think it is not worth it, then I understand. P.S. Avoid top-posting in the Linux kernel mailing lists! > -----Original Message----- > From: Andy Shevchenko > Sent: Tuesday, October 25, 2022 2:04 AM > To: Nathan Moinvaziri > Cc: linux-kernel@vger.kernel.org > Subject: Re: [PATCH] lib/string.c: Improve strcasecmp speed by not lowering if chars match > > On Tue, Oct 25, 2022 at 11:00:36AM +0300, Andy Shevchenko wrote: > > On Tue, Oct 25, 2022 at 4:46 AM Nathan Moinvaziri wrote: > > ... > > > > When running tests using Quick Benchmark with two matching 256 > > > character strings these changes result in anywhere between ~6-9x speed improvement. > > > > > > * We use unsigned char instead of int similar to strncasecmp. > > > * We only subtract c1 - c2 when they are not equal. > > ... > > > You tell us that this is more preformant, but have not provided the > > numbers. Can we see those, please? > > So, I have read carefully and see the reference to some QuickBenchmark I have no idea about. What I meant here is to have numbers provided by an (open > source) tool (maybe even in-kernel test case) that anybody can test on their machines. You also missed details about how you run, what the data set has been used, etc. > > > Note, that you basically trash CPU cache lines when characters are not > > equal, and before doing that you have a branching. I'm unsure that > > your way is more performant than the original one. -- With Best Regards, Andy Shevchenko