Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp221281rwi; Wed, 26 Oct 2022 23:42:48 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4Ma04Zi4JmRL55Ge+GdZ5P7bk7RcomCGtYjpu0kmFBvsrq5rtCXFjs4WMSNfN/7c1Gqbk7 X-Received: by 2002:a17:907:7252:b0:791:9fd8:222e with SMTP id ds18-20020a170907725200b007919fd8222emr40678045ejc.729.1666852967796; Wed, 26 Oct 2022 23:42:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666852967; cv=none; d=google.com; s=arc-20160816; b=j4pKFq7yJf5WDSV1mk9sy8UrvonPLfGpS9R/1dGL3qwl8phA64/He5ah4AEc3a94KX O5ZX2PYDKPrvzOe5C1KoQRNy9QHpFR/VLTxl+zq8jsEz/6qItfQ9xObyo32PFlgO/76h YyIc2ZTqeBwHfiPSgtnTyrH/n9ukPDxc0oKD65x54oBfQ6IE5wq6A1YRTRcE9Z3dwR2H Zgd0ZR1zC4i1/1RJzPOu1APPrGdscBsPIU2Nuoz8Msp7RdwlB9kWJdToTCOnMQuHwjFa PQ4LNt6ocMpu4BniHUtnfp7Iyu2O5eeKK6q8wCaeVZcY+8Mqw31FvPezaQO8932yqgeh Yy2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=egO0y1RoX0VqsW2hz8lRzfuDL2LVRUGSLUN9El9Nax4=; b=gSsSLwsZzlNRUCGhi9uelGA6RBx6FMhb6/TSvb7mLNB2F9zevBAal6FJD87VZ1Xb5n KY5UULsMHz1oKEZTFbUc7zM+g8UxylCWQeyUAwArEY4ARxPtBgsp+qs2yBWpRin8GJ6j 8gzXJB5TRLpZeuyRED4ishNF+Kns4kDr7VcptLPkIEtQss6tqILSFIjjibVDsZ1X1eyB 9iZ6M10ByNiF1gA0m8vq6UgL9nykmixnkR7l7a77h6woxbj6JU6mDEM6vUb4MKDxLQ85 3GQCzTfeZ4/1ICD3aS7Mx0xqbDPcRqLFG9d4raRsDz6anGpIO8MpjTLfOFTrWN3C5VNx m0/w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=VXBbUDSo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id qf40-20020a1709077f2800b00730870cb4b6si672122ejc.621.2022.10.26.23.42.17; Wed, 26 Oct 2022 23:42:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=VXBbUDSo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234616AbiJ0Gcv (ORCPT + 99 others); Thu, 27 Oct 2022 02:32:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44806 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234607AbiJ0Gcg (ORCPT ); Thu, 27 Oct 2022 02:32:36 -0400 Received: from mail-qk1-x731.google.com (mail-qk1-x731.google.com [IPv6:2607:f8b0:4864:20::731]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8B41B166984 for ; Wed, 26 Oct 2022 23:31:50 -0700 (PDT) Received: by mail-qk1-x731.google.com with SMTP id f8so224105qkg.3 for ; Wed, 26 Oct 2022 23:31:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=egO0y1RoX0VqsW2hz8lRzfuDL2LVRUGSLUN9El9Nax4=; b=VXBbUDSo4OwR3rlKuDxsXx+Y85pri5bcV+gDbh5Y1onvbhpGIvDoJfKn5Y7vS3GchV 8gZP0OU5cB/Qdxlzqu3ATR4/mxON7QATa7KaIKATuoRAoiRMon1B7nwSHvL7mEaXhVSA BTRnlmA9TK74jXcEfqlFx2yZ3hNCoH3ialktsOatFg1C2pzZotaZG6NnN5xZVvs4hQtC 6MmExybjk6dcZpbWJdbOGAuYdpli0PJFfyZk+j7fmOpSAzNuFWd3564nwBm9fpTVMdB/ 9yY+126vs0JIrIdyMDaoU+zDHiGKxz9tpX0rNE7+6KVyEzi21RRlJHCAipnsW5Pp7sbf 8a8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=egO0y1RoX0VqsW2hz8lRzfuDL2LVRUGSLUN9El9Nax4=; b=LUjJJpY3K29Uw6m1c6TN3zuM3sEQKHvaMQDTdsPtRcULw4n3btXv9ajy88Hf6F0Op9 HJV1PfgVws1od+OJjj3CHZcty7cqzdISURufxFaCwfHik1lYYfgxaq8BiHVSw2vq/LFA a/kb5yMqgS1f0RVT/nGEyX18M2ZkDC5R09UMrzFm+ddsAMMU5iNnbTEaWs7zcLSTFu18 CbGtbMZPnyLJD6BzI8zsBJngEyt6qFsO6E3FqRVzcIaLwDrJkCv3GQnWhweAnpQeGxa1 O0HnwZQKByzSuSdBycm1M+ZfK4vKMd2V+EG/tK/+HTfhIbmfEMdk0NdW7S3g0P6t3BjE geew== X-Gm-Message-State: ACrzQf06PqTH8dj1knT/rshRmc9im34zY0964hppiG9bMKtkBywdSq/V cBWlexXM/fyHvGpUTzBYOXfjhp7nuEQ2tJT6ZFHrDca4Xd7k7A== X-Received: by 2002:ae9:e315:0:b0:6ee:761d:4b8b with SMTP id v21-20020ae9e315000000b006ee761d4b8bmr33092486qkf.748.1666852309623; Wed, 26 Oct 2022 23:31:49 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Andy Shevchenko Date: Thu, 27 Oct 2022 09:31:13 +0300 Message-ID: Subject: Re: [PATCH] lib/string.c: Improve strcasecmp speed by not lowering if chars match To: Nathan Moinvaziri Cc: Andy Shevchenko , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 27, 2022 at 6:29 AM Nathan Moinvaziri wrote: > On 10/25/2022 12:19 PM, Andy Shevchenko wrote: > > Looks promising, but may I suggest a few things: > > 1) have you considered the word-at-a-time use (like strscpy() does)? > > Only briefly at the beginning of the function to check for an identical > comparison and the added check hurt performance for strings that were > not identical. > > On 10/25/2022 12:19 PM, Andy Shevchenko wrote: > > > 2) instead of using tolower() on both sides, have you considered > > (with the above in mind) to use XOR over words and if they are not 0, > > check if the result is one of possible combinations of 0x20 and then > > by excluding the non-letters from the range you may find the > > difference? > > I'm not sure what you mean about the possible combinations of the space > character. I have not investigated this method. 'a' xor 'A' == 0x20 (same for all the letters. That's why we have a specific _tolower() in vsprintf.c. > According to my previous findings the check for c1 != c2 does perform > better for strings that are at least 25% or more the same. I was able to > get even more performance out of it by changing tolower() to use a > different hash table than the one used for the is*() functions. By using > a pre-generated hash table for both islower() and isupper() it is > possible to remove the branch where ever those functions are used, > including in strcasecmp. This method I've seen employed in the Android > code base and also in cURL. Using it would add additional 2x256 bytes to > the code size for the tables. Rasmus raised a good question, where do we actually need the performant strcasecmp()? -- With Best Regards, Andy Shevchenko