Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp523410imw; Wed, 13 Jul 2022 03:08:30 -0700 (PDT) X-Google-Smtp-Source: AGRyM1t2GrdHRGf/TIljnrp+/qmtnrE+il8eYYonpruzKwLZ5kplGZWdZYLIHeoYm1Ov4KorHZep X-Received: by 2002:a05:6402:28c4:b0:43a:cdde:e047 with SMTP id ef4-20020a05640228c400b0043acddee047mr3672904edb.368.1657706910120; Wed, 13 Jul 2022 03:08:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657706910; cv=none; d=google.com; s=arc-20160816; b=d7GY7JHtsmtAPCNt+SbiJEBOZgOqrqZ9AMVdxKl1MfP0ISHPQ0jjPm7thJdLkzAM8s UezvN73yuIhbHYbhoA6gPT5Gm+55HcBb5DapINdbLIoeIzktVmCmEmCrs2dI+jm/vSrp tU6ZOxPppEuC2Fw1nj7rbv8QtJqugacpLf/d5XK3b5eZ6gj3ilHBXTWNWx7f/gKX0QSc UNhtRODvO6oP+tcA7CaCaN1tHZ+IzgPAAAbXvZ/eCsUEnvJG6IKGasEKI/irITFnBvHZ 9Vgi85js4lJSAdW2ICnwLBrfNKj+WDozTcqWmafxt1+RzAVPuGPsgRIYP/SkAoVftLJX NDiw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:references :cc:to:from:content-language:subject:user-agent:mime-version:date :message-id:dkim-signature; bh=FcdZtxlfGHBvxGWnTjaxeALMraCtTgZBf+0Hn+oyr0g=; b=DZAGuUorSNKJq4rWC5KLzSkSPcyNgVtFoUow/h+2NQG+8Uv0yGIk/ngnqQSVX3Y8Ro kucnKiSmTIB01tjhsigSSbmtaGO9PfWHf01wQWTQBZQHRy9Ti1ZBBJtUJp/ok9tMfCVB cFOhBDumRiGXA5gxX81mC7AUtPdAcqYP9Kq4MLrrtkp/wHbQntipQPIgqEkp+lFLnkFM d89NFROBu6tvHCpUTRMBhlut0kN9i2K+g1MCm41Po1eyC+rv5F54MghkGL3yF/NRmx+E TvePfJuMmvElZjneXkEI1M3ln4KvG+IL5pRvbQ7p+LCk2FgxWnQzq3V5nf3ey9wIf8BC spcw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=lodQZaMJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n23-20020a170906379700b006fec3023067si1532279ejc.343.2022.07.13.03.08.00; Wed, 13 Jul 2022 03:08:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=lodQZaMJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234908AbiGMKCz (ORCPT + 99 others); Wed, 13 Jul 2022 06:02:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56046 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229863AbiGMKCx (ORCPT ); Wed, 13 Jul 2022 06:02:53 -0400 Received: from mail-lf1-x130.google.com (mail-lf1-x130.google.com [IPv6:2a00:1450:4864:20::130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E11A6F5108 for ; Wed, 13 Jul 2022 03:02:51 -0700 (PDT) Received: by mail-lf1-x130.google.com with SMTP id y11so11706054lfs.6 for ; Wed, 13 Jul 2022 03:02:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:date:mime-version:user-agent:subject:content-language :from:to:cc:references:in-reply-to:content-transfer-encoding; bh=FcdZtxlfGHBvxGWnTjaxeALMraCtTgZBf+0Hn+oyr0g=; b=lodQZaMJNoMosnJp42At8SzsOtgrrUlTOd8JN7tdVpOBYvXaSYY78W+ln7IEr6UDWd eLCDXxnebWeP699RvQlkzrcr4Wus+5VDDYUxuxRhBrFY4j6+OVzwxct5sa9a4+N092ku TyV5AN5hmk+QplcVcLSfsuZmWbg2CTR/TJ79mOTkF5juXA1aP70h88cPhB2sCLAXiF38 5GEnOLtj2wnjSm6kMloeqbFim0OVVwkmg15pjHvcE6Yul+SUo9P7ON8pIsPI1jy9QDpL IEqfB7nJjDy/r0zseZ5yl5RN9DtDWvM5eT+HCSDNBKrAX81WLWUSXaEbxCwAzEjg6i9I H0Kg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:from:to:cc:references:in-reply-to :content-transfer-encoding; bh=FcdZtxlfGHBvxGWnTjaxeALMraCtTgZBf+0Hn+oyr0g=; b=fACfjyrxueN7jKeodb3KMBLEwQqXhyDp/6o1A0/i9gNUncg2qQ9PunPhsy+VolTdq8 Q5Faon/O+NINunVoRIkrqCubvqbmK3tcuCuzT+NfyWcA0Cd6e8Pbka7G0aRoR1R0ofa5 bD5T/F343m402X9rh4224ZJQfK4OLRVIfLRRqJII9Bao2xHzCXfda5rGOGU2MJsDSM7M ceIOVoUvAnIcYgbLBH5KWbpf8v2NkOomzvYoiFOnypalyl9mUJEfNr7JAkiBEZE4+28A snzHHaP5HKJOh8hSSL89xvtRrqdUOJdNSM5QQGUxghCZagsWDakxRoh4ad/meYpgGCAC qsjg== X-Gm-Message-State: AJIora/ixKicRg01X+B5DwoHrdQ0QqbjCHMI5yEgTf0xi1IniwSciNOW zNveJ/gCUn2BXOLqw78X0B8= X-Received: by 2002:a19:670e:0:b0:489:f69c:3a2d with SMTP id b14-20020a19670e000000b00489f69c3a2dmr1503697lfc.51.1657706568830; Wed, 13 Jul 2022 03:02:48 -0700 (PDT) Received: from [192.168.1.2] (broadband-188-32-106-30.ip.moscow.rt.ru. [188.32.106.30]) by smtp.gmail.com with ESMTPSA id v16-20020a056512349000b00489d1896c06sm2322895lfr.125.2022.07.13.03.02.47 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 13 Jul 2022 03:02:48 -0700 (PDT) Message-ID: <999057a9-d209-323b-90eb-5756b7c0e91e@gmail.com> Date: Wed, 13 Jul 2022 13:02:47 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Subject: Re: [PATCH 2/2] lib/string.c: Optimize memchr() Content-Language: en-US From: Andrey Semashev To: David Laight , 'Yu-Jen Chang' Cc: "andy@kernel.org" , "akinobu.mita@gmail.com" , Ching-Chun Huang , "linux-kernel@vger.kernel.org" References: <20220710142822.52539-1-arthurchang09@gmail.com> <20220710142822.52539-3-arthurchang09@gmail.com> <3a1b50d2-a7aa-3e89-56fe-5d14ef9da22f@gmail.com> <48db247e-f6fd-cb4b-7cc5-455bf26bb153@gmail.com> <49a8be9269ee47de9fc2d0d7f09eb0b1@AcuMS.aculab.com> <5d14cf64-46b7-dc37-bbb8-dd6be82d06af@gmail.com> In-Reply-To: <5d14cf64-46b7-dc37-bbb8-dd6be82d06af@gmail.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 7/13/22 12:49, Andrey Semashev wrote: > On 7/13/22 12:39, David Laight wrote: >> From: Yu-Jen Chang >>> Sent: 12 July 2022 15:59 >> ... >>>> I think you're missing the point. Loads at unaligned addresses may not >>>> be allowed by hardware using conventional load instructions or may be >>>> inefficient. Given that this memchr implementation is used as a fallback >>>> when no hardware-specific version is available, you should be >>>> conservative wrt. hardware capabilities and behavior. You should >>>> probably have a pre-alignment loop. >>> >>> Got it. I add pre-alignment loop. It aligns the address to 8 or 4bytes. >> >> That should be predicated on !HAS_EFFICIENT_UNALIGNED_ACCESS. >> >> ... >>> for (; p <= end - 8; p += 8) { >>> val = *(u64*)p ^ mask; >>> if ((val + 0xfefefefefefefeffull) >>> & (~val & 0x8080808080808080ull)) >>> break; >> >> I would add a couple of comments, like: >> // Convert to check for zero byte. >> // Standard check for a zero byte in a word. >> (But not the big 4 line explanation you had. >> >> It is also worth looking at how that code compiles >> on 32bit arch that don't have a carry flag. >> That is everything based on MIPS, including riscv. > > It may be worth looking at how glibc does it: > > https://sourceware.org/git/?p=glibc.git;a=blob;f=string/memchr.c;h=422bcd0cd646ea46711a57fa3cbdb8a3329fc302;hb=refs/heads/release/2.35/master#l46 > > They do use 32-bit words on 32-bit targets and 64-bit on 64-bit ones. I > think memchr in the kernel should follow this. Also, if by chance this optimization is aimed for x86-64, it may be worth adding an arch-specific version that uses ERMS.