Received: by 2002:a05:6358:c692:b0:131:369:b2a3 with SMTP id fe18csp5220176rwb; Mon, 31 Jul 2023 21:34:58 -0700 (PDT) X-Google-Smtp-Source: APBJJlGKIIDimZ5kvY47U3ep/s9FVQfJzu7i74cFx/fJrVMf4p/voo347wIR3aya2pnhvxZCrWZA X-Received: by 2002:a05:6a00:1883:b0:682:93ce:4825 with SMTP id x3-20020a056a00188300b0068293ce4825mr17713282pfh.3.1690864497712; Mon, 31 Jul 2023 21:34:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690864497; cv=none; d=google.com; s=arc-20160816; b=CXh+p1oZsPkK8ms6vvmgNe+yz7P3mvpJCxMZDe3G5OPtOlL3lu8W3ToCUbfzV4wltc PMDlzBJuRxeFNChCv7HuMAER9Z3j0Cr9hjkQE9c/XTTFJr5QKmCIj4JsMEFfNMIDG/5r 4dVbkEELDDbgXCQKGOJYvfVmN4OKbDFVtMzYnDPAwd88kAXbBemtqkkD4rYvxm3Vz69G ViddV0PKpthiJ3YcUYmqCIhE24lg3whEY404QhWEGME7FvIgeK5V7SInoyJq5H9VmOIb mKkqlSuWvtyB052uztwmyd1BNGKuDXpMQn526a19fF8aWLNpgD60NGjlhKmpUZsJxv79 uXyQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=CgThJ42brKyjlicV7KhT1PRZi0Lb4ch1MN6PH5wBEbI=; fh=eEoSzb9R7R8qYYSIa2as6rRMMYfeACmO1eS6AZWn2Eo=; b=mGaGXKTe7Lm+U2ngnX2fooYzgMHS9m8OfcnPv6JxFd7MH9+s6dm+qEeMfiVLNZPyio 7Mi1GRaMKVdCNbRC2Edyt59MGD8lGVa5+49o4EV0DbrjSvvmUb0gT7rkz124z6MyUWxS aEARDLC79gnctrUfWZT4S89gt7+GCBlgVDlwt8p9I7dPiEkVz/+Rl5p58Bui96/IuRDx FgXtPW2dyqcceC005kFngV1AltYnEXC8yXPYFyY0Leaper8aLeshwj32y3VSzgAEx+Z6 E7NuLMlXA5XGeZGmS0mbUT+gls5YiaeNo6l7UDgdjk33+KEcMkWGtLZFY/TsnFQOGHaP 6rxA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q17-20020a63d611000000b00563e1d63c07si8307397pgg.695.2023.07.31.21.34.45; Mon, 31 Jul 2023 21:34:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230034AbjHAEFW (ORCPT + 99 others); Tue, 1 Aug 2023 00:05:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229874AbjHAEFV (ORCPT ); Tue, 1 Aug 2023 00:05:21 -0400 Received: from mail.loongson.cn (mail.loongson.cn [114.242.206.163]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 2C708185; Mon, 31 Jul 2023 21:05:18 -0700 (PDT) Received: from loongson.cn (unknown [10.20.42.170]) by gateway (Coremail) with SMTP id _____8Bx5fB7hMhkEtkNAA--.32917S3; Tue, 01 Aug 2023 12:05:15 +0800 (CST) Received: from [10.20.42.170] (unknown [10.20.42.170]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxLCN5hMhkAoVDAA--.26035S3; Tue, 01 Aug 2023 12:05:14 +0800 (CST) Message-ID: <0126bfcf-fb5c-6243-a2f3-2aab01b38279@loongson.cn> Date: Tue, 1 Aug 2023 12:05:13 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: [PATCH V2] asm-generic: ticket-lock: Optimize arch_spin_value_unlocked Content-Language: en-US To: Waiman Long , guoren@kernel.org, David.Laight@ACULAB.COM, will@kernel.org, peterz@infradead.org, mingo@redhat.com Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, Guo Ren References: <20230731023308.3748432-1-guoren@kernel.org> <2437ac29-29f0-34f9-b7cb-f0e294db7dc6@loongson.cn> From: bibo mao In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-CM-TRANSID: AQAAf8CxLCN5hMhkAoVDAA--.26035S3 X-CM-SenderInfo: xpdruxter6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBj93XoWxGryfWr1fZr1xArW5WF1kXrc_yoWrZw17pr 1kGF15JryUCr1kAF1Utr1jqry8tw47Kw1UXr1DGF1UJFsrXr1Yga18Xr90gr1jyw4xJr4U Jr1jqry7Zr1UJrcCm3ZEXasCq-sJn29KB7ZKAUJUUUU3529EdanIXcx71UUUUU7KY7ZEXa sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU 0xBIdaVrnRJUUUPab4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2 IYs7xG6rWj6s0DM7CIcVAFz4kK6r1Y6r17M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Gr0_Xr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Gr0_Cr1l84ACjcxK6I8E87Iv67AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVCY1x0267AK xVW8Jr0_Cr1UM2kKe7AKxVWUAVWUtwAS0I0E0xvYzxvE52x082IY62kv0487Mc804VCY07 AIYIkI8VC2zVCFFI0UMc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWU AVWUtwAv7VC2z280aVAFwI0_Gr0_Cr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcVAKI4 8JMxk0xIA0c2IEe2xFo4CEbIxvr21lc7CjxVAaw2AFwI0_JF0_Jw1l42xK82IYc2Ij64vI r41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1l4IxYO2xFxVAFwI0_JF0_Jw1lx2IqxVAqx4xG67 AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1q6r43MIIY rxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_JFI_Gr1lIxAIcVC0I7IYx2IY6xkF7I0E14 v26r1j6r4UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVW8JVWx JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjxU4SoGDU UUU X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 在 2023/8/1 09:59, Waiman Long 写道: > On 7/31/23 21:37, bibo mao wrote: >> >> 在 2023/7/31 23:16, Waiman Long 写道: >>> On 7/30/23 22:33, guoren@kernel.org wrote: >>>> From: Guo Ren >>>> >>>> The arch_spin_value_unlocked would cause an unnecessary memory >>>> access to the contended value. Although it won't cause a significant >>>> performance gap in most architectures, the arch_spin_value_unlocked >>>> argument contains enough information. Thus, remove unnecessary >>>> atomic_read in arch_spin_value_unlocked(). >>>> >>>> The caller of arch_spin_value_unlocked() could benefit from this >>>> change. Currently, the only caller is lockref. >>>> >>>> Signed-off-by: Guo Ren >>>> Cc: Waiman Long >>>> Cc: David Laight >>>> Cc: Peter Zijlstra >>>> Signed-off-by: Guo Ren >>>> --- >>>> Changelog >>>> V2: >>>>    - Fixup commit log with Waiman advice. >>>>    - Add Waiman comment in the commit msg. >>>> --- >>>>    include/asm-generic/spinlock.h | 16 +++++++++------- >>>>    1 file changed, 9 insertions(+), 7 deletions(-) >>>> >>>> diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h >>>> index fdfebcb050f4..90803a826ba0 100644 >>>> --- a/include/asm-generic/spinlock.h >>>> +++ b/include/asm-generic/spinlock.h >>>> @@ -68,11 +68,18 @@ static __always_inline void arch_spin_unlock(arch_spinlock_t *lock) >>>>        smp_store_release(ptr, (u16)val + 1); >>>>    } >>>>    +static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock) >>>> +{ >>>> +    u32 val = lock.counter; >>>> + >>>> +    return ((val >> 16) == (val & 0xffff)); >>>> +} >>>> + >>>>    static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock) >>>>    { >>>> -    u32 val = atomic_read(lock); >>>> +    arch_spinlock_t val = READ_ONCE(*lock); >>>>    -    return ((val >> 16) != (val & 0xffff)); >>>> +    return !arch_spin_value_unlocked(val); >>>>    } >>>>      static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock) >>>> @@ -82,11 +89,6 @@ static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock) >>>>        return (s16)((val >> 16) - (val & 0xffff)) > 1; >>>>    } >>>>    -static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock) >>>> -{ >>>> -    return !arch_spin_is_locked(&lock); >>>> -} >>>> - >>>>    #include >>>>      #endif /* __ASM_GENERIC_SPINLOCK_H */ >>> I am fine with the current change. However, modern optimizing compiler should be able to avoid the redundant memory read anyway. So this patch may not have an impact from the performance point of view. >> arch_spin_value_unlocked is called with lockref like this: >> >> #define CMPXCHG_LOOP(CODE, SUCCESS) do {                                        \ >>          int retry = 100;                                                        \ >>          struct lockref old;                                                     \ >>          BUILD_BUG_ON(sizeof(old) != 8);                                         \ >>          old.lock_count = READ_ONCE(lockref->lock_count);                        \ >>          while (likely(arch_spin_value_unlocked(old.lock.rlock.raw_lock))) {     \ >> >> With modern optimizing compiler, Is it possible that old value of >> old.lock.rlock.raw_lock is cached in register, despite that try_cmpxchg64_relaxed >> modifies the memory of old.lock_count with new value? > > What I meant is that the call to arch_spin_value_unlocked() as it is today will not generate 2 memory reads of the same location with or without the patch. Of course, a new memory read will be needed after a failed cmpxchg(). yeap, it can solve the issue with a new memory read after a failed cmpxchg(). Regards Bibo Mao > > Cheers, > Longman