Received: by 2002:a05:6358:c692:b0:131:369:b2a3 with SMTP id fe18csp5142976rwb; Mon, 31 Jul 2023 19:35:58 -0700 (PDT) X-Google-Smtp-Source: APBJJlGVc4hTnV7c8ZRVNgxXab1genzTHHMGdK+C1l3/ffZT1e+9ijn8WmN1o6QdPlpvGbhQ6bfD X-Received: by 2002:a17:907:75fa:b0:971:eb29:a082 with SMTP id jz26-20020a17090775fa00b00971eb29a082mr1088544ejc.49.1690857357810; Mon, 31 Jul 2023 19:35:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690857357; cv=none; d=google.com; s=arc-20160816; b=g1kOhOWqRuchC0ZqOZgjelxqKvi6UC+GD9TFX45T3POWtLqrEY5CcD6NPSychpK+bj eupKQ3qOHiOqEloJUu6Uzdx9js1Dq7MY8LE7CItbWjr6tjfUXw5DW9SenYhLXsN3NNGw kKSvbphVa5RQDnCIuwOXWEptb5WPVvWQtgORjX1yU9LsAv2ynUqXnizQOJk3CBRhgrtJ byciC+OSyifofmHgaJgIEEtDeCX+i+su9q+LVSggMXTjlZ/Y3CT/J85TJQRx0o74DNn4 h/rVhJhVRvNJ4elDQigttOXBK80WLi7rJp9qFp4l/Ph56MdzO+Pcmq6VIZ5IS/6F4Meh X1aQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=BoiBxXOmx6CpsDJVeX/Pql9MtD1ohfFLvpt/Vo7JoTM=; fh=7pgVyt2nYdum6bsWxPylnJWHcW4qRyKUbjTMdkdl8jc=; b=SK49anlq9wCb2tt6kX5iYjgP9mFgOfeYVmkRlVwXHLAmu/OtzRzvBfc0kqk+IbCrzx u6Y9qbzbfPlxN1I4URDPG23z8CCLJltecYN2RNWyir6cF5u77speOvED0PoikcVNoAiC lC8xnCAd3nuPaj2xKBAdckmSGAyNczdcWk+KswX9nUMDfdnnv9gu/jcmRoacDdCDI2aU 1LunF+dXVmRrxXYsJuuKMeweEvXHIqxDb7t3Aj6redn+qDxntvmz6RNQWkIj3k18D2AY AAVXl+7qLfTFN0PE7GzQK7WjeqelZdupqbNKxxhBRdp0so3NEsxDd5ahX9VgT+wqIL/N jJKA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Fd3PTevZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id mc16-20020a170906eb5000b00997dfeb04a1si4364233ejb.70.2023.07.31.19.35.34; Mon, 31 Jul 2023 19:35:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Fd3PTevZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231636AbjHACA2 (ORCPT + 99 others); Mon, 31 Jul 2023 22:00:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46498 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229500AbjHACA0 (ORCPT ); Mon, 31 Jul 2023 22:00:26 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C8F2F1BC8 for ; Mon, 31 Jul 2023 18:59:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1690855179; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BoiBxXOmx6CpsDJVeX/Pql9MtD1ohfFLvpt/Vo7JoTM=; b=Fd3PTevZ/7hBqgNpHLXfESlD+Mh1BcxHvPx8LnEhvQeS9Kwuwpj4Ti3bXOIAewl/5pPErn lYfVvBGkfcBV55I4X/OrMmi4blRP+7DOjtqEB0wCvNckxa6XafADP1Ox1JhS6WLP1VuXeD Kvjo1jaPXDzo0mwHYwcBgoSd0NG4Sa0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-515-ToP1tb8vPf28vrStPja-Jw-1; Mon, 31 Jul 2023 21:59:37 -0400 X-MC-Unique: ToP1tb8vPf28vrStPja-Jw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 85DDB891F21; Tue, 1 Aug 2023 01:59:35 +0000 (UTC) Received: from [10.22.10.62] (unknown [10.22.10.62]) by smtp.corp.redhat.com (Postfix) with ESMTP id 12E161121325; Tue, 1 Aug 2023 01:59:33 +0000 (UTC) Message-ID: Date: Mon, 31 Jul 2023 21:59:33 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1 Subject: Re: [PATCH V2] asm-generic: ticket-lock: Optimize arch_spin_value_unlocked Content-Language: en-US To: bibo mao , guoren@kernel.org, David.Laight@ACULAB.COM, will@kernel.org, peterz@infradead.org, mingo@redhat.com Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, Guo Ren References: <20230731023308.3748432-1-guoren@kernel.org> <2437ac29-29f0-34f9-b7cb-f0e294db7dc6@loongson.cn> From: Waiman Long In-Reply-To: <2437ac29-29f0-34f9-b7cb-f0e294db7dc6@loongson.cn> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 7/31/23 21:37, bibo mao wrote: > > 在 2023/7/31 23:16, Waiman Long 写道: >> On 7/30/23 22:33, guoren@kernel.org wrote: >>> From: Guo Ren >>> >>> The arch_spin_value_unlocked would cause an unnecessary memory >>> access to the contended value. Although it won't cause a significant >>> performance gap in most architectures, the arch_spin_value_unlocked >>> argument contains enough information. Thus, remove unnecessary >>> atomic_read in arch_spin_value_unlocked(). >>> >>> The caller of arch_spin_value_unlocked() could benefit from this >>> change. Currently, the only caller is lockref. >>> >>> Signed-off-by: Guo Ren >>> Cc: Waiman Long >>> Cc: David Laight >>> Cc: Peter Zijlstra >>> Signed-off-by: Guo Ren >>> --- >>> Changelog >>> V2: >>>   - Fixup commit log with Waiman advice. >>>   - Add Waiman comment in the commit msg. >>> --- >>>   include/asm-generic/spinlock.h | 16 +++++++++------- >>>   1 file changed, 9 insertions(+), 7 deletions(-) >>> >>> diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h >>> index fdfebcb050f4..90803a826ba0 100644 >>> --- a/include/asm-generic/spinlock.h >>> +++ b/include/asm-generic/spinlock.h >>> @@ -68,11 +68,18 @@ static __always_inline void arch_spin_unlock(arch_spinlock_t *lock) >>>       smp_store_release(ptr, (u16)val + 1); >>>   } >>>   +static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock) >>> +{ >>> +    u32 val = lock.counter; >>> + >>> +    return ((val >> 16) == (val & 0xffff)); >>> +} >>> + >>>   static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock) >>>   { >>> -    u32 val = atomic_read(lock); >>> +    arch_spinlock_t val = READ_ONCE(*lock); >>>   -    return ((val >> 16) != (val & 0xffff)); >>> +    return !arch_spin_value_unlocked(val); >>>   } >>>     static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock) >>> @@ -82,11 +89,6 @@ static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock) >>>       return (s16)((val >> 16) - (val & 0xffff)) > 1; >>>   } >>>   -static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock) >>> -{ >>> -    return !arch_spin_is_locked(&lock); >>> -} >>> - >>>   #include >>>     #endif /* __ASM_GENERIC_SPINLOCK_H */ >> I am fine with the current change. However, modern optimizing compiler should be able to avoid the redundant memory read anyway. So this patch may not have an impact from the performance point of view. > arch_spin_value_unlocked is called with lockref like this: > > #define CMPXCHG_LOOP(CODE, SUCCESS) do { \ > int retry = 100; \ > struct lockref old; \ > BUILD_BUG_ON(sizeof(old) != 8); \ > old.lock_count = READ_ONCE(lockref->lock_count); \ > while (likely(arch_spin_value_unlocked(old.lock.rlock.raw_lock))) { \ > > With modern optimizing compiler, Is it possible that old value of > old.lock.rlock.raw_lock is cached in register, despite that try_cmpxchg64_relaxed > modifies the memory of old.lock_count with new value? What I meant is that the call to arch_spin_value_unlocked() as it is today will not generate 2 memory reads of the same location with or without the patch. Of course, a new memory read will be needed after a failed cmpxchg(). Cheers, Longman