Received: by 2002:a05:6602:2086:0:0:0:0 with SMTP id a6csp4780805ioa; Wed, 27 Apr 2022 10:58:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz9YxF0y+c21JTgnusRPPW0ZjGXjjhetoj+ONyE8KxwaZDirvHAxapiexPs5ktB3+Nlsfdo X-Received: by 2002:a62:bd14:0:b0:50d:4bec:ff78 with SMTP id a20-20020a62bd14000000b0050d4becff78mr12950490pff.71.1651082308251; Wed, 27 Apr 2022 10:58:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1651082308; cv=none; d=google.com; s=arc-20160816; b=WCJVqYIvqQ/JjaKtSVyLT2MeYBkos6XYdwC1qVCoimFIWuWg/BTrt3srFBT+OdP5Os jTZw4/Ltqq9igyLjQ8ExyWXhI4MsTdkLXnwEiO3Xrrh4wpjIXZfnn7WNpLqCWJ1TF0a2 kd0nioOcLoZqAVEfmx+JOGdqatF9IlbZyZAZMHvuA4yCrCSYPZ+1+A6QPo3vyZZP0zIp RAYMqXOjQale17zYwsOKDf63gSVCyqBQWAvleZQ5U383SK2Dg9D6zUxbyp1OwKyftTA2 4OTFat7gT92BZZ44nEIeZ+YxOZgFx5UCPrCtkEuHNUzBfyhWLX01X1cnUNLJw/c37ta+ 6MrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=+wgybu1Z6JRc1kwThPLtUqemuozUgJP3IOj9GEj2a0c=; b=v/W9NXtc3nudEiixKEXnjDoCSiNBMeoqFXsKGufRKcLCmUfJRx9H8z674ieBj6bBKT QViftol52oAR9Iu5rT7MvPkij7CbIkkT+cKGoKQ8Sj9M+FiX22UvS58DkilGu5ute9Dg K/u4zQxyejaVsLW8t5VbdBZ+SW4FxIRqrxKU4k6uYpmUrDVP3Sv1PcA1w0FGis5Cx5R1 UQJiOREVXVbGvBBDEG6Vy1A6Z3gkxbhuIfzjg63MemFL13MMwuW2l/DSkJuDAY90CwVI jpkbhvUVESxlhqF7nSJ4ThSH6+LOgfEJnflRpDvLl12K5aBHJBhx2ets3UbjjZ20XjXx nFIw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=VcmTmSsK; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id ls16-20020a17090b351000b001d529ef74a5si2741442pjb.83.2022.04.27.10.58.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Apr 2022 10:58:28 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=VcmTmSsK; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 2BEEA21DD0E; Wed, 27 Apr 2022 10:32:17 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244322AbiD0RfU (ORCPT + 99 others); Wed, 27 Apr 2022 13:35:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55174 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244221AbiD0Re7 (ORCPT ); Wed, 27 Apr 2022 13:34:59 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 5BB661F47AA for ; Wed, 27 Apr 2022 10:31:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1651080703; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=+wgybu1Z6JRc1kwThPLtUqemuozUgJP3IOj9GEj2a0c=; b=VcmTmSsKsI6FGRkqA4fnvXcNPVWi73Lq+YL58cK5LBP/95nIIkQ2+g+nlK7DEkFiMlu16+ PoHSTWt62n2IFuuwKa/U2X2aPBVWwCEbRaUR4XJAvQlxgTpcNtWeBh7Jpmtog7pGI4nFVd exRrLOvLhlDwmDCNk0p1G9Pyqwum/mY= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-549-V2y1HH7QNqmLEvqrjCVvfw-1; Wed, 27 Apr 2022 13:31:42 -0400 X-MC-Unique: V2y1HH7QNqmLEvqrjCVvfw-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A63F8811E76; Wed, 27 Apr 2022 17:31:41 +0000 (UTC) Received: from llong.com (unknown [10.22.11.251]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1E47142D3D6; Wed, 27 Apr 2022 17:31:41 +0000 (UTC) From: Waiman Long To: Peter Zijlstra , Ingo Molnar , Will Deacon , Boqun Feng Cc: linux-kernel@vger.kernel.org, john.p.donnelly@oracle.com, Hillf Danton , Waiman Long Subject: [PATCH] locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter Date: Wed, 27 Apr 2022 13:31:24 -0400 Message-Id: <20220427173124.1428050-1-longman@redhat.com> MIME-Version: 1.0 Content-type: text/plain Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.85 on 10.11.54.9 X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org With commit d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more consistent"), the writer that sets the handoff bit can be interrupted out without clearing the bit if the wait queue isn't empty. This disables reader and writer optimistic lock spinning and stealing. Now if a non-first writer in the queue is somehow woken up or first entering the waiting loop, it can't acquire the lock. This is not the case before that commit as the writer that set the handoff bit will clear it when exiting out via the out_nolock path. This is less efficient as the busy rwsem stays in an unlock state for a longer time. This patch allows a non-first writer to ignore the handoff bit if it is not originally set or initiated by the first waiter. Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more consistent") Signed-off-by: Waiman Long --- kernel/locking/rwsem.c | 30 ++++++++++++++++++++---------- 1 file changed, 20 insertions(+), 10 deletions(-) diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c index 9d1db4a54d34..65f0262f635e 100644 --- a/kernel/locking/rwsem.c +++ b/kernel/locking/rwsem.c @@ -335,8 +335,6 @@ struct rwsem_waiter { struct task_struct *task; enum rwsem_waiter_type type; unsigned long timeout; - - /* Writer only, not initialized in reader */ bool handoff_set; }; #define rwsem_first_waiter(sem) \ @@ -459,10 +457,12 @@ static void rwsem_mark_wake(struct rw_semaphore *sem, * to give up the lock), request a HANDOFF to * force the issue. */ - if (!(oldcount & RWSEM_FLAG_HANDOFF) && - time_after(jiffies, waiter->timeout)) { - adjustment -= RWSEM_FLAG_HANDOFF; - lockevent_inc(rwsem_rlock_handoff); + if (time_after(jiffies, waiter->timeout)) { + if (!(oldcount & RWSEM_FLAG_HANDOFF)) { + adjustment -= RWSEM_FLAG_HANDOFF; + lockevent_inc(rwsem_rlock_handoff); + } + waiter->handoff_set = true; } atomic_long_add(-adjustment, &sem->count); @@ -599,7 +599,7 @@ rwsem_del_wake_waiter(struct rw_semaphore *sem, struct rwsem_waiter *waiter, static inline bool rwsem_try_write_lock(struct rw_semaphore *sem, struct rwsem_waiter *waiter) { - bool first = rwsem_first_waiter(sem) == waiter; + struct rwsem_waiter *first = rwsem_first_waiter(sem); long count, new; lockdep_assert_held(&sem->wait_lock); @@ -609,11 +609,20 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem, bool has_handoff = !!(count & RWSEM_FLAG_HANDOFF); if (has_handoff) { - if (!first) + /* + * Honor handoff bit and yield only when the first + * waiter is the one that set it. Otherwisee, we + * still try to acquire the rwsem. + */ + if (first->handoff_set && (waiter != first)) return false; - /* First waiter inherits a previously set handoff bit */ - waiter->handoff_set = true; + /* + * First waiter can inherit a previously set handoff + * bit and spin on rwsem if lock acquisition fails. + */ + if (waiter == first) + waiter->handoff_set = true; } new = count; @@ -1027,6 +1036,7 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat waiter.task = current; waiter.type = RWSEM_WAITING_FOR_READ; waiter.timeout = jiffies + RWSEM_WAIT_TIMEOUT; + waiter.handoff_set = false; raw_spin_lock_irq(&sem->wait_lock); if (list_empty(&sem->wait_list)) { -- 2.27.0