Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp2601940ybi; Mon, 17 Jun 2019 07:32:32 -0700 (PDT) X-Google-Smtp-Source: APXvYqxl/Mr8uSEMd3oec+bCsF/+ucHukV0TV0weEMc6sMTyfhJHnfnn7Lc6FYXPjOMdN0bSC3Lt X-Received: by 2002:a65:4383:: with SMTP id m3mr12716236pgp.435.1560781952506; Mon, 17 Jun 2019 07:32:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560781952; cv=none; d=google.com; s=arc-20160816; b=I+pv9XU3pBRQtBhcqNysy3nFqe7EHtgzsjJHKWBLw0rWYkDFeMI3tL3Ac2osOtCwTl KHo6OTZgP4CvUPkGbJLzNzC2/9ecVmXa4tNi1RYPskjO7lZNgwf7O0UoGJ2uEnk8VO4L uJ+GQBcCKFnigRQzGVFYjUjC+ydPWrzEWdE7o6ru1g7xPycltY7rrZm/O+HVysFDO7ok evE7gFiSYGg1wR6uucBGZ/7LuFTPydNb6XnyygWNXwBcq71ReM8I1UTBf+mLBiKcMB17 NnvLmArylF9gEmhjKJZgJ1jG7wl6EQ3sW7XrWy4W1MjLwdXoOFpCvxPIX7C9FJuxCmwH uCMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-disposition :content-transfer-encoding:mime-version:robot-unsubscribe:robot-id :git-commit-id:subject:to:references:in-reply-to:reply-to:cc :message-id:from:date:dkim-signature:dkim-filter; bh=RYRKdUDCnZTQugvPYiQdsU0tnfL+dYxWxZc61NJSO74=; b=kMKe3IXjayQDcwsBfhcHb9KeOyePWMq4K6muoR+EdXGJ1wB6QKpwjq/ev8fjgrG8z6 crnSpV4iOj9jAS6eD6kL+Fy9sth86g9yNPun/se3qBl05bwDbJ6DJhFhy1XuH4xiNI4O 0ZKu2PnkOGhic3aQkK4+Bya1kMkEbtQ192rjlguY7pwTDD/WUhEcCajapKgJOmMkq/Fs Ci/Pj7iI4SMlRHzw9uORFgBJ6iAIgK3phSf2z0/k3PoHzJijDNqdHn1XsqmsTJIiPRrF nBAGA0ha4+pfdbyAzRzXsKgLTsN05uiPXG7Rcufl8yuf2ACw62sIj5ylkhuuIp3QZQGm mfyQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@zytor.com header.s=2019051801 header.b=Qc9YcflK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zytor.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 3si343488plx.344.2019.06.17.07.32.17; Mon, 17 Jun 2019 07:32:32 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@zytor.com header.s=2019051801 header.b=Qc9YcflK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zytor.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728346AbfFQObt (ORCPT + 99 others); Mon, 17 Jun 2019 10:31:49 -0400 Received: from terminus.zytor.com ([198.137.202.136]:54731 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725995AbfFQObs (ORCPT ); Mon, 17 Jun 2019 10:31:48 -0400 Received: from terminus.zytor.com (localhost [127.0.0.1]) by terminus.zytor.com (8.15.2/8.15.2) with ESMTPS id x5HEVJSf3457303 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Mon, 17 Jun 2019 07:31:19 -0700 DKIM-Filter: OpenDKIM Filter v2.11.0 terminus.zytor.com x5HEVJSf3457303 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zytor.com; s=2019051801; t=1560781880; bh=RYRKdUDCnZTQugvPYiQdsU0tnfL+dYxWxZc61NJSO74=; h=Date:From:Cc:Reply-To:In-Reply-To:References:To:Subject:From; b=Qc9YcflKdmcOyW0iorih56FJ03R0iRm8xkRR/qT0dJOkC60jajb2au87yADNPmfQL Svm3P70Ojo6uYmYPT8lpNt+YcNdY+XCBRyVoyG5vSi+FxYkraRXHT3XRRv9/A4qQTq lcA7wujj1I7xxSquhALAVinvmy5dOlAfrV3v2/aa53kxv1KISEqLEXvHZZxnln2LZZ M0MUlA574UPPhd/14Ea3xWDa6YYdWPYipG052rqNAPfkWvfXDaJpgVvIf0lXBt2UpK 5FhYZZVa8V6o4JtcbqRgj7DwhUzet4U7nZwF18CpGYHj9LuM9y1SsDIWHlRPNPA+aR EVEw4ecFKQCag== Received: (from tipbot@localhost) by terminus.zytor.com (8.15.2/8.15.2/Submit) id x5HEVJF13457298; Mon, 17 Jun 2019 07:31:19 -0700 Date: Mon, 17 Jun 2019 07:31:19 -0700 X-Authentication-Warning: terminus.zytor.com: tipbot set sender to tipbot@zytor.com using -f From: tip-bot for Waiman Long Message-ID: Cc: tglx@linutronix.de, linux-kernel@vger.kernel.org, hpa@zytor.com, dave@stgolabs.net, peterz@infradead.org, will.deacon@arm.com, longman@redhat.com, torvalds@linux-foundation.org, mingo@kernel.org, huang.ying.caritas@gmail.com, bp@alien8.de, tim.c.chen@linux.intel.com Reply-To: hpa@zytor.com, linux-kernel@vger.kernel.org, tglx@linutronix.de, peterz@infradead.org, dave@stgolabs.net, will.deacon@arm.com, longman@redhat.com, tim.c.chen@linux.intel.com, torvalds@linux-foundation.org, bp@alien8.de, huang.ying.caritas@gmail.com, mingo@kernel.org In-Reply-To: <20190520205918.22251-13-longman@redhat.com> References: <20190520205918.22251-13-longman@redhat.com> To: linux-tip-commits@vger.kernel.org Subject: [tip:locking/core] locking/rwsem: Enable readers spinning on writer Git-Commit-ID: cf69482d62d996d3ce840eeead8e160de281ac6c X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline X-Spam-Status: No, score=0.9 required=5.0 tests=ALL_TRUSTED,BAYES_00, DATE_IN_FUTURE_06_12,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU, DKIM_VALID_EF,FREEMAIL_FORGED_REPLYTO autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on terminus.zytor.com Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit-ID: cf69482d62d996d3ce840eeead8e160de281ac6c Gitweb: https://git.kernel.org/tip/cf69482d62d996d3ce840eeead8e160de281ac6c Author: Waiman Long AuthorDate: Mon, 20 May 2019 16:59:11 -0400 Committer: Ingo Molnar CommitDate: Mon, 17 Jun 2019 12:28:05 +0200 locking/rwsem: Enable readers spinning on writer This patch enables readers to optimistically spin on a rwsem when it is owned by a writer instead of going to sleep directly. The rwsem_can_spin_on_owner() function is extracted out of rwsem_optimistic_spin() and is called directly by rwsem_down_read_slowpath() and rwsem_down_write_slowpath(). With a locking microbenchmark running on 5.1 based kernel, the total locking rates (in kops/s) on a 8-socket IvyBrige-EX system with equal numbers of readers and writers before and after the patch were as follows: # of Threads Pre-patch Post-patch ------------ --------- ---------- 4 1,674 1,684 8 1,062 1,074 16 924 900 32 300 458 64 195 208 128 164 168 240 149 143 The performance change wasn't significant in this case, but this change is required by a follow-on patch. Signed-off-by: Waiman Long Signed-off-by: Peter Zijlstra (Intel) Cc: Borislav Petkov Cc: Davidlohr Bueso Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Tim Chen Cc: Will Deacon Cc: huang ying Link: https://lkml.kernel.org/r/20190520205918.22251-13-longman@redhat.com Signed-off-by: Ingo Molnar --- kernel/locking/lock_events_list.h | 1 + kernel/locking/rwsem.c | 86 +++++++++++++++++++++++++++++++++------ 2 files changed, 75 insertions(+), 12 deletions(-) diff --git a/kernel/locking/lock_events_list.h b/kernel/locking/lock_events_list.h index 634b47fd8b5e..ca954e4e00e4 100644 --- a/kernel/locking/lock_events_list.h +++ b/kernel/locking/lock_events_list.h @@ -56,6 +56,7 @@ LOCK_EVENT(rwsem_sleep_reader) /* # of reader sleeps */ LOCK_EVENT(rwsem_sleep_writer) /* # of writer sleeps */ LOCK_EVENT(rwsem_wake_reader) /* # of reader wakeups */ LOCK_EVENT(rwsem_wake_writer) /* # of writer wakeups */ +LOCK_EVENT(rwsem_opt_rlock) /* # of read locks opt-spin acquired */ LOCK_EVENT(rwsem_opt_wlock) /* # of write locks opt-spin acquired */ LOCK_EVENT(rwsem_opt_fail) /* # of failed opt-spinnings */ LOCK_EVENT(rwsem_rlock) /* # of read locks acquired */ diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c index 180455b6b0d4..985a03ad3f8c 100644 --- a/kernel/locking/rwsem.c +++ b/kernel/locking/rwsem.c @@ -457,6 +457,30 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem, } #ifdef CONFIG_RWSEM_SPIN_ON_OWNER +/* + * Try to acquire read lock before the reader is put on wait queue. + * Lock acquisition isn't allowed if the rwsem is locked or a writer handoff + * is ongoing. + */ +static inline bool rwsem_try_read_lock_unqueued(struct rw_semaphore *sem) +{ + long count = atomic_long_read(&sem->count); + + if (count & (RWSEM_WRITER_MASK | RWSEM_FLAG_HANDOFF)) + return false; + + count = atomic_long_fetch_add_acquire(RWSEM_READER_BIAS, &sem->count); + if (!(count & (RWSEM_WRITER_MASK | RWSEM_FLAG_HANDOFF))) { + rwsem_set_reader_owned(sem); + lockevent_inc(rwsem_opt_rlock); + return true; + } + + /* Back out the change */ + atomic_long_add(-RWSEM_READER_BIAS, &sem->count); + return false; +} + /* * Try to acquire write lock before the writer has been put on wait queue. */ @@ -491,9 +515,12 @@ static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) BUILD_BUG_ON(is_rwsem_owner_spinnable(RWSEM_OWNER_UNKNOWN)); - if (need_resched()) + if (need_resched()) { + lockevent_inc(rwsem_opt_fail); return false; + } + preempt_disable(); rcu_read_lock(); owner = READ_ONCE(sem->owner); if (owner) { @@ -501,6 +528,9 @@ static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) owner_on_cpu(owner); } rcu_read_unlock(); + preempt_enable(); + + lockevent_cond_inc(rwsem_opt_fail, !ret); return ret; } @@ -578,7 +608,7 @@ static noinline enum owner_state rwsem_spin_on_owner(struct rw_semaphore *sem) return state; } -static bool rwsem_optimistic_spin(struct rw_semaphore *sem) +static bool rwsem_optimistic_spin(struct rw_semaphore *sem, bool wlock) { bool taken = false; int prev_owner_state = OWNER_NULL; @@ -586,9 +616,6 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem) preempt_disable(); /* sem->wait_lock should not be held when doing optimistic spinning */ - if (!rwsem_can_spin_on_owner(sem)) - goto done; - if (!osq_lock(&sem->osq)) goto done; @@ -608,10 +635,11 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem) /* * Try to acquire the lock */ - if (rwsem_try_write_lock_unqueued(sem)) { - taken = true; + taken = wlock ? rwsem_try_write_lock_unqueued(sem) + : rwsem_try_read_lock_unqueued(sem); + + if (taken) break; - } /* * An RT task cannot do optimistic spinning if it cannot @@ -668,7 +696,12 @@ done: return taken; } #else -static bool rwsem_optimistic_spin(struct rw_semaphore *sem) +static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem) +{ + return false; +} + +static inline bool rwsem_optimistic_spin(struct rw_semaphore *sem, bool wlock) { return false; } @@ -684,6 +717,31 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, int state) struct rwsem_waiter waiter; DEFINE_WAKE_Q(wake_q); + if (!rwsem_can_spin_on_owner(sem)) + goto queue; + + /* + * Undo read bias from down_read() and do optimistic spinning. + */ + atomic_long_add(-RWSEM_READER_BIAS, &sem->count); + adjustment = 0; + if (rwsem_optimistic_spin(sem, false)) { + /* + * Wake up other readers in the wait list if the front + * waiter is a reader. + */ + if ((atomic_long_read(&sem->count) & RWSEM_FLAG_WAITERS)) { + raw_spin_lock_irq(&sem->wait_lock); + if (!list_empty(&sem->wait_list)) + rwsem_mark_wake(sem, RWSEM_WAKE_READ_OWNED, + &wake_q); + raw_spin_unlock_irq(&sem->wait_lock); + wake_up_q(&wake_q); + } + return sem; + } + +queue: waiter.task = current; waiter.type = RWSEM_WAITING_FOR_READ; waiter.timeout = jiffies + RWSEM_WAIT_TIMEOUT; @@ -696,7 +754,7 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, int state) * exit the slowpath and return immediately as its * RWSEM_READER_BIAS has already been set in the count. */ - if (!(atomic_long_read(&sem->count) & + if (adjustment && !(atomic_long_read(&sem->count) & (RWSEM_WRITER_MASK | RWSEM_FLAG_HANDOFF))) { raw_spin_unlock_irq(&sem->wait_lock); rwsem_set_reader_owned(sem); @@ -708,7 +766,10 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, int state) list_add_tail(&waiter.list, &sem->wait_list); /* we're now waiting on the lock, but no longer actively locking */ - count = atomic_long_add_return(adjustment, &sem->count); + if (adjustment) + count = atomic_long_add_return(adjustment, &sem->count); + else + count = atomic_long_read(&sem->count); /* * If there are no active locks, wake the front queued process(es). @@ -767,7 +828,8 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state) DEFINE_WAKE_Q(wake_q); /* do optimistic spinning and steal lock if possible */ - if (rwsem_optimistic_spin(sem)) + if (rwsem_can_spin_on_owner(sem) && + rwsem_optimistic_spin(sem, true)) return sem; /*