Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp2564919imm; Mon, 10 Sep 2018 03:12:00 -0700 (PDT) X-Google-Smtp-Source: ANB0VdZk/AmrfG86798DO7ylDrOyzGOU1HXG66+/vuHpiA+g731UL29fR+mZO5ZBHwIkVvRvLjb7 X-Received: by 2002:a63:f244:: with SMTP id d4-v6mr21495969pgk.2.1536574320357; Mon, 10 Sep 2018 03:12:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536574320; cv=none; d=google.com; s=arc-20160816; b=ottAkiCqSjbxm1KjS7pbyDob86XCyQpZXYfpqLQKZBK17wTeLAU6CpFvibJFgZPM5D +cMgRf9gtZf3NWNyVv8+Xz/3wc+BqMfBmSAvQuCkVLtrOo7zvfV21NiEPm9r8N2X6kdO eBCGwWH36WsAeNFt+6evtSa9HE8eGCuYmj31nCuR3Kdr+Pc+j1o8yuAeEMPaZamgPwW6 LcdddZ1RFFpky1DMON7Msi+BJb6xXFBw8eNcX48wPi63gwaOKsF4NdF6LxtmUmCKB0D+ KkSUQAewmC/BgoN6vq9lDZn1v9lhCnGApRZihhd5s7NSWqDInWyoBwbdvJKomoxUoGnX CxyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-disposition :content-transfer-encoding:mime-version:robot-unsubscribe:robot-id :git-commit-id:subject:to:references:in-reply-to:reply-to:cc :message-id:from:date; bh=LcNKFJe0IvvjletYIyJ5DrP+2ikZgiyyGmnUUWxob8I=; b=wg02Q/AXOs7ca0MSXjerbXjYtGGJ6qpyuBjTgImPlQccA0UIMTAvixRMYmWR2avBqQ bwhojP9e/RKLpq4T9JF3Bk0GWGhaG8GLZPy+SGcoZBSir+nNKNtr9jIbccwhiVpQH+6f dn7mFgG0KddHRP+p8/ZXa1Em853nisUCwaV+/cWFGAtgOqAvSJqaRZBLYKA+kZray7gK k44yDazE3aH3qdz07sVqfNWmweCFmeG7SucBlgZLD8ZZa8biiAFifMCnBjgR2FPEovzT IJHwd+PIPSuYDDjLMJXUnnxaPa2xXar2ongZD0RTfGU/ebhoaCOnJTDhGBWEQZl5uTen Unaw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 9-v6si17508780pgq.229.2018.09.10.03.11.44; Mon, 10 Sep 2018 03:12:00 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728126AbeIJPDv (ORCPT + 99 others); Mon, 10 Sep 2018 11:03:51 -0400 Received: from terminus.zytor.com ([198.137.202.136]:38751 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727261AbeIJPDv (ORCPT ); Mon, 10 Sep 2018 11:03:51 -0400 Received: from terminus.zytor.com (localhost [127.0.0.1]) by terminus.zytor.com (8.15.2/8.15.2) with ESMTPS id w8AAA2PQ1807387 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 10 Sep 2018 03:10:02 -0700 Received: (from tipbot@localhost) by terminus.zytor.com (8.15.2/8.15.2/Submit) id w8AAA1VO1807384; Mon, 10 Sep 2018 03:10:01 -0700 Date: Mon, 10 Sep 2018 03:10:01 -0700 X-Authentication-Warning: terminus.zytor.com: tipbot set sender to tipbot@zytor.com using -f From: tip-bot for Waiman Long Message-ID: Cc: will.deacon@arm.com, dbueso@suse.de, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, hpa@zytor.com, tglx@linutronix.de, mingo@kernel.org, longman@redhat.com, jmario@redhat.com, peterz@infradead.org Reply-To: jmario@redhat.com, peterz@infradead.org, longman@redhat.com, tglx@linutronix.de, mingo@kernel.org, linux-kernel@vger.kernel.org, hpa@zytor.com, dbueso@suse.de, torvalds@linux-foundation.org, will.deacon@arm.com In-Reply-To: <1532459425-19204-1-git-send-email-longman@redhat.com> References: <1532459425-19204-1-git-send-email-longman@redhat.com> To: linux-tip-commits@vger.kernel.org Subject: [tip:locking/core] locking/rwsem: Exit read lock slowpath if queue empty & no writer Git-Commit-ID: 4b486b535c33ef354ecf02a2650919004fd7d2b0 X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00, T_DATE_IN_FUTURE_96_Q autolearn=ham autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on terminus.zytor.com Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit-ID: 4b486b535c33ef354ecf02a2650919004fd7d2b0 Gitweb: https://git.kernel.org/tip/4b486b535c33ef354ecf02a2650919004fd7d2b0 Author: Waiman Long AuthorDate: Tue, 24 Jul 2018 15:10:25 -0400 Committer: Ingo Molnar CommitDate: Mon, 10 Sep 2018 10:16:39 +0200 locking/rwsem: Exit read lock slowpath if queue empty & no writer It was discovered that a constant stream of readers with occassional writers pounding on a rwsem may cause many of the readers to enter the slowpath unnecessarily thus increasing latency and lowering performance. In the current code, a reader entering the slowpath critical section will unconditionally set the WAITING_BIAS, if not set yet, and clear its active count even if no one is in the wait queue and no writer is present. This causes some incoming readers to observe the presence of waiters in the wait queue and hence have to go into the slowpath themselves. With sufficient numbers of readers and a relatively short lock hold time, the WAITING_BIAS may be repeatedly turned on and off and a substantial portion of the readers will go into the slowpath sustaining a rather long queue in the wait queue spinlock and repeated WAITING_BIAS on/off cycle until the logjam is broken opportunistically. To avoid this situation from happening, an additional check is added to detect the special case that the reader in the critical section is the only one in the wait queue and no writer is present. When that happens, it can just exit the slowpath and return immediately as its active count has already been set in the lock. Other incoming readers won't observe the presence of waiters and so will not be forced into the slowpath. The issue was found in a customer site where they had an application that pounded on the pread64 syscalls heavily on an XFS filesystem. The application was run in a recent 4-socket boxes with a lot of CPUs. They saw significant spinlock contention in the rwsem_down_read_failed() call. With this patch applied, the system CPU usage went down from 85% to 57%, and the spinlock contention in the pread64 syscalls was gone. Signed-off-by: Waiman Long Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Davidlohr Bueso Acked-by: Will Deacon Cc: Joe Mario Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1532459425-19204-1-git-send-email-longman@redhat.com Signed-off-by: Ingo Molnar --- kernel/locking/rwsem-xadd.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index 3064c50e181e..01fcb807598c 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -233,8 +233,19 @@ __rwsem_down_read_failed_common(struct rw_semaphore *sem, int state) waiter.type = RWSEM_WAITING_FOR_READ; raw_spin_lock_irq(&sem->wait_lock); - if (list_empty(&sem->wait_list)) + if (list_empty(&sem->wait_list)) { + /* + * In case the wait queue is empty and the lock isn't owned + * by a writer, this reader can exit the slowpath and return + * immediately as its RWSEM_ACTIVE_READ_BIAS has already + * been set in the count. + */ + if (atomic_long_read(&sem->count) >= 0) { + raw_spin_unlock_irq(&sem->wait_lock); + return sem; + } adjustment += RWSEM_WAITING_BIAS; + } list_add_tail(&waiter.list, &sem->wait_list); /* we're now waiting on the lock, but no longer actively locking */