Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp573203yba; Fri, 5 Apr 2019 12:23:21 -0700 (PDT) X-Google-Smtp-Source: APXvYqzONyDaP7UE+TaQOsr5ePxfPyI57xidVWoDzjzZVVrgjUwXuXmEZFYVvSptrrYS4J1BXWku X-Received: by 2002:a17:902:2f:: with SMTP id 44mr14662847pla.139.1554492200923; Fri, 05 Apr 2019 12:23:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554492200; cv=none; d=google.com; s=arc-20160816; b=FjxaRqktHSNpcoD8bRvzdiPQ3PFEN2f1PThuqvEo4F/jewCx36+mRSEhan2WauRoqZ Cf2ERbwURWqhduNE33/sjhRN1eXxOrXEUpncbXQJjYuU5dh18REt1WsSnl0KNppAB+5D mFKDQUCMtMZnQmN2H9sEeVfR6hNUtF5GhA6tyiXVN1E3a9Spex0EOLXsX8xBB8o/Qrbc nM59h3X0B7j4euDQhT5Lx6vhJhwodFQ6GbET/GXysuFHbiIDqe+Rxsyrs8jpfXvI57yX lz16SesgEBUZlr0ieW6/OXOeZwjTTKMULHM1sHp3IIS+IkmFVv0u2g3GyELjyzLjhsde 7GHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from; bh=Gsnjc0Xd69cpWsYk2+E7/0ODxaDYWFlIwXfVFAtkUpo=; b=lMKt44kpASi7N1Ej0dswHiA7Q1zDCjIbopR53Rvcig1iVRmYV2mh8AmfwM2bEyWi/z YeOChGBahgaYDPEIORIUACiQaI28YlTMqOCSv4o9RD0Ho3kDs7gY/kWwNdsX06GxYrId Pk1new9KZ1pTRsdRBf0WT3401QZRkd9rLEJE16GffJDJsonjsiLyE2nA4pM0ktCJH8m7 H3fjYTlvqcCYCGOv0dtycNGEvjiwK0sj6V4AVDI0NH65VF3LAF51w8eiArDiKTohYje3 +652fNrPklh1XtBBu2ho81kWi4cxIWcVoXe8GS+CL9E54dcvhA/cL8+yUjBBtFZyB8wK 5WCQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t186si19203468pgd.221.2019.04.05.12.23.05; Fri, 05 Apr 2019 12:23:20 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731892AbfDETWN (ORCPT + 99 others); Fri, 5 Apr 2019 15:22:13 -0400 Received: from mx1.redhat.com ([209.132.183.28]:40996 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731183AbfDETWM (ORCPT ); Fri, 5 Apr 2019 15:22:12 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id BFAA1C0578F2; Fri, 5 Apr 2019 19:22:11 +0000 (UTC) Received: from llong.com (dhcp-17-47.bos.redhat.com [10.18.17.47]) by smtp.corp.redhat.com (Postfix) with ESMTP id EF6FE6136E; Fri, 5 Apr 2019 19:22:08 +0000 (UTC) From: Waiman Long To: Peter Zijlstra , Ingo Molnar , Will Deacon , Thomas Gleixner Cc: linux-kernel@vger.kernel.org, x86@kernel.org, Davidlohr Bueso , Linus Torvalds , Tim Chen , Waiman Long Subject: [PATCH-tip v2 06/12] locking/rwsem: Wake up almost all readers in wait queue Date: Fri, 5 Apr 2019 15:21:09 -0400 Message-Id: <20190405192115.17416-7-longman@redhat.com> In-Reply-To: <20190405192115.17416-1-longman@redhat.com> References: <20190405192115.17416-1-longman@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Fri, 05 Apr 2019 19:22:11 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When the front of the wait queue is a reader, other readers immediately following the first reader will also be woken up at the same time. However, if there is a writer in between. Those readers behind the writer will not be woken up. Because of optimistic spinning, the lock acquisition order is not FIFO anyway. The lock handoff mechanism will ensure that lock starvation will not happen. Assuming that the lock hold times of the other readers still in the queue will be about the same as the readers that are being woken up, there is really not much additional cost other than the additional latency due to the wakeup of additional tasks by the waker. Therefore all the readers up to a maximum of 256 in the queue are woken up when the first waiter is a reader to improve reader throughput. With a locking microbenchmark running on 5.1 based kernel, the total locking rates (in kops/s) on a 8-socket IvyBridge-EX system with equal numbers of readers and writers before and after this patch were as follows: # of Threads Pre-Patch Post-patch ------------ --------- ---------- 4 1,641 1,674 8 731 1,062 16 564 924 32 78 300 64 38 195 240 50 149 There is no performance gain at low contention level. At high contention level, however, this patch gives a pretty decent performance boost. Signed-off-by: Waiman Long --- kernel/locking/rwsem-xadd.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index 11d7eb61799a..51858554ff0e 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -88,6 +88,13 @@ enum rwsem_wake_type { */ #define RWSEM_WAIT_TIMEOUT ((HZ - 1)/200 + 1) +/* + * We limit the maximum number of readers that can be woken up for a + * wake-up call to not penalizing the waking thread for spending too + * much time doing it. + */ +#define MAX_READERS_WAKEUP 0x100 + /* * handle the lock release when processes blocked on it that can now run * - if we come here from up_xxxx(), then the RWSEM_FLAG_WAITERS bit must @@ -158,16 +165,16 @@ static void __rwsem_mark_wake(struct rw_semaphore *sem, } /* - * Grant an infinite number of read locks to the readers at the front - * of the queue. We know that woken will be at least 1 as we accounted - * for above. Note we increment the 'active part' of the count by the + * Grant up to MAX_READERS_WAKEUP read locks to all the readers in the + * queue. We know that woken will be at least 1 as we accounted for + * above. Note we increment the 'active part' of the count by the * number of readers before waking any processes up. */ list_for_each_entry_safe(waiter, tmp, &sem->wait_list, list) { struct task_struct *tsk; if (waiter->type == RWSEM_WAITING_FOR_WRITE) - break; + continue; woken++; tsk = waiter->task; @@ -186,6 +193,12 @@ static void __rwsem_mark_wake(struct rw_semaphore *sem, * after setting the reader waiter to nil. */ wake_q_add_safe(wake_q, tsk); + + /* + * Limit # of readers that can be woken up per wakeup call. + */ + if (woken >= MAX_READERS_WAKEUP) + break; } adjustment = woken * RWSEM_READER_BIAS - adjustment; -- 2.18.1