Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp669018yba; Sat, 13 Apr 2019 10:24:55 -0700 (PDT) X-Google-Smtp-Source: APXvYqzijcMufw5x2OcnaICM4jVqOEjLlRb1j03loNWWzn2//DUE5pphqIBFUUbeahc/qoxYG/JH X-Received: by 2002:a65:6201:: with SMTP id d1mr58296171pgv.28.1555176295477; Sat, 13 Apr 2019 10:24:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555176295; cv=none; d=google.com; s=arc-20160816; b=XIrbpK8/LcF1zPPUg203Y8PGWHOVubf0cxvNwMj03cze5MQpfHdfvfvSAbZkS8RIq2 hGMV0JRkgp8DXD/MTOCNZKy1X4Pdv+mbVaL5MlJX5A/UDkP48tGXBey+GS2ilnoBiYWd PBQDI+ns5m07rn6ZXMVAQ5OW5lcI9fpwKj2ht0eAcekSVkrnytuQndzs2g/aQICzYSLa KauCSMRRpf1ZzmMwxanjpK6UtggeibYjDjYJECcQKRuFSJm0o18E42qCe+dKyNIyVYXp MYxnI6Kcd9d3GgAIUi5QO+6+ePKYR5FqpHYWrkrBHsuw94TTfTahbAwz2/O1A8RGUbhg daMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from; bh=4N/bpN+1IXmAcSXuiDsOAWJ51l5XputY+KeUx+dQHAc=; b=fib2qBqMgaXUeSkjRuuvoRhamFi3CsvuyphfKBCv+jz1BS+EBLHix3ycDiyJrPADV+ TbZ+Cx4mQtt12OyDS//9YWd4P00FUiidXyIBISuT1IuflNzLifICngN48hkuk66hbG5e J9Ueyu2z61gDdIAEhjEUoUCoTcAqTCAI8o6RgqDvJT9zIZOo2NKzmNu2SEj9mLkBraGO HB5hBWCD3uGeKJUsg81eBBooC+9h8u5zD8xNTOy/+N2kAD7yXIXSItqH1/zPy9oe6raE AmDfLD3mHxjAhSGzdi2+a7IwMyMo6XJifweNjeiEbVtRVbf8aD/LX8DLFUnDpjOo0cjE 0gqg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h8si38748614pgp.446.2019.04.13.10.24.36; Sat, 13 Apr 2019 10:24:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727504AbfDMRXs (ORCPT + 99 others); Sat, 13 Apr 2019 13:23:48 -0400 Received: from mx1.redhat.com ([209.132.183.28]:32818 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727460AbfDMRXr (ORCPT ); Sat, 13 Apr 2019 13:23:47 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C25FF3199369; Sat, 13 Apr 2019 17:23:46 +0000 (UTC) Received: from llong.com (ovpn-120-133.rdu2.redhat.com [10.10.120.133]) by smtp.corp.redhat.com (Postfix) with ESMTP id 966B35D9C6; Sat, 13 Apr 2019 17:23:45 +0000 (UTC) From: Waiman Long To: Peter Zijlstra , Ingo Molnar , Will Deacon , Thomas Gleixner Cc: linux-kernel@vger.kernel.org, x86@kernel.org, Davidlohr Bueso , Linus Torvalds , Tim Chen , huang ying , Waiman Long Subject: [PATCH v4 03/16] locking/rwsem: Remove rwsem_wake() wakeup optimization Date: Sat, 13 Apr 2019 13:22:46 -0400 Message-Id: <20190413172259.2740-4-longman@redhat.com> In-Reply-To: <20190413172259.2740-1-longman@redhat.com> References: <20190413172259.2740-1-longman@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.41]); Sat, 13 Apr 2019 17:23:46 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org With the commit 59aabfc7e959 ("locking/rwsem: Reduce spinlock contention in wakeup after up_read()/up_write()"), the rwsem_wake() forgoes doing a wakeup if the wait_lock cannot be directly acquired and an optimistic spinning locker is present. This can help performance by avoiding spinning on the wait_lock when it is contended. With the later commit 133e89ef5ef3 ("locking/rwsem: Enable lockless waiter wakeup(s)"), the performance advantage of the above optimization diminishes as the average wait_lock hold time become much shorter. With a later patch that supports rwsem lock handoff, we can no longer relies on the fact that the presence of an optimistic spinning locker will ensure that the lock will be acquired by a task soon and rwsem_wake() will be called later on to wake up waiters. This can lead to missed wakeup and application hang. So the commit 59aabfc7e959 ("locking/rwsem: Reduce spinlock contention in wakeup after up_read()/up_write()") will have to be reverted. Signed-off-by: Waiman Long --- kernel/locking/rwsem-xadd.c | 72 ------------------------------------- 1 file changed, 72 deletions(-) diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index 7fd4f1de794a..98de7f0cfedd 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -395,25 +395,11 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem) lockevent_cond_inc(rwsem_opt_fail, !taken); return taken; } - -/* - * Return true if the rwsem has active spinner - */ -static inline bool rwsem_has_spinner(struct rw_semaphore *sem) -{ - return osq_is_locked(&sem->osq); -} - #else static bool rwsem_optimistic_spin(struct rw_semaphore *sem) { return false; } - -static inline bool rwsem_has_spinner(struct rw_semaphore *sem) -{ - return false; -} #endif /* @@ -635,65 +621,7 @@ struct rw_semaphore *rwsem_wake(struct rw_semaphore *sem) unsigned long flags; DEFINE_WAKE_Q(wake_q); - /* - * __rwsem_down_write_failed_common(sem) - * rwsem_optimistic_spin(sem) - * osq_unlock(sem->osq) - * ... - * atomic_long_add_return(&sem->count) - * - * - VS - - * - * __up_write() - * if (atomic_long_sub_return_release(&sem->count) < 0) - * rwsem_wake(sem) - * osq_is_locked(&sem->osq) - * - * And __up_write() must observe !osq_is_locked() when it observes the - * atomic_long_add_return() in order to not miss a wakeup. - * - * This boils down to: - * - * [S.rel] X = 1 [RmW] r0 = (Y += 0) - * MB RMB - * [RmW] Y += 1 [L] r1 = X - * - * exists (r0=1 /\ r1=0) - */ - smp_rmb(); - - /* - * If a spinner is present, it is not necessary to do the wakeup. - * Try to do wakeup only if the trylock succeeds to minimize - * spinlock contention which may introduce too much delay in the - * unlock operation. - * - * spinning writer up_write/up_read caller - * --------------- ----------------------- - * [S] osq_unlock() [L] osq - * MB RMB - * [RmW] rwsem_try_write_lock() [RmW] spin_trylock(wait_lock) - * - * Here, it is important to make sure that there won't be a missed - * wakeup while the rwsem is free and the only spinning writer goes - * to sleep without taking the rwsem. Even when the spinning writer - * is just going to break out of the waiting loop, it will still do - * a trylock in rwsem_down_write_failed() before sleeping. IOW, if - * rwsem_has_spinner() is true, it will guarantee at least one - * trylock attempt on the rwsem later on. - */ - if (rwsem_has_spinner(sem)) { - /* - * The smp_rmb() here is to make sure that the spinner - * state is consulted before reading the wait_lock. - */ - smp_rmb(); - if (!raw_spin_trylock_irqsave(&sem->wait_lock, flags)) - return sem; - goto locked; - } raw_spin_lock_irqsave(&sem->wait_lock, flags); -locked: if (!list_empty(&sem->wait_list)) __rwsem_mark_wake(sem, RWSEM_WAKE_ANY, &wake_q); -- 2.18.1