Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp1015902ybe; Wed, 11 Sep 2019 08:08:37 -0700 (PDT) X-Google-Smtp-Source: APXvYqz1O2af86lx2b9aiDxKZCnyMk7qU6v7kbYTR5UghsUKhR72pdY5z93N0Huv2vchnUoaHsjq X-Received: by 2002:a17:906:4882:: with SMTP id v2mr30768876ejq.100.1568214517705; Wed, 11 Sep 2019 08:08:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1568214517; cv=none; d=google.com; s=arc-20160816; b=wxLq8oN8OWwe1cWKXEZdLflLxurOM2KGoyRQ4BXW2GB+MYDxILmnFMQHkTJP+7IkNH 5RnJa0NoXm7koDECCuGwEupJtjOu4ZsOQFFJkbdDnHVvISj4K6pxqPcpY9VqAjc8Mc1s 8EQTvMaCCYWsGUHfrUC9mrcIbpAIA6ywefcKPhOb3cdKeieemgEVdoEhs9ORB4/8qhU6 IawxUQ+KcDT/+u4PCTLecROroe1La49pXpR+Ta35woh+RkPCpM6WuFmiPRcXQ11gzb+C F9GDGB3Ya3BqYMbBogUCf83yHRG4fCr/L3vbxx30N9aQri1fuY4juKurhryJlHtSeJ2e OZKw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from; bh=8sDHGWeKiSKFol06XuQVTKRZ5rE8MhUYsb4O5lftF24=; b=qezq/QcTk3/uutetHPZd+tG+7FIR5JlT3wq6gyco0VcCwy2ewM1Rziaf1SJ1eFWdf7 r6nypfeYn7QDuCBUyABZPoV3JdHTQtZMd5PG788oiIG2P5RQeaZqaug0014CNFuGerGv Qi3kppIc9gioBYYqQUkAMqSFjgqBd7OcfRZslgwJ+7ZgJs2MgptpoTDaq/tY7nQ5mPqe h3jBrCcEHskfphBk4q13f93D2MKyTs5slbxakekOxt8RPl4asf7+nugqTQqTP2zP5aIZ NhPFTlrNUeefCkokd4BhqoS1i8OTcx0YrG863VTJGFIPEaF4ATkK2GFuwltJ64pZvBa9 fafg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t2si11581624ejr.321.2019.09.11.08.08.13; Wed, 11 Sep 2019 08:08:37 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728430AbfIKPGO (ORCPT + 99 others); Wed, 11 Sep 2019 11:06:14 -0400 Received: from mx1.redhat.com ([209.132.183.28]:62270 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727581AbfIKPGO (ORCPT ); Wed, 11 Sep 2019 11:06:14 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 11103309C386; Wed, 11 Sep 2019 15:06:13 +0000 (UTC) Received: from llong.com (ovpn-125-196.rdu2.redhat.com [10.10.125.196]) by smtp.corp.redhat.com (Postfix) with ESMTP id 866675D9E2; Wed, 11 Sep 2019 15:06:10 +0000 (UTC) From: Waiman Long To: Peter Zijlstra , Ingo Molnar , Will Deacon , Alexander Viro , Mike Kravetz Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Davidlohr Bueso , Waiman Long Subject: [PATCH 2/5] locking/rwsem: Enable timeout check when spinning on owner Date: Wed, 11 Sep 2019 16:05:34 +0100 Message-Id: <20190911150537.19527-3-longman@redhat.com> In-Reply-To: <20190911150537.19527-1-longman@redhat.com> References: <20190911150537.19527-1-longman@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.43]); Wed, 11 Sep 2019 15:06:13 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When a task is optimistically spinning on the owner, it may do it for a long time if there is no other running task available in the run queue. That can be long past the given timeout value. To prevent that from happening, the rwsem_optimistic_spin() is now modified to check for the timeout value, if specified, to see if it should abort early. Signed-off-by: Waiman Long --- kernel/locking/rwsem.c | 67 ++++++++++++++++++++++++++++-------------- 1 file changed, 45 insertions(+), 22 deletions(-) diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c index c0285749c338..49f052d68404 100644 --- a/kernel/locking/rwsem.c +++ b/kernel/locking/rwsem.c @@ -716,11 +716,13 @@ rwsem_owner_state(struct task_struct *owner, unsigned long flags, unsigned long } static noinline enum owner_state -rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable) +rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable, + ktime_t timeout) { struct task_struct *new, *owner; unsigned long flags, new_flags; enum owner_state state; + int loopcnt = 0; owner = rwsem_owner_flags(sem, &flags); state = rwsem_owner_state(owner, flags, nonspinnable); @@ -749,16 +751,22 @@ rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable) */ barrier(); - if (need_resched() || !owner_on_cpu(owner)) { - state = OWNER_NONSPINNABLE; - break; - } + if (need_resched() || !owner_on_cpu(owner)) + goto stop_optspin; + + if (timeout && !(++loopcnt & 0xf) && + (sched_clock() >= ktime_to_ns(timeout))) + goto stop_optspin; cpu_relax(); } rcu_read_unlock(); return state; + +stop_optspin: + rcu_read_unlock(); + return OWNER_NONSPINNABLE; } /* @@ -786,12 +794,13 @@ static inline u64 rwsem_rspin_threshold(struct rw_semaphore *sem) return sched_clock() + delta; } -static bool rwsem_optimistic_spin(struct rw_semaphore *sem, bool wlock) +static bool rwsem_optimistic_spin(struct rw_semaphore *sem, bool wlock, + ktime_t timeout) { bool taken = false; int prev_owner_state = OWNER_NULL; int loop = 0; - u64 rspin_threshold = 0; + u64 rspin_threshold = 0, curtime; unsigned long nonspinnable = wlock ? RWSEM_WR_NONSPINNABLE : RWSEM_RD_NONSPINNABLE; @@ -801,6 +810,8 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem, bool wlock) if (!osq_lock(&sem->osq)) goto done; + curtime = timeout ? sched_clock() : 0; + /* * Optimistically spin on the owner field and attempt to acquire the * lock whenever the owner changes. Spinning will be stopped when: @@ -810,7 +821,7 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem, bool wlock) for (;;) { enum owner_state owner_state; - owner_state = rwsem_spin_on_owner(sem, nonspinnable); + owner_state = rwsem_spin_on_owner(sem, nonspinnable, timeout); if (!(owner_state & OWNER_SPINNABLE)) break; @@ -823,6 +834,21 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem, bool wlock) if (taken) break; + /* + * Check current time once every 16 iterations when + * 1) spinning on reader-owned rwsem; or + * 2) a timeout value is specified. + * + * This is to avoid calling sched_clock() too frequently + * so as to reduce the average latency between the times + * when the lock becomes free and when the spinner is + * ready to do a trylock. + */ + if ((wlock && (owner_state == OWNER_READER)) || timeout) { + if (!(++loop & 0xf)) + curtime = sched_clock(); + } + /* * Time-based reader-owned rwsem optimistic spinning */ @@ -838,23 +864,18 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem, bool wlock) if (rwsem_test_oflags(sem, nonspinnable)) break; rspin_threshold = rwsem_rspin_threshold(sem); - loop = 0; } - /* - * Check time threshold once every 16 iterations to - * avoid calling sched_clock() too frequently so - * as to reduce the average latency between the times - * when the lock becomes free and when the spinner - * is ready to do a trylock. - */ - else if (!(++loop & 0xf) && (sched_clock() > rspin_threshold)) { + else if (curtime > rspin_threshold) { rwsem_set_nonspinnable(sem); lockevent_inc(rwsem_opt_nospin); break; } } + if (timeout && (ns_to_ktime(curtime) >= timeout)) + break; + /* * An RT task cannot do optimistic spinning if it cannot * be sure the lock holder is running or live-lock may @@ -968,7 +989,8 @@ static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem, return false; } -static inline bool rwsem_optimistic_spin(struct rw_semaphore *sem, bool wlock) +static inline bool rwsem_optimistic_spin(struct rw_semaphore *sem, bool wlock, + ktime_t timeout) { return false; } @@ -982,7 +1004,8 @@ static inline bool rwsem_reader_phase_trylock(struct rw_semaphore *sem, } static inline int -rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable) +rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable, + ktime_t timeout) { return 0; } @@ -1036,7 +1059,7 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, int state) */ atomic_long_add(-RWSEM_READER_BIAS, &sem->count); adjustment = 0; - if (rwsem_optimistic_spin(sem, false)) { + if (rwsem_optimistic_spin(sem, false, 0)) { /* rwsem_optimistic_spin() implies ACQUIRE on success */ /* * Wake up other readers in the wait list if the front @@ -1175,7 +1198,7 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state, ktime_t timeout) /* do optimistic spinning and steal lock if possible */ if (rwsem_can_spin_on_owner(sem, RWSEM_WR_NONSPINNABLE) && - rwsem_optimistic_spin(sem, true)) { + rwsem_optimistic_spin(sem, true, timeout)) { /* rwsem_optimistic_spin() implies ACQUIRE on success */ return sem; } @@ -1255,7 +1278,7 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state, ktime_t timeout) * without sleeping. */ if ((wstate == WRITER_HANDOFF) && - (rwsem_spin_on_owner(sem, 0) == OWNER_NULL)) + (rwsem_spin_on_owner(sem, 0, 0) == OWNER_NULL)) goto trylock_again; /* Block until there are no active lockers. */ -- 2.18.1