Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp573008yba; Fri, 5 Apr 2019 12:23:05 -0700 (PDT) X-Google-Smtp-Source: APXvYqyP7QJCiyhSrO1DG56Y2unJABMMvwA9mNVn743f1ixGouq9kzs1FaU8qGR8xkPKz9YOb7K6 X-Received: by 2002:a17:902:b597:: with SMTP id a23mr14312196pls.284.1554492185474; Fri, 05 Apr 2019 12:23:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554492185; cv=none; d=google.com; s=arc-20160816; b=cxOuNikXJ4KY/F+CWvD32Zcy67jmHmymHEsPf0Op53OW2JDrLMFFZl2/yftoFeprr9 F9t5d60s2l7S10ARfNkobK0cCHKnv5YMAMMPB0B7NCNoiVIqsxDtI0TWNblgivKC7Q+s LRCV76KeNhSwn9Pvcz8jh9L5AzhhE4FiutPbzAeAGT99Dyr4jq9DPLXCL+d5aCdgO5JN 96mEZmu9TbHM0WP6oRt0YgGCZe6j9uQiIlUJ+7LxOhmIL1+vhaXjug9H6MuDWiXev+0J wzqGofVeWy7lw8C2WxqsAE1e04ashvzyuX4mTvhWxoXsEzhxRwsGECyQjugGYeAMQV/c V0Iw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from; bh=4PpjX47J34DT8dKeLQwUQgioScnZ3JhSAxz6iko/VeQ=; b=GPvNEXN2ZCq3u5+rqLrOfxkRD7RP7LB9M259e38g7tW8YlCTYTSEL0IChdsq8EdTwp yw52tdQtSJrhRmvT7k3JDkouLj3TtOHwS0wLup6d4t43IlUSZFQtk51JtW1aJc1SXsNj WmXSDkWhVjG/7xdaU2wa7J+4sQXfawGerbZ8DqEaCaFyASRiZiMFJfzABWG7NbqqzErQ DKXs7bHch4O4FhRKTxhhwex+wbUNvQT3XzDxu7xMQGgw9DbexQmiKWod5GfHDgt79tZw rjqOorZEzy8GYXTDMWC1+QBTink9AjVsgb50rqpmX6yY4lk1PrPIJ7P0c0N3Xa9OLDP8 0jZw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l13si15502375pgp.54.2019.04.05.12.22.50; Fri, 05 Apr 2019 12:23:05 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731874AbfDETWK (ORCPT + 99 others); Fri, 5 Apr 2019 15:22:10 -0400 Received: from mx1.redhat.com ([209.132.183.28]:45041 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731183AbfDETWJ (ORCPT ); Fri, 5 Apr 2019 15:22:09 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CF6B230F8D97; Fri, 5 Apr 2019 19:22:08 +0000 (UTC) Received: from llong.com (dhcp-17-47.bos.redhat.com [10.18.17.47]) by smtp.corp.redhat.com (Postfix) with ESMTP id D707D620A0; Fri, 5 Apr 2019 19:22:04 +0000 (UTC) From: Waiman Long To: Peter Zijlstra , Ingo Molnar , Will Deacon , Thomas Gleixner Cc: linux-kernel@vger.kernel.org, x86@kernel.org, Davidlohr Bueso , Linus Torvalds , Tim Chen , Waiman Long Subject: [PATCH-tip v2 05/12] locking/rwsem: Ensure an RT task will not spin on reader Date: Fri, 5 Apr 2019 15:21:08 -0400 Message-Id: <20190405192115.17416-6-longman@redhat.com> In-Reply-To: <20190405192115.17416-1-longman@redhat.com> References: <20190405192115.17416-1-longman@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Fri, 05 Apr 2019 19:22:08 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org An RT task can do optimistic spinning only if the lock holder is actually running. If the state of the lock holder isn't known, there is a possibility that high priority of the RT task may block forward progress of the lock holder if it happens to reside on the same CPU. This will lead to deadlock. So we have to make sure that an RT task will not spin on a reader-owned rwsem. When the owner is temporarily set to NULL, it is more tricky to decide if an RT task should stop spinning as it may be a temporary state where another writer may have just stolen the lock which then failed the task's trylock attempt. So one more retry is allowed to make sure that the lock is not spinnable by an RT task. When testing on a 8-socket IvyBridge-EX system, the one additional retry seems to improve locking performance of RT write locking threads under heavy contentions. The table below shows the locking rates (in kops/s) with various write locking threads before and after the patch. Locking threads Pre-patch Post-patch --------------- --------- ----------- 4 2,753 2,608 8 2,529 2,520 16 1,727 1,918 32 1,263 1,956 64 889 1,343 Signed-off-by: Waiman Long --- kernel/locking/rwsem-xadd.c | 36 +++++++++++++++++++++++++++++------- 1 file changed, 29 insertions(+), 7 deletions(-) diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c index 35891c53338b..11d7eb61799a 100644 --- a/kernel/locking/rwsem-xadd.c +++ b/kernel/locking/rwsem-xadd.c @@ -349,6 +349,8 @@ static noinline enum owner_state rwsem_spin_on_owner(struct rw_semaphore *sem) static bool rwsem_optimistic_spin(struct rw_semaphore *sem) { bool taken = false; + bool is_rt_task = rt_task(current); + int prev_owner_state = OWNER_NULL; preempt_disable(); @@ -366,7 +368,12 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem) * 2) readers own the lock as we can't determine if they are * actively running or not. */ - while (rwsem_spin_on_owner(sem) & OWNER_SPINNABLE) { + for (;;) { + enum owner_state owner_state = rwsem_spin_on_owner(sem); + + if (!(owner_state & OWNER_SPINNABLE)) + break; + /* * Try to acquire the lock */ @@ -376,13 +383,28 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem) } /* - * When there's no owner, we might have preempted between the - * owner acquiring the lock and setting the owner field. If - * we're an RT task that will live-lock because we won't let - * the owner complete. + * An RT task cannot do optimistic spinning if it cannot + * be sure the lock holder is running or live-lock may + * happen if the current task and the lock holder happen + * to run in the same CPU. + * + * When there's no owner or is reader-owned, an RT task + * will stop spinning if the owner state is not a writer + * at the previous iteration of the loop. This allows the + * RT task to recheck if the task that steals the lock is + * a spinnable writer. If so, it can keeps on spinning. + * + * If the owner is a writer, the need_resched() check is + * done inside rwsem_spin_on_owner(). If the owner is not + * a writer, need_resched() check needs to be done here. */ - if (!sem->owner && (need_resched() || rt_task(current))) - break; + if (owner_state != OWNER_WRITER) { + if (need_resched()) + break; + if (is_rt_task && (prev_owner_state != OWNER_WRITER)) + break; + } + prev_owner_state = owner_state; /* * The cpu_relax() call is a compiler barrier which forces -- 2.18.1