Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp441065imm; Tue, 7 Aug 2018 22:55:57 -0700 (PDT) X-Google-Smtp-Source: AA+uWPy0D4Cq+ZGwIbO3YCKnoUM+v+9WUqMVueG2hyij6qjiY2c1433Kf7noi7VhsNgLa31yt+pR X-Received: by 2002:a62:b0c:: with SMTP id t12-v6mr1421652pfi.36.1533707757494; Tue, 07 Aug 2018 22:55:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533707757; cv=none; d=google.com; s=arc-20160816; b=MhSt1ahBB1RiMzuMEAB/Ms5ceTkAn0x3fJwJoQK+wZHsJsgFoP4eQ6ejNRxTr1Pt4a XEM4Kk/bCL8cCdafubzRYlqhE6r+75YvVeP/kGRw8EBuWYd6x3tQIW4lPFcVxfrtJ4o5 AXhuU1jv2UPx4UCkewZiLz+n/nl+lxX8c7O5W4cSRdE3BXakIpv0twDSO4Oj503E4c8l QSrEyOov0dPRkW01SunGrvNjuQrSd3iI9A2MS5DrVCoRHevRbikj9Xb3I8FB+R2hpclR J3QfOeRZwF5KMhVjJVovqO9aFSGNT2B/EUdc7a62OI9IXaL7p+4gH0vBtw+G3LF79Sjj 9E9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject :arc-authentication-results; bh=DVDUcE83hIh1avtk/C+GzbGM9o9NEkAd3kpAom31h1I=; b=freDeZ9Z9oozH445gp/U2U0ka31SH7eR4ve/LDi8UdjvcGGwGeBDWpMMHiRgyBLv35 LgnMG+Mlh5hGJy/owFKAsUWwJ/zNGffi058Xz/Ll/SZVDw7QZLPmcZ5RlssSlNC4Kl/k OkXZedFbSofi7J15dB/rF5GzE7+hJuDefmOq6MkRZmi2zU6clbiFB4fle7xcgeHOf7ir g9eCt9tDhbOQgpMQfT78I5vdYb2uVCCpEY5CItbuSdxWOaFpCIU/VENFojo04meH0Nk7 7M01cE4lCcX23n8lvPP/ILRQCdXW958a0+2xMgHIc6qOYTFlni7eXIHKwgA8VpJO9ifq 5vyQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j189-v6si3166675pgd.498.2018.08.07.22.55.42; Tue, 07 Aug 2018 22:55:57 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726974AbeHHIMx (ORCPT + 99 others); Wed, 8 Aug 2018 04:12:53 -0400 Received: from mx1.redhat.com ([209.132.183.28]:60980 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726542AbeHHIMx (ORCPT ); Wed, 8 Aug 2018 04:12:53 -0400 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.24]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 497A2C057F84; Wed, 8 Aug 2018 05:54:52 +0000 (UTC) Received: from llong.remote.csb (ovpn-116-75.phx2.redhat.com [10.3.116.75]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8708F308BDA4; Wed, 8 Aug 2018 05:54:44 +0000 (UTC) Subject: Re: [PATCH v3] locking/rwsem: Exit read lock slowpath if queue empty & no writer To: Peter Zijlstra , Ingo Molnar , Will Deacon Cc: linux-kernel@vger.kernel.org, Joe Mario , Davidlohr Bueso References: <1532459425-19204-1-git-send-email-longman@redhat.com> From: Waiman Long Organization: Red Hat Message-ID: <55866c24-ff40-22cb-1390-6c9055bd2cb7@redhat.com> Date: Tue, 7 Aug 2018 19:29:49 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: <1532459425-19204-1-git-send-email-longman@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Content-Language: en-US X-Scanned-By: MIMEDefang 2.84 on 10.5.11.24 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Wed, 08 Aug 2018 05:54:52 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/24/2018 03:10 PM, Waiman Long wrote: > It was discovered that a constant stream of readers with occassional > writers pounding on a rwsem may cause many of the readers to enter the > slowpath unnecessarily thus increasing latency and lowering performance. > > In the current code, a reader entering the slowpath critical section > will unconditionally set the WAITING_BIAS, if not set yet, and clear > its active count even if no one is in the wait queue and no writer > is present. This causes some incoming readers to observe the presence > of waiters in the wait queue and hence have to go into the slowpath > themselves. > > With sufficient numbers of readers and a relatively short lock hold time, > the WAITING_BIAS may be repeatedly turned on and off and a substantial > portion of the readers will go into the slowpath sustaining a rather > long queue in the wait queue spinlock and repeated WAITING_BIAS on/off > cycle until the logjam is broken opportunistically. > > To avoid this situation from happening, an additional check is added to > detect the special case that the reader in the critical section is the > only one in the wait queue and no writer is present. When that happens, > it can just exit the slowpath and return immediately as its active count > has already been set in the lock. Other incoming readers won't observe > the presence of waiters and so will not be forced into the slowpath. > > The issue was found in a customer site where they had an application > that pounded on the pread64 syscalls heavily on an XFS filesystem. The > application was run in a recent 4-socket boxes with a lot of CPUs. They > saw significant spinlock contention in the rwsem_down_read_failed() call. > With this patch applied, the system CPU usage went down from 85% to 57%, > and the spinlock contention in the pread64 syscalls was gone. > > v3: Revise the commit log and comment again. > v2: Add customer testing results and remove wording that may cause > confusion. > > Signed-off-by: Waiman Long > --- > kernel/locking/rwsem-xadd.c | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c > index 3064c50..01fcb80 100644 > --- a/kernel/locking/rwsem-xadd.c > +++ b/kernel/locking/rwsem-xadd.c > @@ -233,8 +233,19 @@ static void __rwsem_mark_wake(struct rw_semaphore *sem, > waiter.type = RWSEM_WAITING_FOR_READ; > > raw_spin_lock_irq(&sem->wait_lock); > - if (list_empty(&sem->wait_list)) > + if (list_empty(&sem->wait_list)) { > + /* > + * In case the wait queue is empty and the lock isn't owned > + * by a writer, this reader can exit the slowpath and return > + * immediately as its RWSEM_ACTIVE_READ_BIAS has already > + * been set in the count. > + */ > + if (atomic_long_read(&sem->count) >= 0) { > + raw_spin_unlock_irq(&sem->wait_lock); > + return sem; > + } > adjustment += RWSEM_WAITING_BIAS; > + } > list_add_tail(&waiter.list, &sem->wait_list); > > /* we're now waiting on the lock, but no longer actively locking */ Will this patch be eligible to go into 4.19 or 4.20? Thanks, Longman