Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936140AbcJFVsF (ORCPT ); Thu, 6 Oct 2016 17:48:05 -0400 Received: from ipmail05.adl6.internode.on.net ([150.101.137.143]:18631 "EHLO ipmail05.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933011AbcJFVr5 (ORCPT ); Thu, 6 Oct 2016 17:47:57 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DXDgD1xfZXEJqYLHlcHQEFAQsBgz0BAQEBAR6BU4ZyhkCVYAaBGot7hiKCD4ILhhoEAgKBbjkUAQIBAQEBAQEBBgEBAQEBAQI3QIRiAQEEOhwjEAgDDgoJJQ8FJQMHGhOITb8wAQsmHoVUhR+KJgWIO5FEj3CBeYRngzeFaIcPhWiDfx51BQeEbio0h2IBAQE Date: Fri, 7 Oct 2016 08:47:51 +1100 From: Dave Chinner To: Davidlohr Bueso Cc: Waiman Long , Peter Zijlstra , Ingo Molnar , linux-kernel@vger.kernel.org, x86@kernel.org, linux-alpha@vger.kernel.org, linux-ia64@vger.kernel.org, linux-s390@vger.kernel.org, linux-arch@vger.kernel.org, linux-doc@vger.kernel.org, Jason Low , Jonathan Corbet , Scott J Norton , Douglas Hatch Subject: Re: [RFC PATCH-tip v4 02/10] locking/rwsem: Stop active read lock ASAP Message-ID: <20161006214751.GU27872@dastard> References: <1471554672-38662-1-git-send-email-Waiman.Long@hpe.com> <1471554672-38662-3-git-send-email-Waiman.Long@hpe.com> <20161006181718.GA14967@linux-80c1.suse> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161006181718.GA14967@linux-80c1.suse> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2140 Lines: 49 On Thu, Oct 06, 2016 at 11:17:18AM -0700, Davidlohr Bueso wrote: > On Thu, 18 Aug 2016, Waiman Long wrote: > > >Currently, when down_read() fails, the active read locking isn't undone > >until the rwsem_down_read_failed() function grabs the wait_lock. If the > >wait_lock is contended, it may takes a while to get the lock. During > >that period, writer lock stealing will be disabled because of the > >active read lock. > > > >This patch will release the active read lock ASAP so that writer lock > >stealing can happen sooner. The only downside is when the reader is > >the first one in the wait queue as it has to issue another atomic > >operation to update the count. > > > >On a 4-socket Haswell machine running on a 4.7-rc1 tip-based kernel, > >the fio test with multithreaded randrw and randwrite tests on the > >same file on a XFS partition on top of a NVDIMM with DAX were run, > >the aggregated bandwidths before and after the patch were as follows: > > > > Test BW before patch BW after patch % change > > ---- --------------- -------------- -------- > > randrw 1210 MB/s 1352 MB/s +12% > > randwrite 1622 MB/s 1710 MB/s +5.4% > > Yeah, this is really a bad workload to make decisions on locking > heuristics imo - if I'm thinking of the same workload. Mainly because > concurrent buffered io to the same file isn't very realistic and you > end up pathologically pounding on i_rwsem (which used to be until > recently i_mutex until Al's parallel lookup/readdir). Obviously write > lock stealing wins in this case. Except that it's DAX, and in 4.7-rc1 that used shared locking at the XFS level and never took exclusive locks. *However*, the DAX IO path locking in XFS has changed in 4.9-rc1 to match the buffered IO single writer POSIX semantics - the test is a bad test based on the fact it exercised a path that is under heavy development and so can't be used as a regression test across multiple kernels. If you want to stress concurrent access to a single file, please use direct IO, not DAX or buffered IO. Cheers, Dave. -- Dave Chinner david@fromorbit.com