Return-Path: Received: from ipmail04.adl6.internode.on.net ([150.101.137.141]:43240 "EHLO ipmail04.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754807AbbDOW5y (ORCPT ); Wed, 15 Apr 2015 18:57:54 -0400 Date: Thu, 16 Apr 2015 08:57:45 +1000 From: Dave Chinner To: Anna Schumaker Cc: "J. Bruce Fields" , "linux-nfs@vger.kernel.org" , Trond Myklebust , Marc Eshel , xfs@oss.sgi.com, Christoph Hellwig , linux-nfs-owner@vger.kernel.org Subject: Re: [PATCH v3 3/3] NFSD: Add support for encoding multiple segments Message-ID: <20150415225745.GW13731@dastard> References: <55142FB4.2070408@Netapp.com> <5515A9C8.6090400@Netapp.com> <5515C1BF.8000907@Netapp.com> <20150327205414.GD27889@fieldses.org> <5515C3BE.3040807@Netapp.com> <20150327210839.GE27889@fieldses.org> <552EBCB2.1040609@Netapp.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <552EBCB2.1040609@Netapp.com> Sender: linux-nfs-owner@vger.kernel.org List-ID: On Wed, Apr 15, 2015 at 03:32:02PM -0400, Anna Schumaker wrote: > I just ran some more tests comparing the directio case across > different filesystem types. These tests used three 1G files: > 100% data, 100% hole, and mixed file with alternating 4k data and > hole segments. The mixed case seems to be consistently slower > compared to NFS v4.1, and I'm at a loss for anything I could do to > make it faster. Here are my numbers: > > ########### > # # > # XFS # > # # > ########### > > > NFS v4.1: > Trial > |---------|---------|---------|---------|---------|---------|---------| > | | 1 | 2 | 3 | 4 | 5 | Average | > |---------|---------|---------|---------|---------|---------|---------| > | Data | 1.883s | 1.808s | 1.781s | 1.685s | 1.591s | 1.746s | > | Hole | 1.815s | 1.635s | 1.682s | 1.698s | 1.653s | 1.697s | > | Mixed | 2.089s | 2.024s | 1.970s | 1.925s | 2.049s | 2.011s | > |---------|---------|---------|---------|---------|---------|---------| > > > NFS v4.2: > Trial > |---------|---------|---------|---------|---------|---------|---------| > | | 1 | 2 | 3 | 4 | 5 | Average | > |---------|---------|---------|---------|---------|---------|---------| > | Data | 1.849s | 1.879s | 1.852s | 1.799s | 1.781s | 1.832s | > | Hole | 0.668s | 0.600s | 0.611s | 0.619s | 0.617s | 0.623s | > | Mixed | 5.913s | 5.811s | 5.952s | 5.962s | 5.806s | 5.889s | > |---------|---------|---------|---------|---------|---------|---------| What that says to me is that the READ_PLUS when there are (worst case) mixed holes is either burning a lot more CPU than we expected or it is serialising somewhere (not sure where, everything in XFS should be shared locks on read/seek). Can you run a perf profile (even just a snapshot from perf top) on the server so we can see a bit about what is happening on the CPU for the different workloads? Cheers, Dave. -- Dave Chinner david@fromorbit.com