Return-Path: linux-nfs-owner@vger.kernel.org Received: from fieldses.org ([173.255.197.46]:42416 "EHLO fieldses.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752530AbbBKQWp (ORCPT ); Wed, 11 Feb 2015 11:22:45 -0500 Date: Wed, 11 Feb 2015 11:22:44 -0500 From: "J. Bruce Fields" To: Trond Myklebust Cc: Anna Schumaker , Christoph Hellwig , Linux NFS Mailing List , Thomas D Haynes Subject: Re: [PATCH v2 2/4] NFSD: Add READ_PLUS support for data segments Message-ID: <20150211162244.GH25696@fieldses.org> References: <1422477777-27933-1-git-send-email-Anna.Schumaker@Netapp.com> <1422477777-27933-3-git-send-email-Anna.Schumaker@Netapp.com> <20150205141325.GC4522@infradead.org> <54D394EC.9030902@Netapp.com> <20150205162326.GA18977@infradead.org> <54D39DC2.9060808@Netapp.com> <20150205164832.GB4289@fieldses.org> <54DB7D72.5020001@Netapp.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: Sender: linux-nfs-owner@vger.kernel.org List-ID: On Wed, Feb 11, 2015 at 11:13:38AM -0500, Trond Myklebust wrote: > On Wed, Feb 11, 2015 at 11:04 AM, Anna Schumaker > wrote: > > I'm not seeing a huge performance increase with READ_PLUS compared to READ (in fact, it's often a bit slower compared to READ, even when using splice). My guess is that the problem is mostly on the client's end since we have to do a memory shift on each segment to get everything lined up properly. I'm playing around with code that cuts down the number of memory shifts, but I still have a few bugs to work out before I'll know if it actually helps. > > > > I'm wondering if the right way to do READ_PLUS would have been to > instead have a separate function READ_SPARSE, that will return a list > of all sparse areas in the supplied range. We could even make that a > READ_SAME, that can do the same for patterned data. I worry about ending up with incoherent results, but perhaps it's no different from the current behavior since we're already piecing together our idea of the file content from multiple reads sent in parallel. > The thing is that READ works just fine for what we want it to do. The > real win here would be if given a very large file, we could request a > list of all the sparse areas in, say, a 100GB range, and then use that > data to build up a bitmap of unallocated blocks for which we can skip > the READ requests. Can we start by having the server return a single data extent covering the whole read request, with the single exception of the case where the read falls entirely within a hole? I think that should help in the case of large holes without interfering with the client's zero-copy logic in the case there are no large holes. --b.