Return-Path: Received: from mx141.netapp.com ([216.240.21.12]:46881 "EHLO mx141.netapp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755685AbbCRSQb (ORCPT ); Wed, 18 Mar 2015 14:16:31 -0400 Message-ID: <5509C0FD.70309@Netapp.com> Date: Wed, 18 Mar 2015 14:16:29 -0400 From: Anna Schumaker MIME-Version: 1.0 To: "J. Bruce Fields" CC: Subject: Re: [PATCH v3 3/3] NFSD: Add support for encoding multiple segments References: <1426540688-32095-1-git-send-email-Anna.Schumaker@Netapp.com> <1426540688-32095-4-git-send-email-Anna.Schumaker@Netapp.com> <20150317195633.GC29843@fieldses.org> <20150317200738.GD29843@fieldses.org> <20150317213654.GE29843@fieldses.org> In-Reply-To: <20150317213654.GE29843@fieldses.org> Content-Type: text/plain; charset="utf-8" Sender: linux-nfs-owner@vger.kernel.org List-ID: On 03/17/2015 05:36 PM, J. Bruce Fields wrote: > On Tue, Mar 17, 2015 at 04:07:38PM -0400, J. Bruce Fields wrote: >> On Tue, Mar 17, 2015 at 03:56:33PM -0400, J. Bruce Fields wrote: >>> On Mon, Mar 16, 2015 at 05:18:08PM -0400, Anna Schumaker wrote: >>>> This patch implements sending an array of segments back to the client. >>>> Clients should be prepared to handle multiple segment reads to make this >>>> useful. We try to splice the first data segment into the XDR result, >>>> and remaining segments are encoded directly. >>> >>> I'm still interested in what would happen if we started with an >>> implementation like: >>> >>> - if the entire requested range falls within a hole, return that >>> single hole. >>> - otherwise, just treat the thing as one big data segment. >>> >>> That would provide a benefit in the case there are large-ish holes >>> with minimal impact otherwise. >>> >>> (Though patches for full support are still useful even if only for >>> client-testing purposes.) >> >> Also, looks like >> >> xvs_io -c "fiemap -v" >> >> will give hole sizes for a given . (Thanks, esandeen.) Running >> that on a few of my test vm images shows a fair number of large >> (hundreds of megs) files, which suggests identifying only >=rwsize holes >> might still be useful. > > Just for fun.... I wrote the following test program and ran it on my > collection of testing vm's. Some looked like this: > > f21-1.qcow2 > 144784 -rw-------. 1 qemu qemu 8591507456 Mar 16 10:13 f21-1.qcow2 > total hole bytes: 8443252736 (98%) > in aligned 1MB chunks: 8428453888 (98%) > > So, basically, read_plus would save transferring most of the data even > when only handling 1MB holes. > > But some looked like this: > > 501524 -rw-------. 1 qemu qemu 8589934592 May 20 2014 rhel6-1-1.img > total hole bytes: 8077516800 (94%) > in aligned 1MB chunks: 0 (0%) > > So the READ_PLUS that caught every hole might save a lot, the one that > only caught 1MB holes wouldn't help at all. > > And there were lots of examples in between those two extremes. I tested with three different 512 MB files: 100% data, 100% hole, and alternating every megabyte. The results were surprising: | v4.1 | v4.2 ----------------------- data | 0.685s | 0.714s hole | 0.485s | 15.547s mixed | 1.283s | 0.448 >From what I can tell, the 100% hole case takes so long because of the SEEK_DATA call in nfsd4_encode_read_plus_hole(). I took this out to trick the function into thinking that the entire file was already a hole, and runtime dropped to the levels of v4.1 and v4.2. I wonder if this is filesystem dependent? My server is exporting ext4. Anna > > (But, check my math, I haven't tested this carefully.) > > --b. > > #define _GNU_SOURCE > #include > #include > #include > #include > #include > #include > #include > > long round_up(long n, long b) > { > return ((n + b - 1)/b) * b; > } > > long round_down(long n, long b) > { > return (n/b) * b; > } > > long hbytes = 0; > long rplusbytes = 0; > > do_stats(off_t hole_start, off_t hole_end) > { > off_t hole_start_up, hole_end_down; > > hole_start_up = round_up(hole_start, 1024*1024); > hole_end_down = round_down(hole_end, 1024*1024); > > hbytes += hole_end - hole_start; > if (hole_start_up < hole_end_down) > rplusbytes += hole_end_down - hole_start_up; > } > > int main(int argc, char *argv[]) > { > off_t hole_start, hole_end; > int fd; > char *name; > > /* Map out holes with SEEK_HOLE, SEEK_DATA */ > /* Useful statistics: > * - what percentage of file is in holes? > * - what percentage of file would be skipped if we read it > * sequentially in 1MB chunks? > */ > > if (argc != 2) > errx(1, "usage: %s \n", argv[0]); > name = argv[1]; > fd = open(name, O_RDONLY); > if (fd == -1) > err(1, "open"); > > hole_end = 0; > while (1) { > hole_start = lseek(fd, hole_end, SEEK_HOLE); > if (hole_start == -1) > err(1, "lseek"); > hole_end = lseek(fd, hole_start, SEEK_DATA); > if (hole_end == -1) { > if (errno == ENXIO) > break; > err(1, "lseek"); > } > do_stats(hole_start, hole_end); > } > hole_end = lseek(fd, 0, SEEK_END); > do_stats(hole_start, hole_end); > printf("total hole bytes: %ld (%.0f%)\n", hbytes, > 100 * (float)hbytes/hole_end); > printf("in aligned 1MB chunks: %ld (%.0f%)\n", rplusbytes, > 100 * (float)rplusbytes/hole_end); > } >