Return-Path: Received: from mail-out1.uio.no ([129.240.10.57]:52786 "EHLO mail-out1.uio.no" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757826Ab1DLPth (ORCPT ); Tue, 12 Apr 2011 11:49:37 -0400 Subject: Re: [RFC][PATCH] Vector read/write support for NFS (DIO) client From: Trond Myklebust To: Badari Pulavarty Cc: linux-nfs@vger.kernel.org In-Reply-To: <1302622335.3877.62.camel@badari-desktop> References: <1302622335.3877.62.camel@badari-desktop> Content-Type: text/plain; charset="UTF-8" Date: Tue, 12 Apr 2011 11:49:29 -0400 Message-ID: <1302623369.4801.28.camel@lade.trondhjem.org> Sender: linux-nfs-owner@vger.kernel.org List-ID: MIME-Version: 1.0 On Tue, 2011-04-12 at 08:32 -0700, Badari Pulavarty wrote: > Hi, > > We recently ran into serious performance issue with NFS client. > It turned out that its due to lack of readv/write support for > NFS (O_DIRECT) client. > > Here is our use-case: > > In our cloud environment, our storage is over NFS. Files > on NFS are passed as a blockdevices to the guest (using > O_DIRECT). When guest is doing IO on these block devices, > they will end up as O_DIRECT writes to NFS (on KVM host). > > QEMU (on the host) gets a vector from virtio-ring and > submits them. Old versions of QEMU, linearized the vector > it got from KVM (copied them into a buffer) and submits > the buffer. So, NFS client always received a single buffer. > > Later versions of QEMU, eliminated this copy and submits > a vector directly using preadv/pwritev(). > > NFS client loops through the vector and submits each > vector as separate request for each IO < wsize. In our > case (negotiated wsize=1MB), for 256K IO - we get 64 > vectors, each 4K. So, we end up submitting 64 4K FILE_SYNC IOs. > Server end up doing each 4K synchronously. This causes > serious performance degrade. We are trying to see if the > performance improves if we convert IOs to ASYNC - but > our initial results doesn't look good. > > readv/writev support NFS client for all possible cases is > hard. Instead, if all vectors are page-aligned and > iosizes page-multiple - it fits the current code easily. > Luckily, QEMU use-case fits these requirements. > > Here is the patch to add this support. Comments ? Your approach goes in the direction of further special-casing O_DIRECT in the NFS client. I'd like to move away from that and towards integration with the ordinary read/write codepaths so that aside from adding request coalescing, we can also enable pNFS support. Cheers Trond