Return-Path: Received: from mail-qy0-f174.google.com ([209.85.216.174]:38684 "EHLO mail-qy0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756966Ab1FJMri (ORCPT ); Fri, 10 Jun 2011 08:47:38 -0400 Received: by qyk7 with SMTP id 7so3048647qyk.19 for ; Fri, 10 Jun 2011 05:47:37 -0700 (PDT) Message-ID: <4DF21267.9060706@panasas.com> Date: Fri, 10 Jun 2011 08:47:35 -0400 From: Benny Halevy To: tao.peng@emc.com CC: bergwolf@gmail.com, rees@umich.edu, linux-nfs@vger.kernel.org, honey@citi.umich.edu Subject: Re: [PATCH 87/88] Add configurable prefetch size for layoutget References: <09142112ff0115f7f22124a69ead7b9bb5e0958f.1307464382.git.rees@umich.edu> <4DEED80A.4000102@panasas.com> <20110608021852.GA20998@merit.edu> <4DF062D6.7010304@panasas.com> <4DF13B65.5030401@panasas.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-nfs-owner@vger.kernel.org List-ID: MIME-Version: 1.0 On 2011-06-10 02:02, tao.peng@emc.com wrote: > Hi, Benny, > > Cheers, > -Bergwolf > > > -----Original Message----- > From: linux-nfs-owner@vger.kernel.org [mailto:linux-nfs-owner@vger.kernel.org] On Behalf Of Benny Halevy > Sent: Friday, June 10, 2011 5:30 AM > To: Peng Tao > Cc: Jim Rees; linux-nfs@vger.kernel.org; peter honeyman > Subject: Re: [PATCH 87/88] Add configurable prefetch size for layoutget > > On 2011-06-09 07:54, Peng Tao wrote: >> On Thu, Jun 9, 2011 at 2:06 PM, Benny Halevy wrote: >>> On 2011-06-08 03:15, Peng Tao wrote: >>>> On 6/8/11, Jim Rees wrote: >>>>> Benny Halevy wrote: >>>>> >>>>> NAK. >>>>> This affects all layout types. In particular it is undesired >>>>> for write layouts that extend the file with the objects layout. >>>>> The server can extend the layout segments range >>>>> over what the client requested so why would the client >>>>> ask for artificially large layouts? >>>>> >>>>> This has actually been the subject of some debate over Thursday night >>>>> beers. The problem we're trying to solve is that the client is spending 98% >>>>> of its time in layoutget. This patch gives us something like a 10x >>>>> speedup. But many of us think it's not the right fix. I suggest we discuss >>>>> next week. >>>>> >>> >>> Sure. >>> >>>>> But note that this patch doesn't change anything unless you set the sysctl. >>>> there is a default value of 2M. maybe we can set it to page size by >>>> default so other layout are not affected and block layout can let >>>> users set it by hand if they care about performance. does this make >>>> sense? >>> >>> If doing it at all why use a sysctl rather than a mount option? >> The purpose of using a sysctl is to give client the ability to change >> it on the fly. In theory, layout prefetching can benefit all layout >> types. So the patch tries to solve it in the pnfs generic layer. >> > > But the need for this varies per-server and many times per application. > Think sequential vs. random I/O. Therefore a mount option would help > tuning the behavior on a per-use basis. Global behavior must be implemented > using a dynamic algorithm that would take both the workload and the server > observed behavior into account. > [PT] Indeed. Dynamic algorithm is supposed to be able to solve all this. And it often takes longer to be designed/accepted. It has to prove to be better in most scenarios and does not hurt the left. We need to find an acceptable solution to push this driver upstream. I understand that developing a dynamic algorithm in the given time frame is too big of a challenge, but hacking yet another client tunable is out of the question either. For testing in the Bakeathon I'd consider taking a DEVONLY version of this patch that is enabled using a config option and defaults to zero to have no effect in run-time until the sysctl is sets it differently. But keep in mind this is not suitable for pushing upstream. Benny