Return-Path: Received: from fieldses.org ([174.143.236.118]:56110 "EHLO fieldses.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755051Ab0HMMi0 (ORCPT ); Fri, 13 Aug 2010 08:38:26 -0400 Date: Fri, 13 Aug 2010 08:36:35 -0400 To: "Patrick J. LoPresti" Cc: linux-nfs@vger.kernel.org, Trond Myklebust , linux-fsdevel@vger.kernel.org Subject: Re: [PATCH] nfs: lookupcache coherence bugs in WCC update path (revised) Message-ID: <20100813123635.GA8945@fieldses.org> References: <87lj8ckb1e.fsf@patl.com> <8762zgmmer.fsf@patl.com> <1281628195.2873.12.camel@heimdal.trondhjem.org> <1281634004.14329.14.camel@heimdal.trondhjem.org> Content-Type: text/plain; charset=us-ascii In-Reply-To: From: "J. Bruce Fields" Sender: linux-nfs-owner@vger.kernel.org List-ID: MIME-Version: 1.0 On Thu, Aug 12, 2010 at 10:16:57PM -0700, Patrick J. LoPresti wrote: > OK, I found the problem. Whether it is a bug depends on your point of > view, I suppose. > > Although I am using XFS on my file server, and XFS has > nanosecond-granularity timestamps, the true granularity of ctime/mtime > is ultimately determined by the resolution of current_fs_time() which > calls current_kernel_time(); i.e. jiffies; i.e. 1/HZ. > > On my system (SLES 11 SP1), HZ is 250. In my failing application, 4 > ms is long enough for many filesystem operations, even over NFS. (My > network is 10GigE with a 300 microsecond round trip time, and my > systems are very new.) > > Anyway, I instrumented the VFS code on the NFS server to catch it in > the act; specifically, I saw the following sequence: > > file A is created on server, updating directory mtime > NFS client does LOOKUP on file B, gets nfserr_noent > file B created on server, does not update directory's mtime > > ...all within 4 milliseconds (which is why the creation of file B did > not update the directory's mtime). > > The result is that the lookup cache on the client is stale and stays > stale until some other client (or the server) updates the directory. > Even making changes from the client does not invalidate the cache, > thanks to the clever WCC logic that Trond had to explain to me > earlier. > > This is not exactly an NFS specific question, but I will ask anyway... > If I were to propose modifying current_fs_time() to call > getnstimeofday() instead of current_kernel_time(), would the VFS folks > laugh me out the door? Good question.... Cc'ing linux-fsdevel. If you have an easy reproducer it might also be worth experimenting with NFSv4 exports of an ext4 system mounted with the i_version option. Actually: should the NFSv4 server always be using i_version as the change attribute for directories? (Does every exportable filesystem update it on every directory modification?) (And if so, maybe on directories we should factor the i_version into the low bits of the mtime reported to NFSv3 client?) --b. > > 1/HZ granularity for file timestamps just seems so... 90s or > something. 4ms really is a lot of time these days. > > - Pat > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html