From: Jeremy Fitzhardinge Subject: Re: [Ext2-devel] Re: [NFS] htree+NFS (NFS client bug?) Date: 28 Nov 2002 12:00:27 -0800 Sender: linux-kernel-owner@vger.kernel.org Message-ID: <1038513627.1464.44.camel@ixodes.goop.org> References: <1038354285.1302.144.camel@sherkaner.pao.digeo.com> <1038387522.31021.188.camel@ixodes.goop.org> <20021127150053.A2948@redhat.com> <15845.10815.450247.316196@charged.uio.no> <20021127205554.J2948@redhat.com> <20021128164439.E2362@redhat.com> <20021128171324.G2362@redhat.com> Mime-Version: 1.0 Content-Type: text/plain Cc: Trond Myklebust , Ext2 devel , NFS maillist , Linux Kernel List Return-path: To: "Stephen C. Tweedie" In-Reply-To: <20021128171324.G2362@redhat.com> List-ID: On Thu, 2002-11-28 at 09:13, Stephen C. Tweedie wrote: > In fact, it's not clear what we _can_ return as f_pos after the last > dirent. > > We're only using 31-bit hashes right now. Trond, how will other NFS > clients react if we return an NFS cookie 32-bits wide? We could > easily use something like 0x80000000 as an f_pos to represent EOF in > the Linux side of things, but will that cookie work if passed over the > wire on NFSv2? > > The alternative is to hack in a special case so that (for example) we > consider a major htree hash of 0x7fffffff to map to an f_pos of > 0x7ffffffe and just consider that a possible collision, so that > 0x7fffffff is a unique EOF for the htree tree walker. Even if you fix this, there's another problem. It seems that htree basically can't work with NFS in its current state - it only works at all on small directories, which aren't hashed and therefore use the non-htree cookie scheme. This can be fixed creating a distinct EOF cookie. However, in the transformation from a non-hashed to hashed directory the cookie scheme completely changes, and in effect invalidates all cookies currently known by clients. The obvious problem is that sometimes adding a single entry to a directory will kill all concurrent readdirs. I know that changing a directory while scanning it has at least some undefined effects (allowed to miss entries, but not allowed to duplicate, if I remember correctly), but if you add a single entry to a directory, is it allowed to completely break any pending readdir operation? One solution I can think of is to always use name hashes as directory cookies, even for non-hashed directories. This means that scans of a small directory will require linear searching to find the entry corresponding to a particular cookie, but since the directory is small by definition it shouldn't be a bad performance hit. J