From: Trond Myklebust Subject: Re: [NFS] [PATCH] locks: provide a file lease method enabling cluster-coherent leases Date: Tue, 05 Jun 2007 18:53:07 -0400 Message-ID: <1181083987.6108.6.camel@heimdal.trondhjem.org> References: <1180808482.6687.30.camel@heimdal.trondhjem.org> <20070604135920.GA2233@fieldses.org> Mime-Version: 1.0 Content-Type: text/plain Cc: "J. Bruce Fields" , Marc Eshel , linux-fsdevel@vger.kernel.org, nfs@lists.sourceforge.net, David Teigland To: Robert Rappaport Return-path: In-Reply-To: Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Why isn't the existing setlease() export sufficient? The plan is to more or less get rid of __setlease(). Trond On Tue, 2007-06-05 at 18:04 -0400, Robert Rappaport wrote: > I have had some previous communications with Bruce on these topics, > and I am generally pleased with the proposed modifications that are > presented here. > > I am working on a clustered file system and there is one small > additional modification that would be of great use to me. That would > be to export the __setlease symbol. > > In my implementation, my file system specific set_lease() function > first determines whether the granting of the requested lease on the > given inode is compatible with the cluster state of this inode, and > then if it is, I simply invoke the __setlease() routine and have the > kernel build the associated infrastructure. > > Having this symbol globally available greatly simplifies things. > > On 6/4/07, J. Bruce Fields wrote: > On Sat, Jun 02, 2007 at 02:21:22PM -0400, Trond Myklebust > wrote: > > Currently, the lease handling is done all in the VFS, and is > done prior > > to calling any filesystem operations. Bruce's break_lease() > inode > > operation allows the VFS to notify the filesystem that some > operation is > > going to be called that requires the lease to be broken. > > > > My point is that in doing so, you are not atomic with the > operation that > > requires the lease to be broken. Some different node may > re-establish a > > lease while we're calling back down into the filesystem to > perform the > > operation. > > So I agree with you. The break_lease() inode operation isn't > going to > > work. The filesystem is going to have to figure out for > itself when it > > needs to notify other nodes that the lease needs breaking, > and it needs > > to figure out its own methods for ensuring atomicity. > > OK, I agree with you both, thanks for the explanations. > > It looks to me like there's probably a race in the existing > code that > will allow conflicting opens and leases to be granted > simultaneously if > the lease request is handled just after may_open() is > called. These > checks at the beginning of __setlease() are an attempt to > prevent that > race: > > if ((arg == F_RDLCK) && > (atomic_read(&inode->i_writecount) > 0)) > goto out; > if ((arg == F_WRLCK) > && ((atomic_read(&dentry->d_count) > 1) > || (atomic_read(&inode->i_count) > 1))) > goto out; > > But, for example, in the case of a simultaneous write open and > RDLCK > lease request, I believe the call to setlease could come after > the > may_open() but before the call to get_write_access() that > bumps > i_writecount. > > --b. > - > To unsubscribe from this list: send the line "unsubscribe > linux-fsdevel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info > at http://vger.kernel.org/majordomo-info.html >