On 17.11.2007, at 21.46, Trond Myklebust wrote:
> On Sat, 2007-11-17 at 02:11 +0200, Timo Sirainen wrote:
>> Solaris and BSDs support flushing attribute cache safely using
>> fchown(fd, (uid_t)-1, (gid_t)-1). Could Linux be changed to
>> support this
>> as well? If I'm looking at the sources right, this might work
>> (completely untested):
>>
>> --- inode.c.old 2007-11-16 22:18:46.000000000 +0200
>> +++ inode.c 2007-11-16 22:19:44.000000000 +0200
>> @@ -322,6 +322,7 @@
>> nfs_setattr(struct dentry *dentry, struct iattr *attr)
>> {
>> struct inode *inode = dentry->d_inode;
>> + struct nfs_inode *nfsi = NFS_I(inode);
>> struct nfs_fattr fattr;
>> int error;
>>
>> @@ -334,8 +335,10 @@
>>
>> /* Optimization: if the end result is no change, don't RPC */
>> attr->ia_valid &= NFS_VALID_ATTRS;
>> - if (attr->ia_valid == 0)
>> + if (attr->ia_valid == 0) {
>> + nfsi->cache_validity |= NFS_INO_INVALID_ATTR;
>> return 0;
>> + }
>>
>> lock_kernel();
>> nfs_begin_data_update(inode);
>
> Why is this needed?
Do you mean why is flushing attribute cache needed, or why is this
particular way to flush it needed?
I need to be able to find out if a file has changed, so I need to get
its attribute cache flushed. fchown()ing to -1, -1 would work safely
in all situations because it's guaranteed not to change the file in
any way.
>> Another problem I have is that it's difficult to get a file's data
>> cache
>> flushed. The only way I found was to successfully fcntl() lock the
>> file.
>> This is pretty bad from performance point of view since often I don't
>> want/need to lock the file.
>>
>> Solaris and BSDs invalidate also a file's data cache when its
>> attribute
>> cache is invalidated. It would be nicer if there was a separate
>> way, but
>> I'd settle for fchown(fd, (uid_t)-1, (gid_t)-1) invalidating data
>> cache
>> as well.
>>
>> Actually I did also look at posix_fadvise(fd, 0, 0,
>> POSIX_FADVN_DONTNEED). It appears to work, but I'm a bit worried
>> about
>> race conditions that causes pages to randomly not get dropped because
>> invalidate_mapping_pages() doesn't drop locked pages.
>
> Again, why do you need this level of data cache invalidation? If you
> don't want cached i/o, then use O_DIRECT.
Because O_DIRECT is optional and I can't trust that support for it
has been compiled into kernel. It defaults to off, and even its
description makes it sound like it's a bad idea to enable normally.
Also O_DIRECT is a bit too much for my use case. I do want the file
to be cached for the most part, but there are times occationally when
parts of it can be overwritten, and I need to make sure that in those
situations the newest data is read.
If you want a wider description of what I'm trying to do: I'm
developing Dovecot IMAP server. A lot of people store mails on NFS
and want to have multiple IMAP servers be able to access the mails.
Dovecot uses somewhat complex index files to speed up accessing the
mailboxes, and it's mainly for these index files that I need this
explicit control over caching. If two servers are accessing the same
mailbox at the same time, the index files get easily corrupted if I
can't control the caching.
On Sat, 2007-11-17 at 22:12 +0200, Timo Sirainen wrote:
> On 17.11.2007, at 21.46, Trond Myklebust wrote:
> > Why is this needed?
>
> Do you mean why is flushing attribute cache needed, or why is this
> particular way to flush it needed?
>
> I need to be able to find out if a file has changed, so I need to get
> its attribute cache flushed. fchown()ing to -1, -1 would work safely
> in all situations because it's guaranteed not to change the file in
> any way.
Why can't you simply close(), and then re-open() the file? That is _the_
standard way to force an attribute cache revalidation on all NFS
versions. The close-to-open caching model, which is implemented on most
NFS clients guarantees this.
> Also O_DIRECT is a bit too much for my use case. I do want the file
> to be cached for the most part, but there are times occationally when
> parts of it can be overwritten, and I need to make sure that in those
> situations the newest data is read.
>
> If you want a wider description of what I'm trying to do: I'm
> developing Dovecot IMAP server. A lot of people store mails on NFS
> and want to have multiple IMAP servers be able to access the mails.
> Dovecot uses somewhat complex index files to speed up accessing the
> mailboxes, and it's mainly for these index files that I need this
> explicit control over caching. If two servers are accessing the same
> mailbox at the same time, the index files get easily corrupted if I
> can't control the caching.
So how are you ensuring that both servers don't try writing to the same
locations? You must have some form of synchronisation scheme for this to
work.
Trond
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that [email protected] is being discontinued.
Please subscribe to [email protected] instead.
http://vger.kernel.org/vger-lists.html#linux-nfs