2004-03-22 01:35:10

by Rüdiger Klaehn

[permalink] [raw]
Subject: File change notification (enhanced dnotify)

Hi everybody,

I am working on a mechanism to let programs watch for file system
changes in large directory trees or on the whole system. Since my last
post in january I have been trying various approaches.

The current dnotify mechanism is very limited since it is not working
for whole directory trees and it does not report much useful
information. For example to watch for changes in the /home tree you
would have to open every single directory in the tree, which would
probably not even work since it would require more than the maximum
number of file handles. If you have a directory with many files in it,
the only thing dnotify tells you is that something has changed in the
directory, so you have to rescan the whole directory to find out which
file has changed. Kind of defeats the purpose of change notification...

My current approach is compatible with the existing dnotify mechanism,
but extends it to work recursively and makes it possible to find out
what exactly has happened in the directory. It works on the dcache
level, so unlike my first approach it does not require unique inode
numbers to uniquely identify a file.

It works like this: When a program wants to watch for changes for a
directory or file, it does an ioctl like in the original dnotify
mechanism. But there are some additional flags for the ioctl:

DN_RECURSIVE means that all subdirectories of the watched directory will
be watched. A limitation is that this does not work over mount
boundaries, so if you want to watch for changes on the whole system you
will have to watch each mount point.

DN_EXTENDED means that extended information for the type of change is
gathered. The information you get depends on the kind of change that
happened. For example for a read access you get information about the
file that has changed and the offset and size of the changed region.

As in the original dnotify mechanism, whenever one of the watched files
changes the userspace program gets a signal. The program can then do
another ioctl to find out what exactly has happened. The information
passed to the userspace program is in a very compact form, but the
program can reconstruct the path of the file and other interesting
information. See the userspace program for how this works.

Programs that could benefit very much from this mechanism would be the
fam daemon, KDE/gnome, various security tools etc. I am using KDE, and
it is using the original dnotify mechanism quite extensively. When I
start KDE it calls the original dnotify ioctl for *256* different
directories! With the new extension it would be enough to watch three or
four directories recursively.

A few remarks about the code:

This is experimental code. There might be some nasty deadlocks or race
conditions.

Just like the original dnotify mechanism, this does not work with hard
links.

For development, I divided the mechanism into a stub that has to be
compiled into the kernel and a module that contains the bulk of the
mechanism. That way I can try new things without rebooting every five
minutes. This separation will no longer be nessecary when the mechanism
is stable.

If somebody is interested, I will explain how exactly it works. The most
important code is in the module, especially postevent and consumeevent.
postevent traverses the dentry tree upwards until it finds everybody who
is interested in an event. consumeevent adds the event to the buffers of
all interested parties.

The module can be compiled in a separate directory. It has a tiny make
script called "make".

The userspace program can be compiled using
"g++ -o dnotify dnotify.cpp"

patch against 2.6.3 for the stub:
<http://www.lambda-computing.com/~rudi/lkml/dnotify-0.1.patch>

source for the module:
<http://www.lambda-computing.com/~rudi/lkml/dnotify-module-0.1.tgz>

userspace program (c++) for testing:
<http://www.lambda-computing.com/~rudi/lkml/dnotify-user-0.1.tgz>

I would like to have feedback on the general approach before I spend
more time refining this.

best regards,

R?diger Klaehn


2004-03-22 02:18:09

by Horst H. von Brand

[permalink] [raw]
Subject: Re: File change notification (enhanced dnotify)

=?ISO-8859-1?Q?R=FCdiger_Klaehn?= <[email protected]> said:
> I am working on a mechanism to let programs watch for file system
> changes in large directory trees or on the whole system. Since my last
> post in january I have been trying various approaches.

How do you propose to handle the fact that there are changes to _files_,
which happen to be pointed to by entries in directories? There is no
"change in the directory tree" in Unix...
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2004-03-22 15:00:27

by Horst H. von Brand

[permalink] [raw]
Subject: Re: File change notification (enhanced dnotify)

=?ISO-8859-1?Q?R=FCdiger_Klaehn?= <[email protected]> said:
> Horst von Brand wrote:
> > =?ISO-8859-1?Q?R=FCdiger_Klaehn?= <[email protected]> said:
> >>I am working on a mechanism to let programs watch for file system
> >>changes in large directory trees or on the whole system. Since my last
> >>post in january I have been trying various approaches.

> > How do you propose to handle the fact that there are changes to _files_,
> > which happen to be pointed to by entries in directories? There is no
> > "change in the directory tree" in Unix...

> Of course it is files that change. But as you say each file is pointed
> to by one or more dentry, so I use the dentry hierarchy to propagate the
> information about the change. Just like the old dnotify.

dentries just keep the path travelled by hard links to get to the file in
memory for fast future access. So if you have, say:

dir1 dir2
| |
. .
. .
. .
\ /
somefile

and you referenced somefile by the path through dir1, if you monitor dir2
you won't notice the change. There is no on-disk data to trace back through
all the directories that reference the file, and reading all of the
filesystem's metadata to find this out is ludicrous (ever seen fsck(8)
taking an hour or so to make much the same?).
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2004-03-22 16:00:31

by Rüdiger Klaehn

[permalink] [raw]
Subject: Re: File change notification (enhanced dnotify)

Horst von Brand wrote:
> =?ISO-8859-1?Q?R=FCdiger_Klaehn?= <[email protected]> said:
>
>>Horst von Brand wrote:
>>
>>>=?ISO-8859-1?Q?R=FCdiger_Klaehn?= <[email protected]> said:
>>>
>>>>I am working on a mechanism to let programs watch for file system
>>>>changes in large directory trees or on the whole system. Since my last
>>>>post in january I have been trying various approaches.
>
>
>>>How do you propose to handle the fact that there are changes to _files_,
>>>which happen to be pointed to by entries in directories? There is no
>>>"change in the directory tree" in Unix...
>
>
>>Of course it is files that change. But as you say each file is pointed
>>to by one or more dentry, so I use the dentry hierarchy to propagate the
>>information about the change. Just like the old dnotify.
>
>
> dentries just keep the path travelled by hard links to get to the file in
> memory for fast future access. So if you have, say:
>
> dir1 dir2
> | |
> . .
> . .
> . .
> \ /
> somefile
>
> and you referenced somefile by the path through dir1, if you monitor dir2
> you won't notice the change. There is no on-disk data to trace back through
> all the directories that reference the file, and reading all of the
> filesystem's metadata to find this out is ludicrous (ever seen fsck(8)
> taking an hour or so to make much the same?).

I am aware of that. As I mentioned, this approach does not work with
hard links, just like the original dnotify.

From the current "Documentation/dnotify.txt":
"In order to make the impact on the file system code as small as
possible, the problem of hard links to files has been ignored."

There would be some ways to solve this for hard links. But I don't think
that it would be worth it since it would involve a big performance
overhead for little gain.

Note that if you watch for changes in the root of a file system, you
will get notified exactly once for each file change in the file system
regardless of hard links.

In your example if you have one file which can be accessed via two
different paths "/dir1/somefile" and "/dir2/somefile" and you watch "/"
you would get notified for "/dir1/somefile" or "/dir2/somefile"
depending on how the changing program accesses the file.

Figuring out that "/dir1/somefile" and "/dir2/somefile" refer to the
same file should IMHO be done in userspace. If inode numbers were unique
and persistent on all file systems it might be possible to do this
efficiently in kernel space, but unfortunately this is not the case.

My original approach assumed that inode numbers were unique, and it
would have worked with hard links. But I think it is much more important
to have a mechanism that works for all file systems than to solve the
problem of hard links.

best regards,

R?diger

By the way: I just made a small website for my enhanced dnotify
mechanism. I will post my latest code there. It can be found at
<http://www.lambda-computing.com/~rudi/dnotify/>

2004-03-22 16:42:41

by Mike Waychison

[permalink] [raw]
Subject: Re: File change notification (enhanced dnotify)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

R?diger Klaehn wrote:

|
| My original approach assumed that inode numbers were unique, and it
| would have worked with hard links. But I think it is much more important
| to have a mechanism that works for all file systems than to solve the
| problem of hard links.
|

Inode numbers are guaranteed to be unique on a given filesystem other
than for hard links.. Where is this assumption broken otherwise?

Mike Waychison
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFAXxdJdQs4kOxk3/MRAjdqAKCIJ20IxRgq0PmBcV7IIKITI9FhRQCggZDm
IvcRYGtqB5ss+jhoLNIj2So=
=J6qR
-----END PGP SIGNATURE-----

2004-03-22 16:53:54

by Rüdiger Klaehn

[permalink] [raw]
Subject: Re: File change notification (enhanced dnotify)

Mike Waychison wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> R?diger Klaehn wrote:
>
> |
> | My original approach assumed that inode numbers were unique, and it
> | would have worked with hard links. But I think it is much more important
> | to have a mechanism that works for all file systems than to solve the
> | problem of hard links.
> |
>
> Inode numbers are guaranteed to be unique on a given filesystem other
> than for hard links.. Where is this assumption broken otherwise?
>
To quote jan harkes, who repiled to my original proposal:

"Inode number are not necessarily unique per filesystem. Any filesystem
that uses iget4 can have several objects that have the same inode
number. For instance, Coda uses 128-bit file-identifiers and the i_ino
number is a simple hash that is 'typically' unique. There are also
filesystems that invent inode numbers whenever inodes are brought into
the cache, but which have no persistency when the inode_cache is pruned.
So the next time you see the same object, it could have a different
(unique) inode number."

But I think the current approach is much better anyway since it does
work for a single directory and its subdirectories instead of globally
like my initial attempt. If you want to catch hard link issues you will
have to watch the root of the file system, but that is still not that
expensive. And on "sane" file systems that use unique inode numbers in
userspace to solve the hard link problem.

best regards,

R?diger

2004-03-22 17:19:14

by Horst H. von Brand

[permalink] [raw]
Subject: Re: File change notification (enhanced dnotify)

Mike Waychison <[email protected]> said:
> R?diger Klaehn wrote:
> | My original approach assumed that inode numbers were unique, and it
> | would have worked with hard links. But I think it is much more important
> | to have a mechanism that works for all file systems than to solve the
> | problem of hard links.

> Inode numbers are guaranteed to be unique on a given filesystem other
> than for hard links.. Where is this assumption broken otherwise?

On some non-Unix filesystems there isn't an equivalent to (invariant) inode
numbers.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513