MIME-Version: 1.0
In-Reply-To: <20170314145801.qbiybrfpnaff2xmc@rgvaio>
References: <a0fa4e7890ea16e7f90a17524e817755f7dde8b5.1489445257.git.p@regnarg.cz>
 <a7ad913cf04481b0295832e3e201159a7e66ea03.1489445257.git.p@regnarg.cz>
 <CAOQ4uxgt9oNPzufR2U-+DwwLWU4PdqiG9WQdwYUprxSeE3MJ=w@mail.gmail.com> <20170314145801.qbiybrfpnaff2xmc@rgvaio>
From: Amir Goldstein <amir73il@gmail.com>
Date: Tue, 14 Mar 2017 17:35:20 +0200
Message-ID: <CAOQ4uxg+N+HVb5d2eV2TnJYb1ObEiJZZ+GmTcYPyQz+9d7PU-A@mail.gmail.com>
Subject: Re: [RFC 2/2] fanotify: emit FAN_MODIFY_DIR on filesystem changes
To: =?UTF-8?B?RmlsaXAgxaB0xJtkcm9uc2vDvQ==?= <r.lkml@regnarg.cz>
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>,
        linux-kernel <linux-kernel@vger.kernel.org>, Jan Kara <jack@suse.cz>,
        Alexander Viro <viro@zeniv.linux.org.uk>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
Content-Transfer-Encoding: 8bit
Content-Length: 3723
Lines: 87

On Tue, Mar 14, 2017 at 4:58 PM, Filip Štědronský <r.lkml@regnarg.cz> wrote:
> Hi,
>
> On Tue, Mar 14, 2017 at 01:18:01PM +0200, Amir Goldstein wrote:
>> I claim that fanotify filters event by mount not because it
>> was a requirement, but because it was an implementation challenge
>> to do otherwise.
>>
>> And I claim that what mount watchers are really interested in is
>> "all the changes that happen in the file system in the area
>>  that is visible to me through this mount point".
>>
>> In other words, an indexer needs to know if files were modified\
>> create/deleted if that indexer sits in container host namespace
>> regardless if those files were modified from within a container
>> namespace.
>>
>> It's not a matter of security/isolation. It's a matter of functionality.
>> I agree that for some event (e.g. permission events) it is possible
>> to argue both ways (i.e. that the namespace context should be used
>> as a filter for events).
>> But for the new proposed events (FS_MODIFY_DIR), I really don't
>> see the point in isolation by mount/namespace.
>
> there are basically two classes of uses for a fantotify-like
> interface:
>
> (1) Keeping an up-to-date representation of the file system.
>     For this, superblock watches are clearly what you want.
>
>       * You are interested to know the current state of the
>         filesystem so you need to know about every change,
>         regardless of where it came from.
>       * As I mentioned earlier, in case of remote, ditributed
>         and virtual filesystems, the change might come from
>         within the filesystem itself (if the protocol supports
>         reporting such changes). This can probably be
>         implemented only with superblock-scoped watches because
>         the change is fundamentally not related to any mount.
>       * Some filesystems might also support change journalling
>         and it might be concievable to extend the API in the
>         future to report "past" events (for example by passing
>         sequence number of last seen event or similar).
>       * The argument about containers escaping change notification
>         you mentioned earlier.
>
>     All those factors speak greatly in favour of superblock
>     watches.
>
> (2) Tracking filesystem *activity*. Now you are not building
>     an image of current filesystem state but rather a log of
>     what happened. Perhaps you are also interested in who
>     (user/process/...) did what. Permission events also fit
>     mostly in this category.
>
>     For those it *might* make sense to have mount-scoped
>     watches, for example if you want to monitor only one
>     container or a subset of processes.
>
> We both concentrate on the first but we shouldn't forget about
> the second, which was one of the original motivations for
> fanotify.
>
> Thus I conclude that it might be desirable to implement
> mount-scoped filename events in the long run. Even though
> I agree that the sb-scoped events are more important because
> they cover more use cases and you can do additional filtering
> (e.g. by pid) if deemed necessary.
>
> This would require:
>
> (a) Sprinkling the callers of vfs_* with fanotify calls
>     as I did, or
> (b) Creating wrapper functions like vfs_path_unlink & co.
>     that would make the necessary fanotify call (and probably
>     tell the lower function not to generate another
>     notification), as I suggested earlier.
> (c) Give the vfs_* functions an *optional* vfsmount argument.
>
> In the end I probably find (c) the most elegant but this
> can be discussed later, even after your changes are merged.
>

Agreed. That is an independent question.
Thanks for the thorough summary.

Amir.