Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753108AbdCNO6I (ORCPT ); Tue, 14 Mar 2017 10:58:08 -0400 Received: from 1.multihost.cz ([88.86.107.244]:39903 "EHLO multihost.cz" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751157AbdCNO6F (ORCPT ); Tue, 14 Mar 2017 10:58:05 -0400 Date: Tue, 14 Mar 2017 15:58:01 +0100 From: Filip =?utf-8?B?xaB0xJtkcm9uc2vDvQ==?= To: Amir Goldstein Cc: linux-fsdevel , linux-kernel , Jan Kara , Alexander Viro Subject: Re: [RFC 2/2] fanotify: emit FAN_MODIFY_DIR on filesystem changes Message-ID: <20170314145801.qbiybrfpnaff2xmc@rgvaio> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170128 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3439 Lines: 82 Hi, On Tue, Mar 14, 2017 at 01:18:01PM +0200, Amir Goldstein wrote: > I claim that fanotify filters event by mount not because it > was a requirement, but because it was an implementation challenge > to do otherwise. > > And I claim that what mount watchers are really interested in is > "all the changes that happen in the file system in the area > that is visible to me through this mount point". > > In other words, an indexer needs to know if files were modified\ > create/deleted if that indexer sits in container host namespace > regardless if those files were modified from within a container > namespace. > > It's not a matter of security/isolation. It's a matter of functionality. > I agree that for some event (e.g. permission events) it is possible > to argue both ways (i.e. that the namespace context should be used > as a filter for events). > But for the new proposed events (FS_MODIFY_DIR), I really don't > see the point in isolation by mount/namespace. there are basically two classes of uses for a fantotify-like interface: (1) Keeping an up-to-date representation of the file system. For this, superblock watches are clearly what you want. * You are interested to know the current state of the filesystem so you need to know about every change, regardless of where it came from. * As I mentioned earlier, in case of remote, ditributed and virtual filesystems, the change might come from within the filesystem itself (if the protocol supports reporting such changes). This can probably be implemented only with superblock-scoped watches because the change is fundamentally not related to any mount. * Some filesystems might also support change journalling and it might be concievable to extend the API in the future to report "past" events (for example by passing sequence number of last seen event or similar). * The argument about containers escaping change notification you mentioned earlier. All those factors speak greatly in favour of superblock watches. (2) Tracking filesystem *activity*. Now you are not building an image of current filesystem state but rather a log of what happened. Perhaps you are also interested in who (user/process/...) did what. Permission events also fit mostly in this category. For those it *might* make sense to have mount-scoped watches, for example if you want to monitor only one container or a subset of processes. We both concentrate on the first but we shouldn't forget about the second, which was one of the original motivations for fanotify. Thus I conclude that it might be desirable to implement mount-scoped filename events in the long run. Even though I agree that the sb-scoped events are more important because they cover more use cases and you can do additional filtering (e.g. by pid) if deemed necessary. This would require: (a) Sprinkling the callers of vfs_* with fanotify calls as I did, or (b) Creating wrapper functions like vfs_path_unlink & co. that would make the necessary fanotify call (and probably tell the lower function not to generate another notification), as I suggested earlier. (c) Give the vfs_* functions an *optional* vfsmount argument. In the end I probably find (c) the most elegant but this can be discussed later, even after your changes are merged. Filip