2021-06-15 23:57:07

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH v2 00/14] File system wide monitoring

Hi,

This follows up on the previous version [2]. There is not much design
changes, except to attend to Amir and Darrick's reviews. Thank you both
so much for the detailed feedback. It is really appreciated.

I decided not to implement FID mode reporting in this version, since it
can be extended later without prejudice to the user API.

This version also includes a small fix to fsnotify_add_event on patch 1
to prevent adding overflow events to the hash map.

This was tested with LTP for regressions, and also using the sample on
the last patch, with a corrupted image. I can publish the bad image upon
request.

I also pushed the full series to:

https://gitlab.collabora.com/krisman/linux -b fanotify-notifications-single-slot

Thank you

Original cover letter
---------------------
Hi,

This series follow up on my previous proposal [1] to support file system
wide monitoring. As suggested by Amir, this proposal drops the ring
buffer in favor of a single slot associated with each mark. This
simplifies a bit the implementation, as you can see in the code.

As a reminder, This proposal is limited to an interface for
administrators to monitor the health of a file system, instead of a
generic inteface for file errors. Therefore, this doesn't solve the
problem of writeback errors or the need to watch a specific subtree.

In comparison to the previous RFC, this implementation also drops the
per-fs data and location, and leave those as future extensions.

* Implementation

The feature is implemented on top of fanotify, as a new type of fanotify
mark, FAN_ERROR, which a file system monitoring tool can register to
receive error notifications. When an error occurs a new notification is
generated, in addition followed by this info field:

- FS generic data: A file system agnostic structure that has a generic
error code and identifies the filesystem. Basically, it let's
userspace know something happened on a monitored filesystem. Since
only the first error is recorded since the last read, this also
includes a counter of errors that happened since the last read.

* Testing

This was tested by watching notifications flowing from an intentionally
corrupted filesystem in different places. In addition, other events
were watched in an attempt to detect regressions.

Is there a specific testsuite for fanotify I should be running?

* Patches

This patchset is divided as follows: Patch 1 through 5 are refactoring
to fsnotify/fanotify in preparation for FS_ERROR/FAN_ERROR; patch 6 and
7 implement the FS_ERROR API for filesystems to report error; patch 8
add support for FAN_ERROR in fanotify; Patch 9 is an example
implementation for ext4; patch 10 and 11 provide a sample userspace code
and documentation.

I also pushed the full series to:

https://gitlab.collabora.com/krisman/linux -b fanotify-notifications-single-slot

[1] https://lwn.net/Articles/854545/
[2] https://lwn.net/Articles/856916/

Cc: Darrick J. Wong <[email protected]>
Cc: Theodore Ts'o <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: [email protected]
To: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]

Amir Goldstein (1):
fsnotify: pass arguments of fsnotify() in struct fsnotify_event_info

Gabriel Krisman Bertazi (13):
fsnotify: Don't call insert hook for overflow events
fanotify: Fold event size calculation to its own function
fanotify: Split fsid check from other fid mode checks
fanotify: Split superblock marks out to a new cache
inotify: Don't force FS_IN_IGNORED
fsnotify: Add helper to detect overflow_event
fsnotify: Support passing argument to insert callback on add_event
fsnotify: Support FS_ERROR event type
fsnotify: Introduce helpers to send error_events
fanotify: Introduce FAN_FS_ERROR event
ext4: Send notifications on error
samples: Add fs error monitoring example
Documentation: Document the FAN_FS_ERROR event

.../admin-guide/filesystem-monitoring.rst | 63 +++++
Documentation/admin-guide/index.rst | 1 +
fs/ext4/super.c | 8 +
fs/notify/fanotify/fanotify.c | 187 ++++++++++++---
fs/notify/fanotify/fanotify.h | 44 +++-
fs/notify/fanotify/fanotify_user.c | 226 +++++++++++++++---
fs/notify/fsnotify.c | 85 ++++---
fs/notify/inotify/inotify_fsnotify.c | 2 +-
fs/notify/inotify/inotify_user.c | 6 +-
fs/notify/notification.c | 12 +-
include/linux/fanotify.h | 6 +-
include/linux/fsnotify.h | 35 ++-
include/linux/fsnotify_backend.h | 104 ++++++--
include/uapi/linux/fanotify.h | 11 +
samples/Kconfig | 9 +
samples/Makefile | 1 +
samples/fanotify/Makefile | 3 +
samples/fanotify/fs-monitor.c | 95 ++++++++
18 files changed, 752 insertions(+), 146 deletions(-)
create mode 100644 Documentation/admin-guide/filesystem-monitoring.rst
create mode 100644 samples/fanotify/Makefile
create mode 100644 samples/fanotify/fs-monitor.c

--
2.31.0


2021-06-15 23:57:15

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH v2 03/14] fanotify: Split fsid check from other fid mode checks

FAN_ERROR will require fsid, but not necessarily require the filesystem
to expose a file handle. Split those checks into different functions, so
they can be used separately when setting up an event.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>

---
Changes since v1:
(Amir)
- Sort hunks to simplify diff.
Changes since RFC:
(Amir)
- Rename fanotify_check_path_fsid -> fanotify_test_fsid.
- Use dentry directly instead of path.
---
fs/notify/fanotify/fanotify_user.c | 27 ++++++++++++++++++---------
1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 0da4e5dcab0f..af518790a80f 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -1192,16 +1192,15 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
return fd;
}

-/* Check if filesystem can encode a unique fid */
-static int fanotify_test_fid(struct path *path, __kernel_fsid_t *fsid)
+static int fanotify_test_fsid(struct dentry *dentry, __kernel_fsid_t *fsid)
{
__kernel_fsid_t root_fsid;
int err;

/*
- * Make sure path is not in filesystem with zero fsid (e.g. tmpfs).
+ * Make sure dentry is not of a filesystem with zero fsid (e.g. tmpfs).
*/
- err = vfs_get_fsid(path->dentry, fsid);
+ err = vfs_get_fsid(dentry, fsid);
if (err)
return err;

@@ -1209,10 +1208,10 @@ static int fanotify_test_fid(struct path *path, __kernel_fsid_t *fsid)
return -ENODEV;

/*
- * Make sure path is not inside a filesystem subvolume (e.g. btrfs)
+ * Make sure dentry is not of a filesystem subvolume (e.g. btrfs)
* which uses a different fsid than sb root.
*/
- err = vfs_get_fsid(path->dentry->d_sb->s_root, &root_fsid);
+ err = vfs_get_fsid(dentry->d_sb->s_root, &root_fsid);
if (err)
return err;

@@ -1220,6 +1219,12 @@ static int fanotify_test_fid(struct path *path, __kernel_fsid_t *fsid)
root_fsid.val[1] != fsid->val[1])
return -EXDEV;

+ return 0;
+}
+
+/* Check if filesystem can encode a unique fid */
+static int fanotify_test_fid(struct dentry *dentry)
+{
/*
* We need to make sure that the file system supports at least
* encoding a file handle so user can use name_to_handle_at() to
@@ -1227,8 +1232,8 @@ static int fanotify_test_fid(struct path *path, __kernel_fsid_t *fsid)
* objects. However, name_to_handle_at() requires that the
* filesystem also supports decoding file handles.
*/
- if (!path->dentry->d_sb->s_export_op ||
- !path->dentry->d_sb->s_export_op->fh_to_dentry)
+ if (!dentry->d_sb->s_export_op ||
+ !dentry->d_sb->s_export_op->fh_to_dentry)
return -EOPNOTSUPP;

return 0;
@@ -1379,7 +1384,11 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
}

if (fid_mode) {
- ret = fanotify_test_fid(&path, &__fsid);
+ ret = fanotify_test_fsid(path.dentry, &__fsid);
+ if (ret)
+ goto path_put_and_out;
+
+ ret = fanotify_test_fid(path.dentry);
if (ret)
goto path_put_and_out;

--
2.31.0

2021-06-15 23:57:24

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH v2 04/14] fanotify: Split superblock marks out to a new cache

FAN_ERROR will require an error structure to be stored per mark. But,
since FAN_ERROR doesn't apply to inode/mount marks, it should suffice to
only expose this information for superblock marks. Therefore, wrap this
kind of marks into a container and plumb it for the future.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>

---
Changes since v1:
- Only extend superblock marks
---
fs/notify/fanotify/fanotify.c | 10 ++++++++--
fs/notify/fanotify/fanotify.h | 11 +++++++++++
fs/notify/fanotify/fanotify_user.c | 29 ++++++++++++++++++++++++++++-
include/linux/fsnotify_backend.h | 1 +
4 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 057abd2cf887..f85efb24cfb4 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -867,9 +867,15 @@ static void fanotify_freeing_mark(struct fsnotify_mark *mark,
dec_ucount(group->fanotify_data.ucounts, UCOUNT_FANOTIFY_MARKS);
}

-static void fanotify_free_mark(struct fsnotify_mark *fsn_mark)
+static void fanotify_free_mark(struct fsnotify_mark *mark)
{
- kmem_cache_free(fanotify_mark_cache, fsn_mark);
+ if (mark->flags & FSNOTIFY_MARK_FLAG_SB) {
+ struct fanotify_sb_mark *fa_mark = FANOTIFY_SB_MARK(mark);
+
+ kmem_cache_free(fanotify_sb_mark_cache, fa_mark);
+ } else {
+ kmem_cache_free(fanotify_mark_cache, mark);
+ }
}

const struct fsnotify_ops fanotify_fsnotify_ops = {
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index 4a5e555dc3d2..aec05e21d5a9 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -6,6 +6,7 @@
#include <linux/hashtable.h>

extern struct kmem_cache *fanotify_mark_cache;
+extern struct kmem_cache *fanotify_sb_mark_cache;
extern struct kmem_cache *fanotify_fid_event_cachep;
extern struct kmem_cache *fanotify_path_event_cachep;
extern struct kmem_cache *fanotify_perm_event_cachep;
@@ -129,6 +130,16 @@ static inline void fanotify_info_copy_name(struct fanotify_info *info,
name->name);
}

+struct fanotify_sb_mark {
+ struct fsnotify_mark fsn_mark;
+};
+
+static inline
+struct fanotify_sb_mark *FANOTIFY_SB_MARK(struct fsnotify_mark *mark)
+{
+ return container_of(mark, struct fanotify_sb_mark, fsn_mark);
+}
+
/*
* Common structure for fanotify events. Concrete structs are allocated in
* fanotify_handle_event() and freed when the information is retrieved by
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index af518790a80f..db378480f1b1 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -99,6 +99,7 @@ struct ctl_table fanotify_table[] = {
extern const struct fsnotify_ops fanotify_fsnotify_ops;

struct kmem_cache *fanotify_mark_cache __read_mostly;
+struct kmem_cache *fanotify_sb_mark_cache __read_mostly;
struct kmem_cache *fanotify_fid_event_cachep __read_mostly;
struct kmem_cache *fanotify_path_event_cachep __read_mostly;
struct kmem_cache *fanotify_perm_event_cachep __read_mostly;
@@ -915,6 +916,27 @@ static __u32 fanotify_mark_add_to_mask(struct fsnotify_mark *fsn_mark,
return mask & ~oldmask;
}

+static struct fsnotify_mark *fanotify_alloc_mark(unsigned int type)
+{
+ struct fanotify_sb_mark *sb_mark;
+
+ switch (type) {
+ case FSNOTIFY_OBJ_TYPE_SB:
+ sb_mark = kmem_cache_zalloc(fanotify_sb_mark_cache, GFP_KERNEL);
+ if (!sb_mark)
+ return NULL;
+ return &sb_mark->fsn_mark;
+
+ case FSNOTIFY_OBJ_TYPE_INODE:
+ case FSNOTIFY_OBJ_TYPE_PARENT:
+ case FSNOTIFY_OBJ_TYPE_VFSMOUNT:
+ return kmem_cache_alloc(fanotify_mark_cache, GFP_KERNEL);
+ default:
+ WARN_ON(1);
+ return NULL;
+ }
+}
+
static struct fsnotify_mark *fanotify_add_new_mark(struct fsnotify_group *group,
fsnotify_connp_t *connp,
unsigned int type,
@@ -933,13 +955,16 @@ static struct fsnotify_mark *fanotify_add_new_mark(struct fsnotify_group *group,
!inc_ucount(ucounts->ns, ucounts->uid, UCOUNT_FANOTIFY_MARKS))
return ERR_PTR(-ENOSPC);

- mark = kmem_cache_alloc(fanotify_mark_cache, GFP_KERNEL);
+ mark = fanotify_alloc_mark(type);
if (!mark) {
ret = -ENOMEM;
goto out_dec_ucounts;
}

fsnotify_init_mark(mark, group);
+ if (type == FSNOTIFY_OBJ_TYPE_SB)
+ mark->flags |= FSNOTIFY_MARK_FLAG_SB;
+
ret = fsnotify_add_mark_locked(mark, connp, type, 0, fsid);
if (ret) {
fsnotify_put_mark(mark);
@@ -1497,6 +1522,8 @@ static int __init fanotify_user_setup(void)

fanotify_mark_cache = KMEM_CACHE(fsnotify_mark,
SLAB_PANIC|SLAB_ACCOUNT);
+ fanotify_sb_mark_cache = KMEM_CACHE(fanotify_sb_mark,
+ SLAB_PANIC|SLAB_ACCOUNT);
fanotify_fid_event_cachep = KMEM_CACHE(fanotify_fid_event,
SLAB_PANIC);
fanotify_path_event_cachep = KMEM_CACHE(fanotify_path_event,
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 1ce66748a2d2..c4473b467c28 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -401,6 +401,7 @@ struct fsnotify_mark {
#define FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY 0x01
#define FSNOTIFY_MARK_FLAG_ALIVE 0x02
#define FSNOTIFY_MARK_FLAG_ATTACHED 0x04
+#define FSNOTIFY_MARK_FLAG_SB 0x08
unsigned int flags; /* flags [mark->lock] */
};

--
2.31.0

2021-06-15 23:57:24

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH v2 05/14] inotify: Don't force FS_IN_IGNORED

According to Amir:

"FS_IN_IGNORED is completely internal to inotify and there is no need
to set it in i_fsnotify_mask at all, so if we remove the bit from the
output of inotify_arg_to_mask() no functionality will change and we will
be able to overload the event bit for FS_ERROR."

This is done in preparation to overload FS_ERROR with the notification
mechanism in fanotify.

Suggested-by: Amir Goldstein <[email protected]>
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
Reviewed-by: Amir Goldstein <[email protected]>
---
fs/notify/inotify/inotify_user.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
index 98f61b31745a..4d17be6dd58d 100644
--- a/fs/notify/inotify/inotify_user.c
+++ b/fs/notify/inotify/inotify_user.c
@@ -89,10 +89,10 @@ static inline __u32 inotify_arg_to_mask(struct inode *inode, u32 arg)
__u32 mask;

/*
- * Everything should accept their own ignored and should receive events
- * when the inode is unmounted. All directories care about children.
+ * Everything should receive events when the inode is unmounted.
+ * All directories care about children.
*/
- mask = (FS_IN_IGNORED | FS_UNMOUNT);
+ mask = (FS_UNMOUNT);
if (S_ISDIR(inode->i_mode))
mask |= FS_EVENT_ON_CHILD;

--
2.31.0

2021-06-15 23:57:36

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH v2 09/14] fsnotify: Support FS_ERROR event type

Expose a new type of fsnotify event for filesystems to report errors for
userspace monitoring tools. fanotify will send this type of
notification for FAN_ERROR events.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>

---
Changes since v1:
- Overload FS_ERROR with FS_IN_IGNORED
- IMplement support for this type on fsnotify_data_inode
---
include/linux/fsnotify_backend.h | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 8222fe12a6c9..ea5f5c7cc381 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -42,6 +42,12 @@

#define FS_UNMOUNT 0x00002000 /* inode on umount fs */
#define FS_Q_OVERFLOW 0x00004000 /* Event queued overflowed */
+#define FS_ERROR 0x00008000 /* Filesystem Error (fanotify) */
+
+/*
+ * FS_IN_IGNORED overloads FS_ERROR. It is only used internally by inotify
+ * which does not support FS_ERROR.
+ */
#define FS_IN_IGNORED 0x00008000 /* last inotify event here */

#define FS_OPEN_PERM 0x00010000 /* open event in an permission hook */
@@ -95,7 +101,8 @@
#define ALL_FSNOTIFY_EVENTS (ALL_FSNOTIFY_DIRENT_EVENTS | \
FS_EVENTS_POSS_ON_CHILD | \
FS_DELETE_SELF | FS_MOVE_SELF | FS_DN_RENAME | \
- FS_UNMOUNT | FS_Q_OVERFLOW | FS_IN_IGNORED)
+ FS_UNMOUNT | FS_Q_OVERFLOW | FS_IN_IGNORED | \
+ FS_ERROR)

/* Extra flags that may be reported with event or control handling of events */
#define ALL_FSNOTIFY_FLAGS (FS_EXCL_UNLINK | FS_ISDIR | FS_IN_ONESHOT | \
@@ -263,6 +270,12 @@ enum fsnotify_data_type {
FSNOTIFY_EVENT_NONE,
FSNOTIFY_EVENT_PATH,
FSNOTIFY_EVENT_INODE,
+ FSNOTIFY_EVENT_ERROR,
+};
+
+struct fs_error_report {
+ int error;
+ struct inode *inode;
};

static inline struct inode *fsnotify_data_inode(const void *data, int data_type)
@@ -272,6 +285,8 @@ static inline struct inode *fsnotify_data_inode(const void *data, int data_type)
return (struct inode *)data;
case FSNOTIFY_EVENT_PATH:
return d_inode(((const struct path *)data)->dentry);
+ case FSNOTIFY_EVENT_ERROR:
+ return ((struct fs_error_report *)data)->inode;
default:
return NULL;
}
--
2.31.0

2021-06-15 23:57:44

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH v2 10/14] fsnotify: Introduce helpers to send error_events

Introduce helpers for filesystems interested in reporting FS_ERROR
events.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>

---
Changes since v1:
- Use the inode argument (Amir)
- Protect s_fsnotify_marks with ifdef guard
---
fs/notify/fsnotify.c | 2 +-
include/linux/fsnotify.h | 20 ++++++++++++++++++++
include/linux/fsnotify_backend.h | 1 +
3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index 36205a769dde..ac05eb3fb368 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -491,7 +491,7 @@ int __fsnotify(__u32 mask, const struct fsnotify_event_info *event_info)
*/
parent = event_info->dir;
}
- sb = inode->i_sb;
+ sb = event_info->sb ?: inode->i_sb;

/*
* Optimization: srcu_read_lock() has a memory barrier which can
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index 8c2c681b4495..c0dbc5a65381 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -326,4 +326,24 @@ static inline void fsnotify_change(struct dentry *dentry, unsigned int ia_valid)
fsnotify_dentry(dentry, mask);
}

+static inline int fsnotify_sb_error(struct super_block *sb, struct inode *inode,
+ int error)
+{
+#ifdef CONFIG_FSNOTIFY
+ if (sb->s_fsnotify_marks) {
+ struct fs_error_report report = {
+ .error = error,
+ .inode = inode,
+ };
+
+ return __fsnotify(FS_ERROR, &(struct fsnotify_event_info) {
+ .data = &report,
+ .data_type = FSNOTIFY_EVENT_ERROR,
+ .inode = NULL, .cookie = 0, .sb = sb
+ });
+ }
+#endif
+ return 0;
+}
+
#endif /* _LINUX_FS_NOTIFY_H */
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index ea5f5c7cc381..5a32c5010f45 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -138,6 +138,7 @@ struct fsnotify_event_info {
struct inode *dir;
const struct qstr *name;
struct inode *inode;
+ struct super_block *sb;
u32 cookie;
};

--
2.31.0

2021-06-15 23:58:17

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH v2 11/14] fanotify: Introduce FAN_FS_ERROR event

The FAN_FS_ERROR event is used by filesystem wide monitoring tools to
receive notifications of type FS_ERROR_EVENT, emited by filesystems when
a problem is detected. The error notification includes a generic error
descriptor.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>

---
Changes since v1:
- Pass dentry to fanotify_check_fsid (Amir)
- FANOTIFY_EVENT_TYPE_ERROR -> FANOTIFY_EVENT_TYPE_FS_ERROR
- Merge previous patch into it
- Use a single slot
- Move fanotify_mark.error_event definition to this commit
- Rename FAN_ERROR -> FAN_FS_ERROR
- Restrict FAN_FS_ERROR to FAN_MARK_FILESYSTEM
---
fs/notify/fanotify/fanotify.c | 115 ++++++++++++++++++++++--
fs/notify/fanotify/fanotify.h | 30 +++++++
fs/notify/fanotify/fanotify_user.c | 137 ++++++++++++++++++++++++++---
include/linux/fanotify.h | 6 +-
include/uapi/linux/fanotify.h | 11 +++
5 files changed, 281 insertions(+), 18 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index f64234489811..58b2dd01f1ae 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -247,6 +247,14 @@ static int fanotify_get_response(struct fsnotify_group *group,
return ret;
}

+static bool fanotify_event_reports_path(const struct fsnotify_group *group,
+ u32 mask)
+{
+ unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
+
+ return !fid_mode && !fanotify_is_error_event(mask);
+}
+
/*
* This function returns a mask for an event that only contains the flags
* that have been specifically requested by the user. Flags that may have
@@ -261,7 +269,6 @@ static u32 fanotify_group_event_mask(
__u32 marks_mask = 0, marks_ignored_mask = 0;
__u32 test_mask, user_mask = FANOTIFY_OUTGOING_EVENTS |
FANOTIFY_EVENT_FLAGS;
- const struct path *path = fsnotify_event_info_path(event_info);
unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
struct fsnotify_mark *mark;
int type;
@@ -270,14 +277,17 @@ static u32 fanotify_group_event_mask(
__func__, iter_info->report_mask, event_mask,
event_info->data, event_info->data_type);

- if (!fid_mode) {
+ if (fanotify_event_reports_path(group, event_mask)) {
+ const struct path *path;
+
/* Do we have path to open a file descriptor? */
+ path = fsnotify_event_info_path(event_info);
if (!path)
return 0;
/* Path type events are only relevant for files and dirs */
if (!d_is_reg(path->dentry) && !d_can_lookup(path->dentry))
return 0;
- } else if (!(fid_mode & FAN_REPORT_FID)) {
+ } else if (fid_mode && !(fid_mode & FAN_REPORT_FID)) {
/* Do we have a directory inode to report? */
if (!event_info->dir && !(event_mask & FS_ISDIR))
return 0;
@@ -658,6 +668,78 @@ static struct fanotify_event *fanotify_alloc_event(
return event;
}

+static void fanotify_init_error_event(struct fanotify_error_event *event,
+ __kernel_fsid_t fsid,
+ const struct fs_error_report *report)
+{
+ event->fae.type = FANOTIFY_EVENT_TYPE_FS_ERROR;
+ event->err_count = 1;
+ event->fsid = fsid;
+ event->error = report->error;
+ event->ino = (report->inode) ? report->inode->i_ino : 0;
+ event->ino_generation =
+ (report->inode) ? report->inode->i_generation : 0;
+}
+
+struct fanotify_insert_error_data {
+ const struct fs_error_report *report;
+ __kernel_fsid_t fsid;
+};
+
+static void fanotify_insert_error_event(struct fsnotify_group *group,
+ struct fsnotify_event *event,
+ const void *data)
+{
+ struct fanotify_event *fae = FANOTIFY_E(event);
+ struct fanotify_error_event *error_event = FANOTIFY_EE(fae);
+ const struct fanotify_insert_error_data *idata =
+ ((struct fanotify_insert_error_data *) data);
+
+ fsnotify_get_mark(error_event->mark);
+ fanotify_init_error_event(error_event, idata->fsid, idata->report);
+}
+
+static int fanotify_merge_error_event(struct fsnotify_group *group,
+ struct fsnotify_event *event)
+{
+ struct fanotify_event *fae = FANOTIFY_E(event);
+
+ if (!list_empty(&event->list)) {
+ FANOTIFY_EE(fae)->err_count++;
+ return 1;
+ }
+
+ return 0;
+}
+
+static int fanotify_queue_error_event(struct fsnotify_iter_info *iter_info,
+ struct fsnotify_group *group,
+ __kernel_fsid_t fsid,
+ const struct fs_error_report *report)
+{
+ struct fanotify_error_event *error_event;
+ struct fsnotify_mark *mark = fsnotify_iter_sb_mark(iter_info);
+ int ret = -ENOMEM;
+
+ if (!mark || !FANOTIFY_SB_MARK(mark)->error_event)
+ return ret;
+
+ spin_lock(&mark->lock);
+ error_event = FANOTIFY_SB_MARK(mark)->error_event;
+ if (error_event) {
+ ret = fsnotify_add_event(group, &error_event->fae.fse,
+ fanotify_merge_error_event,
+ fanotify_insert_error_event,
+ &(struct fanotify_insert_error_data) {
+ .fsid = fsid,
+ .report = report
+ });
+ }
+ spin_unlock(&mark->lock);
+
+ return ret;
+}
+
/*
* Get cached fsid of the filesystem containing the object from any connector.
* All connectors are supposed to have the same fsid, but we do not verify that
@@ -738,8 +820,9 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
BUILD_BUG_ON(FAN_ONDIR != FS_ISDIR);
BUILD_BUG_ON(FAN_OPEN_EXEC != FS_OPEN_EXEC);
BUILD_BUG_ON(FAN_OPEN_EXEC_PERM != FS_OPEN_EXEC_PERM);
+ BUILD_BUG_ON(FAN_FS_ERROR != FS_ERROR);

- BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 19);
+ BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 20);

mask = fanotify_group_event_mask(group, mask, event_info, iter_info);
if (!mask)
@@ -756,13 +839,20 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
return 0;
}

- if (FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS)) {
+ if (FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS) ||
+ fanotify_is_error_event(mask)) {
fsid = fanotify_get_fsid(iter_info);
/* Racing with mark destruction or creation? */
if (!fsid.val[0] && !fsid.val[1])
return 0;
}

+ if (fanotify_is_error_event(mask)) {
+ ret = fanotify_queue_error_event(iter_info, group, fsid,
+ event_info->data);
+ goto finish;
+ }
+
event = fanotify_alloc_event(group, mask, event_info, &fsid);
ret = -ENOMEM;
if (unlikely(!event)) {
@@ -831,6 +921,17 @@ static void fanotify_free_name_event(struct fanotify_event *event)
kfree(FANOTIFY_NE(event));
}

+static void fanotify_free_error_event(struct fanotify_event *event)
+{
+ /*
+ * Just drop the reference acquired by
+ * fanotify_queue_error_event.
+ *
+ * The actual memory is freed with the mark.
+ */
+ fsnotify_put_mark(FANOTIFY_EE(event)->mark);
+}
+
static void fanotify_free_event(struct fsnotify_event *fsn_event)
{
struct fanotify_event *event;
@@ -853,6 +954,9 @@ static void fanotify_free_event(struct fsnotify_event *fsn_event)
case FANOTIFY_EVENT_TYPE_OVERFLOW:
kfree(event);
break;
+ case FANOTIFY_EVENT_TYPE_FS_ERROR:
+ fanotify_free_error_event(event);
+ break;
default:
WARN_ON_ONCE(1);
}
@@ -870,6 +974,7 @@ static void fanotify_free_mark(struct fsnotify_mark *mark)
if (mark->flags & FSNOTIFY_MARK_FLAG_SB) {
struct fanotify_sb_mark *fa_mark = FANOTIFY_SB_MARK(mark);

+ kfree(fa_mark->error_event);
kmem_cache_free(fanotify_sb_mark_cache, fa_mark);
} else {
kmem_cache_free(fanotify_mark_cache, mark);
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index 7e00c05a979a..882d056b3a7a 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -132,11 +132,14 @@ static inline void fanotify_info_copy_name(struct fanotify_info *info,

struct fanotify_sb_mark {
struct fsnotify_mark fsn_mark;
+ struct fanotify_error_event *error_event;
};

static inline
struct fanotify_sb_mark *FANOTIFY_SB_MARK(struct fsnotify_mark *mark)
{
+ WARN_ON(!(mark->flags & FSNOTIFY_MARK_FLAG_SB));
+
return container_of(mark, struct fanotify_sb_mark, fsn_mark);
}

@@ -152,6 +155,7 @@ enum fanotify_event_type {
FANOTIFY_EVENT_TYPE_PATH,
FANOTIFY_EVENT_TYPE_PATH_PERM,
FANOTIFY_EVENT_TYPE_OVERFLOW, /* struct fanotify_event */
+ FANOTIFY_EVENT_TYPE_FS_ERROR, /* struct fanotify_error_event */
__FANOTIFY_EVENT_TYPE_NUM
};

@@ -207,12 +211,32 @@ FANOTIFY_NE(struct fanotify_event *event)
return container_of(event, struct fanotify_name_event, fae);
}

+struct fanotify_error_event {
+ struct fanotify_event fae;
+ s32 error;
+ u32 err_count;
+ __kernel_fsid_t fsid;
+ u64 ino;
+ u32 ino_generation;
+
+ /* Back reference to the mark this error refers to. */
+ struct fsnotify_mark *mark;
+};
+
+static inline struct fanotify_error_event *
+FANOTIFY_EE(struct fanotify_event *event)
+{
+ return container_of(event, struct fanotify_error_event, fae);
+}
+
static inline __kernel_fsid_t *fanotify_event_fsid(struct fanotify_event *event)
{
if (event->type == FANOTIFY_EVENT_TYPE_FID)
return &FANOTIFY_FE(event)->fsid;
else if (event->type == FANOTIFY_EVENT_TYPE_FID_NAME)
return &FANOTIFY_NE(event)->fsid;
+ else if (event->type == FANOTIFY_EVENT_TYPE_FS_ERROR)
+ return &FANOTIFY_EE(event)->fsid;
else
return NULL;
}
@@ -298,6 +322,11 @@ static inline struct fanotify_event *FANOTIFY_E(struct fsnotify_event *fse)
return container_of(fse, struct fanotify_event, fse);
}

+static inline bool fanotify_is_error_event(u32 mask)
+{
+ return mask & FANOTIFY_ERROR_EVENTS;
+}
+
static inline bool fanotify_event_has_path(struct fanotify_event *event)
{
return event->type == FANOTIFY_EVENT_TYPE_PATH ||
@@ -327,6 +356,7 @@ static inline struct path *fanotify_event_path(struct fanotify_event *event)
static inline bool fanotify_is_hashed_event(u32 mask)
{
return !(fanotify_is_perm_event(mask) ||
+ fanotify_is_error_event(mask) ||
fsnotify_is_overflow_event(mask));
}

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index db378480f1b1..7ec99bec2746 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -107,6 +107,8 @@ struct kmem_cache *fanotify_perm_event_cachep __read_mostly;
#define FANOTIFY_EVENT_ALIGN 4
#define FANOTIFY_INFO_HDR_LEN \
(sizeof(struct fanotify_event_info_fid) + sizeof(struct file_handle))
+#define FANOTIFY_INFO_ERROR_LEN \
+ (sizeof(struct fanotify_event_info_error))

static int fanotify_fid_info_len(int fh_len, int name_len)
{
@@ -127,6 +129,9 @@ static size_t fanotify_event_len(struct fanotify_event *event,
int fh_len;
int dot_len = 0;

+ if (fanotify_is_error_event(event->mask))
+ return event_len + FANOTIFY_INFO_ERROR_LEN;
+
if (!fid_mode)
return event_len;

@@ -150,6 +155,35 @@ static size_t fanotify_event_len(struct fanotify_event *event,
return event_len;
}

+static struct fanotify_event *fanotify_dequeue_error_event(
+ struct fsnotify_group *group,
+ struct fanotify_event *event,
+ struct fanotify_error_event *dest)
+{
+ struct fanotify_error_event *error_event = FANOTIFY_EE(event);
+
+ /*
+ * In order to avoid missing an error count update, the
+ * queued event is de-queued and duplicated to an
+ * in-stack fanotify_error_event while still inside
+ * mark->lock. Once the event is dequeued, it can be
+ * immediately re-used for a new event.
+ *
+ * The ownership of the mark reference is dropped later
+ * by destroy_event.
+ */
+ spin_lock(&error_event->mark->lock);
+
+ memcpy(dest, error_event, sizeof(*error_event));
+ fsnotify_init_event(&dest->fae.fse);
+
+ fsnotify_remove_queued_event(group, &event->fse);
+
+ spin_unlock(&error_event->mark->lock);
+
+ return &dest->fae;
+}
+
/*
* Remove an hashed event from merge hash table.
*/
@@ -173,8 +207,10 @@ static void fanotify_unhash_event(struct fsnotify_group *group,
* is not large enough. When permission event is dequeued, its state is
* updated accordingly.
*/
-static struct fanotify_event *get_one_event(struct fsnotify_group *group,
- size_t count)
+static struct fanotify_event *get_one_event(
+ struct fsnotify_group *group,
+ size_t count,
+ struct fanotify_error_event *error_event)
{
size_t event_size;
struct fanotify_event *event = NULL;
@@ -198,9 +234,14 @@ static struct fanotify_event *get_one_event(struct fsnotify_group *group,

/*
* Held the notification_lock the whole time, so this is the
- * same event we peeked above.
+ * same event we peeked above, unless it is copied to
+ * error_event.
*/
- fsnotify_remove_first_event(group);
+ if (fanotify_is_error_event(event->mask))
+ event = fanotify_dequeue_error_event(group, event, error_event);
+ else
+ fsnotify_remove_first_event(group);
+
if (fanotify_is_perm_event(event->mask))
FANOTIFY_PERM(event)->state = FAN_EVENT_REPORTED;
if (fanotify_is_hashed_event(event->mask))
@@ -310,6 +351,31 @@ static int process_access_response(struct fsnotify_group *group,
return -ENOENT;
}

+static size_t copy_error_info_to_user(struct fanotify_event *event,
+ char __user *buf, int count)
+{
+ struct fanotify_event_info_error info;
+ struct fanotify_error_event *fee = FANOTIFY_EE(event);
+
+ info.hdr.info_type = FAN_EVENT_INFO_TYPE_ERROR;
+ info.hdr.pad = 0;
+ info.hdr.len = sizeof(struct fanotify_event_info_error);
+
+ if (WARN_ON(count < info.hdr.len))
+ return -EFAULT;
+
+ info.fsid = fee->fsid;
+ info.error = fee->error;
+ info.ino = fee->ino;
+ info.ino_generation = fee->ino_generation;
+ info.error_count = fee->err_count;
+
+ if (copy_to_user(buf, &info, sizeof(info)))
+ return -EFAULT;
+
+ return info.hdr.len;
+}
+
static int copy_info_to_user(__kernel_fsid_t *fsid, struct fanotify_fh *fh,
int info_type, const char *name, size_t name_len,
char __user *buf, size_t count)
@@ -531,6 +597,14 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
count -= ret;
}

+ if (fanotify_is_error_event(event->mask)) {
+ ret = copy_error_info_to_user(event, buf, count);
+ if (ret < 0)
+ return ret;
+ buf += ret;
+ count -= ret;
+ }
+
return metadata.event_len;

out_close_fd:
@@ -559,6 +633,7 @@ static __poll_t fanotify_poll(struct file *file, poll_table *wait)
static ssize_t fanotify_read(struct file *file, char __user *buf,
size_t count, loff_t *pos)
{
+ struct fanotify_error_event error_event;
struct fsnotify_group *group;
struct fanotify_event *event;
char __user *start;
@@ -577,7 +652,7 @@ static ssize_t fanotify_read(struct file *file, char __user *buf,
* in case there are lots of available events.
*/
cond_resched();
- event = get_one_event(group, count);
+ event = get_one_event(group, count, &error_event);
if (IS_ERR(event)) {
ret = PTR_ERR(event);
break;
@@ -896,16 +971,34 @@ static int fanotify_remove_inode_mark(struct fsnotify_group *group,
flags, umask);
}

-static __u32 fanotify_mark_add_to_mask(struct fsnotify_mark *fsn_mark,
- __u32 mask,
- unsigned int flags)
+static int fanotify_mark_add_to_mask(struct fsnotify_mark *fsn_mark,
+ __u32 mask, unsigned int flags,
+ __u32 *added_mask)
{
+ struct fanotify_error_event *error_event = NULL;
+ bool ignored = flags & FAN_MARK_IGNORED_MASK;
__u32 oldmask = -1;

+ /* Only pre-alloc error_event if needed. */
+ if (!ignored && (mask & FAN_FS_ERROR) &&
+ (fsn_mark->flags & FSNOTIFY_MARK_FLAG_SB) &&
+ !FANOTIFY_SB_MARK(fsn_mark)->error_event) {
+ error_event = kzalloc(sizeof(*error_event), GFP_KERNEL);
+ if (!error_event)
+ return -ENOMEM;
+ fanotify_init_event(&error_event->fae, 0, FS_ERROR);
+ error_event->mark = fsn_mark;
+ }
+
spin_lock(&fsn_mark->lock);
- if (!(flags & FAN_MARK_IGNORED_MASK)) {
+ if (!ignored) {
oldmask = fsn_mark->mask;
fsn_mark->mask |= mask;
+
+ if (error_event && !FANOTIFY_SB_MARK(fsn_mark)->error_event) {
+ FANOTIFY_SB_MARK(fsn_mark)->error_event = error_event;
+ error_event = NULL;
+ }
} else {
fsn_mark->ignored_mask |= mask;
if (flags & FAN_MARK_IGNORED_SURV_MODIFY)
@@ -913,7 +1006,10 @@ static __u32 fanotify_mark_add_to_mask(struct fsnotify_mark *fsn_mark,
}
spin_unlock(&fsn_mark->lock);

- return mask & ~oldmask;
+ kfree(error_event);
+
+ *added_mask = mask & ~oldmask;
+ return 0;
}

static struct fsnotify_mark *fanotify_alloc_mark(unsigned int type)
@@ -987,6 +1083,7 @@ static int fanotify_add_mark(struct fsnotify_group *group,
{
struct fsnotify_mark *fsn_mark;
__u32 added;
+ int ret = 0;

mutex_lock(&group->mark_mutex);
fsn_mark = fsnotify_find_mark(connp, group);
@@ -997,13 +1094,18 @@ static int fanotify_add_mark(struct fsnotify_group *group,
return PTR_ERR(fsn_mark);
}
}
- added = fanotify_mark_add_to_mask(fsn_mark, mask, flags);
+ ret = fanotify_mark_add_to_mask(fsn_mark, mask, flags, &added);
+ if (ret)
+ goto out;
+
if (added & ~fsnotify_conn_mask(fsn_mark->connector))
fsnotify_recalc_mask(fsn_mark->connector);
+
+out:
mutex_unlock(&group->mark_mutex);

fsnotify_put_mark(fsn_mark);
- return 0;
+ return ret;
}

static int fanotify_add_vfsmount_mark(struct fsnotify_group *group,
@@ -1419,6 +1521,17 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,

fsid = &__fsid;
}
+ if (mask & FAN_FS_ERROR) {
+ if (mark_type != FAN_MARK_FILESYSTEM) {
+ ret = -EINVAL;
+ goto path_put_and_out;
+ }
+
+ ret = fanotify_test_fsid(path.dentry, &__fsid);
+ if (ret)
+ goto path_put_and_out;
+ fsid = &__fsid;
+ }

/* inode held in place by reference to path; group by fget on fd */
if (mark_type == FAN_MARK_INODE)
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index a16dbeced152..3bdfe227e2c2 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -85,9 +85,12 @@ extern struct ctl_table fanotify_table[]; /* for sysctl */
#define FANOTIFY_INODE_EVENTS (FANOTIFY_DIRENT_EVENTS | \
FAN_ATTRIB | FAN_MOVE_SELF | FAN_DELETE_SELF)

+#define FANOTIFY_ERROR_EVENTS (FAN_FS_ERROR)
+
/* Events that user can request to be notified on */
#define FANOTIFY_EVENTS (FANOTIFY_PATH_EVENTS | \
- FANOTIFY_INODE_EVENTS)
+ FANOTIFY_INODE_EVENTS | \
+ FANOTIFY_ERROR_EVENTS)

/* Events that require a permission response from user */
#define FANOTIFY_PERM_EVENTS (FAN_OPEN_PERM | FAN_ACCESS_PERM | \
@@ -99,6 +102,7 @@ extern struct ctl_table fanotify_table[]; /* for sysctl */
/* Events that may be reported to user */
#define FANOTIFY_OUTGOING_EVENTS (FANOTIFY_EVENTS | \
FANOTIFY_PERM_EVENTS | \
+ FANOTIFY_ERROR_EVENTS | \
FAN_Q_OVERFLOW | FAN_ONDIR)

#define ALL_FANOTIFY_EVENT_BITS (FANOTIFY_OUTGOING_EVENTS | \
diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
index fbf9c5c7dd59..a987150a446c 100644
--- a/include/uapi/linux/fanotify.h
+++ b/include/uapi/linux/fanotify.h
@@ -20,6 +20,7 @@
#define FAN_OPEN_EXEC 0x00001000 /* File was opened for exec */

#define FAN_Q_OVERFLOW 0x00004000 /* Event queued overflowed */
+#define FAN_FS_ERROR 0x00008000 /* Filesystem error */

#define FAN_OPEN_PERM 0x00010000 /* File open in perm check */
#define FAN_ACCESS_PERM 0x00020000 /* File accessed in perm check */
@@ -123,6 +124,7 @@ struct fanotify_event_metadata {
#define FAN_EVENT_INFO_TYPE_FID 1
#define FAN_EVENT_INFO_TYPE_DFID_NAME 2
#define FAN_EVENT_INFO_TYPE_DFID 3
+#define FAN_EVENT_INFO_TYPE_ERROR 4

/* Variable length info record following event metadata */
struct fanotify_event_info_header {
@@ -148,6 +150,15 @@ struct fanotify_event_info_fid {
unsigned char handle[0];
};

+struct fanotify_event_info_error {
+ struct fanotify_event_info_header hdr;
+ __s32 error;
+ __u32 error_count;
+ __kernel_fsid_t fsid;
+ __u64 ino;
+ __u32 ino_generation;
+};
+
struct fanotify_response {
__s32 fd;
__u32 response;
--
2.31.0

2021-06-15 23:59:06

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH v2 12/14] ext4: Send notifications on error

Send a FS_ERROR message via fsnotify to a userspace monitoring tool
whenever a ext4 error condition is triggered. This follows the existing
error conditions in ext4, so it is hooked to the ext4_error* functions.

It also follows the current dmesg reporting in the format. The
filesystem message is composed mostly by the string that would be
otherwise printed in dmesg.

A new ext4 specific record format is exposed in the uapi, such that a
monitoring tool knows what to expect when listening errors of an ext4
filesystem.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
Reviewed-by: Amir Goldstein <[email protected]>
---
fs/ext4/super.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index d29f6aa7d96e..b5a8c4bab3ab 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -46,6 +46,7 @@
#include <linux/part_stat.h>
#include <linux/kthread.h>
#include <linux/freezer.h>
+#include <linux/fsnotify.h>

#include "ext4.h"
#include "ext4_extents.h" /* Needed for trace points definition */
@@ -752,6 +753,8 @@ void __ext4_error(struct super_block *sb, const char *function,
sb->s_id, function, line, current->comm, &vaf);
va_end(args);
}
+ fsnotify_sb_error(sb, NULL, error);
+
ext4_handle_error(sb, force_ro, error, 0, block, function, line);
}

@@ -782,6 +785,8 @@ void __ext4_error_inode(struct inode *inode, const char *function,
current->comm, &vaf);
va_end(args);
}
+ fsnotify_sb_error(inode->i_sb, inode, error);
+
ext4_handle_error(inode->i_sb, false, error, inode->i_ino, block,
function, line);
}
@@ -820,6 +825,8 @@ void __ext4_error_file(struct file *file, const char *function,
current->comm, path, &vaf);
va_end(args);
}
+ fsnotify_sb_error(inode->i_sb, inode, EFSCORRUPTED);
+
ext4_handle_error(inode->i_sb, false, EFSCORRUPTED, inode->i_ino, block,
function, line);
}
@@ -887,6 +894,7 @@ void __ext4_std_error(struct super_block *sb, const char *function,
printk(KERN_CRIT "EXT4-fs error (device %s) in %s:%d: %s\n",
sb->s_id, function, line, errstr);
}
+ fsnotify_sb_error(sb, sb->s_root->d_inode, errno);

ext4_handle_error(sb, false, -errno, 0, 0, function, line);
}
--
2.31.0

2021-06-15 23:59:06

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH v2 13/14] samples: Add fs error monitoring example

Introduce an example of a FAN_FS_ERROR fanotify user to track filesystem
errors.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>

---
Changes since v1:
- minor fixes
---
samples/Kconfig | 9 ++++
samples/Makefile | 1 +
samples/fanotify/Makefile | 3 ++
samples/fanotify/fs-monitor.c | 95 +++++++++++++++++++++++++++++++++++
4 files changed, 108 insertions(+)
create mode 100644 samples/fanotify/Makefile
create mode 100644 samples/fanotify/fs-monitor.c

diff --git a/samples/Kconfig b/samples/Kconfig
index b5a1a7aa7e23..f2f9c939035f 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -120,6 +120,15 @@ config SAMPLE_CONNECTOR
with it.
See also Documentation/driver-api/connector.rst

+config SAMPLE_FANOTIFY_ERROR
+ bool "Build fanotify error monitoring sample"
+ depends on FANOTIFY
+ help
+ When enabled, this builds an example code that uses the
+ FAN_FS_ERROR fanotify mechanism to monitor filesystem
+ errors.
+ See also Documentation/admin-guide/filesystem-monitoring.rst.
+
config SAMPLE_HIDRAW
bool "hidraw sample"
depends on CC_CAN_LINK && HEADERS_INSTALL
diff --git a/samples/Makefile b/samples/Makefile
index 087e0988ccc5..931a81847c48 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -5,6 +5,7 @@ subdir-$(CONFIG_SAMPLE_AUXDISPLAY) += auxdisplay
subdir-$(CONFIG_SAMPLE_ANDROID_BINDERFS) += binderfs
obj-$(CONFIG_SAMPLE_CONFIGFS) += configfs/
obj-$(CONFIG_SAMPLE_CONNECTOR) += connector/
+obj-$(CONFIG_SAMPLE_FANOTIFY_ERROR) += fanotify/
subdir-$(CONFIG_SAMPLE_HIDRAW) += hidraw
obj-$(CONFIG_SAMPLE_HW_BREAKPOINT) += hw_breakpoint/
obj-$(CONFIG_SAMPLE_KDB) += kdb/
diff --git a/samples/fanotify/Makefile b/samples/fanotify/Makefile
new file mode 100644
index 000000000000..b3d5cc826e6f
--- /dev/null
+++ b/samples/fanotify/Makefile
@@ -0,0 +1,3 @@
+userprogs-always-y += fs-monitor
+
+userccflags += -I usr/include
diff --git a/samples/fanotify/fs-monitor.c b/samples/fanotify/fs-monitor.c
new file mode 100644
index 000000000000..cb23a2592337
--- /dev/null
+++ b/samples/fanotify/fs-monitor.c
@@ -0,0 +1,95 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2021, Collabora Ltd.
+ */
+
+#define _GNU_SOURCE
+#include <errno.h>
+#include <err.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <fcntl.h>
+#include <sys/fanotify.h>
+#include <sys/types.h>
+#include <unistd.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+
+#ifndef FAN_FS_ERROR
+#define FAN_FS_ERROR 0x00008000
+#define FAN_EVENT_INFO_TYPE_ERROR 4
+
+struct fanotify_event_info_error {
+ struct fanotify_event_info_header hdr;
+ __s32 error;
+ __u32 error_count;
+ __kernel_fsid_t fsid;
+ __u64 ino;
+ __u32 ino_generation;
+};
+#endif
+
+static void handle_notifications(char *buffer, int len)
+{
+ struct fanotify_event_metadata *metadata;
+ struct fanotify_event_info_error *error;
+
+ for (metadata = (struct fanotify_event_metadata *) buffer;
+ FAN_EVENT_OK(metadata, len);
+ metadata = FAN_EVENT_NEXT(metadata, len)) {
+ if (metadata->mask != FAN_FS_ERROR) {
+ printf("unexpected FAN MARK: %llx\n", metadata->mask);
+ continue;
+ } else if (metadata->fd != FAN_NOFD) {
+ printf("Unexpected fd (!= FAN_NOFD)\n");
+ continue;
+ }
+
+ printf("FAN_FS_ERROR found len=%d\n", metadata->event_len);
+
+ error = (struct fanotify_event_info_error *) (metadata+1);
+ if (error->hdr.info_type != FAN_EVENT_INFO_TYPE_ERROR) {
+ printf("unknown record: %d\n", error->hdr.info_type);
+ continue;
+ }
+
+ printf("\tGeneric Error Record: len=%d\n", error->hdr.len);
+ printf("\tfsid: %x%x\n", error->fsid.val[0],
+ error->fsid.val[1]);
+ printf("\terror: %d\n", error->error);
+ printf("\tinode: %llu\tgen:%u\n", error->ino,
+ error->ino_generation);
+ printf("\terror_count: %d\n", error->error_count);
+ }
+}
+
+int main(int argc, char **argv)
+{
+ int fd;
+ char buffer[BUFSIZ];
+
+ if (argc < 2) {
+ printf("Missing path argument\n");
+ return 1;
+ }
+
+ fd = fanotify_init(FAN_CLASS_NOTIF, O_RDONLY);
+ if (fd < 0)
+ errx(1, "fanotify_init");
+
+ if (fanotify_mark(fd, FAN_MARK_ADD|FAN_MARK_FILESYSTEM,
+ FAN_FS_ERROR, AT_FDCWD, argv[1])) {
+ errx(1, "fanotify_mark");
+ }
+
+ while (1) {
+ int n = read(fd, buffer, BUFSIZ);
+
+ if (n < 0)
+ errx(1, "read");
+
+ handle_notifications(buffer, n);
+ }
+
+ return 0;
+}
--
2.31.0

2021-06-15 23:59:13

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH v2 02/14] fanotify: Fold event size calculation to its own function

Every time this function is invoked, it is immediately added to
FAN_EVENT_METADATA_LEN, since there is no need to just calculate the
length of info records. This minor clean up folds the rest of the
calculation into the function, which now operates in terms of events,
returning the size of the entire event, including metadata.

Reviewed-by: Amir Goldstein <[email protected]>
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>

---
Changes since v1:
- rebased on top of hashing patches
---
fs/notify/fanotify/fanotify_user.c | 33 +++++++++++++++++-------------
1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index be5b6d2c01e7..0da4e5dcab0f 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -117,17 +117,24 @@ static int fanotify_fid_info_len(int fh_len, int name_len)
return roundup(FANOTIFY_INFO_HDR_LEN + info_len, FANOTIFY_EVENT_ALIGN);
}

-static int fanotify_event_info_len(unsigned int fid_mode,
- struct fanotify_event *event)
+static size_t fanotify_event_len(struct fanotify_event *event,
+ unsigned int fid_mode)
{
- struct fanotify_info *info = fanotify_event_info(event);
- int dir_fh_len = fanotify_event_dir_fh_len(event);
- int fh_len = fanotify_event_object_fh_len(event);
- int info_len = 0;
+ size_t event_len = FAN_EVENT_METADATA_LEN;
+ struct fanotify_info *info;
+ int dir_fh_len;
+ int fh_len;
int dot_len = 0;

+ if (!fid_mode)
+ return event_len;
+
+ info = fanotify_event_info(event);
+ dir_fh_len = fanotify_event_dir_fh_len(event);
+ fh_len = fanotify_event_object_fh_len(event);
+
if (dir_fh_len) {
- info_len += fanotify_fid_info_len(dir_fh_len, info->name_len);
+ event_len += fanotify_fid_info_len(dir_fh_len, info->name_len);
} else if ((fid_mode & FAN_REPORT_NAME) && (event->mask & FAN_ONDIR)) {
/*
* With group flag FAN_REPORT_NAME, if name was not recorded in
@@ -137,9 +144,9 @@ static int fanotify_event_info_len(unsigned int fid_mode,
}

if (fh_len)
- info_len += fanotify_fid_info_len(fh_len, dot_len);
+ event_len += fanotify_fid_info_len(fh_len, dot_len);

- return info_len;
+ return event_len;
}

/*
@@ -168,7 +175,7 @@ static void fanotify_unhash_event(struct fsnotify_group *group,
static struct fanotify_event *get_one_event(struct fsnotify_group *group,
size_t count)
{
- size_t event_size = FAN_EVENT_METADATA_LEN;
+ size_t event_size;
struct fanotify_event *event = NULL;
struct fsnotify_event *fsn_event;
unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
@@ -181,8 +188,7 @@ static struct fanotify_event *get_one_event(struct fsnotify_group *group,
goto out;

event = FANOTIFY_E(fsn_event);
- if (fid_mode)
- event_size += fanotify_event_info_len(fid_mode, event);
+ event_size = fanotify_event_len(event, fid_mode);

if (event_size > count) {
event = ERR_PTR(-EINVAL);
@@ -412,8 +418,7 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,

pr_debug("%s: group=%p event=%p\n", __func__, group, event);

- metadata.event_len = FAN_EVENT_METADATA_LEN +
- fanotify_event_info_len(fid_mode, event);
+ metadata.event_len = fanotify_event_len(event, fid_mode);
metadata.metadata_len = FAN_EVENT_METADATA_LEN;
metadata.vers = FANOTIFY_METADATA_VERSION;
metadata.reserved = 0;
--
2.31.0

2021-06-15 23:59:16

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH v2 14/14] Documentation: Document the FAN_FS_ERROR event

Document the FAN_FS_ERROR event for user administrators and user space
developers.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>

---
Changes since v1:
- Drop references to location record
- Explain that the inode field is optional
- Explain we are reporting only the first error
---
.../admin-guide/filesystem-monitoring.rst | 63 +++++++++++++++++++
Documentation/admin-guide/index.rst | 1 +
2 files changed, 64 insertions(+)
create mode 100644 Documentation/admin-guide/filesystem-monitoring.rst

diff --git a/Documentation/admin-guide/filesystem-monitoring.rst b/Documentation/admin-guide/filesystem-monitoring.rst
new file mode 100644
index 000000000000..3710302676af
--- /dev/null
+++ b/Documentation/admin-guide/filesystem-monitoring.rst
@@ -0,0 +1,63 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================================
+File system Monitoring with fanotify
+====================================
+
+fanotify supports the FAN_FS_ERROR mark for file system-wide error
+reporting. It is meant to be used by file system health monitoring
+daemons who listen on that interface and take actions (notify sysadmin,
+start recovery) when a file system problem is detected by the kernel.
+
+By design, A FAN_FS_ERROR notification exposes sufficient information for a
+monitoring tool to know a problem in the file system has happened. It
+doesn't necessarily provide a user space application with semantics to
+verify an IO operation was successfully executed. That is outside of
+scope of this feature. Instead, it is only meant as a framework for
+early file system problem detection and reporting recovery tools.
+
+When a file system operation fails, it is common for dozens of kernel
+errors to cascade after the initial failure, hiding the original failure
+log, which is usually the most useful debug data to troubleshoot the
+problem. For this reason, FAN_FS_ERROR only reports the first error that
+occurred since the last notification, and it simply counts addition
+errors. This ensures that the most important piece of error information
+is never lost.
+
+At the time of this writing, the only file system that emits FAN_FS_ERROR
+notifications is Ext4.
+
+A user space example code is provided at ``samples/fanotify/fs-monitor.c``.
+
+Notification structure
+======================
+
+A FAN_FS_ERROR Notification has the following format::
+
+ [ Notification Metadata (Mandatory) ]
+ [ Generic Error Record (Mandatory) ]
+
+Generic error record
+--------------------
+
+The generic error record provides enough information for a file system
+agnostic tool to learn about a problem in the file system, without
+providing any additional details about the problem. This record is
+identified by ``struct fanotify_event_info_header.info_type`` being set
+to FAN_EVENT_INFO_TYPE_ERROR.
+
+ struct fanotify_event_info_error {
+ struct fanotify_event_info_header hdr;
+ int error;
+ __u32 error_count;
+ __kernel_fsid_t fsid;
+ __u64 ino;
+ __u32 ino_generation;
+ };
+
+The `error` field identifies the type of error. `fsid` identifies the
+file system that originated the error, since multiple file systems might
+be reporting to the same fanotify group. The `inode` field is optional,
+as it depends on whether the error is linked to an specific inode.
+`error_count` count tracks the number of errors that occurred and were
+suppressed to preserve the original error, since the last notification.
diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
index dc00afcabb95..1bedab498104 100644
--- a/Documentation/admin-guide/index.rst
+++ b/Documentation/admin-guide/index.rst
@@ -82,6 +82,7 @@ configure specific aspects of kernel behavior to your liking.
edid
efi-stub
ext4
+ filesystem-monitoring
nfs/index
gpio/index
highuid
--
2.31.0

2021-06-15 23:59:18

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH v2 08/14] fsnotify: Support passing argument to insert callback on add_event

FAN_FS_ERROR requires some initialization to happen from inside the
insert hook. This allows a user of fanotify_add_event to pass an
argument to be sent to the insert callback.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/notify/fanotify/fanotify.c | 5 +++--
fs/notify/inotify/inotify_fsnotify.c | 2 +-
fs/notify/notification.c | 7 +++++--
include/linux/fsnotify_backend.h | 7 +++++--
4 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 3822e46fc18a..f64234489811 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -695,7 +695,8 @@ static __kernel_fsid_t fanotify_get_fsid(struct fsnotify_iter_info *iter_info)
* Add an event to hash table for faster merge.
*/
static void fanotify_insert_event(struct fsnotify_group *group,
- struct fsnotify_event *fsn_event)
+ struct fsnotify_event *fsn_event,
+ const void *data)
{
struct fanotify_event *event = FANOTIFY_E(fsn_event);
unsigned int bucket = fanotify_event_hash_bucket(group, event);
@@ -777,7 +778,7 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
fsn_event = &event->fse;
ret = fsnotify_add_event(group, fsn_event, fanotify_merge,
fanotify_is_hashed_event(mask) ?
- fanotify_insert_event : NULL);
+ fanotify_insert_event : NULL, NULL);
if (ret) {
/* Permission events shouldn't be merged */
BUG_ON(ret == 1 && mask & FANOTIFY_PERM_EVENTS);
diff --git a/fs/notify/inotify/inotify_fsnotify.c b/fs/notify/inotify/inotify_fsnotify.c
index d1a64daa0171..a003a64ff8ee 100644
--- a/fs/notify/inotify/inotify_fsnotify.c
+++ b/fs/notify/inotify/inotify_fsnotify.c
@@ -116,7 +116,7 @@ int inotify_handle_inode_event(struct fsnotify_mark *inode_mark, u32 mask,
if (len)
strcpy(event->name, name->name);

- ret = fsnotify_add_event(group, fsn_event, inotify_merge, NULL);
+ ret = fsnotify_add_event(group, fsn_event, inotify_merge, NULL, NULL);
if (ret) {
/* Our event wasn't used in the end. Free it. */
fsnotify_destroy_event(group, fsn_event);
diff --git a/fs/notify/notification.c b/fs/notify/notification.c
index 033294669e07..73db3e7f1735 100644
--- a/fs/notify/notification.c
+++ b/fs/notify/notification.c
@@ -83,7 +83,9 @@ int fsnotify_add_event(struct fsnotify_group *group,
int (*merge)(struct fsnotify_group *,
struct fsnotify_event *),
void (*insert)(struct fsnotify_group *,
- struct fsnotify_event *))
+ struct fsnotify_event *,
+ const void *),
+ const void *insert_data)
{
int ret = 0;
struct list_head *list = &group->notification_list;
@@ -111,6 +113,7 @@ int fsnotify_add_event(struct fsnotify_group *group,
* them in the merge hash.
*/
insert = NULL;
+ insert_data = NULL;
goto queue;
}

@@ -126,7 +129,7 @@ int fsnotify_add_event(struct fsnotify_group *group,
group->q_len++;
list_add_tail(&event->list, list);
if (insert)
- insert(group, event);
+ insert(group, event, insert_data);
spin_unlock(&group->notification_lock);

wake_up(&group->notification_waitq);
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index b1590f654ade..8222fe12a6c9 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -526,11 +526,14 @@ extern int fsnotify_add_event(struct fsnotify_group *group,
int (*merge)(struct fsnotify_group *,
struct fsnotify_event *),
void (*insert)(struct fsnotify_group *,
- struct fsnotify_event *));
+ struct fsnotify_event *,
+ const void *),
+ const void *insert_data);
+
/* Queue overflow event to a notification group */
static inline void fsnotify_queue_overflow(struct fsnotify_group *group)
{
- fsnotify_add_event(group, group->overflow_event, NULL, NULL);
+ fsnotify_add_event(group, group->overflow_event, NULL, NULL, NULL);
}

static inline bool fsnotify_is_overflow_event(u32 mask)
--
2.31.0

2021-06-15 23:59:24

by Gabriel Krisman Bertazi

[permalink] [raw]
Subject: [PATCH v2 07/14] fsnotify: pass arguments of fsnotify() in struct fsnotify_event_info

From: Amir Goldstein <[email protected]>

There are a lot of arguments to fsnotify() and the handle_event() method.
Pass them in a const struct instead of on the argument list.

Apart from being more tidy, this is needed for passing userns to backend.
__fsnotify_parent() argument list was intentionally left untouched,
because its argument list is still short enough and because most of the
event info arguments are initialized inside __fsnotify_parent().

Signed-off-by: Amir Goldstein <[email protected]>
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/notify/fanotify/fanotify.c | 59 +++++++++++------------
fs/notify/fsnotify.c | 83 +++++++++++++++++---------------
include/linux/fsnotify.h | 15 ++++--
include/linux/fsnotify_backend.h | 73 +++++++++++++++++++++-------
4 files changed, 140 insertions(+), 90 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index f85efb24cfb4..3822e46fc18a 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -253,21 +253,22 @@ static int fanotify_get_response(struct fsnotify_group *group,
* been included within the event mask, but have not been explicitly
* requested by the user, will not be present in the returned mask.
*/
-static u32 fanotify_group_event_mask(struct fsnotify_group *group,
- struct fsnotify_iter_info *iter_info,
- u32 event_mask, const void *data,
- int data_type, struct inode *dir)
+static u32 fanotify_group_event_mask(
+ struct fsnotify_group *group, u32 event_mask,
+ const struct fsnotify_event_info *event_info,
+ struct fsnotify_iter_info *iter_info)
{
__u32 marks_mask = 0, marks_ignored_mask = 0;
__u32 test_mask, user_mask = FANOTIFY_OUTGOING_EVENTS |
FANOTIFY_EVENT_FLAGS;
- const struct path *path = fsnotify_data_path(data, data_type);
+ const struct path *path = fsnotify_event_info_path(event_info);
unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
struct fsnotify_mark *mark;
int type;

pr_debug("%s: report_mask=%x mask=%x data=%p data_type=%d\n",
- __func__, iter_info->report_mask, event_mask, data, data_type);
+ __func__, iter_info->report_mask, event_mask,
+ event_info->data, event_info->data_type);

if (!fid_mode) {
/* Do we have path to open a file descriptor? */
@@ -278,7 +279,7 @@ static u32 fanotify_group_event_mask(struct fsnotify_group *group,
return 0;
} else if (!(fid_mode & FAN_REPORT_FID)) {
/* Do we have a directory inode to report? */
- if (!dir && !(event_mask & FS_ISDIR))
+ if (!event_info->dir && !(event_mask & FS_ISDIR))
return 0;
}

@@ -427,13 +428,13 @@ static int fanotify_encode_fh(struct fanotify_fh *fh, struct inode *inode,
* FS_ATTRIB reports the child inode even if reported on a watched parent.
* FS_CREATE reports the modified dir inode and not the created inode.
*/
-static struct inode *fanotify_fid_inode(u32 event_mask, const void *data,
- int data_type, struct inode *dir)
+static struct inode *fanotify_fid_inode(u32 event_mask,
+ const struct fsnotify_event_info *event_info)
{
if (event_mask & ALL_FSNOTIFY_DIRENT_EVENTS)
- return dir;
+ return event_info->dir;

- return fsnotify_data_inode(data, data_type);
+ return fsnotify_event_info_inode(event_info);
}

/*
@@ -444,18 +445,18 @@ static struct inode *fanotify_fid_inode(u32 event_mask, const void *data,
* reported to parent.
* Otherwise, do not report dir fid.
*/
-static struct inode *fanotify_dfid_inode(u32 event_mask, const void *data,
- int data_type, struct inode *dir)
+static struct inode *fanotify_dfid_inode(u32 event_mask,
+ const struct fsnotify_event_info *event_info)
{
- struct inode *inode = fsnotify_data_inode(data, data_type);
+ struct inode *inode = fsnotify_event_info_inode(event_info);

if (event_mask & ALL_FSNOTIFY_DIRENT_EVENTS)
- return dir;
+ return event_info->dir;

if (S_ISDIR(inode->i_mode))
return inode;

- return dir;
+ return event_info->dir;
}

static struct fanotify_event *fanotify_alloc_path_event(const struct path *path,
@@ -563,17 +564,17 @@ static struct fanotify_event *fanotify_alloc_name_event(struct inode *id,
return &fne->fae;
}

-static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
- u32 mask, const void *data,
- int data_type, struct inode *dir,
- const struct qstr *file_name,
- __kernel_fsid_t *fsid)
+static struct fanotify_event *fanotify_alloc_event(
+ struct fsnotify_group *group, u32 mask,
+ const struct fsnotify_event_info *event_info,
+ __kernel_fsid_t *fsid)
{
struct fanotify_event *event = NULL;
gfp_t gfp = GFP_KERNEL_ACCOUNT;
- struct inode *id = fanotify_fid_inode(mask, data, data_type, dir);
- struct inode *dirid = fanotify_dfid_inode(mask, data, data_type, dir);
- const struct path *path = fsnotify_data_path(data, data_type);
+ struct inode *id = fanotify_fid_inode(mask, event_info);
+ struct inode *dirid = fanotify_dfid_inode(mask, event_info);
+ const struct path *path = fsnotify_event_info_path(event_info);
+ const struct qstr *file_name = event_info->name;
unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
struct mem_cgroup *old_memcg;
struct inode *child = NULL;
@@ -709,9 +710,7 @@ static void fanotify_insert_event(struct fsnotify_group *group,
}

static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
- const void *data, int data_type,
- struct inode *dir,
- const struct qstr *file_name, u32 cookie,
+ const struct fsnotify_event_info *event_info,
struct fsnotify_iter_info *iter_info)
{
int ret = 0;
@@ -741,8 +740,7 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,

BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 19);

- mask = fanotify_group_event_mask(group, iter_info, mask, data,
- data_type, dir);
+ mask = fanotify_group_event_mask(group, mask, event_info, iter_info);
if (!mask)
return 0;

@@ -764,8 +762,7 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
return 0;
}

- event = fanotify_alloc_event(group, mask, data, data_type, dir,
- file_name, &fsid);
+ event = fanotify_alloc_event(group, mask, event_info, &fsid);
ret = -ENOMEM;
if (unlikely(!event)) {
/*
diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index 30d422b8c0fc..36205a769dde 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -177,8 +177,8 @@ static bool fsnotify_event_needs_parent(struct inode *inode, struct mount *mnt,
* Notify only the child without name info if parent is not watching and
* inode/sb/mount are not interested in events with parent and name info.
*/
-int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
- int data_type)
+int __fsnotify_parent(struct dentry *dentry, __u32 mask,
+ const void *data, int data_type)
{
const struct path *path = fsnotify_data_path(data, data_type);
struct mount *mnt = path ? real_mount(path->mnt) : NULL;
@@ -229,7 +229,11 @@ int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
}

notify:
- ret = fsnotify(mask, data, data_type, p_inode, file_name, inode, 0);
+ ret = __fsnotify(mask, &(struct fsnotify_event_info) {
+ .data = data, .data_type = data_type,
+ .dir = p_inode, .name = file_name,
+ .inode = inode,
+ });

if (file_name)
release_dentry_name_snapshot(&name);
@@ -240,13 +244,11 @@ int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
EXPORT_SYMBOL_GPL(__fsnotify_parent);

static int fsnotify_handle_inode_event(struct fsnotify_group *group,
- struct fsnotify_mark *inode_mark,
- u32 mask, const void *data, int data_type,
- struct inode *dir, const struct qstr *name,
- u32 cookie)
+ struct fsnotify_mark *inode_mark, u32 mask,
+ const struct fsnotify_event_info *event_info)
{
- const struct path *path = fsnotify_data_path(data, data_type);
- struct inode *inode = fsnotify_data_inode(data, data_type);
+ const struct path *path = fsnotify_event_info_path(event_info);
+ struct inode *inode = fsnotify_event_info_inode(event_info);
const struct fsnotify_ops *ops = group->ops;

if (WARN_ON_ONCE(!ops->handle_inode_event))
@@ -260,16 +262,17 @@ static int fsnotify_handle_inode_event(struct fsnotify_group *group,
if (!(mask & inode_mark->mask & ALL_FSNOTIFY_EVENTS))
return 0;

- return ops->handle_inode_event(inode_mark, mask, inode, dir, name, cookie);
+ return ops->handle_inode_event(inode_mark, mask, inode, event_info->dir,
+ event_info->name, event_info->cookie);
}

static int fsnotify_handle_event(struct fsnotify_group *group, __u32 mask,
- const void *data, int data_type,
- struct inode *dir, const struct qstr *name,
- u32 cookie, struct fsnotify_iter_info *iter_info)
+ const struct fsnotify_event_info *event_info,
+ struct fsnotify_iter_info *iter_info)
{
struct fsnotify_mark *inode_mark = fsnotify_iter_inode_mark(iter_info);
struct fsnotify_mark *parent_mark = fsnotify_iter_parent_mark(iter_info);
+ struct fsnotify_event_info child_event_info = { };
int ret;

if (WARN_ON_ONCE(fsnotify_iter_sb_mark(iter_info)) ||
@@ -284,8 +287,8 @@ static int fsnotify_handle_event(struct fsnotify_group *group, __u32 mask,
* interested in this event?
*/
if (parent_mark->mask & FS_EVENT_ON_CHILD) {
- ret = fsnotify_handle_inode_event(group, parent_mark, mask,
- data, data_type, dir, name, 0);
+ ret = fsnotify_handle_inode_event(group, parent_mark,
+ mask, event_info);
if (ret)
return ret;
}
@@ -302,18 +305,22 @@ static int fsnotify_handle_event(struct fsnotify_group *group, __u32 mask,
* The child watcher is expecting an event without a file name
* and without the FS_EVENT_ON_CHILD flag.
*/
+ if (WARN_ON_ONCE(!event_info->inode))
+ return 0;
+
mask &= ~FS_EVENT_ON_CHILD;
- dir = NULL;
- name = NULL;
+ child_event_info = *event_info;
+ child_event_info.dir = NULL;
+ child_event_info.name = NULL;
+ event_info = &child_event_info;
}

- return fsnotify_handle_inode_event(group, inode_mark, mask, data, data_type,
- dir, name, cookie);
+ return fsnotify_handle_inode_event(group, inode_mark, mask, event_info);
}

-static int send_to_group(__u32 mask, const void *data, int data_type,
- struct inode *dir, const struct qstr *file_name,
- u32 cookie, struct fsnotify_iter_info *iter_info)
+static int send_to_group(__u32 mask,
+ const struct fsnotify_event_info *event_info,
+ struct fsnotify_iter_info *iter_info)
{
struct fsnotify_group *group = NULL;
__u32 test_mask = (mask & ALL_FSNOTIFY_EVENTS);
@@ -351,18 +358,18 @@ static int send_to_group(__u32 mask, const void *data, int data_type,

pr_debug("%s: group=%p mask=%x marks_mask=%x marks_ignored_mask=%x data=%p data_type=%d dir=%p cookie=%d\n",
__func__, group, mask, marks_mask, marks_ignored_mask,
- data, data_type, dir, cookie);
+ event_info->data, event_info->data_type, event_info->dir,
+ event_info->cookie);

if (!(test_mask & marks_mask & ~marks_ignored_mask))
return 0;

if (group->ops->handle_event) {
- return group->ops->handle_event(group, mask, data, data_type, dir,
- file_name, cookie, iter_info);
+ return group->ops->handle_event(group, mask, event_info,
+ iter_info);
}

- return fsnotify_handle_event(group, mask, data, data_type, dir,
- file_name, cookie, iter_info);
+ return fsnotify_handle_event(group, mask, event_info, iter_info);
}

static struct fsnotify_mark *fsnotify_first_mark(struct fsnotify_mark_connector **connp)
@@ -448,21 +455,22 @@ static void fsnotify_iter_next(struct fsnotify_iter_info *iter_info)
* in whatever means they feel necessary.
*
* @mask: event type and flags
+ * Input args in struct fsnotify_event_info:
* @data: object that event happened on
* @data_type: type of object for fanotify_data_XXX() accessors
* @dir: optional directory associated with event -
- * if @file_name is not NULL, this is the directory that
- * @file_name is relative to
- * @file_name: optional file name associated with event
+ * if @name is not NULL, this is the directory that
+ * @name is relative to
+ * @name: optional file name associated with event
* @inode: optional inode associated with event -
* either @dir or @inode must be non-NULL.
* if both are non-NULL event may be reported to both.
* @cookie: inotify rename cookie
*/
-int fsnotify(__u32 mask, const void *data, int data_type, struct inode *dir,
- const struct qstr *file_name, struct inode *inode, u32 cookie)
+int __fsnotify(__u32 mask, const struct fsnotify_event_info *event_info)
{
- const struct path *path = fsnotify_data_path(data, data_type);
+ const struct path *path = fsnotify_event_info_path(event_info);
+ struct inode *inode = event_info->inode;
struct fsnotify_iter_info iter_info = {};
struct super_block *sb;
struct mount *mnt = NULL;
@@ -475,13 +483,13 @@ int fsnotify(__u32 mask, const void *data, int data_type, struct inode *dir,

if (!inode) {
/* Dirent event - report on TYPE_INODE to dir */
- inode = dir;
+ inode = event_info->dir;
} else if (mask & FS_EVENT_ON_CHILD) {
/*
* Event on child - report on TYPE_PARENT to dir if it is
* watching children and on TYPE_INODE to child.
*/
- parent = dir;
+ parent = event_info->dir;
}
sb = inode->i_sb;

@@ -538,8 +546,7 @@ int fsnotify(__u32 mask, const void *data, int data_type, struct inode *dir,
* That's why this traversal is so complicated...
*/
while (fsnotify_iter_select_report_types(&iter_info)) {
- ret = send_to_group(mask, data, data_type, dir, file_name,
- cookie, &iter_info);
+ ret = send_to_group(mask, event_info, &iter_info);

if (ret && (mask & ALL_FSNOTIFY_PERM_EVENTS))
goto out;
@@ -552,7 +559,7 @@ int fsnotify(__u32 mask, const void *data, int data_type, struct inode *dir,

return ret;
}
-EXPORT_SYMBOL_GPL(fsnotify);
+EXPORT_SYMBOL_GPL(__fsnotify);

static __init int fsnotify_init(void)
{
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index f8acddcf54fb..8c2c681b4495 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -30,7 +30,10 @@ static inline void fsnotify_name(struct inode *dir, __u32 mask,
struct inode *child,
const struct qstr *name, u32 cookie)
{
- fsnotify(mask, child, FSNOTIFY_EVENT_INODE, dir, name, NULL, cookie);
+ __fsnotify(mask, &(struct fsnotify_event_info) {
+ .data = child, .data_type = FSNOTIFY_EVENT_INODE,
+ .dir = dir, .name = name, .cookie = cookie,
+ });
}

static inline void fsnotify_dirent(struct inode *dir, struct dentry *dentry,
@@ -44,7 +47,10 @@ static inline void fsnotify_inode(struct inode *inode, __u32 mask)
if (S_ISDIR(inode->i_mode))
mask |= FS_ISDIR;

- fsnotify(mask, inode, FSNOTIFY_EVENT_INODE, NULL, NULL, inode, 0);
+ __fsnotify(mask, &(struct fsnotify_event_info) {
+ .data = inode, .data_type = FSNOTIFY_EVENT_INODE,
+ .inode = inode,
+ });
}

/* Notify this dentry's parent about a child's events. */
@@ -68,7 +74,10 @@ static inline int fsnotify_parent(struct dentry *dentry, __u32 mask,
return __fsnotify_parent(dentry, mask, data, data_type);

notify_child:
- return fsnotify(mask, data, data_type, NULL, NULL, inode, 0);
+ return __fsnotify(mask, &(struct fsnotify_event_info) {
+ .data = data, .data_type = data_type,
+ .inode = inode,
+ });
}

/*
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index f9e2c6cd0f7d..b1590f654ade 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -112,6 +112,28 @@ struct fsnotify_iter_info;

struct mem_cgroup;

+/*
+ * Event info args passed to fsnotify() and to backends on handle_event():
+ * @data: object that event happened on
+ * @data_type: type of object for fanotify_data_XXX() accessors
+ * @dir: optional directory associated with event -
+ * if @name is not NULL, this is the directory that
+ * @name is relative to
+ * @name: optional file name associated with event
+ * @inode: optional inode associated with event -
+ * either @dir or @inode must be non-NULL.
+ * if both are non-NULL event may be reported to both.
+ * @cookie: inotify rename cookie
+ */
+struct fsnotify_event_info {
+ const void *data;
+ int data_type;
+ struct inode *dir;
+ const struct qstr *name;
+ struct inode *inode;
+ u32 cookie;
+};
+
/*
* Each group much define these ops. The fsnotify infrastructure will call
* these operations for each relevant group.
@@ -119,13 +141,7 @@ struct mem_cgroup;
* handle_event - main call for a group to handle an fs event
* @group: group to notify
* @mask: event type and flags
- * @data: object that event happened on
- * @data_type: type of object for fanotify_data_XXX() accessors
- * @dir: optional directory associated with event -
- * if @file_name is not NULL, this is the directory that
- * @file_name is relative to
- * @file_name: optional file name associated with event
- * @cookie: inotify rename cookie
+ * @event_info: information attached to the event
* @iter_info: array of marks from this group that are interested in the event
*
* handle_inode_event - simple variant of handle_event() for groups that only
@@ -147,8 +163,7 @@ struct mem_cgroup;
*/
struct fsnotify_ops {
int (*handle_event)(struct fsnotify_group *group, u32 mask,
- const void *data, int data_type, struct inode *dir,
- const struct qstr *file_name, u32 cookie,
+ const struct fsnotify_event_info *event_info,
struct fsnotify_iter_info *iter_info);
int (*handle_inode_event)(struct fsnotify_mark *mark, u32 mask,
struct inode *inode, struct inode *dir,
@@ -262,6 +277,12 @@ static inline struct inode *fsnotify_data_inode(const void *data, int data_type)
}
}

+static inline struct inode *fsnotify_event_info_inode(
+ const struct fsnotify_event_info *event_info)
+{
+ return fsnotify_data_inode(event_info->data, event_info->data_type);
+}
+
static inline const struct path *fsnotify_data_path(const void *data,
int data_type)
{
@@ -273,6 +294,12 @@ static inline const struct path *fsnotify_data_path(const void *data,
}
}

+static inline const struct path *fsnotify_event_info_path(
+ const struct fsnotify_event_info *event_info)
+{
+ return fsnotify_data_path(event_info->data, event_info->data_type);
+}
+
enum fsnotify_obj_type {
FSNOTIFY_OBJ_TYPE_INODE,
FSNOTIFY_OBJ_TYPE_PARENT,
@@ -410,11 +437,22 @@ struct fsnotify_mark {
/* called from the vfs helpers */

/* main fsnotify call to send events */
-extern int fsnotify(__u32 mask, const void *data, int data_type,
- struct inode *dir, const struct qstr *name,
- struct inode *inode, u32 cookie);
-extern int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
- int data_type);
+extern int __fsnotify(__u32 mask,
+ const struct fsnotify_event_info *event_info);
+extern int __fsnotify_parent(struct dentry *dentry, __u32 mask,
+ const void *data, int data_type);
+
+static inline int fsnotify(__u32 mask, const void *data, int data_type,
+ struct inode *dir, const struct qstr *name,
+ struct inode *inode, u32 cookie)
+{
+ return __fsnotify(mask, &(struct fsnotify_event_info) {
+ .data = data, .data_type = data_type,
+ .dir = dir, .name = name, .inode = inode,
+ .cookie = cookie,
+ });
+}
+
extern void __fsnotify_inode_delete(struct inode *inode);
extern void __fsnotify_vfsmount_delete(struct vfsmount *mnt);
extern void fsnotify_sb_delete(struct super_block *sb);
@@ -594,15 +632,14 @@ static inline void fsnotify_init_event(struct fsnotify_event *event)

#else

-static inline int fsnotify(__u32 mask, const void *data, int data_type,
- struct inode *dir, const struct qstr *name,
- struct inode *inode, u32 cookie)
+static inline int fsnotify(__u32 mask,
+ const struct fsnotify_event_info *event_info)
{
return 0;
}

static inline int __fsnotify_parent(struct dentry *dentry, __u32 mask,
- const void *data, int data_type)
+ const void *data, int data_type)
{
return 0;
}
--
2.31.0

2021-06-16 08:07:52

by Amir Goldstein

[permalink] [raw]
Subject: Re: [PATCH v2 03/14] fanotify: Split fsid check from other fid mode checks

On Wed, Jun 16, 2021 at 2:56 AM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> FAN_ERROR will require fsid, but not necessarily require the filesystem

FAN_FS_ERROR

> to expose a file handle. Split those checks into different functions, so
> they can be used separately when setting up an event.
>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
>
> ---
> Changes since v1:
> (Amir)
> - Sort hunks to simplify diff.
> Changes since RFC:
> (Amir)
> - Rename fanotify_check_path_fsid -> fanotify_test_fsid.
> - Use dentry directly instead of path.
> ---
> fs/notify/fanotify/fanotify_user.c | 27 ++++++++++++++++++---------
> 1 file changed, 18 insertions(+), 9 deletions(-)
>
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index 0da4e5dcab0f..af518790a80f 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -1192,16 +1192,15 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
> return fd;
> }
>
> -/* Check if filesystem can encode a unique fid */
> -static int fanotify_test_fid(struct path *path, __kernel_fsid_t *fsid)
> +static int fanotify_test_fsid(struct dentry *dentry, __kernel_fsid_t *fsid)
> {
> __kernel_fsid_t root_fsid;
> int err;
>
> /*
> - * Make sure path is not in filesystem with zero fsid (e.g. tmpfs).
> + * Make sure dentry is not of a filesystem with zero fsid (e.g. tmpfs).

Sorry.... I forgot to update this comment
59cda49ecf6c ("shmem: allow reporting fanotify events with file
handles on tmpfs")
Not your fault, but best we fix the comment if we change it to be correct.
It can be changed to (e.g. fuse)

Thanks,
Amir.

2021-06-16 09:19:37

by Amir Goldstein

[permalink] [raw]
Subject: Re: [PATCH v2 04/14] fanotify: Split superblock marks out to a new cache

On Wed, Jun 16, 2021 at 2:56 AM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> FAN_ERROR will require an error structure to be stored per mark. But,

FAN_FS_ERROR

> since FAN_ERROR doesn't apply to inode/mount marks, it should suffice to
> only expose this information for superblock marks. Therefore, wrap this
> kind of marks into a container and plumb it for the future.
>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
>
> ---
> Changes since v1:
> - Only extend superblock marks
> ---
> fs/notify/fanotify/fanotify.c | 10 ++++++++--
> fs/notify/fanotify/fanotify.h | 11 +++++++++++
> fs/notify/fanotify/fanotify_user.c | 29 ++++++++++++++++++++++++++++-
> include/linux/fsnotify_backend.h | 1 +
> 4 files changed, 48 insertions(+), 3 deletions(-)
>
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index 057abd2cf887..f85efb24cfb4 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -867,9 +867,15 @@ static void fanotify_freeing_mark(struct fsnotify_mark *mark,
> dec_ucount(group->fanotify_data.ucounts, UCOUNT_FANOTIFY_MARKS);
> }
>
> -static void fanotify_free_mark(struct fsnotify_mark *fsn_mark)
> +static void fanotify_free_mark(struct fsnotify_mark *mark)
> {
> - kmem_cache_free(fanotify_mark_cache, fsn_mark);
> + if (mark->flags & FSNOTIFY_MARK_FLAG_SB) {
> + struct fanotify_sb_mark *fa_mark = FANOTIFY_SB_MARK(mark);
> +
> + kmem_cache_free(fanotify_sb_mark_cache, fa_mark);
> + } else {
> + kmem_cache_free(fanotify_mark_cache, mark);
> + }
> }
>
> const struct fsnotify_ops fanotify_fsnotify_ops = {
> diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
> index 4a5e555dc3d2..aec05e21d5a9 100644
> --- a/fs/notify/fanotify/fanotify.h
> +++ b/fs/notify/fanotify/fanotify.h
> @@ -6,6 +6,7 @@
> #include <linux/hashtable.h>
>
> extern struct kmem_cache *fanotify_mark_cache;
> +extern struct kmem_cache *fanotify_sb_mark_cache;
> extern struct kmem_cache *fanotify_fid_event_cachep;
> extern struct kmem_cache *fanotify_path_event_cachep;
> extern struct kmem_cache *fanotify_perm_event_cachep;
> @@ -129,6 +130,16 @@ static inline void fanotify_info_copy_name(struct fanotify_info *info,
> name->name);
> }
>
> +struct fanotify_sb_mark {
> + struct fsnotify_mark fsn_mark;
> +};
> +
> +static inline
> +struct fanotify_sb_mark *FANOTIFY_SB_MARK(struct fsnotify_mark *mark)
> +{
> + return container_of(mark, struct fanotify_sb_mark, fsn_mark);
> +}
> +
> /*
> * Common structure for fanotify events. Concrete structs are allocated in
> * fanotify_handle_event() and freed when the information is retrieved by
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index af518790a80f..db378480f1b1 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -99,6 +99,7 @@ struct ctl_table fanotify_table[] = {
> extern const struct fsnotify_ops fanotify_fsnotify_ops;
>
> struct kmem_cache *fanotify_mark_cache __read_mostly;
> +struct kmem_cache *fanotify_sb_mark_cache __read_mostly;
> struct kmem_cache *fanotify_fid_event_cachep __read_mostly;
> struct kmem_cache *fanotify_path_event_cachep __read_mostly;
> struct kmem_cache *fanotify_perm_event_cachep __read_mostly;
> @@ -915,6 +916,27 @@ static __u32 fanotify_mark_add_to_mask(struct fsnotify_mark *fsn_mark,
> return mask & ~oldmask;
> }
>
> +static struct fsnotify_mark *fanotify_alloc_mark(unsigned int type)
> +{
> + struct fanotify_sb_mark *sb_mark;
> +
> + switch (type) {
> + case FSNOTIFY_OBJ_TYPE_SB:
> + sb_mark = kmem_cache_zalloc(fanotify_sb_mark_cache, GFP_KERNEL);
> + if (!sb_mark)
> + return NULL;
> + return &sb_mark->fsn_mark;
> +
> + case FSNOTIFY_OBJ_TYPE_INODE:
> + case FSNOTIFY_OBJ_TYPE_PARENT:
> + case FSNOTIFY_OBJ_TYPE_VFSMOUNT:
> + return kmem_cache_alloc(fanotify_mark_cache, GFP_KERNEL);
> + default:
> + WARN_ON(1);
> + return NULL;
> + }
> +}
> +
> static struct fsnotify_mark *fanotify_add_new_mark(struct fsnotify_group *group,
> fsnotify_connp_t *connp,
> unsigned int type,
> @@ -933,13 +955,16 @@ static struct fsnotify_mark *fanotify_add_new_mark(struct fsnotify_group *group,
> !inc_ucount(ucounts->ns, ucounts->uid, UCOUNT_FANOTIFY_MARKS))
> return ERR_PTR(-ENOSPC);
>
> - mark = kmem_cache_alloc(fanotify_mark_cache, GFP_KERNEL);
> + mark = fanotify_alloc_mark(type);
> if (!mark) {
> ret = -ENOMEM;
> goto out_dec_ucounts;
> }
>
> fsnotify_init_mark(mark, group);
> + if (type == FSNOTIFY_OBJ_TYPE_SB)
> + mark->flags |= FSNOTIFY_MARK_FLAG_SB;
> +

Please make sure to set the flag inside fanotify_alloc_mark() similar to
how fanotify_alloc_*_event() set the event type that is checked
in fanotify_free_event().

It mean passing group to fanotify_alloc_mark() and calling
fsnotify_init_mark() inside fanotify_alloc_mark().

Thanks,
Amir.

2021-06-16 09:26:51

by Amir Goldstein

[permalink] [raw]
Subject: Re: [PATCH v2 07/14] fsnotify: pass arguments of fsnotify() in struct fsnotify_event_info

On Wed, Jun 16, 2021 at 2:56 AM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> From: Amir Goldstein <[email protected]>
>
> There are a lot of arguments to fsnotify() and the handle_event() method.
> Pass them in a const struct instead of on the argument list.
>
> Apart from being more tidy, this is needed for passing userns to backend.

Remove this line or change to "needed for <your thing>"
Because I don't know if passing userns to backend is ever going to happen.

Thanks,
Amir.

> __fsnotify_parent() argument list was intentionally left untouched,
> because its argument list is still short enough and because most of the
> event info arguments are initialized inside __fsnotify_parent().
>
> Signed-off-by: Amir Goldstein <[email protected]>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
> ---
> fs/notify/fanotify/fanotify.c | 59 +++++++++++------------
> fs/notify/fsnotify.c | 83 +++++++++++++++++---------------
> include/linux/fsnotify.h | 15 ++++--
> include/linux/fsnotify_backend.h | 73 +++++++++++++++++++++-------
> 4 files changed, 140 insertions(+), 90 deletions(-)
>
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index f85efb24cfb4..3822e46fc18a 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -253,21 +253,22 @@ static int fanotify_get_response(struct fsnotify_group *group,
> * been included within the event mask, but have not been explicitly
> * requested by the user, will not be present in the returned mask.
> */
> -static u32 fanotify_group_event_mask(struct fsnotify_group *group,
> - struct fsnotify_iter_info *iter_info,
> - u32 event_mask, const void *data,
> - int data_type, struct inode *dir)
> +static u32 fanotify_group_event_mask(
> + struct fsnotify_group *group, u32 event_mask,
> + const struct fsnotify_event_info *event_info,
> + struct fsnotify_iter_info *iter_info)
> {
> __u32 marks_mask = 0, marks_ignored_mask = 0;
> __u32 test_mask, user_mask = FANOTIFY_OUTGOING_EVENTS |
> FANOTIFY_EVENT_FLAGS;
> - const struct path *path = fsnotify_data_path(data, data_type);
> + const struct path *path = fsnotify_event_info_path(event_info);
> unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
> struct fsnotify_mark *mark;
> int type;
>
> pr_debug("%s: report_mask=%x mask=%x data=%p data_type=%d\n",
> - __func__, iter_info->report_mask, event_mask, data, data_type);
> + __func__, iter_info->report_mask, event_mask,
> + event_info->data, event_info->data_type);
>
> if (!fid_mode) {
> /* Do we have path to open a file descriptor? */
> @@ -278,7 +279,7 @@ static u32 fanotify_group_event_mask(struct fsnotify_group *group,
> return 0;
> } else if (!(fid_mode & FAN_REPORT_FID)) {
> /* Do we have a directory inode to report? */
> - if (!dir && !(event_mask & FS_ISDIR))
> + if (!event_info->dir && !(event_mask & FS_ISDIR))
> return 0;
> }
>
> @@ -427,13 +428,13 @@ static int fanotify_encode_fh(struct fanotify_fh *fh, struct inode *inode,
> * FS_ATTRIB reports the child inode even if reported on a watched parent.
> * FS_CREATE reports the modified dir inode and not the created inode.
> */
> -static struct inode *fanotify_fid_inode(u32 event_mask, const void *data,
> - int data_type, struct inode *dir)
> +static struct inode *fanotify_fid_inode(u32 event_mask,
> + const struct fsnotify_event_info *event_info)
> {
> if (event_mask & ALL_FSNOTIFY_DIRENT_EVENTS)
> - return dir;
> + return event_info->dir;
>
> - return fsnotify_data_inode(data, data_type);
> + return fsnotify_event_info_inode(event_info);
> }
>
> /*
> @@ -444,18 +445,18 @@ static struct inode *fanotify_fid_inode(u32 event_mask, const void *data,
> * reported to parent.
> * Otherwise, do not report dir fid.
> */
> -static struct inode *fanotify_dfid_inode(u32 event_mask, const void *data,
> - int data_type, struct inode *dir)
> +static struct inode *fanotify_dfid_inode(u32 event_mask,
> + const struct fsnotify_event_info *event_info)
> {
> - struct inode *inode = fsnotify_data_inode(data, data_type);
> + struct inode *inode = fsnotify_event_info_inode(event_info);
>
> if (event_mask & ALL_FSNOTIFY_DIRENT_EVENTS)
> - return dir;
> + return event_info->dir;
>
> if (S_ISDIR(inode->i_mode))
> return inode;
>
> - return dir;
> + return event_info->dir;
> }
>
> static struct fanotify_event *fanotify_alloc_path_event(const struct path *path,
> @@ -563,17 +564,17 @@ static struct fanotify_event *fanotify_alloc_name_event(struct inode *id,
> return &fne->fae;
> }
>
> -static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
> - u32 mask, const void *data,
> - int data_type, struct inode *dir,
> - const struct qstr *file_name,
> - __kernel_fsid_t *fsid)
> +static struct fanotify_event *fanotify_alloc_event(
> + struct fsnotify_group *group, u32 mask,
> + const struct fsnotify_event_info *event_info,
> + __kernel_fsid_t *fsid)
> {
> struct fanotify_event *event = NULL;
> gfp_t gfp = GFP_KERNEL_ACCOUNT;
> - struct inode *id = fanotify_fid_inode(mask, data, data_type, dir);
> - struct inode *dirid = fanotify_dfid_inode(mask, data, data_type, dir);
> - const struct path *path = fsnotify_data_path(data, data_type);
> + struct inode *id = fanotify_fid_inode(mask, event_info);
> + struct inode *dirid = fanotify_dfid_inode(mask, event_info);
> + const struct path *path = fsnotify_event_info_path(event_info);
> + const struct qstr *file_name = event_info->name;
> unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
> struct mem_cgroup *old_memcg;
> struct inode *child = NULL;
> @@ -709,9 +710,7 @@ static void fanotify_insert_event(struct fsnotify_group *group,
> }
>
> static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
> - const void *data, int data_type,
> - struct inode *dir,
> - const struct qstr *file_name, u32 cookie,
> + const struct fsnotify_event_info *event_info,
> struct fsnotify_iter_info *iter_info)
> {
> int ret = 0;
> @@ -741,8 +740,7 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
>
> BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 19);
>
> - mask = fanotify_group_event_mask(group, iter_info, mask, data,
> - data_type, dir);
> + mask = fanotify_group_event_mask(group, mask, event_info, iter_info);
> if (!mask)
> return 0;
>
> @@ -764,8 +762,7 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
> return 0;
> }
>
> - event = fanotify_alloc_event(group, mask, data, data_type, dir,
> - file_name, &fsid);
> + event = fanotify_alloc_event(group, mask, event_info, &fsid);
> ret = -ENOMEM;
> if (unlikely(!event)) {
> /*
> diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
> index 30d422b8c0fc..36205a769dde 100644
> --- a/fs/notify/fsnotify.c
> +++ b/fs/notify/fsnotify.c
> @@ -177,8 +177,8 @@ static bool fsnotify_event_needs_parent(struct inode *inode, struct mount *mnt,
> * Notify only the child without name info if parent is not watching and
> * inode/sb/mount are not interested in events with parent and name info.
> */
> -int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
> - int data_type)
> +int __fsnotify_parent(struct dentry *dentry, __u32 mask,
> + const void *data, int data_type)
> {
> const struct path *path = fsnotify_data_path(data, data_type);
> struct mount *mnt = path ? real_mount(path->mnt) : NULL;
> @@ -229,7 +229,11 @@ int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
> }
>
> notify:
> - ret = fsnotify(mask, data, data_type, p_inode, file_name, inode, 0);
> + ret = __fsnotify(mask, &(struct fsnotify_event_info) {
> + .data = data, .data_type = data_type,
> + .dir = p_inode, .name = file_name,
> + .inode = inode,
> + });
>
> if (file_name)
> release_dentry_name_snapshot(&name);
> @@ -240,13 +244,11 @@ int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
> EXPORT_SYMBOL_GPL(__fsnotify_parent);
>
> static int fsnotify_handle_inode_event(struct fsnotify_group *group,
> - struct fsnotify_mark *inode_mark,
> - u32 mask, const void *data, int data_type,
> - struct inode *dir, const struct qstr *name,
> - u32 cookie)
> + struct fsnotify_mark *inode_mark, u32 mask,
> + const struct fsnotify_event_info *event_info)
> {
> - const struct path *path = fsnotify_data_path(data, data_type);
> - struct inode *inode = fsnotify_data_inode(data, data_type);
> + const struct path *path = fsnotify_event_info_path(event_info);
> + struct inode *inode = fsnotify_event_info_inode(event_info);
> const struct fsnotify_ops *ops = group->ops;
>
> if (WARN_ON_ONCE(!ops->handle_inode_event))
> @@ -260,16 +262,17 @@ static int fsnotify_handle_inode_event(struct fsnotify_group *group,
> if (!(mask & inode_mark->mask & ALL_FSNOTIFY_EVENTS))
> return 0;
>
> - return ops->handle_inode_event(inode_mark, mask, inode, dir, name, cookie);
> + return ops->handle_inode_event(inode_mark, mask, inode, event_info->dir,
> + event_info->name, event_info->cookie);
> }
>
> static int fsnotify_handle_event(struct fsnotify_group *group, __u32 mask,
> - const void *data, int data_type,
> - struct inode *dir, const struct qstr *name,
> - u32 cookie, struct fsnotify_iter_info *iter_info)
> + const struct fsnotify_event_info *event_info,
> + struct fsnotify_iter_info *iter_info)
> {
> struct fsnotify_mark *inode_mark = fsnotify_iter_inode_mark(iter_info);
> struct fsnotify_mark *parent_mark = fsnotify_iter_parent_mark(iter_info);
> + struct fsnotify_event_info child_event_info = { };
> int ret;
>
> if (WARN_ON_ONCE(fsnotify_iter_sb_mark(iter_info)) ||
> @@ -284,8 +287,8 @@ static int fsnotify_handle_event(struct fsnotify_group *group, __u32 mask,
> * interested in this event?
> */
> if (parent_mark->mask & FS_EVENT_ON_CHILD) {
> - ret = fsnotify_handle_inode_event(group, parent_mark, mask,
> - data, data_type, dir, name, 0);
> + ret = fsnotify_handle_inode_event(group, parent_mark,
> + mask, event_info);
> if (ret)
> return ret;
> }
> @@ -302,18 +305,22 @@ static int fsnotify_handle_event(struct fsnotify_group *group, __u32 mask,
> * The child watcher is expecting an event without a file name
> * and without the FS_EVENT_ON_CHILD flag.
> */
> + if (WARN_ON_ONCE(!event_info->inode))
> + return 0;
> +
> mask &= ~FS_EVENT_ON_CHILD;
> - dir = NULL;
> - name = NULL;
> + child_event_info = *event_info;
> + child_event_info.dir = NULL;
> + child_event_info.name = NULL;
> + event_info = &child_event_info;
> }
>
> - return fsnotify_handle_inode_event(group, inode_mark, mask, data, data_type,
> - dir, name, cookie);
> + return fsnotify_handle_inode_event(group, inode_mark, mask, event_info);
> }
>
> -static int send_to_group(__u32 mask, const void *data, int data_type,
> - struct inode *dir, const struct qstr *file_name,
> - u32 cookie, struct fsnotify_iter_info *iter_info)
> +static int send_to_group(__u32 mask,
> + const struct fsnotify_event_info *event_info,
> + struct fsnotify_iter_info *iter_info)
> {
> struct fsnotify_group *group = NULL;
> __u32 test_mask = (mask & ALL_FSNOTIFY_EVENTS);
> @@ -351,18 +358,18 @@ static int send_to_group(__u32 mask, const void *data, int data_type,
>
> pr_debug("%s: group=%p mask=%x marks_mask=%x marks_ignored_mask=%x data=%p data_type=%d dir=%p cookie=%d\n",
> __func__, group, mask, marks_mask, marks_ignored_mask,
> - data, data_type, dir, cookie);
> + event_info->data, event_info->data_type, event_info->dir,
> + event_info->cookie);
>
> if (!(test_mask & marks_mask & ~marks_ignored_mask))
> return 0;
>
> if (group->ops->handle_event) {
> - return group->ops->handle_event(group, mask, data, data_type, dir,
> - file_name, cookie, iter_info);
> + return group->ops->handle_event(group, mask, event_info,
> + iter_info);
> }
>
> - return fsnotify_handle_event(group, mask, data, data_type, dir,
> - file_name, cookie, iter_info);
> + return fsnotify_handle_event(group, mask, event_info, iter_info);
> }
>
> static struct fsnotify_mark *fsnotify_first_mark(struct fsnotify_mark_connector **connp)
> @@ -448,21 +455,22 @@ static void fsnotify_iter_next(struct fsnotify_iter_info *iter_info)
> * in whatever means they feel necessary.
> *
> * @mask: event type and flags
> + * Input args in struct fsnotify_event_info:
> * @data: object that event happened on
> * @data_type: type of object for fanotify_data_XXX() accessors
> * @dir: optional directory associated with event -
> - * if @file_name is not NULL, this is the directory that
> - * @file_name is relative to
> - * @file_name: optional file name associated with event
> + * if @name is not NULL, this is the directory that
> + * @name is relative to
> + * @name: optional file name associated with event
> * @inode: optional inode associated with event -
> * either @dir or @inode must be non-NULL.
> * if both are non-NULL event may be reported to both.
> * @cookie: inotify rename cookie
> */
> -int fsnotify(__u32 mask, const void *data, int data_type, struct inode *dir,
> - const struct qstr *file_name, struct inode *inode, u32 cookie)
> +int __fsnotify(__u32 mask, const struct fsnotify_event_info *event_info)
> {
> - const struct path *path = fsnotify_data_path(data, data_type);
> + const struct path *path = fsnotify_event_info_path(event_info);
> + struct inode *inode = event_info->inode;
> struct fsnotify_iter_info iter_info = {};
> struct super_block *sb;
> struct mount *mnt = NULL;
> @@ -475,13 +483,13 @@ int fsnotify(__u32 mask, const void *data, int data_type, struct inode *dir,
>
> if (!inode) {
> /* Dirent event - report on TYPE_INODE to dir */
> - inode = dir;
> + inode = event_info->dir;
> } else if (mask & FS_EVENT_ON_CHILD) {
> /*
> * Event on child - report on TYPE_PARENT to dir if it is
> * watching children and on TYPE_INODE to child.
> */
> - parent = dir;
> + parent = event_info->dir;
> }
> sb = inode->i_sb;
>
> @@ -538,8 +546,7 @@ int fsnotify(__u32 mask, const void *data, int data_type, struct inode *dir,
> * That's why this traversal is so complicated...
> */
> while (fsnotify_iter_select_report_types(&iter_info)) {
> - ret = send_to_group(mask, data, data_type, dir, file_name,
> - cookie, &iter_info);
> + ret = send_to_group(mask, event_info, &iter_info);
>
> if (ret && (mask & ALL_FSNOTIFY_PERM_EVENTS))
> goto out;
> @@ -552,7 +559,7 @@ int fsnotify(__u32 mask, const void *data, int data_type, struct inode *dir,
>
> return ret;
> }
> -EXPORT_SYMBOL_GPL(fsnotify);
> +EXPORT_SYMBOL_GPL(__fsnotify);
>
> static __init int fsnotify_init(void)
> {
> diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
> index f8acddcf54fb..8c2c681b4495 100644
> --- a/include/linux/fsnotify.h
> +++ b/include/linux/fsnotify.h
> @@ -30,7 +30,10 @@ static inline void fsnotify_name(struct inode *dir, __u32 mask,
> struct inode *child,
> const struct qstr *name, u32 cookie)
> {
> - fsnotify(mask, child, FSNOTIFY_EVENT_INODE, dir, name, NULL, cookie);
> + __fsnotify(mask, &(struct fsnotify_event_info) {
> + .data = child, .data_type = FSNOTIFY_EVENT_INODE,
> + .dir = dir, .name = name, .cookie = cookie,
> + });
> }
>
> static inline void fsnotify_dirent(struct inode *dir, struct dentry *dentry,
> @@ -44,7 +47,10 @@ static inline void fsnotify_inode(struct inode *inode, __u32 mask)
> if (S_ISDIR(inode->i_mode))
> mask |= FS_ISDIR;
>
> - fsnotify(mask, inode, FSNOTIFY_EVENT_INODE, NULL, NULL, inode, 0);
> + __fsnotify(mask, &(struct fsnotify_event_info) {
> + .data = inode, .data_type = FSNOTIFY_EVENT_INODE,
> + .inode = inode,
> + });
> }
>
> /* Notify this dentry's parent about a child's events. */
> @@ -68,7 +74,10 @@ static inline int fsnotify_parent(struct dentry *dentry, __u32 mask,
> return __fsnotify_parent(dentry, mask, data, data_type);
>
> notify_child:
> - return fsnotify(mask, data, data_type, NULL, NULL, inode, 0);
> + return __fsnotify(mask, &(struct fsnotify_event_info) {
> + .data = data, .data_type = data_type,
> + .inode = inode,
> + });
> }
>
> /*
> diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
> index f9e2c6cd0f7d..b1590f654ade 100644
> --- a/include/linux/fsnotify_backend.h
> +++ b/include/linux/fsnotify_backend.h
> @@ -112,6 +112,28 @@ struct fsnotify_iter_info;
>
> struct mem_cgroup;
>
> +/*
> + * Event info args passed to fsnotify() and to backends on handle_event():
> + * @data: object that event happened on
> + * @data_type: type of object for fanotify_data_XXX() accessors
> + * @dir: optional directory associated with event -
> + * if @name is not NULL, this is the directory that
> + * @name is relative to
> + * @name: optional file name associated with event
> + * @inode: optional inode associated with event -
> + * either @dir or @inode must be non-NULL.
> + * if both are non-NULL event may be reported to both.
> + * @cookie: inotify rename cookie
> + */
> +struct fsnotify_event_info {
> + const void *data;
> + int data_type;
> + struct inode *dir;
> + const struct qstr *name;
> + struct inode *inode;
> + u32 cookie;
> +};
> +
> /*
> * Each group much define these ops. The fsnotify infrastructure will call
> * these operations for each relevant group.
> @@ -119,13 +141,7 @@ struct mem_cgroup;
> * handle_event - main call for a group to handle an fs event
> * @group: group to notify
> * @mask: event type and flags
> - * @data: object that event happened on
> - * @data_type: type of object for fanotify_data_XXX() accessors
> - * @dir: optional directory associated with event -
> - * if @file_name is not NULL, this is the directory that
> - * @file_name is relative to
> - * @file_name: optional file name associated with event
> - * @cookie: inotify rename cookie
> + * @event_info: information attached to the event
> * @iter_info: array of marks from this group that are interested in the event
> *
> * handle_inode_event - simple variant of handle_event() for groups that only
> @@ -147,8 +163,7 @@ struct mem_cgroup;
> */
> struct fsnotify_ops {
> int (*handle_event)(struct fsnotify_group *group, u32 mask,
> - const void *data, int data_type, struct inode *dir,
> - const struct qstr *file_name, u32 cookie,
> + const struct fsnotify_event_info *event_info,
> struct fsnotify_iter_info *iter_info);
> int (*handle_inode_event)(struct fsnotify_mark *mark, u32 mask,
> struct inode *inode, struct inode *dir,
> @@ -262,6 +277,12 @@ static inline struct inode *fsnotify_data_inode(const void *data, int data_type)
> }
> }
>
> +static inline struct inode *fsnotify_event_info_inode(
> + const struct fsnotify_event_info *event_info)
> +{
> + return fsnotify_data_inode(event_info->data, event_info->data_type);
> +}
> +
> static inline const struct path *fsnotify_data_path(const void *data,
> int data_type)
> {
> @@ -273,6 +294,12 @@ static inline const struct path *fsnotify_data_path(const void *data,
> }
> }
>
> +static inline const struct path *fsnotify_event_info_path(
> + const struct fsnotify_event_info *event_info)
> +{
> + return fsnotify_data_path(event_info->data, event_info->data_type);
> +}
> +
> enum fsnotify_obj_type {
> FSNOTIFY_OBJ_TYPE_INODE,
> FSNOTIFY_OBJ_TYPE_PARENT,
> @@ -410,11 +437,22 @@ struct fsnotify_mark {
> /* called from the vfs helpers */
>
> /* main fsnotify call to send events */
> -extern int fsnotify(__u32 mask, const void *data, int data_type,
> - struct inode *dir, const struct qstr *name,
> - struct inode *inode, u32 cookie);
> -extern int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
> - int data_type);
> +extern int __fsnotify(__u32 mask,
> + const struct fsnotify_event_info *event_info);
> +extern int __fsnotify_parent(struct dentry *dentry, __u32 mask,
> + const void *data, int data_type);
> +
> +static inline int fsnotify(__u32 mask, const void *data, int data_type,
> + struct inode *dir, const struct qstr *name,
> + struct inode *inode, u32 cookie)
> +{
> + return __fsnotify(mask, &(struct fsnotify_event_info) {
> + .data = data, .data_type = data_type,
> + .dir = dir, .name = name, .inode = inode,
> + .cookie = cookie,
> + });
> +}
> +
> extern void __fsnotify_inode_delete(struct inode *inode);
> extern void __fsnotify_vfsmount_delete(struct vfsmount *mnt);
> extern void fsnotify_sb_delete(struct super_block *sb);
> @@ -594,15 +632,14 @@ static inline void fsnotify_init_event(struct fsnotify_event *event)
>
> #else
>
> -static inline int fsnotify(__u32 mask, const void *data, int data_type,
> - struct inode *dir, const struct qstr *name,
> - struct inode *inode, u32 cookie)
> +static inline int fsnotify(__u32 mask,
> + const struct fsnotify_event_info *event_info)
> {
> return 0;
> }
>
> static inline int __fsnotify_parent(struct dentry *dentry, __u32 mask,
> - const void *data, int data_type)
> + const void *data, int data_type)
> {
> return 0;
> }
> --
> 2.31.0
>

2021-06-16 09:30:24

by Amir Goldstein

[permalink] [raw]
Subject: Re: [PATCH v2 09/14] fsnotify: Support FS_ERROR event type

On Wed, Jun 16, 2021 at 2:56 AM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> Expose a new type of fsnotify event for filesystems to report errors for
> userspace monitoring tools. fanotify will send this type of
> notification for FAN_ERROR events.

FAN_FS_ERROR... it's all over the place ;-)

Otherwise
Reviewed-by: Amir Goldstein <[email protected]>

>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
>
> ---
> Changes since v1:
> - Overload FS_ERROR with FS_IN_IGNORED
> - IMplement support for this type on fsnotify_data_inode
> ---
> include/linux/fsnotify_backend.h | 17 ++++++++++++++++-
> 1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
> index 8222fe12a6c9..ea5f5c7cc381 100644
> --- a/include/linux/fsnotify_backend.h
> +++ b/include/linux/fsnotify_backend.h
> @@ -42,6 +42,12 @@
>
> #define FS_UNMOUNT 0x00002000 /* inode on umount fs */
> #define FS_Q_OVERFLOW 0x00004000 /* Event queued overflowed */
> +#define FS_ERROR 0x00008000 /* Filesystem Error (fanotify) */
> +
> +/*
> + * FS_IN_IGNORED overloads FS_ERROR. It is only used internally by inotify
> + * which does not support FS_ERROR.
> + */
> #define FS_IN_IGNORED 0x00008000 /* last inotify event here */
>
> #define FS_OPEN_PERM 0x00010000 /* open event in an permission hook */
> @@ -95,7 +101,8 @@
> #define ALL_FSNOTIFY_EVENTS (ALL_FSNOTIFY_DIRENT_EVENTS | \
> FS_EVENTS_POSS_ON_CHILD | \
> FS_DELETE_SELF | FS_MOVE_SELF | FS_DN_RENAME | \
> - FS_UNMOUNT | FS_Q_OVERFLOW | FS_IN_IGNORED)
> + FS_UNMOUNT | FS_Q_OVERFLOW | FS_IN_IGNORED | \
> + FS_ERROR)
>
> /* Extra flags that may be reported with event or control handling of events */
> #define ALL_FSNOTIFY_FLAGS (FS_EXCL_UNLINK | FS_ISDIR | FS_IN_ONESHOT | \
> @@ -263,6 +270,12 @@ enum fsnotify_data_type {
> FSNOTIFY_EVENT_NONE,
> FSNOTIFY_EVENT_PATH,
> FSNOTIFY_EVENT_INODE,
> + FSNOTIFY_EVENT_ERROR,
> +};
> +
> +struct fs_error_report {
> + int error;
> + struct inode *inode;
> };
>
> static inline struct inode *fsnotify_data_inode(const void *data, int data_type)
> @@ -272,6 +285,8 @@ static inline struct inode *fsnotify_data_inode(const void *data, int data_type)
> return (struct inode *)data;
> case FSNOTIFY_EVENT_PATH:
> return d_inode(((const struct path *)data)->dentry);
> + case FSNOTIFY_EVENT_ERROR:
> + return ((struct fs_error_report *)data)->inode;
> default:
> return NULL;
> }
> --
> 2.31.0
>

2021-06-16 09:40:11

by Amir Goldstein

[permalink] [raw]
Subject: Re: [PATCH v2 10/14] fsnotify: Introduce helpers to send error_events

On Wed, Jun 16, 2021 at 2:56 AM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> Introduce helpers for filesystems interested in reporting FS_ERROR
> events.
>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
>
> ---
> Changes since v1:
> - Use the inode argument (Amir)
> - Protect s_fsnotify_marks with ifdef guard
> ---
> fs/notify/fsnotify.c | 2 +-
> include/linux/fsnotify.h | 20 ++++++++++++++++++++
> include/linux/fsnotify_backend.h | 1 +
> 3 files changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
> index 36205a769dde..ac05eb3fb368 100644
> --- a/fs/notify/fsnotify.c
> +++ b/fs/notify/fsnotify.c
> @@ -491,7 +491,7 @@ int __fsnotify(__u32 mask, const struct fsnotify_event_info *event_info)
> */
> parent = event_info->dir;
> }
> - sb = inode->i_sb;
> + sb = event_info->sb ?: inode->i_sb;
>
> /*
> * Optimization: srcu_read_lock() has a memory barrier which can
> diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
> index 8c2c681b4495..c0dbc5a65381 100644
> --- a/include/linux/fsnotify.h
> +++ b/include/linux/fsnotify.h
> @@ -326,4 +326,24 @@ static inline void fsnotify_change(struct dentry *dentry, unsigned int ia_valid)
> fsnotify_dentry(dentry, mask);
> }
>
> +static inline int fsnotify_sb_error(struct super_block *sb, struct inode *inode,
> + int error)
> +{
> +#ifdef CONFIG_FSNOTIFY
> + if (sb->s_fsnotify_marks) {

__fsnotify() has this optimization very early
so you do not need it here and you do not need the ifdef.
performance of fsnotify_sb_error() is utterly not important.

> + struct fs_error_report report = {
> + .error = error,
> + .inode = inode,
> + };
> +
> + return __fsnotify(FS_ERROR, &(struct fsnotify_event_info) {
> + .data = &report,
> + .data_type = FSNOTIFY_EVENT_ERROR,
> + .inode = NULL, .cookie = 0, .sb = sb

No need to set members to 0/NULL with this type of initializer.

Thanks,
Amir.

2021-06-16 10:12:02

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v2 07/14] fsnotify: pass arguments of fsnotify() in struct fsnotify_event_info

Hi Gabriel,

I love your patch! Yet something to improve:

[auto build test ERROR on ext3/fsnotify]
[also build test ERROR on linus/master v5.13-rc6 next-20210615]
[cannot apply to ext4/dev]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210616-155248
base: https://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git fsnotify
config: nios2-allnoconfig (attached as .config)
compiler: nios2-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/53c00b0ce1aa2f3fd9e16a892cb44df278969b2d
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210616-155248
git checkout 53c00b0ce1aa2f3fd9e16a892cb44df278969b2d
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=nios2

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

In file included from fs/kernfs/file.c:16:
include/linux/fsnotify.h: In function 'fsnotify_name':
>> include/linux/fsnotify.h:33:2: error: implicit declaration of function '__fsnotify'; did you mean 'fsnotify'? [-Werror=implicit-function-declaration]
33 | __fsnotify(mask, &(struct fsnotify_event_info) {
| ^~~~~~~~~~
| fsnotify
fs/kernfs/file.c: In function 'kernfs_notify_workfn':
>> fs/kernfs/file.c:887:7: error: passing argument 2 of 'fsnotify' from incompatible pointer type [-Werror=incompatible-pointer-types]
887 | inode, FSNOTIFY_EVENT_INODE,
| ^~~~~
| |
| struct inode *
In file included from include/linux/fsnotify.h:15,
from fs/kernfs/file.c:16:
include/linux/fsnotify_backend.h:636:41: note: expected 'const struct fsnotify_event_info *' but argument is of type 'struct inode *'
636 | const struct fsnotify_event_info *event_info)
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~
>> fs/kernfs/file.c:886:5: error: too many arguments to function 'fsnotify'
886 | fsnotify(FS_MODIFY | FS_EVENT_ON_CHILD,
| ^~~~~~~~
In file included from include/linux/fsnotify.h:15,
from fs/kernfs/file.c:16:
include/linux/fsnotify_backend.h:635:19: note: declared here
635 | static inline int fsnotify(__u32 mask,
| ^~~~~~~~
cc1: some warnings being treated as errors


vim +33 include/linux/fsnotify.h

19
20 /*
21 * Notify this @dir inode about a change in a child directory entry.
22 * The directory entry may have turned positive or negative or its inode may
23 * have changed (i.e. renamed over).
24 *
25 * Unlike fsnotify_parent(), the event will be reported regardless of the
26 * FS_EVENT_ON_CHILD mask on the parent inode and will not be reported if only
27 * the child is interested and not the parent.
28 */
29 static inline void fsnotify_name(struct inode *dir, __u32 mask,
30 struct inode *child,
31 const struct qstr *name, u32 cookie)
32 {
> 33 __fsnotify(mask, &(struct fsnotify_event_info) {
34 .data = child, .data_type = FSNOTIFY_EVENT_INODE,
35 .dir = dir, .name = name, .cookie = cookie,
36 });
37 }
38

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (3.91 kB)
.config.gz (5.67 kB)
Download all attachments

2021-06-16 11:03:33

by Amir Goldstein

[permalink] [raw]
Subject: Re: [PATCH v2 11/14] fanotify: Introduce FAN_FS_ERROR event

On Wed, Jun 16, 2021 at 2:56 AM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> The FAN_FS_ERROR event is used by filesystem wide monitoring tools to
> receive notifications of type FS_ERROR_EVENT, emited by filesystems when
> a problem is detected. The error notification includes a generic error
> descriptor.
>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
>
> ---
> Changes since v1:
> - Pass dentry to fanotify_check_fsid (Amir)
> - FANOTIFY_EVENT_TYPE_ERROR -> FANOTIFY_EVENT_TYPE_FS_ERROR
> - Merge previous patch into it
> - Use a single slot
> - Move fanotify_mark.error_event definition to this commit
> - Rename FAN_ERROR -> FAN_FS_ERROR
> - Restrict FAN_FS_ERROR to FAN_MARK_FILESYSTEM
> ---
> fs/notify/fanotify/fanotify.c | 115 ++++++++++++++++++++++--
> fs/notify/fanotify/fanotify.h | 30 +++++++
> fs/notify/fanotify/fanotify_user.c | 137 ++++++++++++++++++++++++++---
> include/linux/fanotify.h | 6 +-
> include/uapi/linux/fanotify.h | 11 +++
> 5 files changed, 281 insertions(+), 18 deletions(-)
>
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index f64234489811..58b2dd01f1ae 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -247,6 +247,14 @@ static int fanotify_get_response(struct fsnotify_group *group,
> return ret;
> }
>
> +static bool fanotify_event_reports_path(const struct fsnotify_group *group,
> + u32 mask)
> +{
> + unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
> +
> + return !fid_mode && !fanotify_is_error_event(mask);
> +}

It would be so much simpler if FAN_FS_ERROR required fid_mode...

> +
> /*
> * This function returns a mask for an event that only contains the flags
> * that have been specifically requested by the user. Flags that may have
> @@ -261,7 +269,6 @@ static u32 fanotify_group_event_mask(
> __u32 marks_mask = 0, marks_ignored_mask = 0;
> __u32 test_mask, user_mask = FANOTIFY_OUTGOING_EVENTS |
> FANOTIFY_EVENT_FLAGS;
> - const struct path *path = fsnotify_event_info_path(event_info);
> unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
> struct fsnotify_mark *mark;
> int type;
> @@ -270,14 +277,17 @@ static u32 fanotify_group_event_mask(
> __func__, iter_info->report_mask, event_mask,
> event_info->data, event_info->data_type);
>
> - if (!fid_mode) {
> + if (fanotify_event_reports_path(group, event_mask)) {
> + const struct path *path;
> +
> /* Do we have path to open a file descriptor? */
> + path = fsnotify_event_info_path(event_info);
> if (!path)
> return 0;
> /* Path type events are only relevant for files and dirs */
> if (!d_is_reg(path->dentry) && !d_can_lookup(path->dentry))
> return 0;
> - } else if (!(fid_mode & FAN_REPORT_FID)) {
> + } else if (fid_mode && !(fid_mode & FAN_REPORT_FID)) {
> /* Do we have a directory inode to report? */
> if (!event_info->dir && !(event_mask & FS_ISDIR))
> return 0;
> @@ -658,6 +668,78 @@ static struct fanotify_event *fanotify_alloc_event(
> return event;
> }
>
> +static void fanotify_init_error_event(struct fanotify_error_event *event,
> + __kernel_fsid_t fsid,
> + const struct fs_error_report *report)
> +{
> + event->fae.type = FANOTIFY_EVENT_TYPE_FS_ERROR;
> + event->err_count = 1;
> + event->fsid = fsid;
> + event->error = report->error;
> + event->ino = (report->inode) ? report->inode->i_ino : 0;
> + event->ino_generation =
> + (report->inode) ? report->inode->i_generation : 0;
> +}
> +
> +struct fanotify_insert_error_data {
> + const struct fs_error_report *report;
> + __kernel_fsid_t fsid;
> +};
> +
> +static void fanotify_insert_error_event(struct fsnotify_group *group,
> + struct fsnotify_event *event,
> + const void *data)
> +{
> + struct fanotify_event *fae = FANOTIFY_E(event);
> + struct fanotify_error_event *error_event = FANOTIFY_EE(fae);
> + const struct fanotify_insert_error_data *idata =
> + ((struct fanotify_insert_error_data *) data);
> +
> + fsnotify_get_mark(error_event->mark);
> + fanotify_init_error_event(error_event, idata->fsid, idata->report);
> +}
> +
> +static int fanotify_merge_error_event(struct fsnotify_group *group,
> + struct fsnotify_event *event)
> +{
> + struct fanotify_event *fae = FANOTIFY_E(event);
> +
> + if (!list_empty(&event->list)) {
> + FANOTIFY_EE(fae)->err_count++;
> + return 1;
> + }
> +
> + return 0;
> +}
> +
> +static int fanotify_queue_error_event(struct fsnotify_iter_info *iter_info,
> + struct fsnotify_group *group,
> + __kernel_fsid_t fsid,
> + const struct fs_error_report *report)
> +{
> + struct fanotify_error_event *error_event;
> + struct fsnotify_mark *mark = fsnotify_iter_sb_mark(iter_info);
> + int ret = -ENOMEM;
> +
> + if (!mark || !FANOTIFY_SB_MARK(mark)->error_event)
> + return ret;
> +
> + spin_lock(&mark->lock);

What is the reason for this lock?
It does not look right to be holding it through fsnotify_add_event()

> + error_event = FANOTIFY_SB_MARK(mark)->error_event;
> + if (error_event) {
> + ret = fsnotify_add_event(group, &error_event->fae.fse,
> + fanotify_merge_error_event,
> + fanotify_insert_error_event,
> + &(struct fanotify_insert_error_data) {
> + .fsid = fsid,
> + .report = report
> + });

I wonder if that is the best API to serve the needs of "single slot event"
I have no better idea at the moment.
Note that fsid does not need to be passed in the data.
It is available via error_event->mark->connector->fsid so no need
to duplicate a copy in error_event->fsid.

> + }
> + spin_unlock(&mark->lock);
> +
> + return ret;
> +}
> +
> /*
> * Get cached fsid of the filesystem containing the object from any connector.
> * All connectors are supposed to have the same fsid, but we do not verify that
> @@ -738,8 +820,9 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
> BUILD_BUG_ON(FAN_ONDIR != FS_ISDIR);
> BUILD_BUG_ON(FAN_OPEN_EXEC != FS_OPEN_EXEC);
> BUILD_BUG_ON(FAN_OPEN_EXEC_PERM != FS_OPEN_EXEC_PERM);
> + BUILD_BUG_ON(FAN_FS_ERROR != FS_ERROR);
>
> - BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 19);
> + BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 20);
>
> mask = fanotify_group_event_mask(group, mask, event_info, iter_info);
> if (!mask)
> @@ -756,13 +839,20 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
> return 0;
> }
>
> - if (FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS)) {
> + if (FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS) ||
> + fanotify_is_error_event(mask)) {

It would be so much simpler if FAN_FS_ERROR required fid_mode...

> fsid = fanotify_get_fsid(iter_info);
> /* Racing with mark destruction or creation? */
> if (!fsid.val[0] && !fsid.val[1])
> return 0;
> }
>
> + if (fanotify_is_error_event(mask)) {
> + ret = fanotify_queue_error_event(iter_info, group, fsid,
> + event_info->data);
> + goto finish;
> + }
> +
> event = fanotify_alloc_event(group, mask, event_info, &fsid);
> ret = -ENOMEM;
> if (unlikely(!event)) {
> @@ -831,6 +921,17 @@ static void fanotify_free_name_event(struct fanotify_event *event)
> kfree(FANOTIFY_NE(event));
> }
>
> +static void fanotify_free_error_event(struct fanotify_event *event)
> +{
> + /*
> + * Just drop the reference acquired by
> + * fanotify_queue_error_event.
> + *
> + * The actual memory is freed with the mark.
> + */
> + fsnotify_put_mark(FANOTIFY_EE(event)->mark);
> +}
> +
> static void fanotify_free_event(struct fsnotify_event *fsn_event)
> {
> struct fanotify_event *event;
> @@ -853,6 +954,9 @@ static void fanotify_free_event(struct fsnotify_event *fsn_event)
> case FANOTIFY_EVENT_TYPE_OVERFLOW:
> kfree(event);
> break;
> + case FANOTIFY_EVENT_TYPE_FS_ERROR:
> + fanotify_free_error_event(event);
> + break;
> default:
> WARN_ON_ONCE(1);
> }
> @@ -870,6 +974,7 @@ static void fanotify_free_mark(struct fsnotify_mark *mark)
> if (mark->flags & FSNOTIFY_MARK_FLAG_SB) {
> struct fanotify_sb_mark *fa_mark = FANOTIFY_SB_MARK(mark);
>
> + kfree(fa_mark->error_event);
> kmem_cache_free(fanotify_sb_mark_cache, fa_mark);
> } else {
> kmem_cache_free(fanotify_mark_cache, mark);
> diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
> index 7e00c05a979a..882d056b3a7a 100644
> --- a/fs/notify/fanotify/fanotify.h
> +++ b/fs/notify/fanotify/fanotify.h
> @@ -132,11 +132,14 @@ static inline void fanotify_info_copy_name(struct fanotify_info *info,
>
> struct fanotify_sb_mark {
> struct fsnotify_mark fsn_mark;
> + struct fanotify_error_event *error_event;
> };
>
> static inline
> struct fanotify_sb_mark *FANOTIFY_SB_MARK(struct fsnotify_mark *mark)
> {
> + WARN_ON(!(mark->flags & FSNOTIFY_MARK_FLAG_SB));
> +
> return container_of(mark, struct fanotify_sb_mark, fsn_mark);
> }
>
> @@ -152,6 +155,7 @@ enum fanotify_event_type {
> FANOTIFY_EVENT_TYPE_PATH,
> FANOTIFY_EVENT_TYPE_PATH_PERM,
> FANOTIFY_EVENT_TYPE_OVERFLOW, /* struct fanotify_event */
> + FANOTIFY_EVENT_TYPE_FS_ERROR, /* struct fanotify_error_event */
> __FANOTIFY_EVENT_TYPE_NUM
> };
>
> @@ -207,12 +211,32 @@ FANOTIFY_NE(struct fanotify_event *event)
> return container_of(event, struct fanotify_name_event, fae);
> }
>
> +struct fanotify_error_event {
> + struct fanotify_event fae;
> + s32 error;
> + u32 err_count;
> + __kernel_fsid_t fsid;
> + u64 ino;
> + u32 ino_generation;

'generation' would have been sufficiently clear name.

> +
> + /* Back reference to the mark this error refers to. */
> + struct fsnotify_mark *mark;
> +};
> +
> +static inline struct fanotify_error_event *
> +FANOTIFY_EE(struct fanotify_event *event)
> +{
> + return container_of(event, struct fanotify_error_event, fae);
> +}
> +
> static inline __kernel_fsid_t *fanotify_event_fsid(struct fanotify_event *event)
> {
> if (event->type == FANOTIFY_EVENT_TYPE_FID)
> return &FANOTIFY_FE(event)->fsid;
> else if (event->type == FANOTIFY_EVENT_TYPE_FID_NAME)
> return &FANOTIFY_NE(event)->fsid;
> + else if (event->type == FANOTIFY_EVENT_TYPE_FS_ERROR)
> + return &FANOTIFY_EE(event)->fsid;
> else
> return NULL;
> }
> @@ -298,6 +322,11 @@ static inline struct fanotify_event *FANOTIFY_E(struct fsnotify_event *fse)
> return container_of(fse, struct fanotify_event, fse);
> }
>
> +static inline bool fanotify_is_error_event(u32 mask)
> +{
> + return mask & FANOTIFY_ERROR_EVENTS;
> +}
> +
> static inline bool fanotify_event_has_path(struct fanotify_event *event)
> {
> return event->type == FANOTIFY_EVENT_TYPE_PATH ||
> @@ -327,6 +356,7 @@ static inline struct path *fanotify_event_path(struct fanotify_event *event)
> static inline bool fanotify_is_hashed_event(u32 mask)
> {
> return !(fanotify_is_perm_event(mask) ||
> + fanotify_is_error_event(mask) ||
> fsnotify_is_overflow_event(mask));
> }
>
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index db378480f1b1..7ec99bec2746 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -107,6 +107,8 @@ struct kmem_cache *fanotify_perm_event_cachep __read_mostly;
> #define FANOTIFY_EVENT_ALIGN 4
> #define FANOTIFY_INFO_HDR_LEN \
> (sizeof(struct fanotify_event_info_fid) + sizeof(struct file_handle))
> +#define FANOTIFY_INFO_ERROR_LEN \
> + (sizeof(struct fanotify_event_info_error))
>
> static int fanotify_fid_info_len(int fh_len, int name_len)
> {
> @@ -127,6 +129,9 @@ static size_t fanotify_event_len(struct fanotify_event *event,
> int fh_len;
> int dot_len = 0;
>
> + if (fanotify_is_error_event(event->mask))
> + return event_len + FANOTIFY_INFO_ERROR_LEN;
> +
> if (!fid_mode)
> return event_len;
>
> @@ -150,6 +155,35 @@ static size_t fanotify_event_len(struct fanotify_event *event,
> return event_len;
> }
>
> +static struct fanotify_event *fanotify_dequeue_error_event(
> + struct fsnotify_group *group,
> + struct fanotify_event *event,
> + struct fanotify_error_event *dest)
> +{
> + struct fanotify_error_event *error_event = FANOTIFY_EE(event);
> +
> + /*
> + * In order to avoid missing an error count update, the
> + * queued event is de-queued and duplicated to an
> + * in-stack fanotify_error_event while still inside
> + * mark->lock. Once the event is dequeued, it can be
> + * immediately re-used for a new event.
> + *
> + * The ownership of the mark reference is dropped later
> + * by destroy_event.
> + */
> + spin_lock(&error_event->mark->lock);
> +
> + memcpy(dest, error_event, sizeof(*error_event));
> + fsnotify_init_event(&dest->fae.fse);
> +
> + fsnotify_remove_queued_event(group, &event->fse);
> +
> + spin_unlock(&error_event->mark->lock);
> +
> + return &dest->fae;
> +}
> +
> /*
> * Remove an hashed event from merge hash table.
> */
> @@ -173,8 +207,10 @@ static void fanotify_unhash_event(struct fsnotify_group *group,
> * is not large enough. When permission event is dequeued, its state is
> * updated accordingly.
> */
> -static struct fanotify_event *get_one_event(struct fsnotify_group *group,
> - size_t count)
> +static struct fanotify_event *get_one_event(
> + struct fsnotify_group *group,
> + size_t count,
> + struct fanotify_error_event *error_event)
> {
> size_t event_size;
> struct fanotify_event *event = NULL;
> @@ -198,9 +234,14 @@ static struct fanotify_event *get_one_event(struct fsnotify_group *group,
>
> /*
> * Held the notification_lock the whole time, so this is the
> - * same event we peeked above.
> + * same event we peeked above, unless it is copied to
> + * error_event.
> */
> - fsnotify_remove_first_event(group);
> + if (fanotify_is_error_event(event->mask))
> + event = fanotify_dequeue_error_event(group, event, error_event);
> + else
> + fsnotify_remove_first_event(group);
> +
> if (fanotify_is_perm_event(event->mask))
> FANOTIFY_PERM(event)->state = FAN_EVENT_REPORTED;
> if (fanotify_is_hashed_event(event->mask))
> @@ -310,6 +351,31 @@ static int process_access_response(struct fsnotify_group *group,
> return -ENOENT;
> }
>
> +static size_t copy_error_info_to_user(struct fanotify_event *event,
> + char __user *buf, int count)
> +{
> + struct fanotify_event_info_error info;
> + struct fanotify_error_event *fee = FANOTIFY_EE(event);
> +
> + info.hdr.info_type = FAN_EVENT_INFO_TYPE_ERROR;
> + info.hdr.pad = 0;
> + info.hdr.len = sizeof(struct fanotify_event_info_error);
> +
> + if (WARN_ON(count < info.hdr.len))
> + return -EFAULT;
> +
> + info.fsid = fee->fsid;
> + info.error = fee->error;
> + info.ino = fee->ino;
> + info.ino_generation = fee->ino_generation;
> + info.error_count = fee->err_count;
> +
> + if (copy_to_user(buf, &info, sizeof(info)))
> + return -EFAULT;
> +
> + return info.hdr.len;
> +}
> +
> static int copy_info_to_user(__kernel_fsid_t *fsid, struct fanotify_fh *fh,
> int info_type, const char *name, size_t name_len,
> char __user *buf, size_t count)
> @@ -531,6 +597,14 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
> count -= ret;
> }
>
> + if (fanotify_is_error_event(event->mask)) {
> + ret = copy_error_info_to_user(event, buf, count);
> + if (ret < 0)
> + return ret;

goto out_close_fd, even though there is not supposed to be an open fd
makes me wonder also about the validity of event->pid in the context of
an fs error event...

> + buf += ret;
> + count -= ret;
> + }
> +
> return metadata.event_len;
>
> out_close_fd:
> @@ -559,6 +633,7 @@ static __poll_t fanotify_poll(struct file *file, poll_table *wait)
> static ssize_t fanotify_read(struct file *file, char __user *buf,
> size_t count, loff_t *pos)
> {
> + struct fanotify_error_event error_event;
> struct fsnotify_group *group;
> struct fanotify_event *event;
> char __user *start;
> @@ -577,7 +652,7 @@ static ssize_t fanotify_read(struct file *file, char __user *buf,
> * in case there are lots of available events.
> */
> cond_resched();
> - event = get_one_event(group, count);
> + event = get_one_event(group, count, &error_event);
> if (IS_ERR(event)) {
> ret = PTR_ERR(event);
> break;
> @@ -896,16 +971,34 @@ static int fanotify_remove_inode_mark(struct fsnotify_group *group,
> flags, umask);
> }
>
> -static __u32 fanotify_mark_add_to_mask(struct fsnotify_mark *fsn_mark,
> - __u32 mask,
> - unsigned int flags)
> +static int fanotify_mark_add_to_mask(struct fsnotify_mark *fsn_mark,
> + __u32 mask, unsigned int flags,
> + __u32 *added_mask)
> {
> + struct fanotify_error_event *error_event = NULL;
> + bool ignored = flags & FAN_MARK_IGNORED_MASK;
> __u32 oldmask = -1;
>
> + /* Only pre-alloc error_event if needed. */
> + if (!ignored && (mask & FAN_FS_ERROR) &&
> + (fsn_mark->flags & FSNOTIFY_MARK_FLAG_SB) &&
> + !FANOTIFY_SB_MARK(fsn_mark)->error_event) {

I am not sure that this check and the assignment below are safe
w.r.t memory order.
Perhaps this needs READ_ONCE() - need Jan to comment on this.


> + error_event = kzalloc(sizeof(*error_event), GFP_KERNEL);
> + if (!error_event)
> + return -ENOMEM;
> + fanotify_init_event(&error_event->fae, 0, FS_ERROR);
> + error_event->mark = fsn_mark;
> + }
> +
> spin_lock(&fsn_mark->lock);
> - if (!(flags & FAN_MARK_IGNORED_MASK)) {
> + if (!ignored) {
> oldmask = fsn_mark->mask;
> fsn_mark->mask |= mask;
> +
> + if (error_event && !FANOTIFY_SB_MARK(fsn_mark)->error_event) {
> + FANOTIFY_SB_MARK(fsn_mark)->error_event = error_event;
> + error_event = NULL;
> + }
> } else {
> fsn_mark->ignored_mask |= mask;
> if (flags & FAN_MARK_IGNORED_SURV_MODIFY)
> @@ -913,7 +1006,10 @@ static __u32 fanotify_mark_add_to_mask(struct fsnotify_mark *fsn_mark,
> }
> spin_unlock(&fsn_mark->lock);
>
> - return mask & ~oldmask;
> + kfree(error_event);
> +
> + *added_mask = mask & ~oldmask;
> + return 0;
> }
>
> static struct fsnotify_mark *fanotify_alloc_mark(unsigned int type)
> @@ -987,6 +1083,7 @@ static int fanotify_add_mark(struct fsnotify_group *group,
> {
> struct fsnotify_mark *fsn_mark;
> __u32 added;
> + int ret = 0;
>
> mutex_lock(&group->mark_mutex);
> fsn_mark = fsnotify_find_mark(connp, group);
> @@ -997,13 +1094,18 @@ static int fanotify_add_mark(struct fsnotify_group *group,
> return PTR_ERR(fsn_mark);
> }
> }
> - added = fanotify_mark_add_to_mask(fsn_mark, mask, flags);
> + ret = fanotify_mark_add_to_mask(fsn_mark, mask, flags, &added);
> + if (ret)
> + goto out;
> +
> if (added & ~fsnotify_conn_mask(fsn_mark->connector))
> fsnotify_recalc_mask(fsn_mark->connector);
> +
> +out:
> mutex_unlock(&group->mark_mutex);
>
> fsnotify_put_mark(fsn_mark);
> - return 0;
> + return ret;
> }
>
> static int fanotify_add_vfsmount_mark(struct fsnotify_group *group,
> @@ -1419,6 +1521,17 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
>
> fsid = &__fsid;
> }
> + if (mask & FAN_FS_ERROR) {
> + if (mark_type != FAN_MARK_FILESYSTEM) {
> + ret = -EINVAL;
> + goto path_put_and_out;
> + }
> +
> + ret = fanotify_test_fsid(path.dentry, &__fsid);
> + if (ret)
> + goto path_put_and_out;
> + fsid = &__fsid;
> + }
>
> /* inode held in place by reference to path; group by fget on fd */
> if (mark_type == FAN_MARK_INODE)
> diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
> index a16dbeced152..3bdfe227e2c2 100644
> --- a/include/linux/fanotify.h
> +++ b/include/linux/fanotify.h
> @@ -85,9 +85,12 @@ extern struct ctl_table fanotify_table[]; /* for sysctl */
> #define FANOTIFY_INODE_EVENTS (FANOTIFY_DIRENT_EVENTS | \
> FAN_ATTRIB | FAN_MOVE_SELF | FAN_DELETE_SELF)
>
> +#define FANOTIFY_ERROR_EVENTS (FAN_FS_ERROR)
> +
> /* Events that user can request to be notified on */
> #define FANOTIFY_EVENTS (FANOTIFY_PATH_EVENTS | \
> - FANOTIFY_INODE_EVENTS)
> + FANOTIFY_INODE_EVENTS | \
> + FANOTIFY_ERROR_EVENTS)
>
> /* Events that require a permission response from user */
> #define FANOTIFY_PERM_EVENTS (FAN_OPEN_PERM | FAN_ACCESS_PERM | \
> @@ -99,6 +102,7 @@ extern struct ctl_table fanotify_table[]; /* for sysctl */
> /* Events that may be reported to user */
> #define FANOTIFY_OUTGOING_EVENTS (FANOTIFY_EVENTS | \
> FANOTIFY_PERM_EVENTS | \
> + FANOTIFY_ERROR_EVENTS | \
> FAN_Q_OVERFLOW | FAN_ONDIR)
>
> #define ALL_FANOTIFY_EVENT_BITS (FANOTIFY_OUTGOING_EVENTS | \
> diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
> index fbf9c5c7dd59..a987150a446c 100644
> --- a/include/uapi/linux/fanotify.h
> +++ b/include/uapi/linux/fanotify.h
> @@ -20,6 +20,7 @@
> #define FAN_OPEN_EXEC 0x00001000 /* File was opened for exec */
>
> #define FAN_Q_OVERFLOW 0x00004000 /* Event queued overflowed */
> +#define FAN_FS_ERROR 0x00008000 /* Filesystem error */
>
> #define FAN_OPEN_PERM 0x00010000 /* File open in perm check */
> #define FAN_ACCESS_PERM 0x00020000 /* File accessed in perm check */
> @@ -123,6 +124,7 @@ struct fanotify_event_metadata {
> #define FAN_EVENT_INFO_TYPE_FID 1
> #define FAN_EVENT_INFO_TYPE_DFID_NAME 2
> #define FAN_EVENT_INFO_TYPE_DFID 3
> +#define FAN_EVENT_INFO_TYPE_ERROR 4

I don't know if I commented about this one specifically,
but I meant that this constant should also be disambiguate
FAN_EVENT_INFO_TYPE_FS_ERROR
but if fsid and inode info are going away, the generic name could stay.

>
> /* Variable length info record following event metadata */
> struct fanotify_event_info_header {
> @@ -148,6 +150,15 @@ struct fanotify_event_info_fid {
> unsigned char handle[0];
> };
>
> +struct fanotify_event_info_error {
> + struct fanotify_event_info_header hdr;
> + __s32 error;
> + __u32 error_count;
> + __kernel_fsid_t fsid;
> + __u64 ino;
> + __u32 ino_generation;

Just 'generation' would be a more conventional name.

But I must say, I really dislike that fact that this struct duplicates
the information that would be available with fanotify_event_info_fid
record otherwise.

I realize that this would not allow reporting fs errors for filesystems
that do not support exportfs ops, but is that really a concern?
For which filesystem would that be a problem?

It would be so much simpler if FAN_FS_ERROR required fid_mode...

I also realize that file handles are supposed to be opaque blobs whose
format userspace should not be concerned about, but since we are talking
about and interface to be used by fs monitoring tools which may need to
follow up with filesystem specific actions anyway, I don't know if reporting
a file handle instead of inode and generation is a big limitation.

Thanks,
Amir.

2021-06-16 13:06:08

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v2 13/14] samples: Add fs error monitoring example

Hi Gabriel,

I love your patch! Yet something to improve:

[auto build test ERROR on ext3/fsnotify]
[also build test ERROR on linus/master v5.13-rc6 next-20210615]
[cannot apply to ext4/dev]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210616-155248
base: https://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git fsnotify
config: m68k-allmodconfig (attached as .config)
compiler: m68k-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/cefa2ac082849b4b7bb22317625cf3371a52faaf
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210616-155248
git checkout cefa2ac082849b4b7bb22317625cf3371a52faaf
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=m68k

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

>> samples/fanotify/fs-monitor.c:7:10: fatal error: errno.h: No such file or directory
7 | #include <errno.h>
| ^~~~~~~~~
compilation terminated.


vim +7 samples/fanotify/fs-monitor.c

> 7 #include <errno.h>
8 #include <err.h>
9 #include <stdlib.h>
10 #include <stdio.h>
11 #include <fcntl.h>
12 #include <sys/fanotify.h>
13 #include <sys/types.h>
14 #include <unistd.h>
15 #include <sys/stat.h>
16 #include <sys/types.h>
17

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (2.07 kB)
.config.gz (57.71 kB)
Download all attachments