Hi,
This is the third version of the FAN_FS_ERROR patches. The main change
in this version is the inode information being reported through an FID
record, which means it requires the group to be created with
FAN_REPORT_FID. It indeed simplifies a lot the FAN_FS_ERROR patch
itself.
This change raises the question of how we report non-inode errors. On
one hand, we could omit the FID report, but then fsid would also be
ommited. I chose to report these kind of errors against the root
inode.
The other changes in this iteration were made to attend to Amir
feedback. Thank you again for your very detailed input. It is really
appreciated.
This was tested with LTP for regressions, and also using the sample on
the last patch, with a corrupted image. I can publish the bad image
upon request.
I also pushed the full series to:
https://gitlab.collabora.com/krisman/linux -b fanotify-notifications-single-slot
Thank you
Original cover letter
---------------------
Hi,
This series follow up on my previous proposal [1] to support file system
wide monitoring. As suggested by Amir, this proposal drops the ring
buffer in favor of a single slot associated with each mark. This
simplifies a bit the implementation, as you can see in the code.
As a reminder, This proposal is limited to an interface for
administrators to monitor the health of a file system, instead of a
generic inteface for file errors. Therefore, this doesn't solve the
problem of writeback errors or the need to watch a specific subtree.
In comparison to the previous RFC, this implementation also drops the
per-fs data and location, and leave those as future extensions.
* Implementation
The feature is implemented on top of fanotify, as a new type of fanotify
mark, FAN_ERROR, which a file system monitoring tool can register to
receive error notifications. When an error occurs a new notification is
generated, in addition followed by this info field:
- FS generic data: A file system agnostic structure that has a generic
error code and identifies the filesystem. Basically, it let's
userspace know something happened on a monitored filesystem. Since
only the first error is recorded since the last read, this also
includes a counter of errors that happened since the last read.
* Testing
This was tested by watching notifications flowing from an intentionally
corrupted filesystem in different places. In addition, other events
were watched in an attempt to detect regressions.
Is there a specific testsuite for fanotify I should be running?
* Patches
This patchset is divided as follows: Patch 1 through 5 are refactoring
to fsnotify/fanotify in preparation for FS_ERROR/FAN_ERROR; patch 6 and
7 implement the FS_ERROR API for filesystems to report error; patch 8
add support for FAN_ERROR in fanotify; Patch 9 is an example
implementation for ext4; patch 10 and 11 provide a sample userspace code
and documentation.
I also pushed the full series to:
https://gitlab.collabora.com/krisman/linux -b fanotify-notifications-single-slot
[1] https://lwn.net/Articles/854545/
[2] https://lwn.net/Articles/856916/
Cc: Darrick J. Wong <[email protected]>
Cc: Theodore Ts'o <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: [email protected]
To: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Amir Goldstein (1):
fsnotify: pass arguments of fsnotify() in struct fsnotify_event_info
Gabriel Krisman Bertazi (14):
fsnotify: Don't insert unmergeable events in hashtable
fanotify: Fold event size calculation to its own function
fanotify: Split fsid check from other fid mode checks
fanotify: Split superblock marks out to a new cache
inotify: Don't force FS_IN_IGNORED
fsnotify: Add helper to detect overflow_event
fsnotify: Support passing argument to insert callback on add_event
fsnotify: Always run the merge hook
fsnotify: Support FS_ERROR event type
fsnotify: Introduce helpers to send error_events
fanotify: Introduce FAN_FS_ERROR event
ext4: Send notifications on error
samples: Add fs error monitoring example
docs: Document the FAN_FS_ERROR event
.../admin-guide/filesystem-monitoring.rst | 70 +++++++
Documentation/admin-guide/index.rst | 1 +
fs/ext4/super.c | 8 +
fs/notify/fanotify/fanotify.c | 189 ++++++++++++-----
fs/notify/fanotify/fanotify.h | 69 ++++++-
fs/notify/fanotify/fanotify_user.c | 195 +++++++++++++++---
fs/notify/fsnotify.c | 85 ++++----
fs/notify/inotify/inotify_fsnotify.c | 5 +-
fs/notify/inotify/inotify_user.c | 6 +-
fs/notify/notification.c | 8 +-
include/linux/fanotify.h | 11 +-
include/linux/fsnotify.h | 28 ++-
include/linux/fsnotify_backend.h | 104 ++++++++--
include/uapi/linux/fanotify.h | 8 +
samples/Kconfig | 9 +
samples/Makefile | 1 +
samples/fanotify/Makefile | 3 +
samples/fanotify/fs-monitor.c | 134 ++++++++++++
18 files changed, 777 insertions(+), 157 deletions(-)
create mode 100644 Documentation/admin-guide/filesystem-monitoring.rst
create mode 100644 samples/fanotify/Makefile
create mode 100644 samples/fanotify/fs-monitor.c
--
2.32.0
Every time this function is invoked, it is immediately added to
FAN_EVENT_METADATA_LEN, since there is no need to just calculate the
length of info records. This minor clean up folds the rest of the
calculation into the function, which now operates in terms of events,
returning the size of the entire event, including metadata.
Reviewed-by: Amir Goldstein <[email protected]>
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
Changes since v1:
- rebased on top of hashing patches
---
fs/notify/fanotify/fanotify_user.c | 33 +++++++++++++++++-------------
1 file changed, 19 insertions(+), 14 deletions(-)
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 64864fb40b40..68a53d3534f8 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -117,17 +117,24 @@ static int fanotify_fid_info_len(int fh_len, int name_len)
return roundup(FANOTIFY_INFO_HDR_LEN + info_len, FANOTIFY_EVENT_ALIGN);
}
-static int fanotify_event_info_len(unsigned int fid_mode,
- struct fanotify_event *event)
+static size_t fanotify_event_len(struct fanotify_event *event,
+ unsigned int fid_mode)
{
- struct fanotify_info *info = fanotify_event_info(event);
- int dir_fh_len = fanotify_event_dir_fh_len(event);
- int fh_len = fanotify_event_object_fh_len(event);
- int info_len = 0;
+ size_t event_len = FAN_EVENT_METADATA_LEN;
+ struct fanotify_info *info;
+ int dir_fh_len;
+ int fh_len;
int dot_len = 0;
+ if (!fid_mode)
+ return event_len;
+
+ info = fanotify_event_info(event);
+ dir_fh_len = fanotify_event_dir_fh_len(event);
+ fh_len = fanotify_event_object_fh_len(event);
+
if (dir_fh_len) {
- info_len += fanotify_fid_info_len(dir_fh_len, info->name_len);
+ event_len += fanotify_fid_info_len(dir_fh_len, info->name_len);
} else if ((fid_mode & FAN_REPORT_NAME) && (event->mask & FAN_ONDIR)) {
/*
* With group flag FAN_REPORT_NAME, if name was not recorded in
@@ -137,9 +144,9 @@ static int fanotify_event_info_len(unsigned int fid_mode,
}
if (fh_len)
- info_len += fanotify_fid_info_len(fh_len, dot_len);
+ event_len += fanotify_fid_info_len(fh_len, dot_len);
- return info_len;
+ return event_len;
}
/*
@@ -168,7 +175,7 @@ static void fanotify_unhash_event(struct fsnotify_group *group,
static struct fanotify_event *get_one_event(struct fsnotify_group *group,
size_t count)
{
- size_t event_size = FAN_EVENT_METADATA_LEN;
+ size_t event_size;
struct fanotify_event *event = NULL;
struct fsnotify_event *fsn_event;
unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
@@ -181,8 +188,7 @@ static struct fanotify_event *get_one_event(struct fsnotify_group *group,
goto out;
event = FANOTIFY_E(fsn_event);
- if (fid_mode)
- event_size += fanotify_event_info_len(fid_mode, event);
+ event_size = fanotify_event_len(event, fid_mode);
if (event_size > count) {
event = ERR_PTR(-EINVAL);
@@ -412,8 +418,7 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
pr_debug("%s: group=%p event=%p\n", __func__, group, event);
- metadata.event_len = FAN_EVENT_METADATA_LEN +
- fanotify_event_info_len(fid_mode, event);
+ metadata.event_len = fanotify_event_len(event, fid_mode);
metadata.metadata_len = FAN_EVENT_METADATA_LEN;
metadata.vers = FANOTIFY_METADATA_VERSION;
metadata.reserved = 0;
--
2.32.0
FAN_FS_ERROR will require fsid, but not necessarily require the
filesystem to expose a file handle. Split those checks into different
functions, so they can be used separately when setting up an event.
While there, update a comment about tmpfs having 0 fsid, which is no
longer true.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
Changes since v2:
- FAN_ERROR -> FAN_FS_ERROR (Amir)
- Update comment (Amir)
Changes since v1:
(Amir)
- Sort hunks to simplify diff.
Changes since RFC:
(Amir)
- Rename fanotify_check_path_fsid -> fanotify_test_fsid.
- Use dentry directly instead of path.
---
fs/notify/fanotify/fanotify_user.c | 27 ++++++++++++++++++---------
1 file changed, 18 insertions(+), 9 deletions(-)
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 68a53d3534f8..67b18dfe0025 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -1192,16 +1192,15 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
return fd;
}
-/* Check if filesystem can encode a unique fid */
-static int fanotify_test_fid(struct path *path, __kernel_fsid_t *fsid)
+static int fanotify_test_fsid(struct dentry *dentry, __kernel_fsid_t *fsid)
{
__kernel_fsid_t root_fsid;
int err;
/*
- * Make sure path is not in filesystem with zero fsid (e.g. tmpfs).
+ * Make sure dentry is not of a filesystem with zero fsid (e.g. fuse).
*/
- err = vfs_get_fsid(path->dentry, fsid);
+ err = vfs_get_fsid(dentry, fsid);
if (err)
return err;
@@ -1209,10 +1208,10 @@ static int fanotify_test_fid(struct path *path, __kernel_fsid_t *fsid)
return -ENODEV;
/*
- * Make sure path is not inside a filesystem subvolume (e.g. btrfs)
+ * Make sure dentry is not of a filesystem subvolume (e.g. btrfs)
* which uses a different fsid than sb root.
*/
- err = vfs_get_fsid(path->dentry->d_sb->s_root, &root_fsid);
+ err = vfs_get_fsid(dentry->d_sb->s_root, &root_fsid);
if (err)
return err;
@@ -1220,6 +1219,12 @@ static int fanotify_test_fid(struct path *path, __kernel_fsid_t *fsid)
root_fsid.val[1] != fsid->val[1])
return -EXDEV;
+ return 0;
+}
+
+/* Check if filesystem can encode a unique fid */
+static int fanotify_test_fid(struct dentry *dentry)
+{
/*
* We need to make sure that the file system supports at least
* encoding a file handle so user can use name_to_handle_at() to
@@ -1227,8 +1232,8 @@ static int fanotify_test_fid(struct path *path, __kernel_fsid_t *fsid)
* objects. However, name_to_handle_at() requires that the
* filesystem also supports decoding file handles.
*/
- if (!path->dentry->d_sb->s_export_op ||
- !path->dentry->d_sb->s_export_op->fh_to_dentry)
+ if (!dentry->d_sb->s_export_op ||
+ !dentry->d_sb->s_export_op->fh_to_dentry)
return -EOPNOTSUPP;
return 0;
@@ -1379,7 +1384,11 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
}
if (fid_mode) {
- ret = fanotify_test_fid(&path, &__fsid);
+ ret = fanotify_test_fsid(path.dentry, &__fsid);
+ if (ret)
+ goto path_put_and_out;
+
+ ret = fanotify_test_fid(path.dentry);
if (ret)
goto path_put_and_out;
--
2.32.0
According to Amir:
"FS_IN_IGNORED is completely internal to inotify and there is no need
to set it in i_fsnotify_mask at all, so if we remove the bit from the
output of inotify_arg_to_mask() no functionality will change and we will
be able to overload the event bit for FS_ERROR."
This is done in preparation to overload FS_ERROR with the notification
mechanism in fanotify.
Suggested-by: Amir Goldstein <[email protected]>
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
Reviewed-by: Amir Goldstein <[email protected]>
---
fs/notify/inotify/inotify_user.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
index 98f61b31745a..4d17be6dd58d 100644
--- a/fs/notify/inotify/inotify_user.c
+++ b/fs/notify/inotify/inotify_user.c
@@ -89,10 +89,10 @@ static inline __u32 inotify_arg_to_mask(struct inode *inode, u32 arg)
__u32 mask;
/*
- * Everything should accept their own ignored and should receive events
- * when the inode is unmounted. All directories care about children.
+ * Everything should receive events when the inode is unmounted.
+ * All directories care about children.
*/
- mask = (FS_IN_IGNORED | FS_UNMOUNT);
+ mask = (FS_UNMOUNT);
if (S_ISDIR(inode->i_mode))
mask |= FS_EVENT_ON_CHILD;
--
2.32.0
Similarly to fanotify_is_perm_event and friends, provide a helper
predicate to say whether a mask is of an overflow event.
Suggested-by: Amir Goldstein <[email protected]>
Reviewed-by: Amir Goldstein <[email protected]>
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/notify/fanotify/fanotify.h | 3 ++-
include/linux/fsnotify_backend.h | 5 +++++
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index fd125a949187..2e005b3a75f2 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -328,7 +328,8 @@ static inline struct path *fanotify_event_path(struct fanotify_event *event)
*/
static inline bool fanotify_is_hashed_event(u32 mask)
{
- return !fanotify_is_perm_event(mask) && !(mask & FS_Q_OVERFLOW);
+ return !(fanotify_is_perm_event(mask) ||
+ fsnotify_is_overflow_event(mask));
}
static inline unsigned int fanotify_event_hash_bucket(
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index c4473b467c28..f9e2c6cd0f7d 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -495,6 +495,11 @@ static inline void fsnotify_queue_overflow(struct fsnotify_group *group)
fsnotify_add_event(group, group->overflow_event, NULL, NULL);
}
+static inline bool fsnotify_is_overflow_event(u32 mask)
+{
+ return mask & FS_Q_OVERFLOW;
+}
+
static inline bool fsnotify_notify_queue_is_empty(struct fsnotify_group *group)
{
assert_spin_locked(&group->notification_lock);
--
2.32.0
Expose a new type of fsnotify event for filesystems to report errors for
userspace monitoring tools. fanotify will send this type of
notification for FAN_FS_ERROR events.
Reviewed-by: Amir Goldstein <[email protected]>
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
Changes since v2:
- FAN_ERROR->FAN_FS_ERROR (Amir)
Changes since v1:
- Overload FS_ERROR with FS_IN_IGNORED
- Implement support for this type on fsnotify_data_inode (Amir)
---
include/linux/fsnotify_backend.h | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 8222fe12a6c9..ea5f5c7cc381 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -42,6 +42,12 @@
#define FS_UNMOUNT 0x00002000 /* inode on umount fs */
#define FS_Q_OVERFLOW 0x00004000 /* Event queued overflowed */
+#define FS_ERROR 0x00008000 /* Filesystem Error (fanotify) */
+
+/*
+ * FS_IN_IGNORED overloads FS_ERROR. It is only used internally by inotify
+ * which does not support FS_ERROR.
+ */
#define FS_IN_IGNORED 0x00008000 /* last inotify event here */
#define FS_OPEN_PERM 0x00010000 /* open event in an permission hook */
@@ -95,7 +101,8 @@
#define ALL_FSNOTIFY_EVENTS (ALL_FSNOTIFY_DIRENT_EVENTS | \
FS_EVENTS_POSS_ON_CHILD | \
FS_DELETE_SELF | FS_MOVE_SELF | FS_DN_RENAME | \
- FS_UNMOUNT | FS_Q_OVERFLOW | FS_IN_IGNORED)
+ FS_UNMOUNT | FS_Q_OVERFLOW | FS_IN_IGNORED | \
+ FS_ERROR)
/* Extra flags that may be reported with event or control handling of events */
#define ALL_FSNOTIFY_FLAGS (FS_EXCL_UNLINK | FS_ISDIR | FS_IN_ONESHOT | \
@@ -263,6 +270,12 @@ enum fsnotify_data_type {
FSNOTIFY_EVENT_NONE,
FSNOTIFY_EVENT_PATH,
FSNOTIFY_EVENT_INODE,
+ FSNOTIFY_EVENT_ERROR,
+};
+
+struct fs_error_report {
+ int error;
+ struct inode *inode;
};
static inline struct inode *fsnotify_data_inode(const void *data, int data_type)
@@ -272,6 +285,8 @@ static inline struct inode *fsnotify_data_inode(const void *data, int data_type)
return (struct inode *)data;
case FSNOTIFY_EVENT_PATH:
return d_inode(((const struct path *)data)->dentry);
+ case FSNOTIFY_EVENT_ERROR:
+ return ((struct fs_error_report *)data)->inode;
default:
return NULL;
}
--
2.32.0
FS_FAN_ERROR must be able to merge events even in the short window after
they've been unqueued but prior to being read. Move the list_empty
check into the merge hooks, such that merge() is always invoked if
existing.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/notify/fanotify/fanotify.c | 3 ++-
fs/notify/inotify/inotify_fsnotify.c | 3 +++
fs/notify/notification.c | 2 +-
3 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index aba06b84da91..769703ef2b9a 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -168,7 +168,8 @@ static int fanotify_merge(struct fsnotify_group *group,
* the event structure we have created in fanotify_handle_event() is the
* one we should check for permission response.
*/
- if (fanotify_is_perm_event(new->mask))
+ if (list_empty(&group->notification_list) ||
+ fanotify_is_perm_event(new->mask))
return 0;
hlist_for_each_entry(old, hlist, merge_list) {
diff --git a/fs/notify/inotify/inotify_fsnotify.c b/fs/notify/inotify/inotify_fsnotify.c
index a003a64ff8ee..2f357b4933c2 100644
--- a/fs/notify/inotify/inotify_fsnotify.c
+++ b/fs/notify/inotify/inotify_fsnotify.c
@@ -52,6 +52,9 @@ static int inotify_merge(struct fsnotify_group *group,
struct list_head *list = &group->notification_list;
struct fsnotify_event *last_event;
+ if (list_empty(list))
+ return 0;
+
last_event = list_entry(list->prev, struct fsnotify_event, list);
return event_compare(last_event, event);
}
diff --git a/fs/notify/notification.c b/fs/notify/notification.c
index 0d9ba592d725..1d06e0728a24 100644
--- a/fs/notify/notification.c
+++ b/fs/notify/notification.c
@@ -111,7 +111,7 @@ int fsnotify_add_event(struct fsnotify_group *group,
goto queue;
}
- if (!list_empty(list) && merge) {
+ if (merge) {
ret = merge(group, event);
if (ret) {
spin_unlock(&group->notification_lock);
--
2.32.0
Introduce helpers for filesystems interested in reporting FS_ERROR
events. When notifying errors, the file system might not have an inode
to report on the error. To support this, allow the caller to specify
the superblock to which the error applies.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
Changes since v2:
- Drop reference to s_fnotify_marks and guards (Amir)
Changes since v1:
- Use the inode argument (Amir)
- Protect s_fsnotify_marks with ifdef guard
---
fs/notify/fsnotify.c | 2 +-
include/linux/fsnotify.h | 13 +++++++++++++
include/linux/fsnotify_backend.h | 1 +
3 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index 36205a769dde..ac05eb3fb368 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -491,7 +491,7 @@ int __fsnotify(__u32 mask, const struct fsnotify_event_info *event_info)
*/
parent = event_info->dir;
}
- sb = inode->i_sb;
+ sb = event_info->sb ?: inode->i_sb;
/*
* Optimization: srcu_read_lock() has a memory barrier which can
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index 8c2c681b4495..684c79ca01b2 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -326,4 +326,17 @@ static inline void fsnotify_change(struct dentry *dentry, unsigned int ia_valid)
fsnotify_dentry(dentry, mask);
}
+static inline int fsnotify_sb_error(struct super_block *sb, struct inode *inode,
+ int error)
+{
+ struct fs_error_report report = {
+ .error = error,
+ .inode = inode,
+ };
+
+ return __fsnotify(FS_ERROR, &(struct fsnotify_event_info) {
+ .data = &report, .data_type = FSNOTIFY_EVENT_ERROR,
+ .sb = sb});
+}
+
#endif /* _LINUX_FS_NOTIFY_H */
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index ea5f5c7cc381..5a32c5010f45 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -138,6 +138,7 @@ struct fsnotify_event_info {
struct inode *dir;
const struct qstr *name;
struct inode *inode;
+ struct super_block *sb;
u32 cookie;
};
--
2.32.0
Send a FS_ERROR message via fsnotify to a userspace monitoring tool
whenever a ext4 error condition is triggered. This follows the existing
error conditions in ext4, so it is hooked to the ext4_error* functions.
It also follows the current dmesg reporting in the format. The
filesystem message is composed mostly by the string that would be
otherwise printed in dmesg.
A new ext4 specific record format is exposed in the uapi, such that a
monitoring tool knows what to expect when listening errors of an ext4
filesystem.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
Reviewed-by: Amir Goldstein <[email protected]>
---
fs/ext4/super.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index d29f6aa7d96e..b5a8c4bab3ab 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -46,6 +46,7 @@
#include <linux/part_stat.h>
#include <linux/kthread.h>
#include <linux/freezer.h>
+#include <linux/fsnotify.h>
#include "ext4.h"
#include "ext4_extents.h" /* Needed for trace points definition */
@@ -752,6 +753,8 @@ void __ext4_error(struct super_block *sb, const char *function,
sb->s_id, function, line, current->comm, &vaf);
va_end(args);
}
+ fsnotify_sb_error(sb, NULL, error);
+
ext4_handle_error(sb, force_ro, error, 0, block, function, line);
}
@@ -782,6 +785,8 @@ void __ext4_error_inode(struct inode *inode, const char *function,
current->comm, &vaf);
va_end(args);
}
+ fsnotify_sb_error(inode->i_sb, inode, error);
+
ext4_handle_error(inode->i_sb, false, error, inode->i_ino, block,
function, line);
}
@@ -820,6 +825,8 @@ void __ext4_error_file(struct file *file, const char *function,
current->comm, path, &vaf);
va_end(args);
}
+ fsnotify_sb_error(inode->i_sb, inode, EFSCORRUPTED);
+
ext4_handle_error(inode->i_sb, false, EFSCORRUPTED, inode->i_ino, block,
function, line);
}
@@ -887,6 +894,7 @@ void __ext4_std_error(struct super_block *sb, const char *function,
printk(KERN_CRIT "EXT4-fs error (device %s) in %s:%d: %s\n",
sb->s_id, function, line, errstr);
}
+ fsnotify_sb_error(sb, sb->s_root->d_inode, errno);
ext4_handle_error(sb, false, -errno, 0, 0, function, line);
}
--
2.32.0
Introduce an example of a FAN_FS_ERROR fanotify user to track filesystem
errors.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
Changes since v1:
- minor fixes
---
samples/Kconfig | 9 +++
samples/Makefile | 1 +
samples/fanotify/Makefile | 3 +
samples/fanotify/fs-monitor.c | 134 ++++++++++++++++++++++++++++++++++
4 files changed, 147 insertions(+)
create mode 100644 samples/fanotify/Makefile
create mode 100644 samples/fanotify/fs-monitor.c
diff --git a/samples/Kconfig b/samples/Kconfig
index b5a1a7aa7e23..f2f9c939035f 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -120,6 +120,15 @@ config SAMPLE_CONNECTOR
with it.
See also Documentation/driver-api/connector.rst
+config SAMPLE_FANOTIFY_ERROR
+ bool "Build fanotify error monitoring sample"
+ depends on FANOTIFY
+ help
+ When enabled, this builds an example code that uses the
+ FAN_FS_ERROR fanotify mechanism to monitor filesystem
+ errors.
+ See also Documentation/admin-guide/filesystem-monitoring.rst.
+
config SAMPLE_HIDRAW
bool "hidraw sample"
depends on CC_CAN_LINK && HEADERS_INSTALL
diff --git a/samples/Makefile b/samples/Makefile
index 087e0988ccc5..931a81847c48 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -5,6 +5,7 @@ subdir-$(CONFIG_SAMPLE_AUXDISPLAY) += auxdisplay
subdir-$(CONFIG_SAMPLE_ANDROID_BINDERFS) += binderfs
obj-$(CONFIG_SAMPLE_CONFIGFS) += configfs/
obj-$(CONFIG_SAMPLE_CONNECTOR) += connector/
+obj-$(CONFIG_SAMPLE_FANOTIFY_ERROR) += fanotify/
subdir-$(CONFIG_SAMPLE_HIDRAW) += hidraw
obj-$(CONFIG_SAMPLE_HW_BREAKPOINT) += hw_breakpoint/
obj-$(CONFIG_SAMPLE_KDB) += kdb/
diff --git a/samples/fanotify/Makefile b/samples/fanotify/Makefile
new file mode 100644
index 000000000000..b3d5cc826e6f
--- /dev/null
+++ b/samples/fanotify/Makefile
@@ -0,0 +1,3 @@
+userprogs-always-y += fs-monitor
+
+userccflags += -I usr/include
diff --git a/samples/fanotify/fs-monitor.c b/samples/fanotify/fs-monitor.c
new file mode 100644
index 000000000000..f949ea00271d
--- /dev/null
+++ b/samples/fanotify/fs-monitor.c
@@ -0,0 +1,134 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2021, Collabora Ltd.
+ */
+
+#define _GNU_SOURCE
+#include <errno.h>
+#include <err.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <fcntl.h>
+#include <sys/fanotify.h>
+#include <sys/types.h>
+#include <unistd.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+
+#ifndef FAN_FS_ERROR
+#define FAN_FS_ERROR 0x00008000
+#define FAN_EVENT_INFO_TYPE_ERROR 4
+
+int mount_fd;
+
+struct fanotify_event_info_error {
+ struct fanotify_event_info_header hdr;
+ __s32 error;
+ __u32 error_count;
+};
+#endif
+
+static void handle_notifications(char *buffer, int len)
+{
+ struct fanotify_event_metadata *metadata;
+ struct fanotify_event_info_error *error;
+ struct fanotify_event_info_fid *fid;
+ struct file_handle *file_handle;
+ int bad_file;
+ int ret;
+ struct stat stat;
+ char *next;
+
+ for (metadata = (struct fanotify_event_metadata *) buffer;
+ FAN_EVENT_OK(metadata, len);
+ metadata = FAN_EVENT_NEXT(metadata, len)) {
+ next = (char *)metadata + metadata->event_len;
+ if (metadata->mask != FAN_FS_ERROR) {
+ printf("unexpected FAN MARK: %llx\n", metadata->mask);
+ goto next_event;
+ } else if (metadata->fd != FAN_NOFD) {
+ printf("Unexpected fd (!= FAN_NOFD)\n");
+ goto next_event;
+ }
+
+ printf("FAN_FS_ERROR found len=%d\n", metadata->event_len);
+
+ error = (struct fanotify_event_info_error *) (metadata+1);
+ if (error->hdr.info_type != FAN_EVENT_INFO_TYPE_ERROR) {
+ printf("unknown record: %d (Expecting TYPE_ERROR)\n",
+ error->hdr.info_type);
+ goto next_event;
+ }
+
+ printf("\tGeneric Error Record: len=%d\n", error->hdr.len);
+ printf("\terror: %d\n", error->error);
+ printf("\terror_count: %d\n", error->error_count);
+
+ fid = (struct fanotify_event_info_fid *) (error + 1);
+
+ if ((char *) fid >= next) {
+ printf("Event doesn't have FID\n");
+ goto next_event;
+ }
+ printf("FID record found\n");
+
+ if (fid->hdr.info_type != FAN_EVENT_INFO_TYPE_FID) {
+ printf("unknown record: %d (Expecting TYPE_FID)\n",
+ fid->hdr.info_type);
+ goto next_event;
+ }
+ printf("\tfsid: %x%x\n", fid->fsid.val[0], fid->fsid.val[1]);
+
+
+ file_handle = (struct file_handle *) &fid->handle;
+ bad_file = open_by_handle_at(mount_fd, file_handle, O_PATH);
+ if (bad_file < 0) {
+ printf("open_by_handle_at %d\n", errno);
+ goto next_event;
+ }
+
+ ret = fstat(bad_file, &stat);
+ if (ret < 0)
+ printf("fstat %d\n", errno);
+
+ printf("\tinode=%ld\n", stat.st_ino);
+
+next_event:
+ printf("---\n\n");
+ }
+}
+
+int main(int argc, char **argv)
+{
+ int fd;
+ char buffer[BUFSIZ];
+
+ if (argc < 2) {
+ printf("Missing path argument\n");
+ return 1;
+ }
+
+ mount_fd = open(argv[1], O_RDONLY);
+ if (mount_fd < 0)
+ errx(1, "mount_fd");
+
+ fd = fanotify_init(FAN_CLASS_NOTIF|FAN_REPORT_FID, O_RDONLY);
+ if (fd < 0)
+ errx(1, "fanotify_init");
+
+ if (fanotify_mark(fd, FAN_MARK_ADD|FAN_MARK_FILESYSTEM,
+ FAN_FS_ERROR, AT_FDCWD, argv[1])) {
+ errx(1, "fanotify_mark");
+ }
+
+ while (1) {
+ int n = read(fd, buffer, BUFSIZ);
+
+ if (n < 0)
+ errx(1, "read");
+
+ handle_notifications(buffer, n);
+ }
+
+ return 0;
+}
--
2.32.0
Document the FAN_FS_ERROR event for user administrators and user space
developers.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
Changes since v1:
- Drop references to location record
- Explain that the inode field is optional
- Explain we are reporting only the first error
---
.../admin-guide/filesystem-monitoring.rst | 70 +++++++++++++++++++
Documentation/admin-guide/index.rst | 1 +
2 files changed, 71 insertions(+)
create mode 100644 Documentation/admin-guide/filesystem-monitoring.rst
diff --git a/Documentation/admin-guide/filesystem-monitoring.rst b/Documentation/admin-guide/filesystem-monitoring.rst
new file mode 100644
index 000000000000..c0ab1ad268b8
--- /dev/null
+++ b/Documentation/admin-guide/filesystem-monitoring.rst
@@ -0,0 +1,70 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================================
+File system Monitoring with fanotify
+====================================
+
+fanotify supports the FAN_FS_ERROR mark for file system-wide error
+reporting. It is meant to be used by file system health monitoring
+daemons who listen on that interface and take actions (notify sysadmin,
+start recovery) when a file system problem is detected by the kernel.
+
+By design, A FAN_FS_ERROR notification exposes sufficient information for a
+monitoring tool to know a problem in the file system has happened. It
+doesn't necessarily provide a user space application with semantics to
+verify an IO operation was successfully executed. That is outside of
+scope of this feature. Instead, it is only meant as a framework for
+early file system problem detection and reporting recovery tools.
+
+When a file system operation fails, it is common for dozens of kernel
+errors to cascade after the initial failure, hiding the original failure
+log, which is usually the most useful debug data to troubleshoot the
+problem. For this reason, FAN_FS_ERROR only reports the first error that
+occurred since the last notification, and it simply counts addition
+errors. This ensures that the most important piece of error information
+is never lost.
+
+FAN_FS_ERROR requires the fanotify group to be setup with the
+FAN_REPORT_FID flag.
+
+At the time of this writing, the only file system that emits FAN_FS_ERROR
+notifications is Ext4.
+
+A user space example code is provided at ``samples/fanotify/fs-monitor.c``.
+
+Notification structure
+======================
+
+A FAN_FS_ERROR Notification has the following format::
+
+ [ Notification Metadata (Mandatory) ]
+ [ Generic Error Record (Mandatory) ]
+ [ FID record (Mandatory) ]
+
+Generic error record
+--------------------
+
+The generic error record provides enough information for a file system
+agnostic tool to learn about a problem in the file system, without
+providing any additional details about the problem. This record is
+identified by ``struct fanotify_event_info_header.info_type`` being set
+to FAN_EVENT_INFO_TYPE_ERROR.
+
+ struct fanotify_event_info_error {
+ struct fanotify_event_info_header hdr;
+ __s32 error;
+ __u32 error_count;
+ };
+
+The `error` field identifies the type of error. `error_count` count
+tracks the number of errors that occurred and were suppressed to
+preserve the original error, since the last notification.
+
+FID record
+----------
+
+The FID record can be used to uniquely identify the inode that triggered
+the error through the combination of fsid and file handler. A
+filesystem specific handler can use that information to attempt a
+recovery procedure. Errors that are not related to an inode are
+reported against the root inode.
diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
index dc00afcabb95..1bedab498104 100644
--- a/Documentation/admin-guide/index.rst
+++ b/Documentation/admin-guide/index.rst
@@ -82,6 +82,7 @@ configure specific aspects of kernel behavior to your liking.
edid
efi-stub
ext4
+ filesystem-monitoring
nfs/index
gpio/index
highuid
--
2.32.0
The FAN_FS_ERROR event is a new inode event used by filesystem wide
monitoring tools to receive notifications of type FS_ERROR_EVENT,
emitted directly by filesystems when a problem is detected. The error
notification includes a generic error descriptor and a FID identifying
the file affected.
FID is sent for every FAN_FS_ERROR. Errors not linked to a regular inode
are reported against the root inode.
An error reporting structure is attached per-mark, and only a single
error can be stored at a time. This is ok, since once an error occurs,
it is common for a stream of related errors to be reported. We only log
accumulate the total of errors occurred since the last notification.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
Changes since v2:
- Support and equire FID mode (amir)
- Goto error path instead of early return (amir)
- Simplify get_one_event (me)
- Base merging on error_count
- drop fanotify_queue_error_event
Changes since v1:
- Pass dentry to fanotify_check_fsid (Amir)
- FANOTIFY_EVENT_TYPE_ERROR -> FANOTIFY_EVENT_TYPE_FS_ERROR
- Merge previous patch into it
- Use a single slot
- Move fanotify_mark.error_event definition to this commit
- Rename FAN_ERROR -> FAN_FS_ERROR
- Restrict FAN_FS_ERROR to FAN_MARK_FILESYSTEM
---
fs/notify/fanotify/fanotify.c | 108 +++++++++++++++++++++++------
fs/notify/fanotify/fanotify.h | 53 ++++++++++++++
fs/notify/fanotify/fanotify_user.c | 97 +++++++++++++++++++++++++-
include/linux/fanotify.h | 11 ++-
include/uapi/linux/fanotify.h | 8 +++
5 files changed, 253 insertions(+), 24 deletions(-)
diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 769703ef2b9a..854adcbeb95d 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -336,23 +336,6 @@ static u32 fanotify_group_event_mask(
return test_mask & user_mask;
}
-/*
- * Check size needed to encode fanotify_fh.
- *
- * Return size of encoded fh without fanotify_fh header.
- * Return 0 on failure to encode.
- */
-static int fanotify_encode_fh_len(struct inode *inode)
-{
- int dwords = 0;
-
- if (!inode)
- return 0;
-
- exportfs_encode_inode_fh(inode, NULL, &dwords, NULL);
-
- return dwords << 2;
-}
/*
* Encode fanotify_fh.
@@ -405,8 +388,12 @@ static int fanotify_encode_fh(struct fanotify_fh *fh, struct inode *inode,
fh->type = type;
fh->len = fh_len;
- /* Mix fh into event merge key */
- *hash ^= fanotify_hash_fh(fh);
+ /*
+ * Mix fh into event merge key. Hash might be NULL in case of
+ * unhashed FID events (i.e. FAN_FS_ERROR).
+ */
+ if (hash)
+ *hash ^= fanotify_hash_fh(fh);
return FANOTIFY_FH_HDR_LEN + fh_len;
@@ -692,6 +679,60 @@ static __kernel_fsid_t fanotify_get_fsid(struct fsnotify_iter_info *iter_info)
return fsid;
}
+static int fanotify_merge_error_event(struct fsnotify_group *group,
+ struct fsnotify_event *event)
+{
+ struct fanotify_event *fae = FANOTIFY_E(event);
+
+ /*
+ * When err_count > 0, the reporting slot is full. Just account
+ * the additional error and abort the insertion.
+ */
+ if (atomic_fetch_inc(&FANOTIFY_EE(fae)->err_count) != 0)
+ return 1;
+
+ return 0;
+}
+
+static void fanotify_insert_error_event(struct fsnotify_group *group,
+ struct fsnotify_event *event,
+ const void *data)
+{
+ const struct fsnotify_event_info *ei =
+ (struct fsnotify_event_info *) data;
+ struct fanotify_event *fae = FANOTIFY_E(event);
+ const struct fs_error_report *report;
+ struct fanotify_error_event *fee;
+ struct inode *inode;
+ int fh_len;
+
+ /* This might be an unexpected type of event (i.e. overflow). */
+ if (!fanotify_is_error_event(fae->mask))
+ return;
+
+ report = (struct fs_error_report *) ei->data;
+ inode = report->inode ?: ei->sb->s_root->d_inode;
+
+ fee = FANOTIFY_EE(fae);
+ fee->fae.type = FANOTIFY_EVENT_TYPE_FS_ERROR;
+ fee->error = report->error;
+ fee->fsid = fee->mark->connector->fsid;
+
+ fsnotify_get_mark(fee->mark);
+
+ /*
+ * Error reporting needs to happen in atomic context. If this
+ * inode's file handler is more than we initially predicted,
+ * there is nothing better we can do than report the error with
+ * a bad FH.
+ */
+ fh_len = fanotify_encode_fh_len(inode);
+ if (WARN_ON(fh_len > fee->max_fh_len))
+ return;
+
+ fanotify_encode_fh(&fee->object_fh, inode, fh_len, NULL, 0);
+}
+
/*
* Add an event to hash table for faster merge.
*/
@@ -742,8 +783,9 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
BUILD_BUG_ON(FAN_ONDIR != FS_ISDIR);
BUILD_BUG_ON(FAN_OPEN_EXEC != FS_OPEN_EXEC);
BUILD_BUG_ON(FAN_OPEN_EXEC_PERM != FS_OPEN_EXEC_PERM);
+ BUILD_BUG_ON(FAN_FS_ERROR != FS_ERROR);
- BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 19);
+ BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 20);
mask = fanotify_group_event_mask(group, mask, event_info, iter_info);
if (!mask)
@@ -767,6 +809,17 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
return 0;
}
+ if (fanotify_is_error_event(mask)) {
+ struct fanotify_sb_mark *sb_mark =
+ FANOTIFY_SB_MARK(fsnotify_iter_sb_mark(iter_info));
+
+ ret = fsnotify_add_event(group, &sb_mark->error_event->fae.fse,
+ fanotify_merge_error_event,
+ fanotify_insert_error_event,
+ event_info);
+ goto finish;
+ }
+
event = fanotify_alloc_event(group, mask, event_info, &fsid);
ret = -ENOMEM;
if (unlikely(!event)) {
@@ -834,6 +887,17 @@ static void fanotify_free_name_event(struct fanotify_event *event)
kfree(FANOTIFY_NE(event));
}
+static void fanotify_free_error_event(struct fanotify_event *event)
+{
+ /*
+ * Just drop the reference acquired by
+ * fanotify_insert_error_event.
+ *
+ * The actual memory is freed with the mark.
+ */
+ fsnotify_put_mark(FANOTIFY_EE(event)->mark);
+}
+
static void fanotify_free_event(struct fsnotify_event *fsn_event)
{
struct fanotify_event *event;
@@ -856,6 +920,9 @@ static void fanotify_free_event(struct fsnotify_event *fsn_event)
case FANOTIFY_EVENT_TYPE_OVERFLOW:
kfree(event);
break;
+ case FANOTIFY_EVENT_TYPE_FS_ERROR:
+ fanotify_free_error_event(event);
+ break;
default:
WARN_ON_ONCE(1);
}
@@ -873,6 +940,7 @@ static void fanotify_free_mark(struct fsnotify_mark *mark)
if (mark->flags & FSNOTIFY_MARK_FLAG_SB) {
struct fanotify_sb_mark *fa_mark = FANOTIFY_SB_MARK(mark);
+ kfree(fa_mark->error_event);
kmem_cache_free(fanotify_sb_mark_cache, fa_mark);
} else {
kmem_cache_free(fanotify_mark_cache, mark);
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index 2e005b3a75f2..0259d631b8d8 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -132,6 +132,7 @@ static inline void fanotify_info_copy_name(struct fanotify_info *info,
struct fanotify_sb_mark {
struct fsnotify_mark fsn_mark;
+ struct fanotify_error_event *error_event;
};
static inline
@@ -154,6 +155,7 @@ enum fanotify_event_type {
FANOTIFY_EVENT_TYPE_PATH,
FANOTIFY_EVENT_TYPE_PATH_PERM,
FANOTIFY_EVENT_TYPE_OVERFLOW, /* struct fanotify_event */
+ FANOTIFY_EVENT_TYPE_FS_ERROR, /* struct fanotify_error_event */
__FANOTIFY_EVENT_TYPE_NUM
};
@@ -209,12 +211,37 @@ FANOTIFY_NE(struct fanotify_event *event)
return container_of(event, struct fanotify_name_event, fae);
}
+struct fanotify_error_event {
+ struct fanotify_event fae;
+ s32 error; /* Error reported by the Filesystem. */
+ atomic_t err_count; /* Suppressed errors count */
+ __kernel_fsid_t fsid; /* FSID this error refers to. */
+
+ struct fsnotify_mark *mark; /* Back reference to the mark. */
+ int max_fh_len; /* Maximum object_fh buffer size. */
+
+ /*
+ * object_fh is followed by a variable sized buffer, so it must
+ * be the last element of this structure.
+ */
+ struct fanotify_fh object_fh;
+};
+
+
+static inline struct fanotify_error_event *
+FANOTIFY_EE(struct fanotify_event *event)
+{
+ return container_of(event, struct fanotify_error_event, fae);
+}
+
static inline __kernel_fsid_t *fanotify_event_fsid(struct fanotify_event *event)
{
if (event->type == FANOTIFY_EVENT_TYPE_FID)
return &FANOTIFY_FE(event)->fsid;
else if (event->type == FANOTIFY_EVENT_TYPE_FID_NAME)
return &FANOTIFY_NE(event)->fsid;
+ else if (event->type == FANOTIFY_EVENT_TYPE_FS_ERROR)
+ return &FANOTIFY_EE(event)->fsid;
else
return NULL;
}
@@ -226,6 +253,8 @@ static inline struct fanotify_fh *fanotify_event_object_fh(
return &FANOTIFY_FE(event)->object_fh;
else if (event->type == FANOTIFY_EVENT_TYPE_FID_NAME)
return fanotify_info_file_fh(&FANOTIFY_NE(event)->info);
+ else if (event->type == FANOTIFY_EVENT_TYPE_FS_ERROR)
+ return &FANOTIFY_EE(event)->object_fh;
else
return NULL;
}
@@ -300,6 +329,11 @@ static inline struct fanotify_event *FANOTIFY_E(struct fsnotify_event *fse)
return container_of(fse, struct fanotify_event, fse);
}
+static inline bool fanotify_is_error_event(u32 mask)
+{
+ return mask & FANOTIFY_ERROR_EVENTS;
+}
+
static inline bool fanotify_event_has_path(struct fanotify_event *event)
{
return event->type == FANOTIFY_EVENT_TYPE_PATH ||
@@ -329,6 +363,7 @@ static inline struct path *fanotify_event_path(struct fanotify_event *event)
static inline bool fanotify_is_hashed_event(u32 mask)
{
return !(fanotify_is_perm_event(mask) ||
+ fanotify_is_error_event(mask) ||
fsnotify_is_overflow_event(mask));
}
@@ -338,3 +373,21 @@ static inline unsigned int fanotify_event_hash_bucket(
{
return event->hash & FANOTIFY_HTABLE_MASK;
}
+
+/*
+ * Check size needed to encode fanotify_fh.
+ *
+ * Return size of encoded fh without fanotify_fh header.
+ * Return 0 on failure to encode.
+ */
+static inline int fanotify_encode_fh_len(struct inode *inode)
+{
+ int dwords = 0;
+
+ if (!inode)
+ return 0;
+
+ exportfs_encode_inode_fh(inode, NULL, &dwords, NULL);
+
+ return dwords << 2;
+}
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index a42521e140e6..8bcce26e21df 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -107,6 +107,8 @@ struct kmem_cache *fanotify_perm_event_cachep __read_mostly;
#define FANOTIFY_EVENT_ALIGN 4
#define FANOTIFY_INFO_HDR_LEN \
(sizeof(struct fanotify_event_info_fid) + sizeof(struct file_handle))
+#define FANOTIFY_INFO_ERROR_LEN \
+ (sizeof(struct fanotify_event_info_error))
static int fanotify_fid_info_len(int fh_len, int name_len)
{
@@ -130,6 +132,9 @@ static size_t fanotify_event_len(struct fanotify_event *event,
if (!fid_mode)
return event_len;
+ if (fanotify_is_error_event(event->mask))
+ event_len += FANOTIFY_INFO_ERROR_LEN;
+
info = fanotify_event_info(event);
dir_fh_len = fanotify_event_dir_fh_len(event);
fh_len = fanotify_event_object_fh_len(event);
@@ -310,6 +315,30 @@ static int process_access_response(struct fsnotify_group *group,
return -ENOENT;
}
+static size_t copy_error_info_to_user(struct fanotify_event *event,
+ char __user *buf, int count)
+{
+ struct fanotify_event_info_error info;
+ struct fanotify_error_event *fee = FANOTIFY_EE(event);
+
+ info.hdr.info_type = FAN_EVENT_INFO_TYPE_ERROR;
+ info.hdr.pad = 0;
+ info.hdr.len = sizeof(struct fanotify_event_info_error);
+
+ if (WARN_ON(count < info.hdr.len))
+ return -EFAULT;
+
+ info.error = fee->error;
+
+ /* This effectively releases the event for logging another error */
+ info.error_count = atomic_xchg(&fee->err_count, 0);
+
+ if (copy_to_user(buf, &info, sizeof(info)))
+ return -EFAULT;
+
+ return info.hdr.len;
+}
+
static int copy_info_to_user(__kernel_fsid_t *fsid, struct fanotify_fh *fh,
int info_type, const char *name, size_t name_len,
char __user *buf, size_t count)
@@ -468,6 +497,14 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
if (f)
fd_install(fd, f);
+ if (fanotify_is_error_event(event->mask)) {
+ ret = copy_error_info_to_user(event, buf, count);
+ if (ret < 0)
+ goto out_close_fd;
+ buf += ret;
+ count -= ret;
+ }
+
/* Event info records order is: dir fid + name, child fid */
if (fanotify_event_dir_fh_len(event)) {
info_type = info->name_len ? FAN_EVENT_INFO_TYPE_DFID_NAME :
@@ -896,6 +933,43 @@ static int fanotify_remove_inode_mark(struct fsnotify_group *group,
flags, umask);
}
+static int fanotify_create_fs_error_event(struct fsnotify_mark *fsn_mark,
+ fsnotify_connp_t *connp)
+{
+ struct fanotify_sb_mark *sb_mark = FANOTIFY_SB_MARK(fsn_mark);
+ struct super_block *sb =
+ container_of(connp, struct super_block, s_fsnotify_marks);
+ struct fanotify_error_event *fee;
+ int fh_len;
+
+ /*
+ * Since the allocation is done holding group->mark_mutex, the
+ * error event allocation is guaranteed not to race with itself.
+ */
+ if (READ_ONCE(sb_mark->error_event))
+ return 0;
+
+ /* Since, for error events, every memory must be preallocated,
+ * the FH buffer size is predicted to be the same as the root
+ * inode file handler size. This should work for file systems
+ * without variable sized FH.
+ */
+ fh_len = fanotify_encode_fh_len(sb->s_root->d_inode);
+
+ fee = kzalloc(sizeof(*fee) + fh_len, GFP_KERNEL);
+ if (!fee)
+ return -ENOMEM;
+
+ fanotify_init_event(&fee->fae, 0, FS_ERROR);
+ fee->mark = fsn_mark;
+ fee->max_fh_len = fh_len;
+
+ WRITE_ONCE(sb_mark->error_event, fee);
+
+ return 0;
+
+}
+
static __u32 fanotify_mark_add_to_mask(struct fsnotify_mark *fsn_mark,
__u32 mask,
unsigned int flags)
@@ -994,6 +1068,7 @@ static int fanotify_add_mark(struct fsnotify_group *group,
{
struct fsnotify_mark *fsn_mark;
__u32 added;
+ int ret = 0;
mutex_lock(&group->mark_mutex);
fsn_mark = fsnotify_find_mark(connp, group);
@@ -1004,13 +1079,28 @@ static int fanotify_add_mark(struct fsnotify_group *group,
return PTR_ERR(fsn_mark);
}
}
+
+ /*
+ * Error events are allocated per super-block mark, but only if
+ * strictly needed (i.e. FAN_FS_ERROR was requested).
+ */
+ if (type == FSNOTIFY_OBJ_TYPE_SB && !(flags & FAN_MARK_IGNORED_MASK) &&
+ (mask & FAN_FS_ERROR)) {
+ if (fanotify_create_fs_error_event(fsn_mark, connp)) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ }
+
added = fanotify_mark_add_to_mask(fsn_mark, mask, flags);
if (added & ~fsnotify_conn_mask(fsn_mark->connector))
fsnotify_recalc_mask(fsn_mark->connector);
+
+out:
mutex_unlock(&group->mark_mutex);
fsnotify_put_mark(fsn_mark);
- return 0;
+ return ret;
}
static int fanotify_add_vfsmount_mark(struct fsnotify_group *group,
@@ -1427,6 +1517,11 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
fsid = &__fsid;
}
+ if (mask & FAN_FS_ERROR && mark_type != FAN_MARK_FILESYSTEM) {
+ ret = -EINVAL;
+ goto path_put_and_out;
+ }
+
/* inode held in place by reference to path; group by fget on fd */
if (mark_type == FAN_MARK_INODE)
inode = path.dentry->d_inode;
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index a16dbeced152..d086a19aff63 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -81,13 +81,17 @@ extern struct ctl_table fanotify_table[]; /* for sysctl */
*/
#define FANOTIFY_DIRENT_EVENTS (FAN_MOVE | FAN_CREATE | FAN_DELETE)
-/* Events that can only be reported with data type FSNOTIFY_EVENT_INODE */
+#define FANOTIFY_ERROR_EVENTS (FAN_FS_ERROR)
+
+/* Events that can only be reported to groups that support FID mode */
#define FANOTIFY_INODE_EVENTS (FANOTIFY_DIRENT_EVENTS | \
- FAN_ATTRIB | FAN_MOVE_SELF | FAN_DELETE_SELF)
+ FAN_ATTRIB | FAN_MOVE_SELF | \
+ FAN_DELETE_SELF | FAN_FS_ERROR)
/* Events that user can request to be notified on */
#define FANOTIFY_EVENTS (FANOTIFY_PATH_EVENTS | \
- FANOTIFY_INODE_EVENTS)
+ FANOTIFY_INODE_EVENTS | \
+ FANOTIFY_ERROR_EVENTS)
/* Events that require a permission response from user */
#define FANOTIFY_PERM_EVENTS (FAN_OPEN_PERM | FAN_ACCESS_PERM | \
@@ -99,6 +103,7 @@ extern struct ctl_table fanotify_table[]; /* for sysctl */
/* Events that may be reported to user */
#define FANOTIFY_OUTGOING_EVENTS (FANOTIFY_EVENTS | \
FANOTIFY_PERM_EVENTS | \
+ FANOTIFY_ERROR_EVENTS | \
FAN_Q_OVERFLOW | FAN_ONDIR)
#define ALL_FANOTIFY_EVENT_BITS (FANOTIFY_OUTGOING_EVENTS | \
diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
index fbf9c5c7dd59..80040a92e9d9 100644
--- a/include/uapi/linux/fanotify.h
+++ b/include/uapi/linux/fanotify.h
@@ -20,6 +20,7 @@
#define FAN_OPEN_EXEC 0x00001000 /* File was opened for exec */
#define FAN_Q_OVERFLOW 0x00004000 /* Event queued overflowed */
+#define FAN_FS_ERROR 0x00008000 /* Filesystem error */
#define FAN_OPEN_PERM 0x00010000 /* File open in perm check */
#define FAN_ACCESS_PERM 0x00020000 /* File accessed in perm check */
@@ -123,6 +124,7 @@ struct fanotify_event_metadata {
#define FAN_EVENT_INFO_TYPE_FID 1
#define FAN_EVENT_INFO_TYPE_DFID_NAME 2
#define FAN_EVENT_INFO_TYPE_DFID 3
+#define FAN_EVENT_INFO_TYPE_ERROR 4
/* Variable length info record following event metadata */
struct fanotify_event_info_header {
@@ -148,6 +150,12 @@ struct fanotify_event_info_fid {
unsigned char handle[0];
};
+struct fanotify_event_info_error {
+ struct fanotify_event_info_header hdr;
+ __s32 error;
+ __u32 error_count;
+};
+
struct fanotify_response {
__s32 fd;
__u32 response;
--
2.32.0
FAN_ERROR will require an error structure to be stored per mark. But,
since FAN_ERROR doesn't apply to inode/mount marks, it should suffice to
only expose this information for superblock marks. Therefore, wrap this
kind of marks into a container and plumb it for the future.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
Changes since v2:
- Move mark initialization to fanotify_alloc_mark (Amir)
Changes since v1:
- Only extend superblock marks (Amir)
---
fs/notify/fanotify/fanotify.c | 10 ++++++--
fs/notify/fanotify/fanotify.h | 13 ++++++++++
fs/notify/fanotify/fanotify_user.c | 38 ++++++++++++++++++++++++++++--
include/linux/fsnotify_backend.h | 1 +
4 files changed, 58 insertions(+), 4 deletions(-)
diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 310246f8d3f1..0f760770d4c9 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -869,9 +869,15 @@ static void fanotify_freeing_mark(struct fsnotify_mark *mark,
dec_ucount(group->fanotify_data.ucounts, UCOUNT_FANOTIFY_MARKS);
}
-static void fanotify_free_mark(struct fsnotify_mark *fsn_mark)
+static void fanotify_free_mark(struct fsnotify_mark *mark)
{
- kmem_cache_free(fanotify_mark_cache, fsn_mark);
+ if (mark->flags & FSNOTIFY_MARK_FLAG_SB) {
+ struct fanotify_sb_mark *fa_mark = FANOTIFY_SB_MARK(mark);
+
+ kmem_cache_free(fanotify_sb_mark_cache, fa_mark);
+ } else {
+ kmem_cache_free(fanotify_mark_cache, mark);
+ }
}
const struct fsnotify_ops fanotify_fsnotify_ops = {
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index 4a5e555dc3d2..fd125a949187 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -6,6 +6,7 @@
#include <linux/hashtable.h>
extern struct kmem_cache *fanotify_mark_cache;
+extern struct kmem_cache *fanotify_sb_mark_cache;
extern struct kmem_cache *fanotify_fid_event_cachep;
extern struct kmem_cache *fanotify_path_event_cachep;
extern struct kmem_cache *fanotify_perm_event_cachep;
@@ -129,6 +130,18 @@ static inline void fanotify_info_copy_name(struct fanotify_info *info,
name->name);
}
+struct fanotify_sb_mark {
+ struct fsnotify_mark fsn_mark;
+};
+
+static inline
+struct fanotify_sb_mark *FANOTIFY_SB_MARK(struct fsnotify_mark *mark)
+{
+ WARN_ON(!(mark->flags & FSNOTIFY_MARK_FLAG_SB));
+
+ return container_of(mark, struct fanotify_sb_mark, fsn_mark);
+}
+
/*
* Common structure for fanotify events. Concrete structs are allocated in
* fanotify_handle_event() and freed when the information is retrieved by
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 67b18dfe0025..a42521e140e6 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -99,6 +99,7 @@ struct ctl_table fanotify_table[] = {
extern const struct fsnotify_ops fanotify_fsnotify_ops;
struct kmem_cache *fanotify_mark_cache __read_mostly;
+struct kmem_cache *fanotify_sb_mark_cache __read_mostly;
struct kmem_cache *fanotify_fid_event_cachep __read_mostly;
struct kmem_cache *fanotify_path_event_cachep __read_mostly;
struct kmem_cache *fanotify_perm_event_cachep __read_mostly;
@@ -915,6 +916,38 @@ static __u32 fanotify_mark_add_to_mask(struct fsnotify_mark *fsn_mark,
return mask & ~oldmask;
}
+static struct fsnotify_mark *fanotify_alloc_mark(struct fsnotify_group *group,
+ unsigned int type)
+{
+ struct fanotify_sb_mark *sb_mark;
+ struct fsnotify_mark *mark;
+
+ switch (type) {
+ case FSNOTIFY_OBJ_TYPE_SB:
+ sb_mark = kmem_cache_zalloc(fanotify_sb_mark_cache, GFP_KERNEL);
+ if (!sb_mark)
+ return NULL;
+ mark = &sb_mark->fsn_mark;
+ break;
+
+ case FSNOTIFY_OBJ_TYPE_INODE:
+ case FSNOTIFY_OBJ_TYPE_PARENT:
+ case FSNOTIFY_OBJ_TYPE_VFSMOUNT:
+ mark = kmem_cache_alloc(fanotify_mark_cache, GFP_KERNEL);
+ break;
+ default:
+ WARN_ON(1);
+ return NULL;
+ }
+
+ fsnotify_init_mark(mark, group);
+
+ if (type == FSNOTIFY_OBJ_TYPE_SB)
+ mark->flags |= FSNOTIFY_MARK_FLAG_SB;
+
+ return mark;
+}
+
static struct fsnotify_mark *fanotify_add_new_mark(struct fsnotify_group *group,
fsnotify_connp_t *connp,
unsigned int type,
@@ -933,13 +966,12 @@ static struct fsnotify_mark *fanotify_add_new_mark(struct fsnotify_group *group,
!inc_ucount(ucounts->ns, ucounts->uid, UCOUNT_FANOTIFY_MARKS))
return ERR_PTR(-ENOSPC);
- mark = kmem_cache_alloc(fanotify_mark_cache, GFP_KERNEL);
+ mark = fanotify_alloc_mark(group, type);
if (!mark) {
ret = -ENOMEM;
goto out_dec_ucounts;
}
- fsnotify_init_mark(mark, group);
ret = fsnotify_add_mark_locked(mark, connp, type, 0, fsid);
if (ret) {
fsnotify_put_mark(mark);
@@ -1497,6 +1529,8 @@ static int __init fanotify_user_setup(void)
fanotify_mark_cache = KMEM_CACHE(fsnotify_mark,
SLAB_PANIC|SLAB_ACCOUNT);
+ fanotify_sb_mark_cache = KMEM_CACHE(fanotify_sb_mark,
+ SLAB_PANIC|SLAB_ACCOUNT);
fanotify_fid_event_cachep = KMEM_CACHE(fanotify_fid_event,
SLAB_PANIC);
fanotify_path_event_cachep = KMEM_CACHE(fanotify_path_event,
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index 1ce66748a2d2..c4473b467c28 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -401,6 +401,7 @@ struct fsnotify_mark {
#define FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY 0x01
#define FSNOTIFY_MARK_FLAG_ALIVE 0x02
#define FSNOTIFY_MARK_FLAG_ATTACHED 0x04
+#define FSNOTIFY_MARK_FLAG_SB 0x08
unsigned int flags; /* flags [mark->lock] */
};
--
2.32.0
Some events, like the overflow event, are not mergeable, so they are not
hashed. But, when failing inside fsnotify_add_event for lack of space,
fsnotify_add_event() still calls the insert hook, which adds the
overflow event to the merge list. Add a check to prevent any kind of
unmergeable event to be inserted in the hashtable.
Fixes: 94e00d28a680 ("fsnotify: use hash table for faster events merge")
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
Changes since v2:
- Do check for hashed events inside the insert hook (Amir)
---
fs/notify/fanotify/fanotify.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 057abd2cf887..310246f8d3f1 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -702,6 +702,9 @@ static void fanotify_insert_event(struct fsnotify_group *group,
assert_spin_locked(&group->notification_lock);
+ if (!fanotify_is_hashed_event(event->mask))
+ return;
+
pr_debug("%s: group=%p event=%p bucket=%u\n", __func__,
group, event, bucket);
@@ -779,8 +782,7 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
fsn_event = &event->fse;
ret = fsnotify_add_event(group, fsn_event, fanotify_merge,
- fanotify_is_hashed_event(mask) ?
- fanotify_insert_event : NULL);
+ fanotify_insert_event);
if (ret) {
/* Permission events shouldn't be merged */
BUG_ON(ret == 1 && mask & FANOTIFY_PERM_EVENTS);
--
2.32.0
From: Amir Goldstein <[email protected]>
There are a lot of arguments to fsnotify() and the handle_event() method.
Pass them in a const struct instead of on the argument list.
Apart from being more tidy, this helps with passing error reports to the
backend. __fsnotify_parent() argument list was intentionally left
untouched, because its argument list is still short enough and because
most of the event info arguments are initialized inside
__fsnotify_parent().
Signed-off-by: Amir Goldstein <[email protected]>
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/notify/fanotify/fanotify.c | 59 +++++++++++------------
fs/notify/fsnotify.c | 83 +++++++++++++++++---------------
include/linux/fsnotify.h | 15 ++++--
include/linux/fsnotify_backend.h | 73 +++++++++++++++++++++-------
4 files changed, 140 insertions(+), 90 deletions(-)
diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 0f760770d4c9..4f2febb15e94 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -253,21 +253,22 @@ static int fanotify_get_response(struct fsnotify_group *group,
* been included within the event mask, but have not been explicitly
* requested by the user, will not be present in the returned mask.
*/
-static u32 fanotify_group_event_mask(struct fsnotify_group *group,
- struct fsnotify_iter_info *iter_info,
- u32 event_mask, const void *data,
- int data_type, struct inode *dir)
+static u32 fanotify_group_event_mask(
+ struct fsnotify_group *group, u32 event_mask,
+ const struct fsnotify_event_info *event_info,
+ struct fsnotify_iter_info *iter_info)
{
__u32 marks_mask = 0, marks_ignored_mask = 0;
__u32 test_mask, user_mask = FANOTIFY_OUTGOING_EVENTS |
FANOTIFY_EVENT_FLAGS;
- const struct path *path = fsnotify_data_path(data, data_type);
+ const struct path *path = fsnotify_event_info_path(event_info);
unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
struct fsnotify_mark *mark;
int type;
pr_debug("%s: report_mask=%x mask=%x data=%p data_type=%d\n",
- __func__, iter_info->report_mask, event_mask, data, data_type);
+ __func__, iter_info->report_mask, event_mask,
+ event_info->data, event_info->data_type);
if (!fid_mode) {
/* Do we have path to open a file descriptor? */
@@ -278,7 +279,7 @@ static u32 fanotify_group_event_mask(struct fsnotify_group *group,
return 0;
} else if (!(fid_mode & FAN_REPORT_FID)) {
/* Do we have a directory inode to report? */
- if (!dir && !(event_mask & FS_ISDIR))
+ if (!event_info->dir && !(event_mask & FS_ISDIR))
return 0;
}
@@ -427,13 +428,13 @@ static int fanotify_encode_fh(struct fanotify_fh *fh, struct inode *inode,
* FS_ATTRIB reports the child inode even if reported on a watched parent.
* FS_CREATE reports the modified dir inode and not the created inode.
*/
-static struct inode *fanotify_fid_inode(u32 event_mask, const void *data,
- int data_type, struct inode *dir)
+static struct inode *fanotify_fid_inode(u32 event_mask,
+ const struct fsnotify_event_info *event_info)
{
if (event_mask & ALL_FSNOTIFY_DIRENT_EVENTS)
- return dir;
+ return event_info->dir;
- return fsnotify_data_inode(data, data_type);
+ return fsnotify_event_info_inode(event_info);
}
/*
@@ -444,18 +445,18 @@ static struct inode *fanotify_fid_inode(u32 event_mask, const void *data,
* reported to parent.
* Otherwise, do not report dir fid.
*/
-static struct inode *fanotify_dfid_inode(u32 event_mask, const void *data,
- int data_type, struct inode *dir)
+static struct inode *fanotify_dfid_inode(u32 event_mask,
+ const struct fsnotify_event_info *event_info)
{
- struct inode *inode = fsnotify_data_inode(data, data_type);
+ struct inode *inode = fsnotify_event_info_inode(event_info);
if (event_mask & ALL_FSNOTIFY_DIRENT_EVENTS)
- return dir;
+ return event_info->dir;
if (S_ISDIR(inode->i_mode))
return inode;
- return dir;
+ return event_info->dir;
}
static struct fanotify_event *fanotify_alloc_path_event(const struct path *path,
@@ -563,17 +564,17 @@ static struct fanotify_event *fanotify_alloc_name_event(struct inode *id,
return &fne->fae;
}
-static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
- u32 mask, const void *data,
- int data_type, struct inode *dir,
- const struct qstr *file_name,
- __kernel_fsid_t *fsid)
+static struct fanotify_event *fanotify_alloc_event(
+ struct fsnotify_group *group, u32 mask,
+ const struct fsnotify_event_info *event_info,
+ __kernel_fsid_t *fsid)
{
struct fanotify_event *event = NULL;
gfp_t gfp = GFP_KERNEL_ACCOUNT;
- struct inode *id = fanotify_fid_inode(mask, data, data_type, dir);
- struct inode *dirid = fanotify_dfid_inode(mask, data, data_type, dir);
- const struct path *path = fsnotify_data_path(data, data_type);
+ struct inode *id = fanotify_fid_inode(mask, event_info);
+ struct inode *dirid = fanotify_dfid_inode(mask, event_info);
+ const struct path *path = fsnotify_event_info_path(event_info);
+ const struct qstr *file_name = event_info->name;
unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
struct mem_cgroup *old_memcg;
struct inode *child = NULL;
@@ -712,9 +713,7 @@ static void fanotify_insert_event(struct fsnotify_group *group,
}
static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
- const void *data, int data_type,
- struct inode *dir,
- const struct qstr *file_name, u32 cookie,
+ const struct fsnotify_event_info *event_info,
struct fsnotify_iter_info *iter_info)
{
int ret = 0;
@@ -744,8 +743,7 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 19);
- mask = fanotify_group_event_mask(group, iter_info, mask, data,
- data_type, dir);
+ mask = fanotify_group_event_mask(group, mask, event_info, iter_info);
if (!mask)
return 0;
@@ -767,8 +765,7 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
return 0;
}
- event = fanotify_alloc_event(group, mask, data, data_type, dir,
- file_name, &fsid);
+ event = fanotify_alloc_event(group, mask, event_info, &fsid);
ret = -ENOMEM;
if (unlikely(!event)) {
/*
diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index 30d422b8c0fc..36205a769dde 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -177,8 +177,8 @@ static bool fsnotify_event_needs_parent(struct inode *inode, struct mount *mnt,
* Notify only the child without name info if parent is not watching and
* inode/sb/mount are not interested in events with parent and name info.
*/
-int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
- int data_type)
+int __fsnotify_parent(struct dentry *dentry, __u32 mask,
+ const void *data, int data_type)
{
const struct path *path = fsnotify_data_path(data, data_type);
struct mount *mnt = path ? real_mount(path->mnt) : NULL;
@@ -229,7 +229,11 @@ int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
}
notify:
- ret = fsnotify(mask, data, data_type, p_inode, file_name, inode, 0);
+ ret = __fsnotify(mask, &(struct fsnotify_event_info) {
+ .data = data, .data_type = data_type,
+ .dir = p_inode, .name = file_name,
+ .inode = inode,
+ });
if (file_name)
release_dentry_name_snapshot(&name);
@@ -240,13 +244,11 @@ int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
EXPORT_SYMBOL_GPL(__fsnotify_parent);
static int fsnotify_handle_inode_event(struct fsnotify_group *group,
- struct fsnotify_mark *inode_mark,
- u32 mask, const void *data, int data_type,
- struct inode *dir, const struct qstr *name,
- u32 cookie)
+ struct fsnotify_mark *inode_mark, u32 mask,
+ const struct fsnotify_event_info *event_info)
{
- const struct path *path = fsnotify_data_path(data, data_type);
- struct inode *inode = fsnotify_data_inode(data, data_type);
+ const struct path *path = fsnotify_event_info_path(event_info);
+ struct inode *inode = fsnotify_event_info_inode(event_info);
const struct fsnotify_ops *ops = group->ops;
if (WARN_ON_ONCE(!ops->handle_inode_event))
@@ -260,16 +262,17 @@ static int fsnotify_handle_inode_event(struct fsnotify_group *group,
if (!(mask & inode_mark->mask & ALL_FSNOTIFY_EVENTS))
return 0;
- return ops->handle_inode_event(inode_mark, mask, inode, dir, name, cookie);
+ return ops->handle_inode_event(inode_mark, mask, inode, event_info->dir,
+ event_info->name, event_info->cookie);
}
static int fsnotify_handle_event(struct fsnotify_group *group, __u32 mask,
- const void *data, int data_type,
- struct inode *dir, const struct qstr *name,
- u32 cookie, struct fsnotify_iter_info *iter_info)
+ const struct fsnotify_event_info *event_info,
+ struct fsnotify_iter_info *iter_info)
{
struct fsnotify_mark *inode_mark = fsnotify_iter_inode_mark(iter_info);
struct fsnotify_mark *parent_mark = fsnotify_iter_parent_mark(iter_info);
+ struct fsnotify_event_info child_event_info = { };
int ret;
if (WARN_ON_ONCE(fsnotify_iter_sb_mark(iter_info)) ||
@@ -284,8 +287,8 @@ static int fsnotify_handle_event(struct fsnotify_group *group, __u32 mask,
* interested in this event?
*/
if (parent_mark->mask & FS_EVENT_ON_CHILD) {
- ret = fsnotify_handle_inode_event(group, parent_mark, mask,
- data, data_type, dir, name, 0);
+ ret = fsnotify_handle_inode_event(group, parent_mark,
+ mask, event_info);
if (ret)
return ret;
}
@@ -302,18 +305,22 @@ static int fsnotify_handle_event(struct fsnotify_group *group, __u32 mask,
* The child watcher is expecting an event without a file name
* and without the FS_EVENT_ON_CHILD flag.
*/
+ if (WARN_ON_ONCE(!event_info->inode))
+ return 0;
+
mask &= ~FS_EVENT_ON_CHILD;
- dir = NULL;
- name = NULL;
+ child_event_info = *event_info;
+ child_event_info.dir = NULL;
+ child_event_info.name = NULL;
+ event_info = &child_event_info;
}
- return fsnotify_handle_inode_event(group, inode_mark, mask, data, data_type,
- dir, name, cookie);
+ return fsnotify_handle_inode_event(group, inode_mark, mask, event_info);
}
-static int send_to_group(__u32 mask, const void *data, int data_type,
- struct inode *dir, const struct qstr *file_name,
- u32 cookie, struct fsnotify_iter_info *iter_info)
+static int send_to_group(__u32 mask,
+ const struct fsnotify_event_info *event_info,
+ struct fsnotify_iter_info *iter_info)
{
struct fsnotify_group *group = NULL;
__u32 test_mask = (mask & ALL_FSNOTIFY_EVENTS);
@@ -351,18 +358,18 @@ static int send_to_group(__u32 mask, const void *data, int data_type,
pr_debug("%s: group=%p mask=%x marks_mask=%x marks_ignored_mask=%x data=%p data_type=%d dir=%p cookie=%d\n",
__func__, group, mask, marks_mask, marks_ignored_mask,
- data, data_type, dir, cookie);
+ event_info->data, event_info->data_type, event_info->dir,
+ event_info->cookie);
if (!(test_mask & marks_mask & ~marks_ignored_mask))
return 0;
if (group->ops->handle_event) {
- return group->ops->handle_event(group, mask, data, data_type, dir,
- file_name, cookie, iter_info);
+ return group->ops->handle_event(group, mask, event_info,
+ iter_info);
}
- return fsnotify_handle_event(group, mask, data, data_type, dir,
- file_name, cookie, iter_info);
+ return fsnotify_handle_event(group, mask, event_info, iter_info);
}
static struct fsnotify_mark *fsnotify_first_mark(struct fsnotify_mark_connector **connp)
@@ -448,21 +455,22 @@ static void fsnotify_iter_next(struct fsnotify_iter_info *iter_info)
* in whatever means they feel necessary.
*
* @mask: event type and flags
+ * Input args in struct fsnotify_event_info:
* @data: object that event happened on
* @data_type: type of object for fanotify_data_XXX() accessors
* @dir: optional directory associated with event -
- * if @file_name is not NULL, this is the directory that
- * @file_name is relative to
- * @file_name: optional file name associated with event
+ * if @name is not NULL, this is the directory that
+ * @name is relative to
+ * @name: optional file name associated with event
* @inode: optional inode associated with event -
* either @dir or @inode must be non-NULL.
* if both are non-NULL event may be reported to both.
* @cookie: inotify rename cookie
*/
-int fsnotify(__u32 mask, const void *data, int data_type, struct inode *dir,
- const struct qstr *file_name, struct inode *inode, u32 cookie)
+int __fsnotify(__u32 mask, const struct fsnotify_event_info *event_info)
{
- const struct path *path = fsnotify_data_path(data, data_type);
+ const struct path *path = fsnotify_event_info_path(event_info);
+ struct inode *inode = event_info->inode;
struct fsnotify_iter_info iter_info = {};
struct super_block *sb;
struct mount *mnt = NULL;
@@ -475,13 +483,13 @@ int fsnotify(__u32 mask, const void *data, int data_type, struct inode *dir,
if (!inode) {
/* Dirent event - report on TYPE_INODE to dir */
- inode = dir;
+ inode = event_info->dir;
} else if (mask & FS_EVENT_ON_CHILD) {
/*
* Event on child - report on TYPE_PARENT to dir if it is
* watching children and on TYPE_INODE to child.
*/
- parent = dir;
+ parent = event_info->dir;
}
sb = inode->i_sb;
@@ -538,8 +546,7 @@ int fsnotify(__u32 mask, const void *data, int data_type, struct inode *dir,
* That's why this traversal is so complicated...
*/
while (fsnotify_iter_select_report_types(&iter_info)) {
- ret = send_to_group(mask, data, data_type, dir, file_name,
- cookie, &iter_info);
+ ret = send_to_group(mask, event_info, &iter_info);
if (ret && (mask & ALL_FSNOTIFY_PERM_EVENTS))
goto out;
@@ -552,7 +559,7 @@ int fsnotify(__u32 mask, const void *data, int data_type, struct inode *dir,
return ret;
}
-EXPORT_SYMBOL_GPL(fsnotify);
+EXPORT_SYMBOL_GPL(__fsnotify);
static __init int fsnotify_init(void)
{
diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index f8acddcf54fb..8c2c681b4495 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -30,7 +30,10 @@ static inline void fsnotify_name(struct inode *dir, __u32 mask,
struct inode *child,
const struct qstr *name, u32 cookie)
{
- fsnotify(mask, child, FSNOTIFY_EVENT_INODE, dir, name, NULL, cookie);
+ __fsnotify(mask, &(struct fsnotify_event_info) {
+ .data = child, .data_type = FSNOTIFY_EVENT_INODE,
+ .dir = dir, .name = name, .cookie = cookie,
+ });
}
static inline void fsnotify_dirent(struct inode *dir, struct dentry *dentry,
@@ -44,7 +47,10 @@ static inline void fsnotify_inode(struct inode *inode, __u32 mask)
if (S_ISDIR(inode->i_mode))
mask |= FS_ISDIR;
- fsnotify(mask, inode, FSNOTIFY_EVENT_INODE, NULL, NULL, inode, 0);
+ __fsnotify(mask, &(struct fsnotify_event_info) {
+ .data = inode, .data_type = FSNOTIFY_EVENT_INODE,
+ .inode = inode,
+ });
}
/* Notify this dentry's parent about a child's events. */
@@ -68,7 +74,10 @@ static inline int fsnotify_parent(struct dentry *dentry, __u32 mask,
return __fsnotify_parent(dentry, mask, data, data_type);
notify_child:
- return fsnotify(mask, data, data_type, NULL, NULL, inode, 0);
+ return __fsnotify(mask, &(struct fsnotify_event_info) {
+ .data = data, .data_type = data_type,
+ .inode = inode,
+ });
}
/*
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index f9e2c6cd0f7d..b1590f654ade 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -112,6 +112,28 @@ struct fsnotify_iter_info;
struct mem_cgroup;
+/*
+ * Event info args passed to fsnotify() and to backends on handle_event():
+ * @data: object that event happened on
+ * @data_type: type of object for fanotify_data_XXX() accessors
+ * @dir: optional directory associated with event -
+ * if @name is not NULL, this is the directory that
+ * @name is relative to
+ * @name: optional file name associated with event
+ * @inode: optional inode associated with event -
+ * either @dir or @inode must be non-NULL.
+ * if both are non-NULL event may be reported to both.
+ * @cookie: inotify rename cookie
+ */
+struct fsnotify_event_info {
+ const void *data;
+ int data_type;
+ struct inode *dir;
+ const struct qstr *name;
+ struct inode *inode;
+ u32 cookie;
+};
+
/*
* Each group much define these ops. The fsnotify infrastructure will call
* these operations for each relevant group.
@@ -119,13 +141,7 @@ struct mem_cgroup;
* handle_event - main call for a group to handle an fs event
* @group: group to notify
* @mask: event type and flags
- * @data: object that event happened on
- * @data_type: type of object for fanotify_data_XXX() accessors
- * @dir: optional directory associated with event -
- * if @file_name is not NULL, this is the directory that
- * @file_name is relative to
- * @file_name: optional file name associated with event
- * @cookie: inotify rename cookie
+ * @event_info: information attached to the event
* @iter_info: array of marks from this group that are interested in the event
*
* handle_inode_event - simple variant of handle_event() for groups that only
@@ -147,8 +163,7 @@ struct mem_cgroup;
*/
struct fsnotify_ops {
int (*handle_event)(struct fsnotify_group *group, u32 mask,
- const void *data, int data_type, struct inode *dir,
- const struct qstr *file_name, u32 cookie,
+ const struct fsnotify_event_info *event_info,
struct fsnotify_iter_info *iter_info);
int (*handle_inode_event)(struct fsnotify_mark *mark, u32 mask,
struct inode *inode, struct inode *dir,
@@ -262,6 +277,12 @@ static inline struct inode *fsnotify_data_inode(const void *data, int data_type)
}
}
+static inline struct inode *fsnotify_event_info_inode(
+ const struct fsnotify_event_info *event_info)
+{
+ return fsnotify_data_inode(event_info->data, event_info->data_type);
+}
+
static inline const struct path *fsnotify_data_path(const void *data,
int data_type)
{
@@ -273,6 +294,12 @@ static inline const struct path *fsnotify_data_path(const void *data,
}
}
+static inline const struct path *fsnotify_event_info_path(
+ const struct fsnotify_event_info *event_info)
+{
+ return fsnotify_data_path(event_info->data, event_info->data_type);
+}
+
enum fsnotify_obj_type {
FSNOTIFY_OBJ_TYPE_INODE,
FSNOTIFY_OBJ_TYPE_PARENT,
@@ -410,11 +437,22 @@ struct fsnotify_mark {
/* called from the vfs helpers */
/* main fsnotify call to send events */
-extern int fsnotify(__u32 mask, const void *data, int data_type,
- struct inode *dir, const struct qstr *name,
- struct inode *inode, u32 cookie);
-extern int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
- int data_type);
+extern int __fsnotify(__u32 mask,
+ const struct fsnotify_event_info *event_info);
+extern int __fsnotify_parent(struct dentry *dentry, __u32 mask,
+ const void *data, int data_type);
+
+static inline int fsnotify(__u32 mask, const void *data, int data_type,
+ struct inode *dir, const struct qstr *name,
+ struct inode *inode, u32 cookie)
+{
+ return __fsnotify(mask, &(struct fsnotify_event_info) {
+ .data = data, .data_type = data_type,
+ .dir = dir, .name = name, .inode = inode,
+ .cookie = cookie,
+ });
+}
+
extern void __fsnotify_inode_delete(struct inode *inode);
extern void __fsnotify_vfsmount_delete(struct vfsmount *mnt);
extern void fsnotify_sb_delete(struct super_block *sb);
@@ -594,15 +632,14 @@ static inline void fsnotify_init_event(struct fsnotify_event *event)
#else
-static inline int fsnotify(__u32 mask, const void *data, int data_type,
- struct inode *dir, const struct qstr *name,
- struct inode *inode, u32 cookie)
+static inline int fsnotify(__u32 mask,
+ const struct fsnotify_event_info *event_info)
{
return 0;
}
static inline int __fsnotify_parent(struct dentry *dentry, __u32 mask,
- const void *data, int data_type)
+ const void *data, int data_type)
{
return 0;
}
--
2.32.0
FAN_FS_ERROR requires some initialization to happen from inside the
insert hook. This allows a user of fanotify_add_event to pass an
argument to be sent to the insert callback.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
---
fs/notify/fanotify/fanotify.c | 5 +++--
fs/notify/inotify/inotify_fsnotify.c | 2 +-
fs/notify/notification.c | 6 ++++--
include/linux/fsnotify_backend.h | 7 +++++--
4 files changed, 13 insertions(+), 7 deletions(-)
diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 4f2febb15e94..aba06b84da91 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -695,7 +695,8 @@ static __kernel_fsid_t fanotify_get_fsid(struct fsnotify_iter_info *iter_info)
* Add an event to hash table for faster merge.
*/
static void fanotify_insert_event(struct fsnotify_group *group,
- struct fsnotify_event *fsn_event)
+ struct fsnotify_event *fsn_event,
+ const void *data)
{
struct fanotify_event *event = FANOTIFY_E(fsn_event);
unsigned int bucket = fanotify_event_hash_bucket(group, event);
@@ -779,7 +780,7 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
fsn_event = &event->fse;
ret = fsnotify_add_event(group, fsn_event, fanotify_merge,
- fanotify_insert_event);
+ fanotify_insert_event, NULL);
if (ret) {
/* Permission events shouldn't be merged */
BUG_ON(ret == 1 && mask & FANOTIFY_PERM_EVENTS);
diff --git a/fs/notify/inotify/inotify_fsnotify.c b/fs/notify/inotify/inotify_fsnotify.c
index d1a64daa0171..a003a64ff8ee 100644
--- a/fs/notify/inotify/inotify_fsnotify.c
+++ b/fs/notify/inotify/inotify_fsnotify.c
@@ -116,7 +116,7 @@ int inotify_handle_inode_event(struct fsnotify_mark *inode_mark, u32 mask,
if (len)
strcpy(event->name, name->name);
- ret = fsnotify_add_event(group, fsn_event, inotify_merge, NULL);
+ ret = fsnotify_add_event(group, fsn_event, inotify_merge, NULL, NULL);
if (ret) {
/* Our event wasn't used in the end. Free it. */
fsnotify_destroy_event(group, fsn_event);
diff --git a/fs/notify/notification.c b/fs/notify/notification.c
index 32f45543b9c6..0d9ba592d725 100644
--- a/fs/notify/notification.c
+++ b/fs/notify/notification.c
@@ -83,7 +83,9 @@ int fsnotify_add_event(struct fsnotify_group *group,
int (*merge)(struct fsnotify_group *,
struct fsnotify_event *),
void (*insert)(struct fsnotify_group *,
- struct fsnotify_event *))
+ struct fsnotify_event *,
+ const void *),
+ const void *insert_data)
{
int ret = 0;
struct list_head *list = &group->notification_list;
@@ -121,7 +123,7 @@ int fsnotify_add_event(struct fsnotify_group *group,
group->q_len++;
list_add_tail(&event->list, list);
if (insert)
- insert(group, event);
+ insert(group, event, insert_data);
spin_unlock(&group->notification_lock);
wake_up(&group->notification_waitq);
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index b1590f654ade..8222fe12a6c9 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -526,11 +526,14 @@ extern int fsnotify_add_event(struct fsnotify_group *group,
int (*merge)(struct fsnotify_group *,
struct fsnotify_event *),
void (*insert)(struct fsnotify_group *,
- struct fsnotify_event *));
+ struct fsnotify_event *,
+ const void *),
+ const void *insert_data);
+
/* Queue overflow event to a notification group */
static inline void fsnotify_queue_overflow(struct fsnotify_group *group)
{
- fsnotify_add_event(group, group->overflow_event, NULL, NULL);
+ fsnotify_add_event(group, group->overflow_event, NULL, NULL, NULL);
}
static inline bool fsnotify_is_overflow_event(u32 mask)
--
2.32.0
Hi Gabriel,
I love your patch! Yet something to improve:
[auto build test ERROR on ext3/fsnotify]
[also build test ERROR on ext4/dev linus/master v5.13 next-20210629]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210630-031347
base: https://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git fsnotify
config: i386-tinyconfig (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# https://github.com/0day-ci/linux/commit/dca640915c7b840656b052e138effd501bd5d5e1
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210630-031347
git checkout dca640915c7b840656b052e138effd501bd5d5e1
# save the attached .config to linux build tree
make W=1 ARCH=i386
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>
All errors (new ones prefixed by >>):
In file included from fs/open.c:12:
include/linux/fsnotify.h: In function 'fsnotify_name':
>> include/linux/fsnotify.h:33:2: error: implicit declaration of function '__fsnotify'; did you mean 'fsnotify'? [-Werror=implicit-function-declaration]
33 | __fsnotify(mask, &(struct fsnotify_event_info) {
| ^~~~~~~~~~
| fsnotify
cc1: some warnings being treated as errors
vim +33 include/linux/fsnotify.h
19
20 /*
21 * Notify this @dir inode about a change in a child directory entry.
22 * The directory entry may have turned positive or negative or its inode may
23 * have changed (i.e. renamed over).
24 *
25 * Unlike fsnotify_parent(), the event will be reported regardless of the
26 * FS_EVENT_ON_CHILD mask on the parent inode and will not be reported if only
27 * the child is interested and not the parent.
28 */
29 static inline void fsnotify_name(struct inode *dir, __u32 mask,
30 struct inode *child,
31 const struct qstr *name, u32 cookie)
32 {
> 33 __fsnotify(mask, &(struct fsnotify_event_info) {
34 .data = child, .data_type = FSNOTIFY_EVENT_INODE,
35 .dir = dir, .name = name, .cookie = cookie,
36 });
37 }
38
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]
Hi Gabriel,
I love your patch! Yet something to improve:
[auto build test ERROR on ext3/fsnotify]
[also build test ERROR on ext4/dev linus/master v5.13 next-20210629]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210630-031347
base: https://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git fsnotify
config: nios2-defconfig (attached as .config)
compiler: nios2-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/dca640915c7b840656b052e138effd501bd5d5e1
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210630-031347
git checkout dca640915c7b840656b052e138effd501bd5d5e1
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=nios2
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>
All errors (new ones prefixed by >>):
In file included from fs/kernfs/file.c:16:
include/linux/fsnotify.h: In function 'fsnotify_name':
include/linux/fsnotify.h:33:2: error: implicit declaration of function '__fsnotify'; did you mean 'fsnotify'? [-Werror=implicit-function-declaration]
33 | __fsnotify(mask, &(struct fsnotify_event_info) {
| ^~~~~~~~~~
| fsnotify
fs/kernfs/file.c: In function 'kernfs_notify_workfn':
>> fs/kernfs/file.c:887:7: error: passing argument 2 of 'fsnotify' from incompatible pointer type [-Werror=incompatible-pointer-types]
887 | inode, FSNOTIFY_EVENT_INODE,
| ^~~~~
| |
| struct inode *
In file included from include/linux/fsnotify.h:15,
from fs/kernfs/file.c:16:
include/linux/fsnotify_backend.h:636:41: note: expected 'const struct fsnotify_event_info *' but argument is of type 'struct inode *'
636 | const struct fsnotify_event_info *event_info)
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~
>> fs/kernfs/file.c:886:5: error: too many arguments to function 'fsnotify'
886 | fsnotify(FS_MODIFY | FS_EVENT_ON_CHILD,
| ^~~~~~~~
In file included from include/linux/fsnotify.h:15,
from fs/kernfs/file.c:16:
include/linux/fsnotify_backend.h:635:19: note: declared here
635 | static inline int fsnotify(__u32 mask,
| ^~~~~~~~
cc1: some warnings being treated as errors
vim +/fsnotify +887 fs/kernfs/file.c
414985ae23c031 Tejun Heo 2013-11-28 845
ecca47ce829484 Tejun Heo 2014-07-01 846 static void kernfs_notify_workfn(struct work_struct *work)
414985ae23c031 Tejun Heo 2013-11-28 847 {
ecca47ce829484 Tejun Heo 2014-07-01 848 struct kernfs_node *kn;
d911d98748018f Tejun Heo 2014-04-09 849 struct kernfs_super_info *info;
ecca47ce829484 Tejun Heo 2014-07-01 850 repeat:
ecca47ce829484 Tejun Heo 2014-07-01 851 /* pop one off the notify_list */
ecca47ce829484 Tejun Heo 2014-07-01 852 spin_lock_irq(&kernfs_notify_lock);
ecca47ce829484 Tejun Heo 2014-07-01 853 kn = kernfs_notify_list;
ecca47ce829484 Tejun Heo 2014-07-01 854 if (kn == KERNFS_NOTIFY_EOL) {
ecca47ce829484 Tejun Heo 2014-07-01 855 spin_unlock_irq(&kernfs_notify_lock);
d911d98748018f Tejun Heo 2014-04-09 856 return;
ecca47ce829484 Tejun Heo 2014-07-01 857 }
ecca47ce829484 Tejun Heo 2014-07-01 858 kernfs_notify_list = kn->attr.notify_next;
ecca47ce829484 Tejun Heo 2014-07-01 859 kn->attr.notify_next = NULL;
ecca47ce829484 Tejun Heo 2014-07-01 860 spin_unlock_irq(&kernfs_notify_lock);
d911d98748018f Tejun Heo 2014-04-09 861
d911d98748018f Tejun Heo 2014-04-09 862 /* kick fsnotify */
d911d98748018f Tejun Heo 2014-04-09 863 mutex_lock(&kernfs_mutex);
d911d98748018f Tejun Heo 2014-04-09 864
ecca47ce829484 Tejun Heo 2014-07-01 865 list_for_each_entry(info, &kernfs_root(kn)->supers, node) {
df6a58c5c5aa8e Tejun Heo 2016-06-17 866 struct kernfs_node *parent;
497b0c5a7c0688 Amir Goldstein 2020-07-16 867 struct inode *p_inode = NULL;
d911d98748018f Tejun Heo 2014-04-09 868 struct inode *inode;
25b229dff4ffff Al Viro 2019-04-26 869 struct qstr name;
d911d98748018f Tejun Heo 2014-04-09 870
df6a58c5c5aa8e Tejun Heo 2016-06-17 871 /*
df6a58c5c5aa8e Tejun Heo 2016-06-17 872 * We want fsnotify_modify() on @kn but as the
df6a58c5c5aa8e Tejun Heo 2016-06-17 873 * modifications aren't originating from userland don't
df6a58c5c5aa8e Tejun Heo 2016-06-17 874 * have the matching @file available. Look up the inodes
df6a58c5c5aa8e Tejun Heo 2016-06-17 875 * and generate the events manually.
df6a58c5c5aa8e Tejun Heo 2016-06-17 876 */
67c0496e87d193 Tejun Heo 2019-11-04 877 inode = ilookup(info->sb, kernfs_ino(kn));
d911d98748018f Tejun Heo 2014-04-09 878 if (!inode)
d911d98748018f Tejun Heo 2014-04-09 879 continue;
d911d98748018f Tejun Heo 2014-04-09 880
25b229dff4ffff Al Viro 2019-04-26 881 name = (struct qstr)QSTR_INIT(kn->name, strlen(kn->name));
df6a58c5c5aa8e Tejun Heo 2016-06-17 882 parent = kernfs_get_parent(kn);
df6a58c5c5aa8e Tejun Heo 2016-06-17 883 if (parent) {
67c0496e87d193 Tejun Heo 2019-11-04 884 p_inode = ilookup(info->sb, kernfs_ino(parent));
df6a58c5c5aa8e Tejun Heo 2016-06-17 885 if (p_inode) {
40a100d3adc1ad Amir Goldstein 2020-07-22 @886 fsnotify(FS_MODIFY | FS_EVENT_ON_CHILD,
40a100d3adc1ad Amir Goldstein 2020-07-22 @887 inode, FSNOTIFY_EVENT_INODE,
40a100d3adc1ad Amir Goldstein 2020-07-22 888 p_inode, &name, inode, 0);
df6a58c5c5aa8e Tejun Heo 2016-06-17 889 iput(p_inode);
d911d98748018f Tejun Heo 2014-04-09 890 }
d911d98748018f Tejun Heo 2014-04-09 891
df6a58c5c5aa8e Tejun Heo 2016-06-17 892 kernfs_put(parent);
df6a58c5c5aa8e Tejun Heo 2016-06-17 893 }
df6a58c5c5aa8e Tejun Heo 2016-06-17 894
82ace1efb3cb1d Amir Goldstein 2020-07-22 895 if (!p_inode)
82ace1efb3cb1d Amir Goldstein 2020-07-22 896 fsnotify_inode(inode, FS_MODIFY);
497b0c5a7c0688 Amir Goldstein 2020-07-16 897
d911d98748018f Tejun Heo 2014-04-09 898 iput(inode);
d911d98748018f Tejun Heo 2014-04-09 899 }
d911d98748018f Tejun Heo 2014-04-09 900
d911d98748018f Tejun Heo 2014-04-09 901 mutex_unlock(&kernfs_mutex);
ecca47ce829484 Tejun Heo 2014-07-01 902 kernfs_put(kn);
ecca47ce829484 Tejun Heo 2014-07-01 903 goto repeat;
ecca47ce829484 Tejun Heo 2014-07-01 904 }
ecca47ce829484 Tejun Heo 2014-07-01 905
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]
Hi Gabriel,
I love your patch! Yet something to improve:
[auto build test ERROR on ext3/fsnotify]
[also build test ERROR on ext4/dev linus/master v5.13 next-20210629]
[cannot apply to tytso-fscrypt/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210630-031347
base: https://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git fsnotify
config: arm64-allyesconfig (attached as .config)
compiler: aarch64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/746524d8db08a041fed90e41b15c8e8ca69cb22d
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210630-031347
git checkout 746524d8db08a041fed90e41b15c8e8ca69cb22d
# save the attached .config to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross O=build_dir ARCH=arm64 SHELL=/bin/bash samples/
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>
All errors (new ones prefixed by >>):
>> samples/fanotify/fs-monitor.c:7:10: fatal error: errno.h: No such file or directory
7 | #include <errno.h>
| ^~~~~~~~~
compilation terminated.
vim +7 samples/fanotify/fs-monitor.c
> 7 #include <errno.h>
8 #include <err.h>
9 #include <stdlib.h>
10 #include <stdio.h>
11 #include <fcntl.h>
12 #include <sys/fanotify.h>
13 #include <sys/types.h>
14 #include <unistd.h>
15 #include <sys/stat.h>
16 #include <sys/types.h>
17
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]
On Tue, Jun 29, 2021 at 10:11 PM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> Some events, like the overflow event, are not mergeable, so they are not
> hashed. But, when failing inside fsnotify_add_event for lack of space,
> fsnotify_add_event() still calls the insert hook, which adds the
> overflow event to the merge list. Add a check to prevent any kind of
> unmergeable event to be inserted in the hashtable.
>
> Fixes: 94e00d28a680 ("fsnotify: use hash table for faster events merge")
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
>
Reviewed-by: Amir Goldstein <[email protected]>
> ---
> Changes since v2:
> - Do check for hashed events inside the insert hook (Amir)
> ---
> fs/notify/fanotify/fanotify.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index 057abd2cf887..310246f8d3f1 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -702,6 +702,9 @@ static void fanotify_insert_event(struct fsnotify_group *group,
>
> assert_spin_locked(&group->notification_lock);
>
> + if (!fanotify_is_hashed_event(event->mask))
> + return;
> +
> pr_debug("%s: group=%p event=%p bucket=%u\n", __func__,
> group, event, bucket);
>
> @@ -779,8 +782,7 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
>
> fsn_event = &event->fse;
> ret = fsnotify_add_event(group, fsn_event, fanotify_merge,
> - fanotify_is_hashed_event(mask) ?
> - fanotify_insert_event : NULL);
> + fanotify_insert_event);
> if (ret) {
> /* Permission events shouldn't be merged */
> BUG_ON(ret == 1 && mask & FANOTIFY_PERM_EVENTS);
> --
> 2.32.0
>
On Tue, Jun 29, 2021 at 10:12 PM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> FAN_FS_ERROR will require fsid, but not necessarily require the
> filesystem to expose a file handle. Split those checks into different
> functions, so they can be used separately when setting up an event.
>
> While there, update a comment about tmpfs having 0 fsid, which is no
> longer true.
>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
>
Reviewed-by: Amir Goldstein <[email protected]>
> ---
> Changes since v2:
> - FAN_ERROR -> FAN_FS_ERROR (Amir)
> - Update comment (Amir)
>
> Changes since v1:
> (Amir)
> - Sort hunks to simplify diff.
> Changes since RFC:
> (Amir)
> - Rename fanotify_check_path_fsid -> fanotify_test_fsid.
> - Use dentry directly instead of path.
> ---
> fs/notify/fanotify/fanotify_user.c | 27 ++++++++++++++++++---------
> 1 file changed, 18 insertions(+), 9 deletions(-)
>
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index 68a53d3534f8..67b18dfe0025 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -1192,16 +1192,15 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
> return fd;
> }
>
> -/* Check if filesystem can encode a unique fid */
> -static int fanotify_test_fid(struct path *path, __kernel_fsid_t *fsid)
> +static int fanotify_test_fsid(struct dentry *dentry, __kernel_fsid_t *fsid)
> {
> __kernel_fsid_t root_fsid;
> int err;
>
> /*
> - * Make sure path is not in filesystem with zero fsid (e.g. tmpfs).
> + * Make sure dentry is not of a filesystem with zero fsid (e.g. fuse).
> */
> - err = vfs_get_fsid(path->dentry, fsid);
> + err = vfs_get_fsid(dentry, fsid);
> if (err)
> return err;
>
> @@ -1209,10 +1208,10 @@ static int fanotify_test_fid(struct path *path, __kernel_fsid_t *fsid)
> return -ENODEV;
>
> /*
> - * Make sure path is not inside a filesystem subvolume (e.g. btrfs)
> + * Make sure dentry is not of a filesystem subvolume (e.g. btrfs)
> * which uses a different fsid than sb root.
> */
> - err = vfs_get_fsid(path->dentry->d_sb->s_root, &root_fsid);
> + err = vfs_get_fsid(dentry->d_sb->s_root, &root_fsid);
> if (err)
> return err;
>
> @@ -1220,6 +1219,12 @@ static int fanotify_test_fid(struct path *path, __kernel_fsid_t *fsid)
> root_fsid.val[1] != fsid->val[1])
> return -EXDEV;
>
> + return 0;
> +}
> +
> +/* Check if filesystem can encode a unique fid */
> +static int fanotify_test_fid(struct dentry *dentry)
> +{
> /*
> * We need to make sure that the file system supports at least
> * encoding a file handle so user can use name_to_handle_at() to
> @@ -1227,8 +1232,8 @@ static int fanotify_test_fid(struct path *path, __kernel_fsid_t *fsid)
> * objects. However, name_to_handle_at() requires that the
> * filesystem also supports decoding file handles.
> */
> - if (!path->dentry->d_sb->s_export_op ||
> - !path->dentry->d_sb->s_export_op->fh_to_dentry)
> + if (!dentry->d_sb->s_export_op ||
> + !dentry->d_sb->s_export_op->fh_to_dentry)
> return -EOPNOTSUPP;
>
> return 0;
> @@ -1379,7 +1384,11 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
> }
>
> if (fid_mode) {
> - ret = fanotify_test_fid(&path, &__fsid);
> + ret = fanotify_test_fsid(path.dentry, &__fsid);
> + if (ret)
> + goto path_put_and_out;
> +
> + ret = fanotify_test_fid(path.dentry);
> if (ret)
> goto path_put_and_out;
>
> --
> 2.32.0
>
On Tue, Jun 29, 2021 at 10:12 PM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> FAN_ERROR will require an error structure to be stored per mark. But,
> since FAN_ERROR doesn't apply to inode/mount marks, it should suffice to
FAN_FS_ERROR
> only expose this information for superblock marks. Therefore, wrap this
> kind of marks into a container and plumb it for the future.
>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
>
With typo fixes:
Reviewed-by: Amir Goldstein <[email protected]>
> ---
> Changes since v2:
> - Move mark initialization to fanotify_alloc_mark (Amir)
>
> Changes since v1:
> - Only extend superblock marks (Amir)
> ---
> fs/notify/fanotify/fanotify.c | 10 ++++++--
> fs/notify/fanotify/fanotify.h | 13 ++++++++++
> fs/notify/fanotify/fanotify_user.c | 38 ++++++++++++++++++++++++++++--
> include/linux/fsnotify_backend.h | 1 +
> 4 files changed, 58 insertions(+), 4 deletions(-)
>
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index 310246f8d3f1..0f760770d4c9 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -869,9 +869,15 @@ static void fanotify_freeing_mark(struct fsnotify_mark *mark,
> dec_ucount(group->fanotify_data.ucounts, UCOUNT_FANOTIFY_MARKS);
> }
>
> -static void fanotify_free_mark(struct fsnotify_mark *fsn_mark)
> +static void fanotify_free_mark(struct fsnotify_mark *mark)
> {
> - kmem_cache_free(fanotify_mark_cache, fsn_mark);
> + if (mark->flags & FSNOTIFY_MARK_FLAG_SB) {
> + struct fanotify_sb_mark *fa_mark = FANOTIFY_SB_MARK(mark);
I guess you meant sb_mark?
> +
> + kmem_cache_free(fanotify_sb_mark_cache, fa_mark);
> + } else {
> + kmem_cache_free(fanotify_mark_cache, mark);
> + }
> }
>
> const struct fsnotify_ops fanotify_fsnotify_ops = {
> diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
> index 4a5e555dc3d2..fd125a949187 100644
> --- a/fs/notify/fanotify/fanotify.h
> +++ b/fs/notify/fanotify/fanotify.h
> @@ -6,6 +6,7 @@
> #include <linux/hashtable.h>
>
> extern struct kmem_cache *fanotify_mark_cache;
> +extern struct kmem_cache *fanotify_sb_mark_cache;
> extern struct kmem_cache *fanotify_fid_event_cachep;
> extern struct kmem_cache *fanotify_path_event_cachep;
> extern struct kmem_cache *fanotify_perm_event_cachep;
> @@ -129,6 +130,18 @@ static inline void fanotify_info_copy_name(struct fanotify_info *info,
> name->name);
> }
>
> +struct fanotify_sb_mark {
> + struct fsnotify_mark fsn_mark;
> +};
> +
> +static inline
> +struct fanotify_sb_mark *FANOTIFY_SB_MARK(struct fsnotify_mark *mark)
> +{
> + WARN_ON(!(mark->flags & FSNOTIFY_MARK_FLAG_SB));
> +
> + return container_of(mark, struct fanotify_sb_mark, fsn_mark);
> +}
> +
> /*
> * Common structure for fanotify events. Concrete structs are allocated in
> * fanotify_handle_event() and freed when the information is retrieved by
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index 67b18dfe0025..a42521e140e6 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -99,6 +99,7 @@ struct ctl_table fanotify_table[] = {
> extern const struct fsnotify_ops fanotify_fsnotify_ops;
>
> struct kmem_cache *fanotify_mark_cache __read_mostly;
> +struct kmem_cache *fanotify_sb_mark_cache __read_mostly;
> struct kmem_cache *fanotify_fid_event_cachep __read_mostly;
> struct kmem_cache *fanotify_path_event_cachep __read_mostly;
> struct kmem_cache *fanotify_perm_event_cachep __read_mostly;
> @@ -915,6 +916,38 @@ static __u32 fanotify_mark_add_to_mask(struct fsnotify_mark *fsn_mark,
> return mask & ~oldmask;
> }
>
> +static struct fsnotify_mark *fanotify_alloc_mark(struct fsnotify_group *group,
> + unsigned int type)
> +{
> + struct fanotify_sb_mark *sb_mark;
> + struct fsnotify_mark *mark;
> +
> + switch (type) {
> + case FSNOTIFY_OBJ_TYPE_SB:
> + sb_mark = kmem_cache_zalloc(fanotify_sb_mark_cache, GFP_KERNEL);
> + if (!sb_mark)
> + return NULL;
> + mark = &sb_mark->fsn_mark;
> + break;
> +
> + case FSNOTIFY_OBJ_TYPE_INODE:
> + case FSNOTIFY_OBJ_TYPE_PARENT:
> + case FSNOTIFY_OBJ_TYPE_VFSMOUNT:
> + mark = kmem_cache_alloc(fanotify_mark_cache, GFP_KERNEL);
> + break;
> + default:
> + WARN_ON(1);
> + return NULL;
> + }
> +
> + fsnotify_init_mark(mark, group);
> +
> + if (type == FSNOTIFY_OBJ_TYPE_SB)
> + mark->flags |= FSNOTIFY_MARK_FLAG_SB;
> +
> + return mark;
> +}
> +
> static struct fsnotify_mark *fanotify_add_new_mark(struct fsnotify_group *group,
> fsnotify_connp_t *connp,
> unsigned int type,
> @@ -933,13 +966,12 @@ static struct fsnotify_mark *fanotify_add_new_mark(struct fsnotify_group *group,
> !inc_ucount(ucounts->ns, ucounts->uid, UCOUNT_FANOTIFY_MARKS))
> return ERR_PTR(-ENOSPC);
>
> - mark = kmem_cache_alloc(fanotify_mark_cache, GFP_KERNEL);
> + mark = fanotify_alloc_mark(group, type);
> if (!mark) {
> ret = -ENOMEM;
> goto out_dec_ucounts;
> }
>
> - fsnotify_init_mark(mark, group);
> ret = fsnotify_add_mark_locked(mark, connp, type, 0, fsid);
> if (ret) {
> fsnotify_put_mark(mark);
> @@ -1497,6 +1529,8 @@ static int __init fanotify_user_setup(void)
>
> fanotify_mark_cache = KMEM_CACHE(fsnotify_mark,
> SLAB_PANIC|SLAB_ACCOUNT);
> + fanotify_sb_mark_cache = KMEM_CACHE(fanotify_sb_mark,
> + SLAB_PANIC|SLAB_ACCOUNT);
> fanotify_fid_event_cachep = KMEM_CACHE(fanotify_fid_event,
> SLAB_PANIC);
> fanotify_path_event_cachep = KMEM_CACHE(fanotify_path_event,
> diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
> index 1ce66748a2d2..c4473b467c28 100644
> --- a/include/linux/fsnotify_backend.h
> +++ b/include/linux/fsnotify_backend.h
> @@ -401,6 +401,7 @@ struct fsnotify_mark {
> #define FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY 0x01
> #define FSNOTIFY_MARK_FLAG_ALIVE 0x02
> #define FSNOTIFY_MARK_FLAG_ATTACHED 0x04
> +#define FSNOTIFY_MARK_FLAG_SB 0x08
> unsigned int flags; /* flags [mark->lock] */
> };
>
> --
> 2.32.0
>
()
On Tue, Jun 29, 2021 at 10:12 PM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> From: Amir Goldstein <[email protected]>
>
> There are a lot of arguments to fsnotify() and the handle_event() method.
> Pass them in a const struct instead of on the argument list.
>
> Apart from being more tidy, this helps with passing error reports to the
> backend. __fsnotify_parent() argument list was intentionally left
> untouched, because its argument list is still short enough and because
> most of the event info arguments are initialized inside
> __fsnotify_parent().
>
> Signed-off-by: Amir Goldstein <[email protected]>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
> ---
> fs/notify/fanotify/fanotify.c | 59 +++++++++++------------
> fs/notify/fsnotify.c | 83 +++++++++++++++++---------------
> include/linux/fsnotify.h | 15 ++++--
> include/linux/fsnotify_backend.h | 73 +++++++++++++++++++++-------
> 4 files changed, 140 insertions(+), 90 deletions(-)
>
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index 0f760770d4c9..4f2febb15e94 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -253,21 +253,22 @@ static int fanotify_get_response(struct fsnotify_group *group,
> * been included within the event mask, but have not been explicitly
> * requested by the user, will not be present in the returned mask.
> */
> -static u32 fanotify_group_event_mask(struct fsnotify_group *group,
> - struct fsnotify_iter_info *iter_info,
> - u32 event_mask, const void *data,
> - int data_type, struct inode *dir)
> +static u32 fanotify_group_event_mask(
> + struct fsnotify_group *group, u32 event_mask,
> + const struct fsnotify_event_info *event_info,
> + struct fsnotify_iter_info *iter_info)
> {
> __u32 marks_mask = 0, marks_ignored_mask = 0;
> __u32 test_mask, user_mask = FANOTIFY_OUTGOING_EVENTS |
> FANOTIFY_EVENT_FLAGS;
> - const struct path *path = fsnotify_data_path(data, data_type);
> + const struct path *path = fsnotify_event_info_path(event_info);
> unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
> struct fsnotify_mark *mark;
> int type;
>
> pr_debug("%s: report_mask=%x mask=%x data=%p data_type=%d\n",
> - __func__, iter_info->report_mask, event_mask, data, data_type);
> + __func__, iter_info->report_mask, event_mask,
> + event_info->data, event_info->data_type);
>
> if (!fid_mode) {
> /* Do we have path to open a file descriptor? */
> @@ -278,7 +279,7 @@ static u32 fanotify_group_event_mask(struct fsnotify_group *group,
> return 0;
> } else if (!(fid_mode & FAN_REPORT_FID)) {
> /* Do we have a directory inode to report? */
> - if (!dir && !(event_mask & FS_ISDIR))
> + if (!event_info->dir && !(event_mask & FS_ISDIR))
> return 0;
> }
>
> @@ -427,13 +428,13 @@ static int fanotify_encode_fh(struct fanotify_fh *fh, struct inode *inode,
> * FS_ATTRIB reports the child inode even if reported on a watched parent.
> * FS_CREATE reports the modified dir inode and not the created inode.
> */
> -static struct inode *fanotify_fid_inode(u32 event_mask, const void *data,
> - int data_type, struct inode *dir)
> +static struct inode *fanotify_fid_inode(u32 event_mask,
> + const struct fsnotify_event_info *event_info)
> {
> if (event_mask & ALL_FSNOTIFY_DIRENT_EVENTS)
> - return dir;
> + return event_info->dir;
>
> - return fsnotify_data_inode(data, data_type);
> + return fsnotify_event_info_inode(event_info);
> }
>
> /*
> @@ -444,18 +445,18 @@ static struct inode *fanotify_fid_inode(u32 event_mask, const void *data,
> * reported to parent.
> * Otherwise, do not report dir fid.
> */
> -static struct inode *fanotify_dfid_inode(u32 event_mask, const void *data,
> - int data_type, struct inode *dir)
> +static struct inode *fanotify_dfid_inode(u32 event_mask,
> + const struct fsnotify_event_info *event_info)
> {
> - struct inode *inode = fsnotify_data_inode(data, data_type);
> + struct inode *inode = fsnotify_event_info_inode(event_info);
>
> if (event_mask & ALL_FSNOTIFY_DIRENT_EVENTS)
> - return dir;
> + return event_info->dir;
>
> if (S_ISDIR(inode->i_mode))
> return inode;
>
> - return dir;
> + return event_info->dir;
> }
>
> static struct fanotify_event *fanotify_alloc_path_event(const struct path *path,
> @@ -563,17 +564,17 @@ static struct fanotify_event *fanotify_alloc_name_event(struct inode *id,
> return &fne->fae;
> }
>
> -static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
> - u32 mask, const void *data,
> - int data_type, struct inode *dir,
> - const struct qstr *file_name,
> - __kernel_fsid_t *fsid)
> +static struct fanotify_event *fanotify_alloc_event(
> + struct fsnotify_group *group, u32 mask,
> + const struct fsnotify_event_info *event_info,
> + __kernel_fsid_t *fsid)
> {
> struct fanotify_event *event = NULL;
> gfp_t gfp = GFP_KERNEL_ACCOUNT;
> - struct inode *id = fanotify_fid_inode(mask, data, data_type, dir);
> - struct inode *dirid = fanotify_dfid_inode(mask, data, data_type, dir);
> - const struct path *path = fsnotify_data_path(data, data_type);
> + struct inode *id = fanotify_fid_inode(mask, event_info);
> + struct inode *dirid = fanotify_dfid_inode(mask, event_info);
> + const struct path *path = fsnotify_event_info_path(event_info);
> + const struct qstr *file_name = event_info->name;
> unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
> struct mem_cgroup *old_memcg;
> struct inode *child = NULL;
> @@ -712,9 +713,7 @@ static void fanotify_insert_event(struct fsnotify_group *group,
> }
>
> static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
> - const void *data, int data_type,
> - struct inode *dir,
> - const struct qstr *file_name, u32 cookie,
> + const struct fsnotify_event_info *event_info,
> struct fsnotify_iter_info *iter_info)
> {
> int ret = 0;
> @@ -744,8 +743,7 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
>
> BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 19);
>
> - mask = fanotify_group_event_mask(group, iter_info, mask, data,
> - data_type, dir);
> + mask = fanotify_group_event_mask(group, mask, event_info, iter_info);
> if (!mask)
> return 0;
>
> @@ -767,8 +765,7 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
> return 0;
> }
>
> - event = fanotify_alloc_event(group, mask, data, data_type, dir,
> - file_name, &fsid);
> + event = fanotify_alloc_event(group, mask, event_info, &fsid);
> ret = -ENOMEM;
> if (unlikely(!event)) {
> /*
> diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
> index 30d422b8c0fc..36205a769dde 100644
> --- a/fs/notify/fsnotify.c
> +++ b/fs/notify/fsnotify.c
> @@ -177,8 +177,8 @@ static bool fsnotify_event_needs_parent(struct inode *inode, struct mount *mnt,
> * Notify only the child without name info if parent is not watching and
> * inode/sb/mount are not interested in events with parent and name info.
> */
> -int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
> - int data_type)
> +int __fsnotify_parent(struct dentry *dentry, __u32 mask,
> + const void *data, int data_type)
> {
> const struct path *path = fsnotify_data_path(data, data_type);
> struct mount *mnt = path ? real_mount(path->mnt) : NULL;
> @@ -229,7 +229,11 @@ int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
> }
>
> notify:
> - ret = fsnotify(mask, data, data_type, p_inode, file_name, inode, 0);
> + ret = __fsnotify(mask, &(struct fsnotify_event_info) {
> + .data = data, .data_type = data_type,
> + .dir = p_inode, .name = file_name,
> + .inode = inode,
> + });
>
> if (file_name)
> release_dentry_name_snapshot(&name);
> @@ -240,13 +244,11 @@ int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
> EXPORT_SYMBOL_GPL(__fsnotify_parent);
>
> static int fsnotify_handle_inode_event(struct fsnotify_group *group,
> - struct fsnotify_mark *inode_mark,
> - u32 mask, const void *data, int data_type,
> - struct inode *dir, const struct qstr *name,
> - u32 cookie)
> + struct fsnotify_mark *inode_mark, u32 mask,
> + const struct fsnotify_event_info *event_info)
> {
> - const struct path *path = fsnotify_data_path(data, data_type);
> - struct inode *inode = fsnotify_data_inode(data, data_type);
> + const struct path *path = fsnotify_event_info_path(event_info);
> + struct inode *inode = fsnotify_event_info_inode(event_info);
> const struct fsnotify_ops *ops = group->ops;
>
> if (WARN_ON_ONCE(!ops->handle_inode_event))
> @@ -260,16 +262,17 @@ static int fsnotify_handle_inode_event(struct fsnotify_group *group,
> if (!(mask & inode_mark->mask & ALL_FSNOTIFY_EVENTS))
> return 0;
>
> - return ops->handle_inode_event(inode_mark, mask, inode, dir, name, cookie);
> + return ops->handle_inode_event(inode_mark, mask, inode, event_info->dir,
> + event_info->name, event_info->cookie);
> }
>
> static int fsnotify_handle_event(struct fsnotify_group *group, __u32 mask,
> - const void *data, int data_type,
> - struct inode *dir, const struct qstr *name,
> - u32 cookie, struct fsnotify_iter_info *iter_info)
> + const struct fsnotify_event_info *event_info,
> + struct fsnotify_iter_info *iter_info)
> {
> struct fsnotify_mark *inode_mark = fsnotify_iter_inode_mark(iter_info);
> struct fsnotify_mark *parent_mark = fsnotify_iter_parent_mark(iter_info);
> + struct fsnotify_event_info child_event_info = { };
> int ret;
>
> if (WARN_ON_ONCE(fsnotify_iter_sb_mark(iter_info)) ||
> @@ -284,8 +287,8 @@ static int fsnotify_handle_event(struct fsnotify_group *group, __u32 mask,
> * interested in this event?
> */
> if (parent_mark->mask & FS_EVENT_ON_CHILD) {
> - ret = fsnotify_handle_inode_event(group, parent_mark, mask,
> - data, data_type, dir, name, 0);
> + ret = fsnotify_handle_inode_event(group, parent_mark,
> + mask, event_info);
> if (ret)
> return ret;
> }
> @@ -302,18 +305,22 @@ static int fsnotify_handle_event(struct fsnotify_group *group, __u32 mask,
> * The child watcher is expecting an event without a file name
> * and without the FS_EVENT_ON_CHILD flag.
> */
> + if (WARN_ON_ONCE(!event_info->inode))
> + return 0;
> +
> mask &= ~FS_EVENT_ON_CHILD;
> - dir = NULL;
> - name = NULL;
> + child_event_info = *event_info;
> + child_event_info.dir = NULL;
> + child_event_info.name = NULL;
> + event_info = &child_event_info;
> }
>
> - return fsnotify_handle_inode_event(group, inode_mark, mask, data, data_type,
> - dir, name, cookie);
> + return fsnotify_handle_inode_event(group, inode_mark, mask, event_info);
> }
>
> -static int send_to_group(__u32 mask, const void *data, int data_type,
> - struct inode *dir, const struct qstr *file_name,
> - u32 cookie, struct fsnotify_iter_info *iter_info)
> +static int send_to_group(__u32 mask,
> + const struct fsnotify_event_info *event_info,
> + struct fsnotify_iter_info *iter_info)
> {
> struct fsnotify_group *group = NULL;
> __u32 test_mask = (mask & ALL_FSNOTIFY_EVENTS);
> @@ -351,18 +358,18 @@ static int send_to_group(__u32 mask, const void *data, int data_type,
>
> pr_debug("%s: group=%p mask=%x marks_mask=%x marks_ignored_mask=%x data=%p data_type=%d dir=%p cookie=%d\n",
> __func__, group, mask, marks_mask, marks_ignored_mask,
> - data, data_type, dir, cookie);
> + event_info->data, event_info->data_type, event_info->dir,
> + event_info->cookie);
>
> if (!(test_mask & marks_mask & ~marks_ignored_mask))
> return 0;
>
> if (group->ops->handle_event) {
> - return group->ops->handle_event(group, mask, data, data_type, dir,
> - file_name, cookie, iter_info);
> + return group->ops->handle_event(group, mask, event_info,
> + iter_info);
> }
>
> - return fsnotify_handle_event(group, mask, data, data_type, dir,
> - file_name, cookie, iter_info);
> + return fsnotify_handle_event(group, mask, event_info, iter_info);
> }
>
> static struct fsnotify_mark *fsnotify_first_mark(struct fsnotify_mark_connector **connp)
> @@ -448,21 +455,22 @@ static void fsnotify_iter_next(struct fsnotify_iter_info *iter_info)
> * in whatever means they feel necessary.
> *
> * @mask: event type and flags
> + * Input args in struct fsnotify_event_info:
> * @data: object that event happened on
> * @data_type: type of object for fanotify_data_XXX() accessors
> * @dir: optional directory associated with event -
> - * if @file_name is not NULL, this is the directory that
> - * @file_name is relative to
> - * @file_name: optional file name associated with event
> + * if @name is not NULL, this is the directory that
> + * @name is relative to
> + * @name: optional file name associated with event
> * @inode: optional inode associated with event -
> * either @dir or @inode must be non-NULL.
> * if both are non-NULL event may be reported to both.
> * @cookie: inotify rename cookie
> */
> -int fsnotify(__u32 mask, const void *data, int data_type, struct inode *dir,
> - const struct qstr *file_name, struct inode *inode, u32 cookie)
> +int __fsnotify(__u32 mask, const struct fsnotify_event_info *event_info)
> {
> - const struct path *path = fsnotify_data_path(data, data_type);
> + const struct path *path = fsnotify_event_info_path(event_info);
> + struct inode *inode = event_info->inode;
> struct fsnotify_iter_info iter_info = {};
> struct super_block *sb;
> struct mount *mnt = NULL;
> @@ -475,13 +483,13 @@ int fsnotify(__u32 mask, const void *data, int data_type, struct inode *dir,
>
> if (!inode) {
> /* Dirent event - report on TYPE_INODE to dir */
> - inode = dir;
> + inode = event_info->dir;
> } else if (mask & FS_EVENT_ON_CHILD) {
> /*
> * Event on child - report on TYPE_PARENT to dir if it is
> * watching children and on TYPE_INODE to child.
> */
> - parent = dir;
> + parent = event_info->dir;
> }
> sb = inode->i_sb;
>
> @@ -538,8 +546,7 @@ int fsnotify(__u32 mask, const void *data, int data_type, struct inode *dir,
> * That's why this traversal is so complicated...
> */
> while (fsnotify_iter_select_report_types(&iter_info)) {
> - ret = send_to_group(mask, data, data_type, dir, file_name,
> - cookie, &iter_info);
> + ret = send_to_group(mask, event_info, &iter_info);
>
> if (ret && (mask & ALL_FSNOTIFY_PERM_EVENTS))
> goto out;
> @@ -552,7 +559,7 @@ int fsnotify(__u32 mask, const void *data, int data_type, struct inode *dir,
>
> return ret;
> }
> -EXPORT_SYMBOL_GPL(fsnotify);
> +EXPORT_SYMBOL_GPL(__fsnotify);
>
> static __init int fsnotify_init(void)
> {
> diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
> index f8acddcf54fb..8c2c681b4495 100644
> --- a/include/linux/fsnotify.h
> +++ b/include/linux/fsnotify.h
> @@ -30,7 +30,10 @@ static inline void fsnotify_name(struct inode *dir, __u32 mask,
> struct inode *child,
> const struct qstr *name, u32 cookie)
> {
> - fsnotify(mask, child, FSNOTIFY_EVENT_INODE, dir, name, NULL, cookie);
> + __fsnotify(mask, &(struct fsnotify_event_info) {
> + .data = child, .data_type = FSNOTIFY_EVENT_INODE,
> + .dir = dir, .name = name, .cookie = cookie,
> + });
> }
>
> static inline void fsnotify_dirent(struct inode *dir, struct dentry *dentry,
> @@ -44,7 +47,10 @@ static inline void fsnotify_inode(struct inode *inode, __u32 mask)
> if (S_ISDIR(inode->i_mode))
> mask |= FS_ISDIR;
>
> - fsnotify(mask, inode, FSNOTIFY_EVENT_INODE, NULL, NULL, inode, 0);
> + __fsnotify(mask, &(struct fsnotify_event_info) {
> + .data = inode, .data_type = FSNOTIFY_EVENT_INODE,
> + .inode = inode,
> + });
> }
>
> /* Notify this dentry's parent about a child's events. */
> @@ -68,7 +74,10 @@ static inline int fsnotify_parent(struct dentry *dentry, __u32 mask,
> return __fsnotify_parent(dentry, mask, data, data_type);
>
> notify_child:
> - return fsnotify(mask, data, data_type, NULL, NULL, inode, 0);
> + return __fsnotify(mask, &(struct fsnotify_event_info) {
> + .data = data, .data_type = data_type,
> + .inode = inode,
> + });
> }
>
> /*
> diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
> index f9e2c6cd0f7d..b1590f654ade 100644
> --- a/include/linux/fsnotify_backend.h
> +++ b/include/linux/fsnotify_backend.h
> @@ -112,6 +112,28 @@ struct fsnotify_iter_info;
>
> struct mem_cgroup;
>
> +/*
> + * Event info args passed to fsnotify() and to backends on handle_event():
> + * @data: object that event happened on
> + * @data_type: type of object for fanotify_data_XXX() accessors
> + * @dir: optional directory associated with event -
> + * if @name is not NULL, this is the directory that
> + * @name is relative to
> + * @name: optional file name associated with event
> + * @inode: optional inode associated with event -
> + * either @dir or @inode must be non-NULL.
> + * if both are non-NULL event may be reported to both.
> + * @cookie: inotify rename cookie
> + */
> +struct fsnotify_event_info {
> + const void *data;
> + int data_type;
> + struct inode *dir;
> + const struct qstr *name;
> + struct inode *inode;
> + u32 cookie;
> +};
> +
> /*
> * Each group much define these ops. The fsnotify infrastructure will call
> * these operations for each relevant group.
> @@ -119,13 +141,7 @@ struct mem_cgroup;
> * handle_event - main call for a group to handle an fs event
> * @group: group to notify
> * @mask: event type and flags
> - * @data: object that event happened on
> - * @data_type: type of object for fanotify_data_XXX() accessors
> - * @dir: optional directory associated with event -
> - * if @file_name is not NULL, this is the directory that
> - * @file_name is relative to
> - * @file_name: optional file name associated with event
> - * @cookie: inotify rename cookie
> + * @event_info: information attached to the event
> * @iter_info: array of marks from this group that are interested in the event
> *
> * handle_inode_event - simple variant of handle_event() for groups that only
> @@ -147,8 +163,7 @@ struct mem_cgroup;
> */
> struct fsnotify_ops {
> int (*handle_event)(struct fsnotify_group *group, u32 mask,
> - const void *data, int data_type, struct inode *dir,
> - const struct qstr *file_name, u32 cookie,
> + const struct fsnotify_event_info *event_info,
> struct fsnotify_iter_info *iter_info);
> int (*handle_inode_event)(struct fsnotify_mark *mark, u32 mask,
> struct inode *inode, struct inode *dir,
> @@ -262,6 +277,12 @@ static inline struct inode *fsnotify_data_inode(const void *data, int data_type)
> }
> }
>
> +static inline struct inode *fsnotify_event_info_inode(
> + const struct fsnotify_event_info *event_info)
> +{
> + return fsnotify_data_inode(event_info->data, event_info->data_type);
> +}
> +
> static inline const struct path *fsnotify_data_path(const void *data,
> int data_type)
> {
> @@ -273,6 +294,12 @@ static inline const struct path *fsnotify_data_path(const void *data,
> }
> }
>
> +static inline const struct path *fsnotify_event_info_path(
> + const struct fsnotify_event_info *event_info)
> +{
> + return fsnotify_data_path(event_info->data, event_info->data_type);
> +}
> +
> enum fsnotify_obj_type {
> FSNOTIFY_OBJ_TYPE_INODE,
> FSNOTIFY_OBJ_TYPE_PARENT,
> @@ -410,11 +437,22 @@ struct fsnotify_mark {
> /* called from the vfs helpers */
>
> /* main fsnotify call to send events */
> -extern int fsnotify(__u32 mask, const void *data, int data_type,
> - struct inode *dir, const struct qstr *name,
> - struct inode *inode, u32 cookie);
> -extern int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
> - int data_type);
> +extern int __fsnotify(__u32 mask,
> + const struct fsnotify_event_info *event_info);
> +extern int __fsnotify_parent(struct dentry *dentry, __u32 mask,
> + const void *data, int data_type);
> +
> +static inline int fsnotify(__u32 mask, const void *data, int data_type,
> + struct inode *dir, const struct qstr *name,
> + struct inode *inode, u32 cookie)
> +{
> + return __fsnotify(mask, &(struct fsnotify_event_info) {
> + .data = data, .data_type = data_type,
> + .dir = dir, .name = name, .inode = inode,
> + .cookie = cookie,
> + });
> +}
> +
> extern void __fsnotify_inode_delete(struct inode *inode);
> extern void __fsnotify_vfsmount_delete(struct vfsmount *mnt);
> extern void fsnotify_sb_delete(struct super_block *sb);
> @@ -594,15 +632,14 @@ static inline void fsnotify_init_event(struct fsnotify_event *event)
>
> #else
>
> -static inline int fsnotify(__u32 mask, const void *data, int data_type,
> - struct inode *dir, const struct qstr *name,
> - struct inode *inode, u32 cookie)
> +static inline int fsnotify(__u32 mask,
> + const struct fsnotify_event_info *event_info)
> {
> return 0;
> }
>
Did you miss the kernel test robot messages on v2?
These noop implementations in my patch are wrong.
noop implementation of fsnotify() should not change signature
and noop implementation of __fsnotify() should be added.
Thanks,
Amir.
> static inline int __fsnotify_parent(struct dentry *dentry, __u32 mask,
> - const void *data, int data_type)
> + const void *data, int data_type)
> {
> return 0;
> }
> --
> 2.32.0
>
On Tue, Jun 29, 2021 at 10:12 PM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> FAN_FS_ERROR requires some initialization to happen from inside the
> insert hook. This allows a user of fanotify_add_event to pass an
> argument to be sent to the insert callback.
>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
Reviewed-by: Amir Goldstein <[email protected]>
with optional suggestion for improvement...
> ---
> fs/notify/fanotify/fanotify.c | 5 +++--
> fs/notify/inotify/inotify_fsnotify.c | 2 +-
> fs/notify/notification.c | 6 ++++--
> include/linux/fsnotify_backend.h | 7 +++++--
> 4 files changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index 4f2febb15e94..aba06b84da91 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -695,7 +695,8 @@ static __kernel_fsid_t fanotify_get_fsid(struct fsnotify_iter_info *iter_info)
> * Add an event to hash table for faster merge.
> */
> static void fanotify_insert_event(struct fsnotify_group *group,
> - struct fsnotify_event *fsn_event)
> + struct fsnotify_event *fsn_event,
> + const void *data)
> {
> struct fanotify_event *event = FANOTIFY_E(fsn_event);
> unsigned int bucket = fanotify_event_hash_bucket(group, event);
> @@ -779,7 +780,7 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
>
> fsn_event = &event->fse;
> ret = fsnotify_add_event(group, fsn_event, fanotify_merge,
> - fanotify_insert_event);
> + fanotify_insert_event, NULL);
> if (ret) {
> /* Permission events shouldn't be merged */
> BUG_ON(ret == 1 && mask & FANOTIFY_PERM_EVENTS);
> diff --git a/fs/notify/inotify/inotify_fsnotify.c b/fs/notify/inotify/inotify_fsnotify.c
> index d1a64daa0171..a003a64ff8ee 100644
> --- a/fs/notify/inotify/inotify_fsnotify.c
> +++ b/fs/notify/inotify/inotify_fsnotify.c
> @@ -116,7 +116,7 @@ int inotify_handle_inode_event(struct fsnotify_mark *inode_mark, u32 mask,
> if (len)
> strcpy(event->name, name->name);
>
> - ret = fsnotify_add_event(group, fsn_event, inotify_merge, NULL);
> + ret = fsnotify_add_event(group, fsn_event, inotify_merge, NULL, NULL);
> if (ret) {
> /* Our event wasn't used in the end. Free it. */
> fsnotify_destroy_event(group, fsn_event);
> diff --git a/fs/notify/notification.c b/fs/notify/notification.c
> index 32f45543b9c6..0d9ba592d725 100644
> --- a/fs/notify/notification.c
> +++ b/fs/notify/notification.c
> @@ -83,7 +83,9 @@ int fsnotify_add_event(struct fsnotify_group *group,
> int (*merge)(struct fsnotify_group *,
> struct fsnotify_event *),
> void (*insert)(struct fsnotify_group *,
> - struct fsnotify_event *))
> + struct fsnotify_event *,
> + const void *),
> + const void *insert_data)
> {
> int ret = 0;
> struct list_head *list = &group->notification_list;
> @@ -121,7 +123,7 @@ int fsnotify_add_event(struct fsnotify_group *group,
> group->q_len++;
> list_add_tail(&event->list, list);
> if (insert)
> - insert(group, event);
> + insert(group, event, insert_data);
> spin_unlock(&group->notification_lock);
>
> wake_up(&group->notification_waitq);
> diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
> index b1590f654ade..8222fe12a6c9 100644
> --- a/include/linux/fsnotify_backend.h
> +++ b/include/linux/fsnotify_backend.h
> @@ -526,11 +526,14 @@ extern int fsnotify_add_event(struct fsnotify_group *group,
> int (*merge)(struct fsnotify_group *,
> struct fsnotify_event *),
> void (*insert)(struct fsnotify_group *,
> - struct fsnotify_event *));
> + struct fsnotify_event *,
> + const void *),
> + const void *insert_data);
> +
> /* Queue overflow event to a notification group */
> static inline void fsnotify_queue_overflow(struct fsnotify_group *group)
> {
> - fsnotify_add_event(group, group->overflow_event, NULL, NULL);
> + fsnotify_add_event(group, group->overflow_event, NULL, NULL, NULL);
> }
>
Looking back, when I added the insert() callback, it might have been wiser to
rename fsnotify_add_event() to fsnotify_insert_event() add a wrapper:
static inline int fsnotify_add_event(group, event, merge) {
return fsnotify_insert_event(group, event, merge, NULL);
}
But the two call sites did not seem to justify it.
Perhaps with the growing list of NULL arguments it is time to
reconsider this small tidy up?
Not critical though.
Thanks,
Amir.
On Tue, Jun 29, 2021 at 10:12 PM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> FS_FAN_ERROR must be able to merge events even in the short window after
> they've been unqueued but prior to being read. Move the list_empty
This smells like trouble.
Breaking abstractions like this should be done for a very good reason
and I am not sure this is the case at hand.
More on this on merge_error_event patch.
Thanks,
Amir.
> check into the merge hooks, such that merge() is always invoked if
> existing.
>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
> ---
> fs/notify/fanotify/fanotify.c | 3 ++-
> fs/notify/inotify/inotify_fsnotify.c | 3 +++
> fs/notify/notification.c | 2 +-
> 3 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index aba06b84da91..769703ef2b9a 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -168,7 +168,8 @@ static int fanotify_merge(struct fsnotify_group *group,
> * the event structure we have created in fanotify_handle_event() is the
> * one we should check for permission response.
> */
> - if (fanotify_is_perm_event(new->mask))
> + if (list_empty(&group->notification_list) ||
> + fanotify_is_perm_event(new->mask))
> return 0;
>
> hlist_for_each_entry(old, hlist, merge_list) {
> diff --git a/fs/notify/inotify/inotify_fsnotify.c b/fs/notify/inotify/inotify_fsnotify.c
> index a003a64ff8ee..2f357b4933c2 100644
> --- a/fs/notify/inotify/inotify_fsnotify.c
> +++ b/fs/notify/inotify/inotify_fsnotify.c
> @@ -52,6 +52,9 @@ static int inotify_merge(struct fsnotify_group *group,
> struct list_head *list = &group->notification_list;
> struct fsnotify_event *last_event;
>
> + if (list_empty(list))
> + return 0;
> +
> last_event = list_entry(list->prev, struct fsnotify_event, list);
> return event_compare(last_event, event);
> }
> diff --git a/fs/notify/notification.c b/fs/notify/notification.c
> index 0d9ba592d725..1d06e0728a24 100644
> --- a/fs/notify/notification.c
> +++ b/fs/notify/notification.c
> @@ -111,7 +111,7 @@ int fsnotify_add_event(struct fsnotify_group *group,
> goto queue;
> }
>
> - if (!list_empty(list) && merge) {
> + if (merge) {
> ret = merge(group, event);
> if (ret) {
> spin_unlock(&group->notification_lock);
> --
> 2.32.0
>
On Tue, Jun 29, 2021 at 10:12 PM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> Introduce helpers for filesystems interested in reporting FS_ERROR
> events. When notifying errors, the file system might not have an inode
> to report on the error. To support this, allow the caller to specify
> the superblock to which the error applies.
>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
>
Reviewed-by: Amir Goldstein <[email protected]>
> ---
> Changes since v2:
> - Drop reference to s_fnotify_marks and guards (Amir)
>
> Changes since v1:
> - Use the inode argument (Amir)
> - Protect s_fsnotify_marks with ifdef guard
> ---
> fs/notify/fsnotify.c | 2 +-
> include/linux/fsnotify.h | 13 +++++++++++++
> include/linux/fsnotify_backend.h | 1 +
> 3 files changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
> index 36205a769dde..ac05eb3fb368 100644
> --- a/fs/notify/fsnotify.c
> +++ b/fs/notify/fsnotify.c
> @@ -491,7 +491,7 @@ int __fsnotify(__u32 mask, const struct fsnotify_event_info *event_info)
> */
> parent = event_info->dir;
> }
> - sb = inode->i_sb;
> + sb = event_info->sb ?: inode->i_sb;
>
> /*
> * Optimization: srcu_read_lock() has a memory barrier which can
> diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
> index 8c2c681b4495..684c79ca01b2 100644
> --- a/include/linux/fsnotify.h
> +++ b/include/linux/fsnotify.h
> @@ -326,4 +326,17 @@ static inline void fsnotify_change(struct dentry *dentry, unsigned int ia_valid)
> fsnotify_dentry(dentry, mask);
> }
>
> +static inline int fsnotify_sb_error(struct super_block *sb, struct inode *inode,
> + int error)
> +{
> + struct fs_error_report report = {
> + .error = error,
> + .inode = inode,
> + };
> +
> + return __fsnotify(FS_ERROR, &(struct fsnotify_event_info) {
> + .data = &report, .data_type = FSNOTIFY_EVENT_ERROR,
> + .sb = sb});
> +}
> +
> #endif /* _LINUX_FS_NOTIFY_H */
> diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
> index ea5f5c7cc381..5a32c5010f45 100644
> --- a/include/linux/fsnotify_backend.h
> +++ b/include/linux/fsnotify_backend.h
> @@ -138,6 +138,7 @@ struct fsnotify_event_info {
> struct inode *dir;
> const struct qstr *name;
> struct inode *inode;
> + struct super_block *sb;
> u32 cookie;
> };
>
> --
> 2.32.0
>
Hi Gabriel,
I love your patch! Yet something to improve:
[auto build test ERROR on ext3/fsnotify]
[also build test ERROR on ext4/dev linus/master v5.13 next-20210629]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210630-031347
base: https://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git fsnotify
config: x86_64-allyesconfig (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# https://github.com/0day-ci/linux/commit/746524d8db08a041fed90e41b15c8e8ca69cb22d
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210630-031347
git checkout 746524d8db08a041fed90e41b15c8e8ca69cb22d
# save the attached .config to linux build tree
mkdir build_dir
make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash samples/
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>
All errors (new ones prefixed by >>):
samples/fanotify/fs-monitor.c: In function 'handle_notifications':
>> samples/fanotify/fs-monitor.c:84:32: error: 'mount_fd' undeclared (first use in this function)
84 | bad_file = open_by_handle_at(mount_fd, file_handle, O_PATH);
| ^~~~~~~~
samples/fanotify/fs-monitor.c:84:32: note: each undeclared identifier is reported only once for each function it appears in
samples/fanotify/fs-monitor.c: In function 'main':
samples/fanotify/fs-monitor.c:111:2: error: 'mount_fd' undeclared (first use in this function)
111 | mount_fd = open(argv[1], O_RDONLY);
| ^~~~~~~~
vim +/mount_fd +84 samples/fanotify/fs-monitor.c
30
31 static void handle_notifications(char *buffer, int len)
32 {
33 struct fanotify_event_metadata *metadata;
34 struct fanotify_event_info_error *error;
35 struct fanotify_event_info_fid *fid;
36 struct file_handle *file_handle;
37 int bad_file;
38 int ret;
39 struct stat stat;
40 char *next;
41
42 for (metadata = (struct fanotify_event_metadata *) buffer;
43 FAN_EVENT_OK(metadata, len);
44 metadata = FAN_EVENT_NEXT(metadata, len)) {
45 next = (char *)metadata + metadata->event_len;
46 if (metadata->mask != FAN_FS_ERROR) {
47 printf("unexpected FAN MARK: %llx\n", metadata->mask);
48 goto next_event;
49 } else if (metadata->fd != FAN_NOFD) {
50 printf("Unexpected fd (!= FAN_NOFD)\n");
51 goto next_event;
52 }
53
54 printf("FAN_FS_ERROR found len=%d\n", metadata->event_len);
55
56 error = (struct fanotify_event_info_error *) (metadata+1);
57 if (error->hdr.info_type != FAN_EVENT_INFO_TYPE_ERROR) {
58 printf("unknown record: %d (Expecting TYPE_ERROR)\n",
59 error->hdr.info_type);
60 goto next_event;
61 }
62
63 printf("\tGeneric Error Record: len=%d\n", error->hdr.len);
64 printf("\terror: %d\n", error->error);
65 printf("\terror_count: %d\n", error->error_count);
66
67 fid = (struct fanotify_event_info_fid *) (error + 1);
68
69 if ((char *) fid >= next) {
70 printf("Event doesn't have FID\n");
71 goto next_event;
72 }
73 printf("FID record found\n");
74
75 if (fid->hdr.info_type != FAN_EVENT_INFO_TYPE_FID) {
76 printf("unknown record: %d (Expecting TYPE_FID)\n",
77 fid->hdr.info_type);
78 goto next_event;
79 }
80 printf("\tfsid: %x%x\n", fid->fsid.val[0], fid->fsid.val[1]);
81
82
83 file_handle = (struct file_handle *) &fid->handle;
> 84 bad_file = open_by_handle_at(mount_fd, file_handle, O_PATH);
85 if (bad_file < 0) {
86 printf("open_by_handle_at %d\n", errno);
87 goto next_event;
88 }
89
90 ret = fstat(bad_file, &stat);
91 if (ret < 0)
92 printf("fstat %d\n", errno);
93
94 printf("\tinode=%ld\n", stat.st_ino);
95
96 next_event:
97 printf("---\n\n");
98 }
99 }
100
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]
On Tue, Jun 29, 2021 at 10:13 PM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> Introduce an example of a FAN_FS_ERROR fanotify user to track filesystem
> errors.
>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
>
> ---
> Changes since v1:
> - minor fixes
> ---
> samples/Kconfig | 9 +++
> samples/Makefile | 1 +
> samples/fanotify/Makefile | 3 +
> samples/fanotify/fs-monitor.c | 134 ++++++++++++++++++++++++++++++++++
> 4 files changed, 147 insertions(+)
> create mode 100644 samples/fanotify/Makefile
> create mode 100644 samples/fanotify/fs-monitor.c
>
> diff --git a/samples/Kconfig b/samples/Kconfig
> index b5a1a7aa7e23..f2f9c939035f 100644
> --- a/samples/Kconfig
> +++ b/samples/Kconfig
> @@ -120,6 +120,15 @@ config SAMPLE_CONNECTOR
> with it.
> See also Documentation/driver-api/connector.rst
>
> +config SAMPLE_FANOTIFY_ERROR
> + bool "Build fanotify error monitoring sample"
> + depends on FANOTIFY
> + help
> + When enabled, this builds an example code that uses the
> + FAN_FS_ERROR fanotify mechanism to monitor filesystem
> + errors.
> + See also Documentation/admin-guide/filesystem-monitoring.rst.
> +
> config SAMPLE_HIDRAW
> bool "hidraw sample"
> depends on CC_CAN_LINK && HEADERS_INSTALL
> diff --git a/samples/Makefile b/samples/Makefile
> index 087e0988ccc5..931a81847c48 100644
> --- a/samples/Makefile
> +++ b/samples/Makefile
> @@ -5,6 +5,7 @@ subdir-$(CONFIG_SAMPLE_AUXDISPLAY) += auxdisplay
> subdir-$(CONFIG_SAMPLE_ANDROID_BINDERFS) += binderfs
> obj-$(CONFIG_SAMPLE_CONFIGFS) += configfs/
> obj-$(CONFIG_SAMPLE_CONNECTOR) += connector/
> +obj-$(CONFIG_SAMPLE_FANOTIFY_ERROR) += fanotify/
> subdir-$(CONFIG_SAMPLE_HIDRAW) += hidraw
> obj-$(CONFIG_SAMPLE_HW_BREAKPOINT) += hw_breakpoint/
> obj-$(CONFIG_SAMPLE_KDB) += kdb/
> diff --git a/samples/fanotify/Makefile b/samples/fanotify/Makefile
> new file mode 100644
> index 000000000000..b3d5cc826e6f
> --- /dev/null
> +++ b/samples/fanotify/Makefile
> @@ -0,0 +1,3 @@
> +userprogs-always-y += fs-monitor
> +
> +userccflags += -I usr/include
> diff --git a/samples/fanotify/fs-monitor.c b/samples/fanotify/fs-monitor.c
> new file mode 100644
> index 000000000000..f949ea00271d
> --- /dev/null
> +++ b/samples/fanotify/fs-monitor.c
> @@ -0,0 +1,134 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright 2021, Collabora Ltd.
> + */
> +
> +#define _GNU_SOURCE
> +#include <errno.h>
Kernel test robot reported some problem with this include
> +#include <err.h>
> +#include <stdlib.h>
> +#include <stdio.h>
> +#include <fcntl.h>
> +#include <sys/fanotify.h>
> +#include <sys/types.h>
> +#include <unistd.h>
> +#include <sys/stat.h>
> +#include <sys/types.h>
> +
> +#ifndef FAN_FS_ERROR
> +#define FAN_FS_ERROR 0x00008000
> +#define FAN_EVENT_INFO_TYPE_ERROR 4
> +
> +int mount_fd;
> +
> +struct fanotify_event_info_error {
> + struct fanotify_event_info_header hdr;
> + __s32 error;
> + __u32 error_count;
> +};
> +#endif
> +
> +static void handle_notifications(char *buffer, int len)
> +{
> + struct fanotify_event_metadata *metadata;
> + struct fanotify_event_info_error *error;
> + struct fanotify_event_info_fid *fid;
> + struct file_handle *file_handle;
> + int bad_file;
> + int ret;
> + struct stat stat;
> + char *next;
> +
> + for (metadata = (struct fanotify_event_metadata *) buffer;
> + FAN_EVENT_OK(metadata, len);
> + metadata = FAN_EVENT_NEXT(metadata, len)) {
> + next = (char *)metadata + metadata->event_len;
> + if (metadata->mask != FAN_FS_ERROR) {
> + printf("unexpected FAN MARK: %llx\n", metadata->mask);
> + goto next_event;
> + } else if (metadata->fd != FAN_NOFD) {
> + printf("Unexpected fd (!= FAN_NOFD)\n");
> + goto next_event;
> + }
> +
> + printf("FAN_FS_ERROR found len=%d\n", metadata->event_len);
> +
> + error = (struct fanotify_event_info_error *) (metadata+1);
> + if (error->hdr.info_type != FAN_EVENT_INFO_TYPE_ERROR) {
> + printf("unknown record: %d (Expecting TYPE_ERROR)\n",
> + error->hdr.info_type);
> + goto next_event;
> + }
> +
> + printf("\tGeneric Error Record: len=%d\n", error->hdr.len);
> + printf("\terror: %d\n", error->error);
> + printf("\terror_count: %d\n", error->error_count);
> +
> + fid = (struct fanotify_event_info_fid *) (error + 1);
> +
> + if ((char *) fid >= next) {
> + printf("Event doesn't have FID\n");
> + goto next_event;
> + }
> + printf("FID record found\n");
> +
> + if (fid->hdr.info_type != FAN_EVENT_INFO_TYPE_FID) {
> + printf("unknown record: %d (Expecting TYPE_FID)\n",
> + fid->hdr.info_type);
> + goto next_event;
> + }
> + printf("\tfsid: %x%x\n", fid->fsid.val[0], fid->fsid.val[1]);
> +
> +
> + file_handle = (struct file_handle *) &fid->handle;
> + bad_file = open_by_handle_at(mount_fd, file_handle, O_PATH);
I would not recommend in this practice at all and in a sample program
in particular.
While FID is mean to to be used as input for open_by_handle_at()
with "regular" events, with error events FID may belong to a corrupt
inode that cannot be opened, a filesystem that is already shutdown
due to error or point to irrelevant root inode.
I suggest that you stick to printing the FID value.
A filesystem monitoring tool will typically know which filesystem it is
watching and it will be easy for it to parse the inode and generation
out of the FID.
In fact, if the handle_type is the generic type FILEID_INO32_GEN
and handle_bytes is 8, it is safe for this sample fs-monitor to parse the
FID as <32bit ino; 32bit gen> and print those parsed values
worst case the values will be wrong.
Thanks,
Amir.
> + if (bad_file < 0) {
> + printf("open_by_handle_at %d\n", errno);
> + goto next_event;
> + }
> +
> + ret = fstat(bad_file, &stat);
> + if (ret < 0)
> + printf("fstat %d\n", errno);
> +
> + printf("\tinode=%ld\n", stat.st_ino);
> +
> +next_event:
> + printf("---\n\n");
> + }
> +}
> +
> +int main(int argc, char **argv)
> +{
> + int fd;
> + char buffer[BUFSIZ];
> +
> + if (argc < 2) {
> + printf("Missing path argument\n");
> + return 1;
> + }
> +
> + mount_fd = open(argv[1], O_RDONLY);
> + if (mount_fd < 0)
> + errx(1, "mount_fd");
> +
> + fd = fanotify_init(FAN_CLASS_NOTIF|FAN_REPORT_FID, O_RDONLY);
> + if (fd < 0)
> + errx(1, "fanotify_init");
> +
> + if (fanotify_mark(fd, FAN_MARK_ADD|FAN_MARK_FILESYSTEM,
> + FAN_FS_ERROR, AT_FDCWD, argv[1])) {
> + errx(1, "fanotify_mark");
> + }
> +
> + while (1) {
> + int n = read(fd, buffer, BUFSIZ);
> +
> + if (n < 0)
> + errx(1, "read");
> +
> + handle_notifications(buffer, n);
> + }
> +
> + return 0;
> +}
> --
> 2.32.0
>
On Tue, Jun 29, 2021 at 10:13 PM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> Document the FAN_FS_ERROR event for user administrators and user space
> developers.
>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
>
> ---
> Changes since v1:
> - Drop references to location record
> - Explain that the inode field is optional
> - Explain we are reporting only the first error
> ---
> .../admin-guide/filesystem-monitoring.rst | 70 +++++++++++++++++++
> Documentation/admin-guide/index.rst | 1 +
> 2 files changed, 71 insertions(+)
> create mode 100644 Documentation/admin-guide/filesystem-monitoring.rst
>
> diff --git a/Documentation/admin-guide/filesystem-monitoring.rst b/Documentation/admin-guide/filesystem-monitoring.rst
> new file mode 100644
> index 000000000000..c0ab1ad268b8
> --- /dev/null
> +++ b/Documentation/admin-guide/filesystem-monitoring.rst
> @@ -0,0 +1,70 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +====================================
> +File system Monitoring with fanotify
> +====================================
It is great that you are adding an admin-guide book and it is surely
not your fault that there is no existing admin-guide for fanotify to add to.
However, the name of the book as well as this title are more generic than
what you describe.
You see, watching all the deleted files in a filesystem or all modified files
in a mount may also be considered as "filesystem monitoring" by some.
So my only request is that you keep the book name and title as is,
but place your content under a chapter about "filesystem error monitoring".
This way, other people can later fill the gaps.
(CC Matthew Borowski who indicated his interest in doing so)
Thanks,
Amir.
+CC linux-api
On Tue, Jun 29, 2021 at 10:10 PM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> Hi,
>
> This is the third version of the FAN_FS_ERROR patches. The main change
> in this version is the inode information being reported through an FID
> record, which means it requires the group to be created with
> FAN_REPORT_FID. It indeed simplifies a lot the FAN_FS_ERROR patch
> itself.
I am glad that you took this path.
Uniformity across the UAPI is important.
>
> This change raises the question of how we report non-inode errors. On
> one hand, we could omit the FID report, but then fsid would also be
> ommited. I chose to report these kind of errors against the root
> inode.
>
There are other option to consider.
To avoid special casing error events in fanotify event read code,
it would is convenient to use a non-zero length FID, but you can
use a 8 bytes zero buffer as NULL-FID
If I am not mistaken, that amounts to 64 bytes of event_len
including the event_metadata and both records which is pretty
nicely aligned.
All 3 handle_type options below are valid options:
1. handle_type FILEID_ROOT
2. handle_type FILEID_INVALID
3. handle_type FILEID_INO32_GEN (i.e. ino=0;gen=0)
The advantage of option #3 is that the monitoring program
does not need to special case the NULL_FID case when
parsing the FID to informative user message.
> The other changes in this iteration were made to attend to Amir
> feedback. Thank you again for your very detailed input. It is really
> appreciated.
>
> This was tested with LTP for regressions, and also using the sample on
> the last patch, with a corrupted image. I can publish the bad image
> upon request.
Just to set expectations, we now have an official standard for fanotify [1]
where we require an LTP test and man page update patch before merge
of UAPI changes.
That should not stop us from continuing the review process - it's just
a heads up, but I think that we are down to implementation details in
the review anyway and that the UAPI (give or take root inode) is
pretty much clear at this point, so spreading the review of UAPI to
wider audience is not a bad idea.
w.r.t man page update, I know you have created the admin-guide book,
but it's not the same. For linux-api reviewers, reviewing the changed to
fanotify man pages is good way to make sure we did not miss any corners.
w.r.t LTP test, I don't think that using a corrupt image will be a good way
for an LTP test. LTP tests can prepare and mount an ext4 loop image.
Does ext4 have some debugging method to inject an error?
Because that would be the best way IMO.
If it doesn't, you can implement this in ext4 and use it in the test if that
debug file exists - skip the test otherwise - it's common practice.
Thanks,
Amir.
[1] https://lore.kernel.org/linux-fsdevel/[email protected]/
Hi Gabriel,
url: https://github.com/0day-ci/linux/commits/Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210630-031347
base: https://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git fsnotify
config: x86_64-randconfig-m001-20210628 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>
Reported-by: Dan Carpenter <[email protected]>
smatch warnings:
fs/notify/fsnotify.c:505 __fsnotify() warn: variable dereferenced before check 'inode' (see line 494)
vim +/inode +505 fs/notify/fsnotify.c
dca640915c7b84 Amir Goldstein 2021-06-29 470 int __fsnotify(__u32 mask, const struct fsnotify_event_info *event_info)
90586523eb4b34 Eric Paris 2009-05-21 471 {
dca640915c7b84 Amir Goldstein 2021-06-29 472 const struct path *path = fsnotify_event_info_path(event_info);
dca640915c7b84 Amir Goldstein 2021-06-29 473 struct inode *inode = event_info->inode;
3427ce71554123 Miklos Szeredi 2017-10-30 474 struct fsnotify_iter_info iter_info = {};
40a100d3adc1ad Amir Goldstein 2020-07-22 475 struct super_block *sb;
60f7ed8c7c4d06 Amir Goldstein 2018-09-01 476 struct mount *mnt = NULL;
fecc4559780d52 Amir Goldstein 2020-12-02 477 struct inode *parent = NULL;
9385a84d7e1f65 Jan Kara 2016-11-10 478 int ret = 0;
71d734103edfa2 Mel Gorman 2020-07-08 479 __u32 test_mask, marks_mask;
90586523eb4b34 Eric Paris 2009-05-21 480
71d734103edfa2 Mel Gorman 2020-07-08 481 if (path)
aa93bdc5500cc9 Amir Goldstein 2020-03-19 482 mnt = real_mount(path->mnt);
3a9fb89f4cd04c Eric Paris 2009-12-17 483
40a100d3adc1ad Amir Goldstein 2020-07-22 484 if (!inode) {
40a100d3adc1ad Amir Goldstein 2020-07-22 485 /* Dirent event - report on TYPE_INODE to dir */
dca640915c7b84 Amir Goldstein 2021-06-29 486 inode = event_info->dir;
^^^^^^^^^^^^^^^^^^^^^^^
Presumably this is non-NULL
40a100d3adc1ad Amir Goldstein 2020-07-22 487 } else if (mask & FS_EVENT_ON_CHILD) {
40a100d3adc1ad Amir Goldstein 2020-07-22 488 /*
fecc4559780d52 Amir Goldstein 2020-12-02 489 * Event on child - report on TYPE_PARENT to dir if it is
fecc4559780d52 Amir Goldstein 2020-12-02 490 * watching children and on TYPE_INODE to child.
40a100d3adc1ad Amir Goldstein 2020-07-22 491 */
dca640915c7b84 Amir Goldstein 2021-06-29 492 parent = event_info->dir;
40a100d3adc1ad Amir Goldstein 2020-07-22 493 }
40a100d3adc1ad Amir Goldstein 2020-07-22 @494 sb = inode->i_sb;
^^^^^^^^^^^^
Dereference
497b0c5a7c0688 Amir Goldstein 2020-07-16 495
7c49b8616460eb Dave Hansen 2015-09-04 496 /*
7c49b8616460eb Dave Hansen 2015-09-04 497 * Optimization: srcu_read_lock() has a memory barrier which can
7c49b8616460eb Dave Hansen 2015-09-04 498 * be expensive. It protects walking the *_fsnotify_marks lists.
7c49b8616460eb Dave Hansen 2015-09-04 499 * However, if we do not walk the lists, we do not have to do
7c49b8616460eb Dave Hansen 2015-09-04 500 * SRCU because we have no references to any objects and do not
7c49b8616460eb Dave Hansen 2015-09-04 501 * need SRCU to keep them "alive".
7c49b8616460eb Dave Hansen 2015-09-04 502 */
9b93f33105f5f9 Amir Goldstein 2020-07-16 503 if (!sb->s_fsnotify_marks &&
497b0c5a7c0688 Amir Goldstein 2020-07-16 504 (!mnt || !mnt->mnt_fsnotify_marks) &&
9b93f33105f5f9 Amir Goldstein 2020-07-16 @505 (!inode || !inode->i_fsnotify_marks) &&
^^^^^^
Unnecessary check for NULL
fecc4559780d52 Amir Goldstein 2020-12-02 506 (!parent || !parent->i_fsnotify_marks))
7c49b8616460eb Dave Hansen 2015-09-04 507 return 0;
71d734103edfa2 Mel Gorman 2020-07-08 508
9b93f33105f5f9 Amir Goldstein 2020-07-16 509 marks_mask = sb->s_fsnotify_mask;
71d734103edfa2 Mel Gorman 2020-07-08 510 if (mnt)
71d734103edfa2 Mel Gorman 2020-07-08 511 marks_mask |= mnt->mnt_fsnotify_mask;
9b93f33105f5f9 Amir Goldstein 2020-07-16 512 if (inode)
^^^^^^
9b93f33105f5f9 Amir Goldstein 2020-07-16 513 marks_mask |= inode->i_fsnotify_mask;
fecc4559780d52 Amir Goldstein 2020-12-02 514 if (parent)
fecc4559780d52 Amir Goldstein 2020-12-02 515 marks_mask |= parent->i_fsnotify_mask;
497b0c5a7c0688 Amir Goldstein 2020-07-16 516
71d734103edfa2 Mel Gorman 2020-07-08 517
613a807fe7c793 Eric Paris 2010-07-28 518 /*
613a807fe7c793 Eric Paris 2010-07-28 519 * if this is a modify event we may need to clear the ignored masks
497b0c5a7c0688 Amir Goldstein 2020-07-16 520 * otherwise return if none of the marks care about this type of event.
613a807fe7c793 Eric Paris 2010-07-28 521 */
71d734103edfa2 Mel Gorman 2020-07-08 522 test_mask = (mask & ALL_FSNOTIFY_EVENTS);
71d734103edfa2 Mel Gorman 2020-07-08 523 if (!(mask & FS_MODIFY) && !(test_mask & marks_mask))
613a807fe7c793 Eric Paris 2010-07-28 524 return 0;
75c1be487a690d Eric Paris 2010-07-28 525
9385a84d7e1f65 Jan Kara 2016-11-10 526 iter_info.srcu_idx = srcu_read_lock(&fsnotify_mark_srcu);
75c1be487a690d Eric Paris 2010-07-28 527
45a9fb3725d886 Amir Goldstein 2019-01-10 528 iter_info.marks[FSNOTIFY_OBJ_TYPE_SB] =
45a9fb3725d886 Amir Goldstein 2019-01-10 529 fsnotify_first_mark(&sb->s_fsnotify_marks);
9bdda4e9cf2dce Amir Goldstein 2018-09-01 530 if (mnt) {
47d9c7cc457adc Amir Goldstein 2018-04-20 531 iter_info.marks[FSNOTIFY_OBJ_TYPE_VFSMOUNT] =
3427ce71554123 Miklos Szeredi 2017-10-30 532 fsnotify_first_mark(&mnt->mnt_fsnotify_marks);
7131485a93679f Eric Paris 2009-12-17 533 }
9b93f33105f5f9 Amir Goldstein 2020-07-16 534 if (inode) {
^^^^^
Lots of checking... Maybe this is really NULL?
9b93f33105f5f9 Amir Goldstein 2020-07-16 535 iter_info.marks[FSNOTIFY_OBJ_TYPE_INODE] =
9b93f33105f5f9 Amir Goldstein 2020-07-16 536 fsnotify_first_mark(&inode->i_fsnotify_marks);
9b93f33105f5f9 Amir Goldstein 2020-07-16 537 }
fecc4559780d52 Amir Goldstein 2020-12-02 538 if (parent) {
fecc4559780d52 Amir Goldstein 2020-12-02 539 iter_info.marks[FSNOTIFY_OBJ_TYPE_PARENT] =
fecc4559780d52 Amir Goldstein 2020-12-02 540 fsnotify_first_mark(&parent->i_fsnotify_marks);
497b0c5a7c0688 Amir Goldstein 2020-07-16 541 }
75c1be487a690d Eric Paris 2010-07-28 542
8edc6e1688fc8f Jan Kara 2014-11-13 543 /*
60f7ed8c7c4d06 Amir Goldstein 2018-09-01 544 * We need to merge inode/vfsmount/sb mark lists so that e.g. inode mark
60f7ed8c7c4d06 Amir Goldstein 2018-09-01 545 * ignore masks are properly reflected for mount/sb mark notifications.
8edc6e1688fc8f Jan Kara 2014-11-13 546 * That's why this traversal is so complicated...
8edc6e1688fc8f Jan Kara 2014-11-13 547 */
d9a6f30bb89309 Amir Goldstein 2018-04-20 548 while (fsnotify_iter_select_report_types(&iter_info)) {
dca640915c7b84 Amir Goldstein 2021-06-29 549 ret = send_to_group(mask, event_info, &iter_info);
613a807fe7c793 Eric Paris 2010-07-28 550
ff8bcbd03da881 Eric Paris 2010-10-28 551 if (ret && (mask & ALL_FSNOTIFY_PERM_EVENTS))
ff8bcbd03da881 Eric Paris 2010-10-28 552 goto out;
ff8bcbd03da881 Eric Paris 2010-10-28 553
d9a6f30bb89309 Amir Goldstein 2018-04-20 554 fsnotify_iter_next(&iter_info);
90586523eb4b34 Eric Paris 2009-05-21 555 }
ff8bcbd03da881 Eric Paris 2010-10-28 556 ret = 0;
ff8bcbd03da881 Eric Paris 2010-10-28 557 out:
9385a84d7e1f65 Jan Kara 2016-11-10 558 srcu_read_unlock(&fsnotify_mark_srcu, iter_info.srcu_idx);
c4ec54b40d33f8 Eric Paris 2009-12-17 559
98b5c10d320adf Jean-Christophe Dubois 2010-03-23 560 return ret;
90586523eb4b34 Eric Paris 2009-05-21 561 }
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]
On Wed, Jun 30, 2021 at 11:12 AM Dan Carpenter <[email protected]> wrote:
>
> Hi Gabriel,
>
> url: https://github.com/0day-ci/linux/commits/Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210630-031347
> base: https://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git fsnotify
> config: x86_64-randconfig-m001-20210628 (attached as .config)
> compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <[email protected]>
> Reported-by: Dan Carpenter <[email protected]>
>
> smatch warnings:
> fs/notify/fsnotify.c:505 __fsnotify() warn: variable dereferenced before check 'inode' (see line 494)
>
> vim +/inode +505 fs/notify/fsnotify.c
>
> dca640915c7b84 Amir Goldstein 2021-06-29 470 int __fsnotify(__u32 mask, const struct fsnotify_event_info *event_info)
> 90586523eb4b34 Eric Paris 2009-05-21 471 {
> dca640915c7b84 Amir Goldstein 2021-06-29 472 const struct path *path = fsnotify_event_info_path(event_info);
> dca640915c7b84 Amir Goldstein 2021-06-29 473 struct inode *inode = event_info->inode;
> 3427ce71554123 Miklos Szeredi 2017-10-30 474 struct fsnotify_iter_info iter_info = {};
> 40a100d3adc1ad Amir Goldstein 2020-07-22 475 struct super_block *sb;
> 60f7ed8c7c4d06 Amir Goldstein 2018-09-01 476 struct mount *mnt = NULL;
> fecc4559780d52 Amir Goldstein 2020-12-02 477 struct inode *parent = NULL;
> 9385a84d7e1f65 Jan Kara 2016-11-10 478 int ret = 0;
> 71d734103edfa2 Mel Gorman 2020-07-08 479 __u32 test_mask, marks_mask;
> 90586523eb4b34 Eric Paris 2009-05-21 480
> 71d734103edfa2 Mel Gorman 2020-07-08 481 if (path)
> aa93bdc5500cc9 Amir Goldstein 2020-03-19 482 mnt = real_mount(path->mnt);
> 3a9fb89f4cd04c Eric Paris 2009-12-17 483
> 40a100d3adc1ad Amir Goldstein 2020-07-22 484 if (!inode) {
> 40a100d3adc1ad Amir Goldstein 2020-07-22 485 /* Dirent event - report on TYPE_INODE to dir */
> dca640915c7b84 Amir Goldstein 2021-06-29 486 inode = event_info->dir;
> ^^^^^^^^^^^^^^^^^^^^^^^
> Presumably this is non-NULL
>
> 40a100d3adc1ad Amir Goldstein 2020-07-22 487 } else if (mask & FS_EVENT_ON_CHILD) {
> 40a100d3adc1ad Amir Goldstein 2020-07-22 488 /*
> fecc4559780d52 Amir Goldstein 2020-12-02 489 * Event on child - report on TYPE_PARENT to dir if it is
> fecc4559780d52 Amir Goldstein 2020-12-02 490 * watching children and on TYPE_INODE to child.
> 40a100d3adc1ad Amir Goldstein 2020-07-22 491 */
> dca640915c7b84 Amir Goldstein 2021-06-29 492 parent = event_info->dir;
> 40a100d3adc1ad Amir Goldstein 2020-07-22 493 }
> 40a100d3adc1ad Amir Goldstein 2020-07-22 @494 sb = inode->i_sb;
> ^^^^^^^^^^^^
> Dereference
>
> 497b0c5a7c0688 Amir Goldstein 2020-07-16 495
> 7c49b8616460eb Dave Hansen 2015-09-04 496 /*
> 7c49b8616460eb Dave Hansen 2015-09-04 497 * Optimization: srcu_read_lock() has a memory barrier which can
> 7c49b8616460eb Dave Hansen 2015-09-04 498 * be expensive. It protects walking the *_fsnotify_marks lists.
> 7c49b8616460eb Dave Hansen 2015-09-04 499 * However, if we do not walk the lists, we do not have to do
> 7c49b8616460eb Dave Hansen 2015-09-04 500 * SRCU because we have no references to any objects and do not
> 7c49b8616460eb Dave Hansen 2015-09-04 501 * need SRCU to keep them "alive".
> 7c49b8616460eb Dave Hansen 2015-09-04 502 */
> 9b93f33105f5f9 Amir Goldstein 2020-07-16 503 if (!sb->s_fsnotify_marks &&
> 497b0c5a7c0688 Amir Goldstein 2020-07-16 504 (!mnt || !mnt->mnt_fsnotify_marks) &&
> 9b93f33105f5f9 Amir Goldstein 2020-07-16 @505 (!inode || !inode->i_fsnotify_marks) &&
> ^^^^^^
> Unnecessary check for NULL
>
> fecc4559780d52 Amir Goldstein 2020-12-02 506 (!parent || !parent->i_fsnotify_marks))
> 7c49b8616460eb Dave Hansen 2015-09-04 507 return 0;
> 71d734103edfa2 Mel Gorman 2020-07-08 508
> 9b93f33105f5f9 Amir Goldstein 2020-07-16 509 marks_mask = sb->s_fsnotify_mask;
> 71d734103edfa2 Mel Gorman 2020-07-08 510 if (mnt)
> 71d734103edfa2 Mel Gorman 2020-07-08 511 marks_mask |= mnt->mnt_fsnotify_mask;
> 9b93f33105f5f9 Amir Goldstein 2020-07-16 512 if (inode)
> ^^^^^^
>
> 9b93f33105f5f9 Amir Goldstein 2020-07-16 513 marks_mask |= inode->i_fsnotify_mask;
> fecc4559780d52 Amir Goldstein 2020-12-02 514 if (parent)
> fecc4559780d52 Amir Goldstein 2020-12-02 515 marks_mask |= parent->i_fsnotify_mask;
> 497b0c5a7c0688 Amir Goldstein 2020-07-16 516
> 71d734103edfa2 Mel Gorman 2020-07-08 517
> 613a807fe7c793 Eric Paris 2010-07-28 518 /*
> 613a807fe7c793 Eric Paris 2010-07-28 519 * if this is a modify event we may need to clear the ignored masks
> 497b0c5a7c0688 Amir Goldstein 2020-07-16 520 * otherwise return if none of the marks care about this type of event.
> 613a807fe7c793 Eric Paris 2010-07-28 521 */
> 71d734103edfa2 Mel Gorman 2020-07-08 522 test_mask = (mask & ALL_FSNOTIFY_EVENTS);
> 71d734103edfa2 Mel Gorman 2020-07-08 523 if (!(mask & FS_MODIFY) && !(test_mask & marks_mask))
> 613a807fe7c793 Eric Paris 2010-07-28 524 return 0;
> 75c1be487a690d Eric Paris 2010-07-28 525
> 9385a84d7e1f65 Jan Kara 2016-11-10 526 iter_info.srcu_idx = srcu_read_lock(&fsnotify_mark_srcu);
> 75c1be487a690d Eric Paris 2010-07-28 527
> 45a9fb3725d886 Amir Goldstein 2019-01-10 528 iter_info.marks[FSNOTIFY_OBJ_TYPE_SB] =
> 45a9fb3725d886 Amir Goldstein 2019-01-10 529 fsnotify_first_mark(&sb->s_fsnotify_marks);
> 9bdda4e9cf2dce Amir Goldstein 2018-09-01 530 if (mnt) {
> 47d9c7cc457adc Amir Goldstein 2018-04-20 531 iter_info.marks[FSNOTIFY_OBJ_TYPE_VFSMOUNT] =
> 3427ce71554123 Miklos Szeredi 2017-10-30 532 fsnotify_first_mark(&mnt->mnt_fsnotify_marks);
> 7131485a93679f Eric Paris 2009-12-17 533 }
> 9b93f33105f5f9 Amir Goldstein 2020-07-16 534 if (inode) {
> ^^^^^
> Lots of checking... Maybe this is really NULL?
Do you have feeling of dejavu? ;-)
https://lore.kernel.org/linux-fsdevel/[email protected]/
We've been through this.
Maybe you silenced the smach warning on fsnotify() and the rename to
__fsnotifty()
caused this warning to refloat?
Thanks,
Amir.
On Wed, Jun 30, 2021 at 11:35:32AM +0300, Amir Goldstein wrote:
>
> Do you have feeling of dejavu? ;-)
> https://lore.kernel.org/linux-fsdevel/[email protected]/
That was a year ago. I have trouble remembering emails I sent
yesterday.
>
> We've been through this.
> Maybe you silenced the smach warning on fsnotify() and the rename to
> __fsnotifty()
> caused this warning to refloat?
Yes. Renaming the function will make it show up as a new warning. Also
this is an email from the kbuild-bot and last years email was from me,
so it's a different tool and a different record of sent messages.
(IMO, you should really just remove the bogus NULL checks because
everyone looking at the warning will think the code is buggy).
regards,
dan carpenter
On Wed, Jun 30, 2021 at 11:46 AM Dan Carpenter <[email protected]> wrote:
>
> On Wed, Jun 30, 2021 at 11:35:32AM +0300, Amir Goldstein wrote:
> >
> > Do you have feeling of dejavu? ;-)
> > https://lore.kernel.org/linux-fsdevel/[email protected]/
>
> That was a year ago. I have trouble remembering emails I sent
> yesterday.
>
> >
> > We've been through this.
> > Maybe you silenced the smach warning on fsnotify() and the rename to
> > __fsnotifty()
> > caused this warning to refloat?
>
> Yes. Renaming the function will make it show up as a new warning. Also
> this is an email from the kbuild-bot and last years email was from me,
> so it's a different tool and a different record of sent messages.
>
> (IMO, you should really just remove the bogus NULL checks because
> everyone looking at the warning will think the code is buggy).
>
I think the warning is really incorrect.
Why does it presume that event_info->dir is non-NULL?
Did smach check all the callers to fsnotify() or something?
What about future callers that will pass NULL, just like this one:
https://lore.kernel.org/linux-fsdevel/[email protected]/
Please fix the tool.
Thanks,
Amir.
On Wed, Jun 30, 2021 at 12:32 PM Amir Goldstein <[email protected]> wrote:
>
> On Wed, Jun 30, 2021 at 11:46 AM Dan Carpenter <[email protected]> wrote:
> >
> > On Wed, Jun 30, 2021 at 11:35:32AM +0300, Amir Goldstein wrote:
> > >
> > > Do you have feeling of dejavu? ;-)
> > > https://lore.kernel.org/linux-fsdevel/[email protected]/
> >
> > That was a year ago. I have trouble remembering emails I sent
> > yesterday.
> >
> > >
> > > We've been through this.
> > > Maybe you silenced the smach warning on fsnotify() and the rename to
> > > __fsnotifty()
> > > caused this warning to refloat?
> >
> > Yes. Renaming the function will make it show up as a new warning. Also
> > this is an email from the kbuild-bot and last years email was from me,
> > so it's a different tool and a different record of sent messages.
> >
> > (IMO, you should really just remove the bogus NULL checks because
> > everyone looking at the warning will think the code is buggy).
> >
>
> I think the warning is really incorrect.
> Why does it presume that event_info->dir is non-NULL?
> Did smach check all the callers to fsnotify() or something?
> What about future callers that will pass NULL, just like this one:
>
> https://lore.kernel.org/linux-fsdevel/[email protected]/
>
FWIW, the caller of this new helper passes NULL as inode:
https://lore.kernel.org/linux-fsdevel/[email protected]/
Thanks,
Amir.
On Tue, Jun 29, 2021 at 10:13 PM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> The FAN_FS_ERROR event is a new inode event used by filesystem wide
> monitoring tools to receive notifications of type FS_ERROR_EVENT,
> emitted directly by filesystems when a problem is detected. The error
> notification includes a generic error descriptor and a FID identifying
> the file affected.
>
> FID is sent for every FAN_FS_ERROR. Errors not linked to a regular inode
> are reported against the root inode.
>
> An error reporting structure is attached per-mark, and only a single
> error can be stored at a time. This is ok, since once an error occurs,
> it is common for a stream of related errors to be reported. We only log
> accumulate the total of errors occurred since the last notification.
>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
>
> ---
> Changes since v2:
> - Support and equire FID mode (amir)
> - Goto error path instead of early return (amir)
> - Simplify get_one_event (me)
> - Base merging on error_count
> - drop fanotify_queue_error_event
>
> Changes since v1:
> - Pass dentry to fanotify_check_fsid (Amir)
> - FANOTIFY_EVENT_TYPE_ERROR -> FANOTIFY_EVENT_TYPE_FS_ERROR
> - Merge previous patch into it
> - Use a single slot
> - Move fanotify_mark.error_event definition to this commit
> - Rename FAN_ERROR -> FAN_FS_ERROR
> - Restrict FAN_FS_ERROR to FAN_MARK_FILESYSTEM
> ---
> fs/notify/fanotify/fanotify.c | 108 +++++++++++++++++++++++------
> fs/notify/fanotify/fanotify.h | 53 ++++++++++++++
> fs/notify/fanotify/fanotify_user.c | 97 +++++++++++++++++++++++++-
> include/linux/fanotify.h | 11 ++-
> include/uapi/linux/fanotify.h | 8 +++
> 5 files changed, 253 insertions(+), 24 deletions(-)
>
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index 769703ef2b9a..854adcbeb95d 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -336,23 +336,6 @@ static u32 fanotify_group_event_mask(
> return test_mask & user_mask;
> }
>
> -/*
> - * Check size needed to encode fanotify_fh.
> - *
> - * Return size of encoded fh without fanotify_fh header.
> - * Return 0 on failure to encode.
> - */
> -static int fanotify_encode_fh_len(struct inode *inode)
> -{
> - int dwords = 0;
> -
> - if (!inode)
> - return 0;
> -
> - exportfs_encode_inode_fh(inode, NULL, &dwords, NULL);
> -
> - return dwords << 2;
> -}
>
> /*
> * Encode fanotify_fh.
> @@ -405,8 +388,12 @@ static int fanotify_encode_fh(struct fanotify_fh *fh, struct inode *inode,
> fh->type = type;
> fh->len = fh_len;
>
> - /* Mix fh into event merge key */
> - *hash ^= fanotify_hash_fh(fh);
> + /*
> + * Mix fh into event merge key. Hash might be NULL in case of
> + * unhashed FID events (i.e. FAN_FS_ERROR).
> + */
> + if (hash)
> + *hash ^= fanotify_hash_fh(fh);
>
> return FANOTIFY_FH_HDR_LEN + fh_len;
>
> @@ -692,6 +679,60 @@ static __kernel_fsid_t fanotify_get_fsid(struct fsnotify_iter_info *iter_info)
> return fsid;
> }
>
> +static int fanotify_merge_error_event(struct fsnotify_group *group,
> + struct fsnotify_event *event)
> +{
> + struct fanotify_event *fae = FANOTIFY_E(event);
> +
> + /*
> + * When err_count > 0, the reporting slot is full. Just account
> + * the additional error and abort the insertion.
> + */
> + if (atomic_fetch_inc(&FANOTIFY_EE(fae)->err_count) != 0)
> + return 1;
> +
I'd like to address the review of this patch in two parts.
First, I will comments on minor technical details.
Stay tuned for another reply regarding this merge() method...
> + return 0;
> +}
> +
> +static void fanotify_insert_error_event(struct fsnotify_group *group,
> + struct fsnotify_event *event,
> + const void *data)
> +{
> + const struct fsnotify_event_info *ei =
> + (struct fsnotify_event_info *) data;
> + struct fanotify_event *fae = FANOTIFY_E(event);
> + const struct fs_error_report *report;
> + struct fanotify_error_event *fee;
> + struct inode *inode;
> + int fh_len;
> +
> + /* This might be an unexpected type of event (i.e. overflow). */
> + if (!fanotify_is_error_event(fae->mask))
> + return;
> +
> + report = (struct fs_error_report *) ei->data;
> + inode = report->inode ?: ei->sb->s_root->d_inode;
> +
> + fee = FANOTIFY_EE(fae);
> + fee->fae.type = FANOTIFY_EVENT_TYPE_FS_ERROR;
> + fee->error = report->error;
> + fee->fsid = fee->mark->connector->fsid;
> +
> + fsnotify_get_mark(fee->mark);
> +
> + /*
> + * Error reporting needs to happen in atomic context. If this
> + * inode's file handler is more than we initially predicted,
> + * there is nothing better we can do than report the error with
> + * a bad FH.
> + */
> + fh_len = fanotify_encode_fh_len(inode);
> + if (WARN_ON(fh_len > fee->max_fh_len))
WARN_ON() is not acceptable for things that can logically happen
if you think this is important you could use pr_warn_ratelimited()
like we do in fanotify_encode_fh(),
but since fs-monitor will observe the lack of FID anyway, I think
there is little point in reporting this to kmsg.
> + return;
> +
> + fanotify_encode_fh(&fee->object_fh, inode, fh_len, NULL, 0);
> +}
> +
> /*
> * Add an event to hash table for faster merge.
> */
> @@ -742,8 +783,9 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
> BUILD_BUG_ON(FAN_ONDIR != FS_ISDIR);
> BUILD_BUG_ON(FAN_OPEN_EXEC != FS_OPEN_EXEC);
> BUILD_BUG_ON(FAN_OPEN_EXEC_PERM != FS_OPEN_EXEC_PERM);
> + BUILD_BUG_ON(FAN_FS_ERROR != FS_ERROR);
>
> - BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 19);
> + BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 20);
>
> mask = fanotify_group_event_mask(group, mask, event_info, iter_info);
> if (!mask)
> @@ -767,6 +809,17 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
> return 0;
> }
>
> + if (fanotify_is_error_event(mask)) {
> + struct fanotify_sb_mark *sb_mark =
> + FANOTIFY_SB_MARK(fsnotify_iter_sb_mark(iter_info));
> +
> + ret = fsnotify_add_event(group, &sb_mark->error_event->fae.fse,
> + fanotify_merge_error_event,
> + fanotify_insert_error_event,
> + event_info);
> + goto finish;
> + }
> +
> event = fanotify_alloc_event(group, mask, event_info, &fsid);
> ret = -ENOMEM;
> if (unlikely(!event)) {
> @@ -834,6 +887,17 @@ static void fanotify_free_name_event(struct fanotify_event *event)
> kfree(FANOTIFY_NE(event));
> }
>
> +static void fanotify_free_error_event(struct fanotify_event *event)
> +{
> + /*
> + * Just drop the reference acquired by
> + * fanotify_insert_error_event.
> + *
> + * The actual memory is freed with the mark.
> + */
> + fsnotify_put_mark(FANOTIFY_EE(event)->mark);
> +}
> +
> static void fanotify_free_event(struct fsnotify_event *fsn_event)
> {
> struct fanotify_event *event;
> @@ -856,6 +920,9 @@ static void fanotify_free_event(struct fsnotify_event *fsn_event)
> case FANOTIFY_EVENT_TYPE_OVERFLOW:
> kfree(event);
> break;
> + case FANOTIFY_EVENT_TYPE_FS_ERROR:
> + fanotify_free_error_event(event);
> + break;
> default:
> WARN_ON_ONCE(1);
> }
> @@ -873,6 +940,7 @@ static void fanotify_free_mark(struct fsnotify_mark *mark)
> if (mark->flags & FSNOTIFY_MARK_FLAG_SB) {
> struct fanotify_sb_mark *fa_mark = FANOTIFY_SB_MARK(mark);
>
> + kfree(fa_mark->error_event);
> kmem_cache_free(fanotify_sb_mark_cache, fa_mark);
> } else {
> kmem_cache_free(fanotify_mark_cache, mark);
> diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
> index 2e005b3a75f2..0259d631b8d8 100644
> --- a/fs/notify/fanotify/fanotify.h
> +++ b/fs/notify/fanotify/fanotify.h
> @@ -132,6 +132,7 @@ static inline void fanotify_info_copy_name(struct fanotify_info *info,
>
> struct fanotify_sb_mark {
> struct fsnotify_mark fsn_mark;
> + struct fanotify_error_event *error_event;
> };
>
> static inline
> @@ -154,6 +155,7 @@ enum fanotify_event_type {
> FANOTIFY_EVENT_TYPE_PATH,
> FANOTIFY_EVENT_TYPE_PATH_PERM,
> FANOTIFY_EVENT_TYPE_OVERFLOW, /* struct fanotify_event */
> + FANOTIFY_EVENT_TYPE_FS_ERROR, /* struct fanotify_error_event */
> __FANOTIFY_EVENT_TYPE_NUM
> };
>
> @@ -209,12 +211,37 @@ FANOTIFY_NE(struct fanotify_event *event)
> return container_of(event, struct fanotify_name_event, fae);
> }
>
> +struct fanotify_error_event {
> + struct fanotify_event fae;
> + s32 error; /* Error reported by the Filesystem. */
> + atomic_t err_count; /* Suppressed errors count */
> + __kernel_fsid_t fsid; /* FSID this error refers to. */
> +
> + struct fsnotify_mark *mark; /* Back reference to the mark. */
> + int max_fh_len; /* Maximum object_fh buffer size. */
> +
> + /*
> + * object_fh is followed by a variable sized buffer, so it must
> + * be the last element of this structure.
> + */
> + struct fanotify_fh object_fh;
> +};
> +
> +
> +static inline struct fanotify_error_event *
> +FANOTIFY_EE(struct fanotify_event *event)
> +{
> + return container_of(event, struct fanotify_error_event, fae);
> +}
> +
> static inline __kernel_fsid_t *fanotify_event_fsid(struct fanotify_event *event)
> {
> if (event->type == FANOTIFY_EVENT_TYPE_FID)
> return &FANOTIFY_FE(event)->fsid;
> else if (event->type == FANOTIFY_EVENT_TYPE_FID_NAME)
> return &FANOTIFY_NE(event)->fsid;
> + else if (event->type == FANOTIFY_EVENT_TYPE_FS_ERROR)
> + return &FANOTIFY_EE(event)->fsid;
> else
> return NULL;
> }
> @@ -226,6 +253,8 @@ static inline struct fanotify_fh *fanotify_event_object_fh(
> return &FANOTIFY_FE(event)->object_fh;
> else if (event->type == FANOTIFY_EVENT_TYPE_FID_NAME)
> return fanotify_info_file_fh(&FANOTIFY_NE(event)->info);
> + else if (event->type == FANOTIFY_EVENT_TYPE_FS_ERROR)
> + return &FANOTIFY_EE(event)->object_fh;
> else
> return NULL;
> }
> @@ -300,6 +329,11 @@ static inline struct fanotify_event *FANOTIFY_E(struct fsnotify_event *fse)
> return container_of(fse, struct fanotify_event, fse);
> }
>
> +static inline bool fanotify_is_error_event(u32 mask)
> +{
> + return mask & FANOTIFY_ERROR_EVENTS;
> +}
> +
> static inline bool fanotify_event_has_path(struct fanotify_event *event)
> {
> return event->type == FANOTIFY_EVENT_TYPE_PATH ||
> @@ -329,6 +363,7 @@ static inline struct path *fanotify_event_path(struct fanotify_event *event)
> static inline bool fanotify_is_hashed_event(u32 mask)
> {
> return !(fanotify_is_perm_event(mask) ||
> + fanotify_is_error_event(mask) ||
> fsnotify_is_overflow_event(mask));
> }
>
> @@ -338,3 +373,21 @@ static inline unsigned int fanotify_event_hash_bucket(
> {
> return event->hash & FANOTIFY_HTABLE_MASK;
> }
> +
> +/*
> + * Check size needed to encode fanotify_fh.
> + *
> + * Return size of encoded fh without fanotify_fh header.
> + * Return 0 on failure to encode.
> + */
> +static inline int fanotify_encode_fh_len(struct inode *inode)
> +{
> + int dwords = 0;
> +
> + if (!inode)
> + return 0;
> +
> + exportfs_encode_inode_fh(inode, NULL, &dwords, NULL);
> +
> + return dwords << 2;
> +}
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index a42521e140e6..8bcce26e21df 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -107,6 +107,8 @@ struct kmem_cache *fanotify_perm_event_cachep __read_mostly;
> #define FANOTIFY_EVENT_ALIGN 4
> #define FANOTIFY_INFO_HDR_LEN \
> (sizeof(struct fanotify_event_info_fid) + sizeof(struct file_handle))
> +#define FANOTIFY_INFO_ERROR_LEN \
> + (sizeof(struct fanotify_event_info_error))
>
> static int fanotify_fid_info_len(int fh_len, int name_len)
> {
> @@ -130,6 +132,9 @@ static size_t fanotify_event_len(struct fanotify_event *event,
> if (!fid_mode)
> return event_len;
>
> + if (fanotify_is_error_event(event->mask))
> + event_len += FANOTIFY_INFO_ERROR_LEN;
> +
> info = fanotify_event_info(event);
> dir_fh_len = fanotify_event_dir_fh_len(event);
> fh_len = fanotify_event_object_fh_len(event);
> @@ -310,6 +315,30 @@ static int process_access_response(struct fsnotify_group *group,
> return -ENOENT;
> }
>
> +static size_t copy_error_info_to_user(struct fanotify_event *event,
> + char __user *buf, int count)
> +{
> + struct fanotify_event_info_error info;
> + struct fanotify_error_event *fee = FANOTIFY_EE(event);
> +
> + info.hdr.info_type = FAN_EVENT_INFO_TYPE_ERROR;
> + info.hdr.pad = 0;
> + info.hdr.len = sizeof(struct fanotify_event_info_error);
> +
> + if (WARN_ON(count < info.hdr.len))
> + return -EFAULT;
> +
> + info.error = fee->error;
> +
> + /* This effectively releases the event for logging another error */
> + info.error_count = atomic_xchg(&fee->err_count, 0);
> +
> + if (copy_to_user(buf, &info, sizeof(info)))
> + return -EFAULT;
> +
> + return info.hdr.len;
> +}
> +
> static int copy_info_to_user(__kernel_fsid_t *fsid, struct fanotify_fh *fh,
> int info_type, const char *name, size_t name_len,
> char __user *buf, size_t count)
> @@ -468,6 +497,14 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
> if (f)
> fd_install(fd, f);
>
> + if (fanotify_is_error_event(event->mask)) {
> + ret = copy_error_info_to_user(event, buf, count);
> + if (ret < 0)
> + goto out_close_fd;
> + buf += ret;
> + count -= ret;
> + }
> +
> /* Event info records order is: dir fid + name, child fid */
> if (fanotify_event_dir_fh_len(event)) {
> info_type = info->name_len ? FAN_EVENT_INFO_TYPE_DFID_NAME :
> @@ -896,6 +933,43 @@ static int fanotify_remove_inode_mark(struct fsnotify_group *group,
> flags, umask);
> }
>
> +static int fanotify_create_fs_error_event(struct fsnotify_mark *fsn_mark,
> + fsnotify_connp_t *connp)
> +{
> + struct fanotify_sb_mark *sb_mark = FANOTIFY_SB_MARK(fsn_mark);
> + struct super_block *sb =
> + container_of(connp, struct super_block, s_fsnotify_marks);
> + struct fanotify_error_event *fee;
> + int fh_len;
> +
> + /*
> + * Since the allocation is done holding group->mark_mutex, the
> + * error event allocation is guaranteed not to race with itself.
If this is protected by a mutex then READ_ONCE/WRITE_ONCE are not need
and the comment above is confusing.
You should fire your code reviewer ;-)
> + */
> + if (READ_ONCE(sb_mark->error_event))
> + return 0;
> +
> + /* Since, for error events, every memory must be preallocated,
> + * the FH buffer size is predicted to be the same as the root
> + * inode file handler size. This should work for file systems
> + * without variable sized FH.
> + */
> + fh_len = fanotify_encode_fh_len(sb->s_root->d_inode);
> +
> + fee = kzalloc(sizeof(*fee) + fh_len, GFP_KERNEL);
GFP_KERNEL_ACCOUNT
> + if (!fee)
> + return -ENOMEM;
> +
> + fanotify_init_event(&fee->fae, 0, FS_ERROR);
> + fee->mark = fsn_mark;
> + fee->max_fh_len = fh_len;
> +
> + WRITE_ONCE(sb_mark->error_event, fee);
> +
> + return 0;
> +
> +}
> +
> static __u32 fanotify_mark_add_to_mask(struct fsnotify_mark *fsn_mark,
> __u32 mask,
> unsigned int flags)
> @@ -994,6 +1068,7 @@ static int fanotify_add_mark(struct fsnotify_group *group,
> {
> struct fsnotify_mark *fsn_mark;
> __u32 added;
> + int ret = 0;
>
> mutex_lock(&group->mark_mutex);
> fsn_mark = fsnotify_find_mark(connp, group);
> @@ -1004,13 +1079,28 @@ static int fanotify_add_mark(struct fsnotify_group *group,
> return PTR_ERR(fsn_mark);
> }
> }
> +
> + /*
> + * Error events are allocated per super-block mark, but only if
> + * strictly needed (i.e. FAN_FS_ERROR was requested).
> + */
> + if (type == FSNOTIFY_OBJ_TYPE_SB && !(flags & FAN_MARK_IGNORED_MASK) &&
> + (mask & FAN_FS_ERROR)) {
> + if (fanotify_create_fs_error_event(fsn_mark, connp)) {
> + ret = -ENOMEM;
> + goto out;
> + }
> + }
> +
> added = fanotify_mark_add_to_mask(fsn_mark, mask, flags);
> if (added & ~fsnotify_conn_mask(fsn_mark->connector))
> fsnotify_recalc_mask(fsn_mark->connector);
> +
> +out:
> mutex_unlock(&group->mark_mutex);
>
> fsnotify_put_mark(fsn_mark);
> - return 0;
> + return ret;
> }
>
> static int fanotify_add_vfsmount_mark(struct fsnotify_group *group,
> @@ -1427,6 +1517,11 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
> fsid = &__fsid;
> }
>
> + if (mask & FAN_FS_ERROR && mark_type != FAN_MARK_FILESYSTEM) {
> + ret = -EINVAL;
> + goto path_put_and_out;
> + }
> +
> /* inode held in place by reference to path; group by fget on fd */
> if (mark_type == FAN_MARK_INODE)
> inode = path.dentry->d_inode;
> diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
> index a16dbeced152..d086a19aff63 100644
> --- a/include/linux/fanotify.h
> +++ b/include/linux/fanotify.h
> @@ -81,13 +81,17 @@ extern struct ctl_table fanotify_table[]; /* for sysctl */
> */
> #define FANOTIFY_DIRENT_EVENTS (FAN_MOVE | FAN_CREATE | FAN_DELETE)
>
> -/* Events that can only be reported with data type FSNOTIFY_EVENT_INODE */
> +#define FANOTIFY_ERROR_EVENTS (FAN_FS_ERROR)
> +
> +/* Events that can only be reported to groups that support FID mode */
Let's not do that.
How about the opposite:
/* Events that can be reported with event->fd */
#define FANOTIFY_FD_EVENTS (FANOTIFY_PATH_EVENTS | FANOTIFY_PERM_EVENTS)
/*
* Events that do not carry enough information to report event->fd
* require a group that supports reporting fid.
* Those events are not supported on a mount mark, because they do
* not carry enough information (i.e. path) to be filtered by
mount point.
*/
fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
if (!(mask & FANOTIFY_FD_EVENTS) &&
(!fid_mode || mark_type == FAN_MARK_MOUNT))
Thanks,
Amir.
I think my bug report was not clear... :/ The code looks like this:
sb = inode->i_sb;
if (inode) ...
The NULL check cannot be false because if "inode" is NULL we would have
already crashed when we dereference it on the line before.
In this case, based on last years discussion, the "inode" pointer can't
be NULL. The debate is only whether the unnecessary NULL checks help
readability or hurt readability.
> Why does it presume that event_info->dir is non-NULL?
That was my commentary, just from reading the code. Smatch says that
"event->dir" is unknown.
> Did smach check all the callers to fsnotify() or something?
The kbuild-bot doesn't build the cross function database but if you did
use the cross function database then, yes, it does track all the
callers. There are two pointers that we care about, the "inode" and
the parent inode (dir). Smatch can figure out when "inode" is NULL vs
non-NULL but where it gets stuck is on the some of the parent inodes
like this call from fsnotify_dirent():
fsnotify_name(dir, mask, d_inode(dentry), &dentry->d_name, 0);
^^^^^^^^^^^^^^^
Smatch doesn't know that d_inode() is always non-NULL at this point.
regards,
dan carpenter
On Wed, Jun 30, 2021 at 1:49 PM Dan Carpenter <[email protected]> wrote:
>
> I think my bug report was not clear... :/ The code looks like this:
>
> sb = inode->i_sb;
>
> if (inode) ...
>
> The NULL check cannot be false because if "inode" is NULL we would have
> already crashed when we dereference it on the line before.
>
> In this case, based on last years discussion, the "inode" pointer can't
> be NULL. The debate is only whether the unnecessary NULL checks help
> readability or hurt readability.
>
Right. Sorry, I forgot.
Anyway, patch 11/15 of the same series changes this code to:
sb = event_info->sb ?: inode->i_sb;
So inode can and will be NULL coming from the caller of
fsnotify_sb_error(sb, NULL).
I think that should make smach happy?
You can try to run it after patch 11/15 of this series.
Thanks,
Amir.
On Tue, Jun 29, 2021 at 10:13 PM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> The FAN_FS_ERROR event is a new inode event used by filesystem wide
> monitoring tools to receive notifications of type FS_ERROR_EVENT,
> emitted directly by filesystems when a problem is detected. The error
> notification includes a generic error descriptor and a FID identifying
> the file affected.
>
> FID is sent for every FAN_FS_ERROR. Errors not linked to a regular inode
> are reported against the root inode.
>
> An error reporting structure is attached per-mark, and only a single
> error can be stored at a time. This is ok, since once an error occurs,
> it is common for a stream of related errors to be reported. We only log
> accumulate the total of errors occurred since the last notification.
>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
>
> ---
> Changes since v2:
> - Support and equire FID mode (amir)
> - Goto error path instead of early return (amir)
> - Simplify get_one_event (me)
About that...
> - Base merging on error_count
> - drop fanotify_queue_error_event
[...]
> +static int fanotify_merge_error_event(struct fsnotify_group *group,
> + struct fsnotify_event *event)
> +{
> + struct fanotify_event *fae = FANOTIFY_E(event);
> +
> + /*
> + * When err_count > 0, the reporting slot is full. Just account
> + * the additional error and abort the insertion.
> + */
> + if (atomic_fetch_inc(&FANOTIFY_EE(fae)->err_count) != 0)
> + return 1;
It feels a bit unsafe to bump err_count to 1 in merge() and set the error
info in insert(). feels like looking for trouble.
Maybe atomic_inc_not_zero() would have been better,
but since I am going to argue against modifying err_count outside
the notification_lock, it does not need to be atomic at all.
> +
> + return 0;
> +}
> +
> +static void fanotify_insert_error_event(struct fsnotify_group *group,
> + struct fsnotify_event *event,
> + const void *data)
> +{
> + const struct fsnotify_event_info *ei =
> + (struct fsnotify_event_info *) data;
> + struct fanotify_event *fae = FANOTIFY_E(event);
> + const struct fs_error_report *report;
> + struct fanotify_error_event *fee;
> + struct inode *inode;
> + int fh_len;
> +
> + /* This might be an unexpected type of event (i.e. overflow). */
> + if (!fanotify_is_error_event(fae->mask))
> + return;
> +
> + report = (struct fs_error_report *) ei->data;
> + inode = report->inode ?: ei->sb->s_root->d_inode;
As I commented on the cover letter, I think it would be better to
encode a NULL-FID in case of NULL inode, e.g.:
static int fanotify_encode_null_fh(struct fanotify_fh *fh)
{
fh->type = FILEID_ROOT;
fh->len = 8;
fh->flags = 0;
memset(fh->buf, 0, 8);
}
But that API may be controversial, so wait for other voices
before changing that.
> +
> + fee = FANOTIFY_EE(fae);
> + fee->fae.type = FANOTIFY_EVENT_TYPE_FS_ERROR;
> + fee->error = report->error;
> + fee->fsid = fee->mark->connector->fsid;
> +
> + fsnotify_get_mark(fee->mark);
> +
> + /*
> + * Error reporting needs to happen in atomic context. If this
> + * inode's file handler is more than we initially predicted,
> + * there is nothing better we can do than report the error with
> + * a bad FH.
> + */
> + fh_len = fanotify_encode_fh_len(inode);
> + if (WARN_ON(fh_len > fee->max_fh_len))
> + return;
> +
> + fanotify_encode_fh(&fee->object_fh, inode, fh_len, NULL, 0);
> +}
> +
[...]
> +static size_t copy_error_info_to_user(struct fanotify_event *event,
> + char __user *buf, int count)
> +{
> + struct fanotify_event_info_error info;
> + struct fanotify_error_event *fee = FANOTIFY_EE(event);
> +
> + info.hdr.info_type = FAN_EVENT_INFO_TYPE_ERROR;
> + info.hdr.pad = 0;
> + info.hdr.len = sizeof(struct fanotify_event_info_error);
> +
> + if (WARN_ON(count < info.hdr.len))
> + return -EFAULT;
> +
> + info.error = fee->error;
> +
> + /* This effectively releases the event for logging another error */
> + info.error_count = atomic_xchg(&fee->err_count, 0);
> +
it's a shame because the code is really simpler, but
I am afraid it may not be correct, even without mentioning breaking
the queue abstraction.
While the read/insert of err_count is "atomic", read of FID is done
after this point, not to mention that read of fee->error could be reordered
without barriers. FID and error could be set by insert event after reading
fee->err_count.
You can either go back to copying the error report to stack on dequeue
(though it was a bit ugly) under the group notification_lock
or you can allocate a new initialized error event on dequeue and
exchange it with the mark's event, without having to change the
calling semantics of get_one_event(), e.g.:
fsnotify_remove_first_event(group);
if (fanotify_is_perm_event(event->mask))
FANOTIFY_PERM(event)->state = FAN_EVENT_REPORTED;
if (fanotify_is_hashed_event(event->mask))
fanotify_unhash_event(group, event);
+ if (fanotify_is_error_event(event->mask))
+ event = fanotify_recreate_fs_error_event(event);
If allocation fails, this helper returns ERR_PTR(-EINVAL)
and resets the info and err_count of the event in the mark.
Thanks,
Amir.
Amir Goldstein <[email protected]> writes:
>> + fee->fsid = fee->mark->connector->fsid;
>> +
>> + fsnotify_get_mark(fee->mark);
>> +
>> + /*
>> + * Error reporting needs to happen in atomic context. If this
>> + * inode's file handler is more than we initially predicted,
>> + * there is nothing better we can do than report the error with
>> + * a bad FH.
>> + */
>> + fh_len = fanotify_encode_fh_len(inode);
>> + if (WARN_ON(fh_len > fee->max_fh_len))
>
> WARN_ON() is not acceptable for things that can logically happen
> if you think this is important you could use pr_warn_ratelimited()
> like we do in fanotify_encode_fh(),
> but since fs-monitor will observe the lack of FID anyway, I think
> there is little point in reporting this to kmsg.
Hi Amir,
Thanks for all the review so far.
Consider that fh_len > max_fh_len can happen only if the filesystem
requires a longer handler for the failed inode than it requires for the
root inode. Looking at the FH types, I don't think this would be
possible to happen currently, but this WARN_ON is trying to catch future
problems.
Notice this would not be a fs-monitor misuse of the uAPI, but an actual
kernel bug. The FH size we predicted when allocating the static error
slot is not large enough for at least one FH of this filesystem. So I
think a WARN_ON or a pr_warn is desired. I will change it to a
pr_warn_ratelimited as you suggested.
>> @@ -896,6 +933,43 @@ static int fanotify_remove_inode_mark(struct fsnotify_group *group,
>> flags, umask);
>> }
>>
>> +static int fanotify_create_fs_error_event(struct fsnotify_mark *fsn_mark,
>> + fsnotify_connp_t *connp)
>> +{
>> + struct fanotify_sb_mark *sb_mark = FANOTIFY_SB_MARK(fsn_mark);
>> + struct super_block *sb =
>> + container_of(connp, struct super_block, s_fsnotify_marks);
>> + struct fanotify_error_event *fee;
>> + int fh_len;
>> +
>> + /*
>> + * Since the allocation is done holding group->mark_mutex, the
>> + * error event allocation is guaranteed not to race with itself.
>
> If this is protected by a mutex then READ_ONCE/WRITE_ONCE are not need
> and the comment above is confusing.
> You should fire your code reviewer ;-)
okay :)
>
>> + */
>> + if (READ_ONCE(sb_mark->error_event))
>> + return 0;
>> +
>> + /* Since, for error events, every memory must be preallocated,
>> + * the FH buffer size is predicted to be the same as the root
>> + * inode file handler size. This should work for file systems
>> + * without variable sized FH.
>> + */
>> + fh_len = fanotify_encode_fh_len(sb->s_root->d_inode);
>> +
>> + fee = kzalloc(sizeof(*fee) + fh_len, GFP_KERNEL);
>
> GFP_KERNEL_ACCOUNT
will do.
>> diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
>> index a16dbeced152..d086a19aff63 100644
>> --- a/include/linux/fanotify.h
>> +++ b/include/linux/fanotify.h
>> @@ -81,13 +81,17 @@ extern struct ctl_table fanotify_table[]; /* for sysctl */
>> */
>> #define FANOTIFY_DIRENT_EVENTS (FAN_MOVE | FAN_CREATE | FAN_DELETE)
>>
>> -/* Events that can only be reported with data type FSNOTIFY_EVENT_INODE */
>> +#define FANOTIFY_ERROR_EVENTS (FAN_FS_ERROR)
>> +
>> +/* Events that can only be reported to groups that support FID mode */
>
> Let's not do that.
> How about the opposite:
>
> /* Events that can be reported with event->fd */
> #define FANOTIFY_FD_EVENTS (FANOTIFY_PATH_EVENTS | FANOTIFY_PERM_EVENTS)
>
> /*
> * Events that do not carry enough information to report event->fd
> * require a group that supports reporting fid.
> * Those events are not supported on a mount mark, because they do
> * not carry enough information (i.e. path) to be filtered by
> mount point.
> */
> fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
> if (!(mask & FANOTIFY_FD_EVENTS) &&
> (!fid_mode || mark_type == FAN_MARK_MOUNT))
>
Will do.
Thanks,
--
Gabriel Krisman Bertazi
On Wed, Jun 30, 2021 at 8:43 PM Gabriel Krisman Bertazi
<[email protected]> wrote:
>
> Amir Goldstein <[email protected]> writes:
>
> >> + fee->fsid = fee->mark->connector->fsid;
> >> +
> >> + fsnotify_get_mark(fee->mark);
> >> +
> >> + /*
> >> + * Error reporting needs to happen in atomic context. If this
> >> + * inode's file handler is more than we initially predicted,
> >> + * there is nothing better we can do than report the error with
> >> + * a bad FH.
> >> + */
> >> + fh_len = fanotify_encode_fh_len(inode);
> >> + if (WARN_ON(fh_len > fee->max_fh_len))
> >
> > WARN_ON() is not acceptable for things that can logically happen
> > if you think this is important you could use pr_warn_ratelimited()
> > like we do in fanotify_encode_fh(),
> > but since fs-monitor will observe the lack of FID anyway, I think
> > there is little point in reporting this to kmsg.
>
> Hi Amir,
>
> Thanks for all the review so far.
>
> Consider that fh_len > max_fh_len can happen only if the filesystem
> requires a longer handler for the failed inode than it requires for the
> root inode. Looking at the FH types, I don't think this would be
> possible to happen currently, but this WARN_ON is trying to catch future
> problems.
>
Don't get confused by FH types. A filesystem is not obliged to
return a uniform and single handle_type nor uniform handle_size.
Overlayfs FH size depends on the FH size of the fs in the layer
the file is on, which may be different for different files.
> Notice this would not be a fs-monitor misuse of the uAPI, but an actual
> kernel bug. The FH size we predicted when allocating the static error
> slot is not large enough for at least one FH of this filesystem. So I
> think a WARN_ON or a pr_warn is desired. I will change it to a
> pr_warn_ratelimited as you suggested.
>
It would be a very minor kernel bug.
It would mean that there is a filesystem that matters in practice
for error reporting with different sizes of FH which you did not
take into account.
There is also a solution, but I think it is an overkill -
If you follow my suggestion to recreate the mark error event
on dequeue, you can update max_fh_len and re-created the
next event with larger buffer size.
In that case, admin will only see a few pr_warn_ratelimited()
messages until fs-monitors reads the overflowed error event.
Also, I think it would be wise to use the NULL-FID convention
with different handle_types to report the different cases of:
- Failed encode (FILEID_INVALID)
- No inode (FILEID_ROOT)
Also, better use FANOTIFY_INLINE_FH_LEN as mimimum
for error event buffer size.
Thanks,
Amir.
On Tue 29-06-21 15:10:22, Gabriel Krisman Bertazi wrote:
> Every time this function is invoked, it is immediately added to
> FAN_EVENT_METADATA_LEN, since there is no need to just calculate the
> length of info records. This minor clean up folds the rest of the
> calculation into the function, which now operates in terms of events,
> returning the size of the entire event, including metadata.
>
> Reviewed-by: Amir Goldstein <[email protected]>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <[email protected]>
Honza
>
> ---
> Changes since v1:
> - rebased on top of hashing patches
> ---
> fs/notify/fanotify/fanotify_user.c | 33 +++++++++++++++++-------------
> 1 file changed, 19 insertions(+), 14 deletions(-)
>
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index 64864fb40b40..68a53d3534f8 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -117,17 +117,24 @@ static int fanotify_fid_info_len(int fh_len, int name_len)
> return roundup(FANOTIFY_INFO_HDR_LEN + info_len, FANOTIFY_EVENT_ALIGN);
> }
>
> -static int fanotify_event_info_len(unsigned int fid_mode,
> - struct fanotify_event *event)
> +static size_t fanotify_event_len(struct fanotify_event *event,
> + unsigned int fid_mode)
> {
> - struct fanotify_info *info = fanotify_event_info(event);
> - int dir_fh_len = fanotify_event_dir_fh_len(event);
> - int fh_len = fanotify_event_object_fh_len(event);
> - int info_len = 0;
> + size_t event_len = FAN_EVENT_METADATA_LEN;
> + struct fanotify_info *info;
> + int dir_fh_len;
> + int fh_len;
> int dot_len = 0;
>
> + if (!fid_mode)
> + return event_len;
> +
> + info = fanotify_event_info(event);
> + dir_fh_len = fanotify_event_dir_fh_len(event);
> + fh_len = fanotify_event_object_fh_len(event);
> +
> if (dir_fh_len) {
> - info_len += fanotify_fid_info_len(dir_fh_len, info->name_len);
> + event_len += fanotify_fid_info_len(dir_fh_len, info->name_len);
> } else if ((fid_mode & FAN_REPORT_NAME) && (event->mask & FAN_ONDIR)) {
> /*
> * With group flag FAN_REPORT_NAME, if name was not recorded in
> @@ -137,9 +144,9 @@ static int fanotify_event_info_len(unsigned int fid_mode,
> }
>
> if (fh_len)
> - info_len += fanotify_fid_info_len(fh_len, dot_len);
> + event_len += fanotify_fid_info_len(fh_len, dot_len);
>
> - return info_len;
> + return event_len;
> }
>
> /*
> @@ -168,7 +175,7 @@ static void fanotify_unhash_event(struct fsnotify_group *group,
> static struct fanotify_event *get_one_event(struct fsnotify_group *group,
> size_t count)
> {
> - size_t event_size = FAN_EVENT_METADATA_LEN;
> + size_t event_size;
> struct fanotify_event *event = NULL;
> struct fsnotify_event *fsn_event;
> unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
> @@ -181,8 +188,7 @@ static struct fanotify_event *get_one_event(struct fsnotify_group *group,
> goto out;
>
> event = FANOTIFY_E(fsn_event);
> - if (fid_mode)
> - event_size += fanotify_event_info_len(fid_mode, event);
> + event_size = fanotify_event_len(event, fid_mode);
>
> if (event_size > count) {
> event = ERR_PTR(-EINVAL);
> @@ -412,8 +418,7 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
>
> pr_debug("%s: group=%p event=%p\n", __func__, group, event);
>
> - metadata.event_len = FAN_EVENT_METADATA_LEN +
> - fanotify_event_info_len(fid_mode, event);
> + metadata.event_len = fanotify_event_len(event, fid_mode);
> metadata.metadata_len = FAN_EVENT_METADATA_LEN;
> metadata.vers = FANOTIFY_METADATA_VERSION;
> metadata.reserved = 0;
> --
> 2.32.0
>
--
Jan Kara <[email protected]>
SUSE Labs, CR
On Tue 29-06-21 15:10:21, Gabriel Krisman Bertazi wrote:
> Some events, like the overflow event, are not mergeable, so they are not
> hashed. But, when failing inside fsnotify_add_event for lack of space,
> fsnotify_add_event() still calls the insert hook, which adds the
> overflow event to the merge list. Add a check to prevent any kind of
> unmergeable event to be inserted in the hashtable.
>
> Fixes: 94e00d28a680 ("fsnotify: use hash table for faster events merge")
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <[email protected]>
Honza
>
> ---
> Changes since v2:
> - Do check for hashed events inside the insert hook (Amir)
> ---
> fs/notify/fanotify/fanotify.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index 057abd2cf887..310246f8d3f1 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -702,6 +702,9 @@ static void fanotify_insert_event(struct fsnotify_group *group,
>
> assert_spin_locked(&group->notification_lock);
>
> + if (!fanotify_is_hashed_event(event->mask))
> + return;
> +
> pr_debug("%s: group=%p event=%p bucket=%u\n", __func__,
> group, event, bucket);
>
> @@ -779,8 +782,7 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
>
> fsn_event = &event->fse;
> ret = fsnotify_add_event(group, fsn_event, fanotify_merge,
> - fanotify_is_hashed_event(mask) ?
> - fanotify_insert_event : NULL);
> + fanotify_insert_event);
> if (ret) {
> /* Permission events shouldn't be merged */
> BUG_ON(ret == 1 && mask & FANOTIFY_PERM_EVENTS);
> --
> 2.32.0
>
--
Jan Kara <[email protected]>
SUSE Labs, CR
On Tue 29-06-21 15:10:23, Gabriel Krisman Bertazi wrote:
> FAN_FS_ERROR will require fsid, but not necessarily require the
> filesystem to expose a file handle. Split those checks into different
> functions, so they can be used separately when setting up an event.
>
> While there, update a comment about tmpfs having 0 fsid, which is no
> longer true.
>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <[email protected]>
Honza
>
> ---
> Changes since v2:
> - FAN_ERROR -> FAN_FS_ERROR (Amir)
> - Update comment (Amir)
>
> Changes since v1:
> (Amir)
> - Sort hunks to simplify diff.
> Changes since RFC:
> (Amir)
> - Rename fanotify_check_path_fsid -> fanotify_test_fsid.
> - Use dentry directly instead of path.
> ---
> fs/notify/fanotify/fanotify_user.c | 27 ++++++++++++++++++---------
> 1 file changed, 18 insertions(+), 9 deletions(-)
>
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index 68a53d3534f8..67b18dfe0025 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -1192,16 +1192,15 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
> return fd;
> }
>
> -/* Check if filesystem can encode a unique fid */
> -static int fanotify_test_fid(struct path *path, __kernel_fsid_t *fsid)
> +static int fanotify_test_fsid(struct dentry *dentry, __kernel_fsid_t *fsid)
> {
> __kernel_fsid_t root_fsid;
> int err;
>
> /*
> - * Make sure path is not in filesystem with zero fsid (e.g. tmpfs).
> + * Make sure dentry is not of a filesystem with zero fsid (e.g. fuse).
> */
> - err = vfs_get_fsid(path->dentry, fsid);
> + err = vfs_get_fsid(dentry, fsid);
> if (err)
> return err;
>
> @@ -1209,10 +1208,10 @@ static int fanotify_test_fid(struct path *path, __kernel_fsid_t *fsid)
> return -ENODEV;
>
> /*
> - * Make sure path is not inside a filesystem subvolume (e.g. btrfs)
> + * Make sure dentry is not of a filesystem subvolume (e.g. btrfs)
> * which uses a different fsid than sb root.
> */
> - err = vfs_get_fsid(path->dentry->d_sb->s_root, &root_fsid);
> + err = vfs_get_fsid(dentry->d_sb->s_root, &root_fsid);
> if (err)
> return err;
>
> @@ -1220,6 +1219,12 @@ static int fanotify_test_fid(struct path *path, __kernel_fsid_t *fsid)
> root_fsid.val[1] != fsid->val[1])
> return -EXDEV;
>
> + return 0;
> +}
> +
> +/* Check if filesystem can encode a unique fid */
> +static int fanotify_test_fid(struct dentry *dentry)
> +{
> /*
> * We need to make sure that the file system supports at least
> * encoding a file handle so user can use name_to_handle_at() to
> @@ -1227,8 +1232,8 @@ static int fanotify_test_fid(struct path *path, __kernel_fsid_t *fsid)
> * objects. However, name_to_handle_at() requires that the
> * filesystem also supports decoding file handles.
> */
> - if (!path->dentry->d_sb->s_export_op ||
> - !path->dentry->d_sb->s_export_op->fh_to_dentry)
> + if (!dentry->d_sb->s_export_op ||
> + !dentry->d_sb->s_export_op->fh_to_dentry)
> return -EOPNOTSUPP;
>
> return 0;
> @@ -1379,7 +1384,11 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
> }
>
> if (fid_mode) {
> - ret = fanotify_test_fid(&path, &__fsid);
> + ret = fanotify_test_fsid(path.dentry, &__fsid);
> + if (ret)
> + goto path_put_and_out;
> +
> + ret = fanotify_test_fid(path.dentry);
> if (ret)
> goto path_put_and_out;
>
> --
> 2.32.0
>
--
Jan Kara <[email protected]>
SUSE Labs, CR
On Tue 29-06-21 15:10:25, Gabriel Krisman Bertazi wrote:
> According to Amir:
>
> "FS_IN_IGNORED is completely internal to inotify and there is no need
> to set it in i_fsnotify_mask at all, so if we remove the bit from the
> output of inotify_arg_to_mask() no functionality will change and we will
> be able to overload the event bit for FS_ERROR."
>
> This is done in preparation to overload FS_ERROR with the notification
> mechanism in fanotify.
>
> Suggested-by: Amir Goldstein <[email protected]>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
> Reviewed-by: Amir Goldstein <[email protected]>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <[email protected]>
Honza
> ---
> fs/notify/inotify/inotify_user.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
> index 98f61b31745a..4d17be6dd58d 100644
> --- a/fs/notify/inotify/inotify_user.c
> +++ b/fs/notify/inotify/inotify_user.c
> @@ -89,10 +89,10 @@ static inline __u32 inotify_arg_to_mask(struct inode *inode, u32 arg)
> __u32 mask;
>
> /*
> - * Everything should accept their own ignored and should receive events
> - * when the inode is unmounted. All directories care about children.
> + * Everything should receive events when the inode is unmounted.
> + * All directories care about children.
> */
> - mask = (FS_IN_IGNORED | FS_UNMOUNT);
> + mask = (FS_UNMOUNT);
> if (S_ISDIR(inode->i_mode))
> mask |= FS_EVENT_ON_CHILD;
>
> --
> 2.32.0
>
--
Jan Kara <[email protected]>
SUSE Labs, CR
On Tue 29-06-21 15:10:24, Gabriel Krisman Bertazi wrote:
> FAN_ERROR will require an error structure to be stored per mark. But,
> since FAN_ERROR doesn't apply to inode/mount marks, it should suffice to
> only expose this information for superblock marks. Therefore, wrap this
> kind of marks into a container and plumb it for the future.
>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
...
> -static void fanotify_free_mark(struct fsnotify_mark *fsn_mark)
> +static void fanotify_free_mark(struct fsnotify_mark *mark)
> {
> - kmem_cache_free(fanotify_mark_cache, fsn_mark);
> + if (mark->flags & FSNOTIFY_MARK_FLAG_SB) {
> + struct fanotify_sb_mark *fa_mark = FANOTIFY_SB_MARK(mark);
> +
> + kmem_cache_free(fanotify_sb_mark_cache, fa_mark);
> + } else {
> + kmem_cache_free(fanotify_mark_cache, mark);
> + }
> }
Frankly, I find using mark->flags for fanotify internal distinction of mark
type somewhat ugly. Even more so because fsnotify_put_mark() could infer
the mark type information from mark->conn->type and pass it to the freeing
function. But the passing would be somewhat non-trivial so probably we can
leave that for some other day. But the fact that FSNOTIFY_MARK_FLAG_SB is
set inside fsnotify-backend specific code is a landmine waiting just for
another backend to start supporting sb marks, not set
FSNOTIFY_MARK_FLAG_SB, and some generic code depend on checking
FSNOTIFY_MARK_FLAG_SB instead of mark->conn->type.
So I see two sensible solutions:
a) Just admit this is fanotify private flag, carve out some flags from
mark->flags as backend private and have FANOTIFY_MARK_FLAG_SB in that space
(e.g. look how include/linux/buffer_head.h has flags upto BH_PrivateStart,
then e.g. include/linux/jbd2.h starts its flags from BH_PrivateStart
further).
b) Make a rule that mark connector type is also stored in mark->flags. We
have plenty of space there so why not. Then fsnotify_add_mark_locked() has
to store type into the flags.
Pick your poison :)
Honza
> const struct fsnotify_ops fanotify_fsnotify_ops = {
> diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
> index 4a5e555dc3d2..fd125a949187 100644
> --- a/fs/notify/fanotify/fanotify.h
> +++ b/fs/notify/fanotify/fanotify.h
> @@ -6,6 +6,7 @@
> #include <linux/hashtable.h>
>
> extern struct kmem_cache *fanotify_mark_cache;
> +extern struct kmem_cache *fanotify_sb_mark_cache;
> extern struct kmem_cache *fanotify_fid_event_cachep;
> extern struct kmem_cache *fanotify_path_event_cachep;
> extern struct kmem_cache *fanotify_perm_event_cachep;
> @@ -129,6 +130,18 @@ static inline void fanotify_info_copy_name(struct fanotify_info *info,
> name->name);
> }
>
> +struct fanotify_sb_mark {
> + struct fsnotify_mark fsn_mark;
> +};
> +
> +static inline
> +struct fanotify_sb_mark *FANOTIFY_SB_MARK(struct fsnotify_mark *mark)
> +{
> + WARN_ON(!(mark->flags & FSNOTIFY_MARK_FLAG_SB));
> +
> + return container_of(mark, struct fanotify_sb_mark, fsn_mark);
> +}
> +
> /*
> * Common structure for fanotify events. Concrete structs are allocated in
> * fanotify_handle_event() and freed when the information is retrieved by
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index 67b18dfe0025..a42521e140e6 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -99,6 +99,7 @@ struct ctl_table fanotify_table[] = {
> extern const struct fsnotify_ops fanotify_fsnotify_ops;
>
> struct kmem_cache *fanotify_mark_cache __read_mostly;
> +struct kmem_cache *fanotify_sb_mark_cache __read_mostly;
> struct kmem_cache *fanotify_fid_event_cachep __read_mostly;
> struct kmem_cache *fanotify_path_event_cachep __read_mostly;
> struct kmem_cache *fanotify_perm_event_cachep __read_mostly;
> @@ -915,6 +916,38 @@ static __u32 fanotify_mark_add_to_mask(struct fsnotify_mark *fsn_mark,
> return mask & ~oldmask;
> }
>
> +static struct fsnotify_mark *fanotify_alloc_mark(struct fsnotify_group *group,
> + unsigned int type)
> +{
> + struct fanotify_sb_mark *sb_mark;
> + struct fsnotify_mark *mark;
> +
> + switch (type) {
> + case FSNOTIFY_OBJ_TYPE_SB:
> + sb_mark = kmem_cache_zalloc(fanotify_sb_mark_cache, GFP_KERNEL);
> + if (!sb_mark)
> + return NULL;
> + mark = &sb_mark->fsn_mark;
> + break;
> +
> + case FSNOTIFY_OBJ_TYPE_INODE:
> + case FSNOTIFY_OBJ_TYPE_PARENT:
> + case FSNOTIFY_OBJ_TYPE_VFSMOUNT:
> + mark = kmem_cache_alloc(fanotify_mark_cache, GFP_KERNEL);
> + break;
> + default:
> + WARN_ON(1);
> + return NULL;
> + }
> +
> + fsnotify_init_mark(mark, group);
> +
> + if (type == FSNOTIFY_OBJ_TYPE_SB)
> + mark->flags |= FSNOTIFY_MARK_FLAG_SB;
> +
> + return mark;
> +}
> +
> static struct fsnotify_mark *fanotify_add_new_mark(struct fsnotify_group *group,
> fsnotify_connp_t *connp,
> unsigned int type,
> @@ -933,13 +966,12 @@ static struct fsnotify_mark *fanotify_add_new_mark(struct fsnotify_group *group,
> !inc_ucount(ucounts->ns, ucounts->uid, UCOUNT_FANOTIFY_MARKS))
> return ERR_PTR(-ENOSPC);
>
> - mark = kmem_cache_alloc(fanotify_mark_cache, GFP_KERNEL);
> + mark = fanotify_alloc_mark(group, type);
> if (!mark) {
> ret = -ENOMEM;
> goto out_dec_ucounts;
> }
>
> - fsnotify_init_mark(mark, group);
> ret = fsnotify_add_mark_locked(mark, connp, type, 0, fsid);
> if (ret) {
> fsnotify_put_mark(mark);
> @@ -1497,6 +1529,8 @@ static int __init fanotify_user_setup(void)
>
> fanotify_mark_cache = KMEM_CACHE(fsnotify_mark,
> SLAB_PANIC|SLAB_ACCOUNT);
> + fanotify_sb_mark_cache = KMEM_CACHE(fanotify_sb_mark,
> + SLAB_PANIC|SLAB_ACCOUNT);
> fanotify_fid_event_cachep = KMEM_CACHE(fanotify_fid_event,
> SLAB_PANIC);
> fanotify_path_event_cachep = KMEM_CACHE(fanotify_path_event,
> diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
> index 1ce66748a2d2..c4473b467c28 100644
> --- a/include/linux/fsnotify_backend.h
> +++ b/include/linux/fsnotify_backend.h
> @@ -401,6 +401,7 @@ struct fsnotify_mark {
> #define FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY 0x01
> #define FSNOTIFY_MARK_FLAG_ALIVE 0x02
> #define FSNOTIFY_MARK_FLAG_ATTACHED 0x04
> +#define FSNOTIFY_MARK_FLAG_SB 0x08
> unsigned int flags; /* flags [mark->lock] */
> };
>
> --
> 2.32.0
>
--
Jan Kara <[email protected]>
SUSE Labs, CR
On Tue 29-06-21 15:10:26, Gabriel Krisman Bertazi wrote:
> Similarly to fanotify_is_perm_event and friends, provide a helper
> predicate to say whether a mask is of an overflow event.
>
> Suggested-by: Amir Goldstein <[email protected]>
> Reviewed-by: Amir Goldstein <[email protected]>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
> ---
> fs/notify/fanotify/fanotify.h | 3 ++-
> include/linux/fsnotify_backend.h | 5 +++++
> 2 files changed, 7 insertions(+), 1 deletion(-)
Looks good. Feel free to add:
Reviewed-by: Jan Kara <[email protected]>
Honza
>
> diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
> index fd125a949187..2e005b3a75f2 100644
> --- a/fs/notify/fanotify/fanotify.h
> +++ b/fs/notify/fanotify/fanotify.h
> @@ -328,7 +328,8 @@ static inline struct path *fanotify_event_path(struct fanotify_event *event)
> */
> static inline bool fanotify_is_hashed_event(u32 mask)
> {
> - return !fanotify_is_perm_event(mask) && !(mask & FS_Q_OVERFLOW);
> + return !(fanotify_is_perm_event(mask) ||
> + fsnotify_is_overflow_event(mask));
> }
>
> static inline unsigned int fanotify_event_hash_bucket(
> diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
> index c4473b467c28..f9e2c6cd0f7d 100644
> --- a/include/linux/fsnotify_backend.h
> +++ b/include/linux/fsnotify_backend.h
> @@ -495,6 +495,11 @@ static inline void fsnotify_queue_overflow(struct fsnotify_group *group)
> fsnotify_add_event(group, group->overflow_event, NULL, NULL);
> }
>
> +static inline bool fsnotify_is_overflow_event(u32 mask)
> +{
> + return mask & FS_Q_OVERFLOW;
> +}
> +
> static inline bool fsnotify_notify_queue_is_empty(struct fsnotify_group *group)
> {
> assert_spin_locked(&group->notification_lock);
> --
> 2.32.0
>
--
Jan Kara <[email protected]>
SUSE Labs, CR
On Wed, Jul 7, 2021 at 11:13 PM Jan Kara <[email protected]> wrote:
>
> On Tue 29-06-21 15:10:24, Gabriel Krisman Bertazi wrote:
> > FAN_ERROR will require an error structure to be stored per mark. But,
> > since FAN_ERROR doesn't apply to inode/mount marks, it should suffice to
> > only expose this information for superblock marks. Therefore, wrap this
> > kind of marks into a container and plumb it for the future.
> >
> > Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
>
> ...
>
> > -static void fanotify_free_mark(struct fsnotify_mark *fsn_mark)
> > +static void fanotify_free_mark(struct fsnotify_mark *mark)
> > {
> > - kmem_cache_free(fanotify_mark_cache, fsn_mark);
> > + if (mark->flags & FSNOTIFY_MARK_FLAG_SB) {
> > + struct fanotify_sb_mark *fa_mark = FANOTIFY_SB_MARK(mark);
> > +
> > + kmem_cache_free(fanotify_sb_mark_cache, fa_mark);
> > + } else {
> > + kmem_cache_free(fanotify_mark_cache, mark);
> > + }
> > }
>
> Frankly, I find using mark->flags for fanotify internal distinction of mark
> type somewhat ugly. Even more so because fsnotify_put_mark() could infer
> the mark type information from mark->conn->type and pass it to the freeing
> function. But the passing would be somewhat non-trivial so probably we can
> leave that for some other day. But the fact that FSNOTIFY_MARK_FLAG_SB is
> set inside fsnotify-backend specific code is a landmine waiting just for
> another backend to start supporting sb marks, not set
> FSNOTIFY_MARK_FLAG_SB, and some generic code depend on checking
> FSNOTIFY_MARK_FLAG_SB instead of mark->conn->type.
>
> So I see two sensible solutions:
>
> a) Just admit this is fanotify private flag, carve out some flags from
> mark->flags as backend private and have FANOTIFY_MARK_FLAG_SB in that space
> (e.g. look how include/linux/buffer_head.h has flags upto BH_PrivateStart,
> then e.g. include/linux/jbd2.h starts its flags from BH_PrivateStart
> further).
>
> b) Make a rule that mark connector type is also stored in mark->flags. We
> have plenty of space there so why not. Then fsnotify_add_mark_locked() has
> to store type into the flags.
>
> Pick your poison :)
I find option a) more flexible for future expansion.
This way fanotify could potentially allocate "normal" sb marks
(e.g. if not FAN_REPORT_FID) and "private" sb marks otherwise
(not that I recommend it - this was just an example)
Also, it is far less fsnotify code changes:
#define FSNOTIFY_MARK_FLAG_ATTACHED 0x04
+/* Backend private flags */
+#define FSNOTIFY_MARK_FLAG_PRIVATE(x) (0x100 + (x))
[...]
+#define FANOTIFY_MARK_FLAG_SB FSNOTIFY_MARK_FLAG_PRIVATE(0x01)
+
+struct fanotify_sb_mark {
+ struct fsnotify_mark fsn_mark;
+};
+
Thanks,
Amir.
On Tue 29-06-21 15:10:27, Gabriel Krisman Bertazi wrote:
> From: Amir Goldstein <[email protected]>
>
> There are a lot of arguments to fsnotify() and the handle_event() method.
> Pass them in a const struct instead of on the argument list.
>
> Apart from being more tidy, this helps with passing error reports to the
> backend. __fsnotify_parent() argument list was intentionally left
> untouched, because its argument list is still short enough and because
> most of the event info arguments are initialized inside
> __fsnotify_parent().
>
> Signed-off-by: Amir Goldstein <[email protected]>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
> ---
> fs/notify/fanotify/fanotify.c | 59 +++++++++++------------
> fs/notify/fsnotify.c | 83 +++++++++++++++++---------------
> include/linux/fsnotify.h | 15 ++++--
> include/linux/fsnotify_backend.h | 73 +++++++++++++++++++++-------
> 4 files changed, 140 insertions(+), 90 deletions(-)
Besides the noop function issue Amir has already pointed out I have just a
few nits:
> @@ -229,7 +229,11 @@ int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
> }
>
> notify:
> - ret = fsnotify(mask, data, data_type, p_inode, file_name, inode, 0);
> + ret = __fsnotify(mask, &(struct fsnotify_event_info) {
> + .data = data, .data_type = data_type,
> + .dir = p_inode, .name = file_name,
> + .inode = inode,
> + });
What's the advantage of using __fsnotify() here instead of fsnotify()? In
terms of readability the fewer places with these initializers the better
I'd say...
> static int fsnotify_handle_event(struct fsnotify_group *group, __u32 mask,
> - const void *data, int data_type,
> - struct inode *dir, const struct qstr *name,
> - u32 cookie, struct fsnotify_iter_info *iter_info)
> + const struct fsnotify_event_info *event_info,
> + struct fsnotify_iter_info *iter_info)
> {
> struct fsnotify_mark *inode_mark = fsnotify_iter_inode_mark(iter_info);
> struct fsnotify_mark *parent_mark = fsnotify_iter_parent_mark(iter_info);
> + struct fsnotify_event_info child_event_info = { };
> int ret;
No need to init child_event_info. It is fully rewritten if it gets used...
> diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
> index f8acddcf54fb..8c2c681b4495 100644
> --- a/include/linux/fsnotify.h
> +++ b/include/linux/fsnotify.h
> @@ -30,7 +30,10 @@ static inline void fsnotify_name(struct inode *dir, __u32 mask,
> struct inode *child,
> const struct qstr *name, u32 cookie)
> {
> - fsnotify(mask, child, FSNOTIFY_EVENT_INODE, dir, name, NULL, cookie);
> + __fsnotify(mask, &(struct fsnotify_event_info) {
> + .data = child, .data_type = FSNOTIFY_EVENT_INODE,
> + .dir = dir, .name = name, .cookie = cookie,
> + });
> }
Hmm, maybe we could have a macro initializer like:
#define FSNOTIFY_EVENT_INFO(data, data_type, dir, name, inode, cookie) \
(struct fsnotify_event_info) { \
.data = (data), .data_type = (data_type), .dir = (dir), \
.name = (name), .inode = (inode), .cookie = (cookie)}
Then we'd have:
__fsnotify(mask, &FSNOTIFY_EVENT_INFO(child, FSNOTIFY_EVENT_INODE,
dir, name, NULL, cookie));
Which looks a bit nicer to me. What do you think guys?
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR
On Tue 29-06-21 15:10:28, Gabriel Krisman Bertazi wrote:
> FAN_FS_ERROR requires some initialization to happen from inside the
> insert hook. This allows a user of fanotify_add_event to pass an
> argument to be sent to the insert callback.
>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <[email protected]>
Honza
> ---
> fs/notify/fanotify/fanotify.c | 5 +++--
> fs/notify/inotify/inotify_fsnotify.c | 2 +-
> fs/notify/notification.c | 6 ++++--
> include/linux/fsnotify_backend.h | 7 +++++--
> 4 files changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index 4f2febb15e94..aba06b84da91 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -695,7 +695,8 @@ static __kernel_fsid_t fanotify_get_fsid(struct fsnotify_iter_info *iter_info)
> * Add an event to hash table for faster merge.
> */
> static void fanotify_insert_event(struct fsnotify_group *group,
> - struct fsnotify_event *fsn_event)
> + struct fsnotify_event *fsn_event,
> + const void *data)
> {
> struct fanotify_event *event = FANOTIFY_E(fsn_event);
> unsigned int bucket = fanotify_event_hash_bucket(group, event);
> @@ -779,7 +780,7 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
>
> fsn_event = &event->fse;
> ret = fsnotify_add_event(group, fsn_event, fanotify_merge,
> - fanotify_insert_event);
> + fanotify_insert_event, NULL);
> if (ret) {
> /* Permission events shouldn't be merged */
> BUG_ON(ret == 1 && mask & FANOTIFY_PERM_EVENTS);
> diff --git a/fs/notify/inotify/inotify_fsnotify.c b/fs/notify/inotify/inotify_fsnotify.c
> index d1a64daa0171..a003a64ff8ee 100644
> --- a/fs/notify/inotify/inotify_fsnotify.c
> +++ b/fs/notify/inotify/inotify_fsnotify.c
> @@ -116,7 +116,7 @@ int inotify_handle_inode_event(struct fsnotify_mark *inode_mark, u32 mask,
> if (len)
> strcpy(event->name, name->name);
>
> - ret = fsnotify_add_event(group, fsn_event, inotify_merge, NULL);
> + ret = fsnotify_add_event(group, fsn_event, inotify_merge, NULL, NULL);
> if (ret) {
> /* Our event wasn't used in the end. Free it. */
> fsnotify_destroy_event(group, fsn_event);
> diff --git a/fs/notify/notification.c b/fs/notify/notification.c
> index 32f45543b9c6..0d9ba592d725 100644
> --- a/fs/notify/notification.c
> +++ b/fs/notify/notification.c
> @@ -83,7 +83,9 @@ int fsnotify_add_event(struct fsnotify_group *group,
> int (*merge)(struct fsnotify_group *,
> struct fsnotify_event *),
> void (*insert)(struct fsnotify_group *,
> - struct fsnotify_event *))
> + struct fsnotify_event *,
> + const void *),
> + const void *insert_data)
> {
> int ret = 0;
> struct list_head *list = &group->notification_list;
> @@ -121,7 +123,7 @@ int fsnotify_add_event(struct fsnotify_group *group,
> group->q_len++;
> list_add_tail(&event->list, list);
> if (insert)
> - insert(group, event);
> + insert(group, event, insert_data);
> spin_unlock(&group->notification_lock);
>
> wake_up(&group->notification_waitq);
> diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
> index b1590f654ade..8222fe12a6c9 100644
> --- a/include/linux/fsnotify_backend.h
> +++ b/include/linux/fsnotify_backend.h
> @@ -526,11 +526,14 @@ extern int fsnotify_add_event(struct fsnotify_group *group,
> int (*merge)(struct fsnotify_group *,
> struct fsnotify_event *),
> void (*insert)(struct fsnotify_group *,
> - struct fsnotify_event *));
> + struct fsnotify_event *,
> + const void *),
> + const void *insert_data);
> +
> /* Queue overflow event to a notification group */
> static inline void fsnotify_queue_overflow(struct fsnotify_group *group)
> {
> - fsnotify_add_event(group, group->overflow_event, NULL, NULL);
> + fsnotify_add_event(group, group->overflow_event, NULL, NULL, NULL);
> }
>
> static inline bool fsnotify_is_overflow_event(u32 mask)
> --
> 2.32.0
>
--
Jan Kara <[email protected]>
SUSE Labs, CR
On Tue 29-06-21 15:10:30, Gabriel Krisman Bertazi wrote:
> Expose a new type of fsnotify event for filesystems to report errors for
> userspace monitoring tools. fanotify will send this type of
> notification for FAN_FS_ERROR events.
>
> Reviewed-by: Amir Goldstein <[email protected]>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
Looks sensible. Feel free to add:
Reviewed-by: Jan Kara <[email protected]>
Honza
>
> ---
> Changes since v2:
> - FAN_ERROR->FAN_FS_ERROR (Amir)
>
> Changes since v1:
> - Overload FS_ERROR with FS_IN_IGNORED
> - Implement support for this type on fsnotify_data_inode (Amir)
> ---
> include/linux/fsnotify_backend.h | 17 ++++++++++++++++-
> 1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
> index 8222fe12a6c9..ea5f5c7cc381 100644
> --- a/include/linux/fsnotify_backend.h
> +++ b/include/linux/fsnotify_backend.h
> @@ -42,6 +42,12 @@
>
> #define FS_UNMOUNT 0x00002000 /* inode on umount fs */
> #define FS_Q_OVERFLOW 0x00004000 /* Event queued overflowed */
> +#define FS_ERROR 0x00008000 /* Filesystem Error (fanotify) */
> +
> +/*
> + * FS_IN_IGNORED overloads FS_ERROR. It is only used internally by inotify
> + * which does not support FS_ERROR.
> + */
> #define FS_IN_IGNORED 0x00008000 /* last inotify event here */
>
> #define FS_OPEN_PERM 0x00010000 /* open event in an permission hook */
> @@ -95,7 +101,8 @@
> #define ALL_FSNOTIFY_EVENTS (ALL_FSNOTIFY_DIRENT_EVENTS | \
> FS_EVENTS_POSS_ON_CHILD | \
> FS_DELETE_SELF | FS_MOVE_SELF | FS_DN_RENAME | \
> - FS_UNMOUNT | FS_Q_OVERFLOW | FS_IN_IGNORED)
> + FS_UNMOUNT | FS_Q_OVERFLOW | FS_IN_IGNORED | \
> + FS_ERROR)
>
> /* Extra flags that may be reported with event or control handling of events */
> #define ALL_FSNOTIFY_FLAGS (FS_EXCL_UNLINK | FS_ISDIR | FS_IN_ONESHOT | \
> @@ -263,6 +270,12 @@ enum fsnotify_data_type {
> FSNOTIFY_EVENT_NONE,
> FSNOTIFY_EVENT_PATH,
> FSNOTIFY_EVENT_INODE,
> + FSNOTIFY_EVENT_ERROR,
> +};
> +
> +struct fs_error_report {
> + int error;
> + struct inode *inode;
> };
>
> static inline struct inode *fsnotify_data_inode(const void *data, int data_type)
> @@ -272,6 +285,8 @@ static inline struct inode *fsnotify_data_inode(const void *data, int data_type)
> return (struct inode *)data;
> case FSNOTIFY_EVENT_PATH:
> return d_inode(((const struct path *)data)->dentry);
> + case FSNOTIFY_EVENT_ERROR:
> + return ((struct fs_error_report *)data)->inode;
> default:
> return NULL;
> }
> --
> 2.32.0
>
--
Jan Kara <[email protected]>
SUSE Labs, CR
On Tue 29-06-21 15:10:31, Gabriel Krisman Bertazi wrote:
> Introduce helpers for filesystems interested in reporting FS_ERROR
> events. When notifying errors, the file system might not have an inode
> to report on the error. To support this, allow the caller to specify
> the superblock to which the error applies.
>
> Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
Looks OK. Feel free to add:
Reviewed-by: Jan Kara <[email protected]>
Honza
>
> ---
> Changes since v2:
> - Drop reference to s_fnotify_marks and guards (Amir)
>
> Changes since v1:
> - Use the inode argument (Amir)
> - Protect s_fsnotify_marks with ifdef guard
> ---
> fs/notify/fsnotify.c | 2 +-
> include/linux/fsnotify.h | 13 +++++++++++++
> include/linux/fsnotify_backend.h | 1 +
> 3 files changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
> index 36205a769dde..ac05eb3fb368 100644
> --- a/fs/notify/fsnotify.c
> +++ b/fs/notify/fsnotify.c
> @@ -491,7 +491,7 @@ int __fsnotify(__u32 mask, const struct fsnotify_event_info *event_info)
> */
> parent = event_info->dir;
> }
> - sb = inode->i_sb;
> + sb = event_info->sb ?: inode->i_sb;
>
> /*
> * Optimization: srcu_read_lock() has a memory barrier which can
> diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
> index 8c2c681b4495..684c79ca01b2 100644
> --- a/include/linux/fsnotify.h
> +++ b/include/linux/fsnotify.h
> @@ -326,4 +326,17 @@ static inline void fsnotify_change(struct dentry *dentry, unsigned int ia_valid)
> fsnotify_dentry(dentry, mask);
> }
>
> +static inline int fsnotify_sb_error(struct super_block *sb, struct inode *inode,
> + int error)
> +{
> + struct fs_error_report report = {
> + .error = error,
> + .inode = inode,
> + };
> +
> + return __fsnotify(FS_ERROR, &(struct fsnotify_event_info) {
> + .data = &report, .data_type = FSNOTIFY_EVENT_ERROR,
> + .sb = sb});
> +}
> +
> #endif /* _LINUX_FS_NOTIFY_H */
> diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
> index ea5f5c7cc381..5a32c5010f45 100644
> --- a/include/linux/fsnotify_backend.h
> +++ b/include/linux/fsnotify_backend.h
> @@ -138,6 +138,7 @@ struct fsnotify_event_info {
> struct inode *dir;
> const struct qstr *name;
> struct inode *inode;
> + struct super_block *sb;
> u32 cookie;
> };
>
> --
> 2.32.0
>
--
Jan Kara <[email protected]>
SUSE Labs, CR
On Thu, Jul 8, 2021 at 1:43 PM Jan Kara <[email protected]> wrote:
>
> On Tue 29-06-21 15:10:27, Gabriel Krisman Bertazi wrote:
> > From: Amir Goldstein <[email protected]>
> >
> > There are a lot of arguments to fsnotify() and the handle_event() method.
> > Pass them in a const struct instead of on the argument list.
> >
> > Apart from being more tidy, this helps with passing error reports to the
> > backend. __fsnotify_parent() argument list was intentionally left
> > untouched, because its argument list is still short enough and because
> > most of the event info arguments are initialized inside
> > __fsnotify_parent().
> >
> > Signed-off-by: Amir Goldstein <[email protected]>
> > Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
> > ---
> > fs/notify/fanotify/fanotify.c | 59 +++++++++++------------
> > fs/notify/fsnotify.c | 83 +++++++++++++++++---------------
> > include/linux/fsnotify.h | 15 ++++--
> > include/linux/fsnotify_backend.h | 73 +++++++++++++++++++++-------
> > 4 files changed, 140 insertions(+), 90 deletions(-)
>
> Besides the noop function issue Amir has already pointed out I have just a
> few nits:
>
> > @@ -229,7 +229,11 @@ int __fsnotify_parent(struct dentry *dentry, __u32 mask, const void *data,
> > }
> >
> > notify:
> > - ret = fsnotify(mask, data, data_type, p_inode, file_name, inode, 0);
> > + ret = __fsnotify(mask, &(struct fsnotify_event_info) {
> > + .data = data, .data_type = data_type,
> > + .dir = p_inode, .name = file_name,
> > + .inode = inode,
> > + });
>
> What's the advantage of using __fsnotify() here instead of fsnotify()? In
> terms of readability the fewer places with these initializers the better
> I'd say...
>
> > static int fsnotify_handle_event(struct fsnotify_group *group, __u32 mask,
> > - const void *data, int data_type,
> > - struct inode *dir, const struct qstr *name,
> > - u32 cookie, struct fsnotify_iter_info *iter_info)
> > + const struct fsnotify_event_info *event_info,
> > + struct fsnotify_iter_info *iter_info)
> > {
> > struct fsnotify_mark *inode_mark = fsnotify_iter_inode_mark(iter_info);
> > struct fsnotify_mark *parent_mark = fsnotify_iter_parent_mark(iter_info);
> > + struct fsnotify_event_info child_event_info = { };
> > int ret;
>
> No need to init child_event_info. It is fully rewritten if it gets used...
>
> > diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
> > index f8acddcf54fb..8c2c681b4495 100644
> > --- a/include/linux/fsnotify.h
> > +++ b/include/linux/fsnotify.h
> > @@ -30,7 +30,10 @@ static inline void fsnotify_name(struct inode *dir, __u32 mask,
> > struct inode *child,
> > const struct qstr *name, u32 cookie)
> > {
> > - fsnotify(mask, child, FSNOTIFY_EVENT_INODE, dir, name, NULL, cookie);
> > + __fsnotify(mask, &(struct fsnotify_event_info) {
> > + .data = child, .data_type = FSNOTIFY_EVENT_INODE,
> > + .dir = dir, .name = name, .cookie = cookie,
> > + });
> > }
>
> Hmm, maybe we could have a macro initializer like:
>
> #define FSNOTIFY_EVENT_INFO(data, data_type, dir, name, inode, cookie) \
> (struct fsnotify_event_info) { \
> .data = (data), .data_type = (data_type), .dir = (dir), \
> .name = (name), .inode = (inode), .cookie = (cookie)}
>
> Then we'd have:
> __fsnotify(mask, &FSNOTIFY_EVENT_INFO(child, FSNOTIFY_EVENT_INODE,
> dir, name, NULL, cookie));
>
> Which looks a bit nicer to me. What do you think guys?
>
Sure, looks good.
But I think it would be even better to have different "wrapper defines" like
FSNOTIFY_NAME_EVENT_INFO() will less irrelevant arguments.
Thanks,
Amir.
On Wed 30-06-21 08:10:39, Amir Goldstein wrote:
> > This change raises the question of how we report non-inode errors. On
> > one hand, we could omit the FID report, but then fsid would also be
> > ommited. I chose to report these kind of errors against the root
> > inode.
>
> There are other option to consider.
Yeah, so reporting against root inode has the disadvantage that in
principle you don't know whether the error really happened on the root
inode or whether the event is in fact without an inode. So some information
is lost here. Maybe the set of errors that can happen without an inode and
the set of errors that can happen with an inode are disjoint, so no
information is actually lost but then does reporting root inode actually
bring any benefit? So I agree reporting root inode is not ideal.
> To avoid special casing error events in fanotify event read code,
> it would is convenient to use a non-zero length FID, but you can
> use a 8 bytes zero buffer as NULL-FID
>
> If I am not mistaken, that amounts to 64 bytes of event_len
> including the event_metadata and both records which is pretty
> nicely aligned.
>
> All 3 handle_type options below are valid options:
> 1. handle_type FILEID_ROOT
> 2. handle_type FILEID_INVALID
> 3. handle_type FILEID_INO32_GEN (i.e. ino=0;gen=0)
>
> The advantage of option #3 is that the monitoring program
> does not need to special case the NULL_FID case when
> parsing the FID to informative user message.
I actually like #2 more. #1 has similar problems as I outlined above for
reporting root dir. The advantage that userspace won't have to special case
FILEID_INO32_GEN FID in #3 is IMHO a dream - if you want a good message,
you should report the problem was on a superblock, not some just some
zeroes instead of proper inode info. Even more if it was on a real inode,
good reporter will e.g. try to resolve it to a path.
Also because we will presumably have more filesystems supporting this in
the future, normal inodes can be reported with other handle types anyway.
So IMO #2 is the most sensible option.
> w.r.t LTP test, I don't think that using a corrupt image will be a good way
> for an LTP test. LTP tests can prepare and mount an ext4 loop image.
> Does ext4 have some debugging method to inject an error?
> Because that would be the best way IMO.
> If it doesn't, you can implement this in ext4 and use it in the test if that
> debug file exists - skip the test otherwise - it's common practice.
Ext4 does not have an error injection facility. Not sure if we want to
force Gabriel into creating one just for these LTP tests. Actually creating
ext4 images with known problems (through mke2fs and debugfs) should be
rather easy and we could then test we get expected error notifications
back...
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR
On Thu 08-07-21 14:09:43, Amir Goldstein wrote:
> On Thu, Jul 8, 2021 at 1:43 PM Jan Kara <[email protected]> wrote:
> > On Tue 29-06-21 15:10:27, Gabriel Krisman Bertazi wrote:
> > > diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
> > > index f8acddcf54fb..8c2c681b4495 100644
> > > --- a/include/linux/fsnotify.h
> > > +++ b/include/linux/fsnotify.h
> > > @@ -30,7 +30,10 @@ static inline void fsnotify_name(struct inode *dir, __u32 mask,
> > > struct inode *child,
> > > const struct qstr *name, u32 cookie)
> > > {
> > > - fsnotify(mask, child, FSNOTIFY_EVENT_INODE, dir, name, NULL, cookie);
> > > + __fsnotify(mask, &(struct fsnotify_event_info) {
> > > + .data = child, .data_type = FSNOTIFY_EVENT_INODE,
> > > + .dir = dir, .name = name, .cookie = cookie,
> > > + });
> > > }
> >
> > Hmm, maybe we could have a macro initializer like:
> >
> > #define FSNOTIFY_EVENT_INFO(data, data_type, dir, name, inode, cookie) \
> > (struct fsnotify_event_info) { \
> > .data = (data), .data_type = (data_type), .dir = (dir), \
> > .name = (name), .inode = (inode), .cookie = (cookie)}
> >
> > Then we'd have:
> > __fsnotify(mask, &FSNOTIFY_EVENT_INFO(child, FSNOTIFY_EVENT_INODE,
> > dir, name, NULL, cookie));
> >
> > Which looks a bit nicer to me. What do you think guys?
> >
>
> Sure, looks good.
> But I think it would be even better to have different "wrapper defines" like
> FSNOTIFY_NAME_EVENT_INFO() will less irrelevant arguments.
If we don't overdo it, I agree :) I mean if we end up with a different
helper for each site creating this structure, I'm not sure it helps much...
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR
On Thu, Jul 8, 2021 at 2:32 PM Jan Kara <[email protected]> wrote:
>
> On Wed 30-06-21 08:10:39, Amir Goldstein wrote:
> > > This change raises the question of how we report non-inode errors. On
> > > one hand, we could omit the FID report, but then fsid would also be
> > > ommited. I chose to report these kind of errors against the root
> > > inode.
> >
> > There are other option to consider.
>
> Yeah, so reporting against root inode has the disadvantage that in
> principle you don't know whether the error really happened on the root
> inode or whether the event is in fact without an inode. So some information
> is lost here. Maybe the set of errors that can happen without an inode and
> the set of errors that can happen with an inode are disjoint, so no
> information is actually lost but then does reporting root inode actually
> bring any benefit? So I agree reporting root inode is not ideal.
>
> > To avoid special casing error events in fanotify event read code,
> > it would is convenient to use a non-zero length FID, but you can
> > use a 8 bytes zero buffer as NULL-FID
> >
> > If I am not mistaken, that amounts to 64 bytes of event_len
> > including the event_metadata and both records which is pretty
> > nicely aligned.
> >
> > All 3 handle_type options below are valid options:
> > 1. handle_type FILEID_ROOT
> > 2. handle_type FILEID_INVALID
> > 3. handle_type FILEID_INO32_GEN (i.e. ino=0;gen=0)
> >
> > The advantage of option #3 is that the monitoring program
> > does not need to special case the NULL_FID case when
> > parsing the FID to informative user message.
>
> I actually like #2 more. #1 has similar problems as I outlined above for
> reporting root dir. The advantage that userspace won't have to special case
> FILEID_INO32_GEN FID in #3 is IMHO a dream - if you want a good message,
> you should report the problem was on a superblock, not some just some
> zeroes instead of proper inode info. Even more if it was on a real inode,
> good reporter will e.g. try to resolve it to a path.
>
> Also because we will presumably have more filesystems supporting this in
> the future, normal inodes can be reported with other handle types anyway.
> So IMO #2 is the most sensible option.
>
I am perfectly fine with #2, but just FYI, there is no ambiguity around using
FILEID_ROOT - it is a special "application level" type used by nfsd to describe
a handle to the root of an export (I'm not in which protocol versions).
The inode itself does not even need to be the root of the filesystem
(it seldom is)
nfsd keeps a dentry with elevated refcount per export in order to "decode"
those export root handles.
The filesystem itself will never return this value, so when reporting
errors on a
specific inode and the specific inode is the filesystem root directory then
the type will be the same as the native filesystem type and not FILEID_ROOT.
For this reason I thought it would be safe to use FILEID_ROOT for reporting
events on sb without inode and FILEID_INVALID for reporting failure to encode
fh, for example when not guessing max_fh_len correctly.
But it's just nice to have.
> > w.r.t LTP test, I don't think that using a corrupt image will be a good way
> > for an LTP test. LTP tests can prepare and mount an ext4 loop image.
> > Does ext4 have some debugging method to inject an error?
> > Because that would be the best way IMO.
> > If it doesn't, you can implement this in ext4 and use it in the test if that
> > debug file exists - skip the test otherwise - it's common practice.
>
> Ext4 does not have an error injection facility. Not sure if we want to
> force Gabriel into creating one just for these LTP tests. Actually creating
> ext4 images with known problems (through mke2fs and debugfs) should be
> rather easy and we could then test we get expected error notifications
> back...
Ok.
Thanks,
Amir.
kernel test robot <[email protected]> writes:
> Hi Gabriel,
>
> I love your patch! Yet something to improve:
>
> [auto build test ERROR on ext3/fsnotify]
> [also build test ERROR on ext4/dev linus/master v5.13 next-20210629]
> [cannot apply to tytso-fscrypt/master]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url: https://github.com/0day-ci/linux/commits/Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210630-031347
> base: https://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git fsnotify
> config: arm64-allyesconfig (attached as .config)
> compiler: aarch64-linux-gcc (GCC) 9.3.0
> reproduce (this is a W=1 build):
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # https://github.com/0day-ci/linux/commit/746524d8db08a041fed90e41b15c8e8ca69cb22d
> git remote add linux-review https://github.com/0day-ci/linux
> git fetch --no-tags linux-review Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210630-031347
> git checkout 746524d8db08a041fed90e41b15c8e8ca69cb22d
> # save the attached .config to linux build tree
> mkdir build_dir
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross O=build_dir ARCH=arm64 SHELL=/bin/bash samples/
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <[email protected]>
>
> All errors (new ones prefixed by >>):
>
>>> samples/fanotify/fs-monitor.c:7:10: fatal error: errno.h: No such file or directory
> 7 | #include <errno.h>
> | ^~~~~~~~~
> compilation terminated.
Hi Dan,
I'm not sure what's the proper fix here. Looks like 0day is not using
cross system libraries when building this user space code. Should I do
something special to silent it?
--
Gabriel Krisman Bertazi
On Mon, Jul 19, 2021 at 10:36:54AM -0400, Gabriel Krisman Bertazi wrote:
> kernel test robot <[email protected]> writes:
>
> > Hi Gabriel,
> >
> > I love your patch! Yet something to improve:
> >
> > [auto build test ERROR on ext3/fsnotify]
> > [also build test ERROR on ext4/dev linus/master v5.13 next-20210629]
> > [cannot apply to tytso-fscrypt/master]
> > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > And when submitting patch, we suggest to use '--base' as documented in
> > https://git-scm.com/docs/git-format-patch ]
> >
> > url: https://github.com/0day-ci/linux/commits/Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210630-031347
> > base: https://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git fsnotify
> > config: arm64-allyesconfig (attached as .config)
> > compiler: aarch64-linux-gcc (GCC) 9.3.0
> > reproduce (this is a W=1 build):
> > wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> > chmod +x ~/bin/make.cross
> > # https://github.com/0day-ci/linux/commit/746524d8db08a041fed90e41b15c8e8ca69cb22d
> > git remote add linux-review https://github.com/0day-ci/linux
> > git fetch --no-tags linux-review Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210630-031347
> > git checkout 746524d8db08a041fed90e41b15c8e8ca69cb22d
> > # save the attached .config to linux build tree
> > mkdir build_dir
> > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross O=build_dir ARCH=arm64 SHELL=/bin/bash samples/
> >
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kernel test robot <[email protected]>
> >
> > All errors (new ones prefixed by >>):
> >
> >>> samples/fanotify/fs-monitor.c:7:10: fatal error: errno.h: No such file or directory
> > 7 | #include <errno.h>
> > | ^~~~~~~~~
> > compilation terminated.
>
> Hi Dan,
>
> I'm not sure what's the proper fix here. Looks like 0day is not using
> cross system libraries when building this user space code. Should I do
> something special to silent it?
I'm not the person to ask, I just look at Smatch warnings. Rong Chen
might know the answer.
regards,
dan carpenter
Hi Gabriel,
On 7/21/2021 3:49 AM, Dan Carpenter wrote:
>
> On Mon, Jul 19, 2021 at 10:36:54AM -0400, Gabriel Krisman Bertazi wrote:
>> kernel test robot <[email protected]> writes:
>>
>>> Hi Gabriel,
>>>
>>> I love your patch! Yet something to improve:
>>>
>>> [auto build test ERROR on ext3/fsnotify]
>>> [also build test ERROR on ext4/dev linus/master v5.13 next-20210629]
>>> [cannot apply to tytso-fscrypt/master]
>>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>>> And when submitting patch, we suggest to use '--base' as documented in
>>> https://git-scm.com/docs/git-format-patch ]
>>>
>>> url: https://github.com/0day-ci/linux/commits/Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210630-031347
>>> base: https://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git fsnotify
>>> config: arm64-allyesconfig (attached as .config)
>>> compiler: aarch64-linux-gcc (GCC) 9.3.0
>>> reproduce (this is a W=1 build):
>>> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>>> chmod +x ~/bin/make.cross
>>> # https://github.com/0day-ci/linux/commit/746524d8db08a041fed90e41b15c8e8ca69cb22d
>>> git remote add linux-review https://github.com/0day-ci/linux
>>> git fetch --no-tags linux-review Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210630-031347
>>> git checkout 746524d8db08a041fed90e41b15c8e8ca69cb22d
>>> # save the attached .config to linux build tree
>>> mkdir build_dir
>>> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross O=build_dir ARCH=arm64 SHELL=/bin/bash samples/
>>>
>>> If you fix the issue, kindly add following tag as appropriate
>>> Reported-by: kernel test robot <[email protected]>
>>>
>>> All errors (new ones prefixed by >>):
>>>
>>>>> samples/fanotify/fs-monitor.c:7:10: fatal error: errno.h: No such file or directory
>>> 7 | #include <errno.h>
>>> | ^~~~~~~~~
>>> compilation terminated.
>>
>> Hi Dan,
>>
>> I'm not sure what's the proper fix here. Looks like 0day is not using
>> cross system libraries when building this user space code. Should I do
>> something special to silent it?
It seems need extra libraries for arm64, we'll disable CONFIG_SAMPLES to
avoid reporting this error.
Best Regards,
Rong Chen
>
> I'm not the person to ask, I just look at Smatch warnings. Rong Chen
> might know the answer.
>
> regards,
> dan carpenter
>
"Chen, Rong A" <[email protected]> writes:
> Hi Gabriel,
>
> On 7/21/2021 3:49 AM, Dan Carpenter wrote:
>> On Mon, Jul 19, 2021 at 10:36:54AM -0400, Gabriel Krisman Bertazi
>> wrote:
>>> kernel test robot <[email protected]> writes:
>>>
>>>> Hi Gabriel,
>>>>
>>>> I love your patch! Yet something to improve:
>>>>
>>>> [auto build test ERROR on ext3/fsnotify]
>>>> [also build test ERROR on ext4/dev linus/master v5.13 next-20210629]
>>>> [cannot apply to tytso-fscrypt/master]
>>>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>>>> And when submitting patch, we suggest to use '--base' as documented in
>>>> https://git-scm.com/docs/git-format-patch ]
>>>>
>>>> url: https://github.com/0day-ci/linux/commits/Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210630-031347
>>>> base: https://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git fsnotify
>>>> config: arm64-allyesconfig (attached as .config)
>>>> compiler: aarch64-linux-gcc (GCC) 9.3.0
>>>> reproduce (this is a W=1 build):
>>>> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>>>> chmod +x ~/bin/make.cross
>>>> # https://github.com/0day-ci/linux/commit/746524d8db08a041fed90e41b15c8e8ca69cb22d
>>>> git remote add linux-review https://github.com/0day-ci/linux
>>>> git fetch --no-tags linux-review Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210630-031347
>>>> git checkout 746524d8db08a041fed90e41b15c8e8ca69cb22d
>>>> # save the attached .config to linux build tree
>>>> mkdir build_dir
>>>> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross O=build_dir ARCH=arm64 SHELL=/bin/bash samples/
>>>>
>>>> If you fix the issue, kindly add following tag as appropriate
>>>> Reported-by: kernel test robot <[email protected]>
>>>>
>>>> All errors (new ones prefixed by >>):
>>>>
>>>>>> samples/fanotify/fs-monitor.c:7:10: fatal error: errno.h: No such file or directory
>>>> 7 | #include <errno.h>
>>>> | ^~~~~~~~~
>>>> compilation terminated.
>>>
>>> Hi Dan,
>>>
>>> I'm not sure what's the proper fix here. Looks like 0day is not using
>>> cross system libraries when building this user space code. Should I do
>>> something special to silent it?
>
> It seems need extra libraries for arm64, we'll disable CONFIG_SAMPLES to
> avoid reporting this error.
There are kernel space code in samples/ that still benefit from the test
robot. See ftrace/ftrace-direct-too.c for one instance.
Perhaps it can be disabled just for userprogs-* Makefile entries in
samples/ ?
--
Gabriel Krisman Bertazi
On 7/23/2021 12:15 AM, Gabriel Krisman Bertazi wrote:
> "Chen, Rong A" <[email protected]> writes:
>
>> Hi Gabriel,
>>
>> On 7/21/2021 3:49 AM, Dan Carpenter wrote:
>>> On Mon, Jul 19, 2021 at 10:36:54AM -0400, Gabriel Krisman Bertazi
>>> wrote:
>>>> kernel test robot <[email protected]> writes:
>>>>
>>>>> Hi Gabriel,
>>>>>
>>>>> I love your patch! Yet something to improve:
>>>>>
>>>>> [auto build test ERROR on ext3/fsnotify]
>>>>> [also build test ERROR on ext4/dev linus/master v5.13 next-20210629]
>>>>> [cannot apply to tytso-fscrypt/master]
>>>>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>>>>> And when submitting patch, we suggest to use '--base' as documented in
>>>>> https://git-scm.com/docs/git-format-patch ]
>>>>>
>>>>> url: https://github.com/0day-ci/linux/commits/Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210630-031347
>>>>> base: https://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git fsnotify
>>>>> config: arm64-allyesconfig (attached as .config)
>>>>> compiler: aarch64-linux-gcc (GCC) 9.3.0
>>>>> reproduce (this is a W=1 build):
>>>>> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>>>>> chmod +x ~/bin/make.cross
>>>>> # https://github.com/0day-ci/linux/commit/746524d8db08a041fed90e41b15c8e8ca69cb22d
>>>>> git remote add linux-review https://github.com/0day-ci/linux
>>>>> git fetch --no-tags linux-review Gabriel-Krisman-Bertazi/File-system-wide-monitoring/20210630-031347
>>>>> git checkout 746524d8db08a041fed90e41b15c8e8ca69cb22d
>>>>> # save the attached .config to linux build tree
>>>>> mkdir build_dir
>>>>> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross O=build_dir ARCH=arm64 SHELL=/bin/bash samples/
>>>>>
>>>>> If you fix the issue, kindly add following tag as appropriate
>>>>> Reported-by: kernel test robot <[email protected]>
>>>>>
>>>>> All errors (new ones prefixed by >>):
>>>>>
>>>>>>> samples/fanotify/fs-monitor.c:7:10: fatal error: errno.h: No such file or directory
>>>>> 7 | #include <errno.h>
>>>>> | ^~~~~~~~~
>>>>> compilation terminated.
>>>>
>>>> Hi Dan,
>>>>
>>>> I'm not sure what's the proper fix here. Looks like 0day is not using
>>>> cross system libraries when building this user space code. Should I do
>>>> something special to silent it?
>>
>> It seems need extra libraries for arm64, we'll disable CONFIG_SAMPLES to
>> avoid reporting this error.
>
> There are kernel space code in samples/ that still benefit from the test
> robot. See ftrace/ftrace-direct-too.c for one instance.
>
> Perhaps it can be disabled just for userprogs-* Makefile entries in
> samples/ ?
>
we'll still test samples on arch x86_64, is there a simple way to
disable userprogs-* cases? we don't want to edit kernel code.
Best Regards,
Rong Chen