When fanotify_release() is called, there may still be processes waiting for
access permission. Currently only processes for which an event has already been
queued into the groups access list will be woken up. Processes for which no
event has been queued will continue to sleep and thus cause a deadlock when
fsnotify_put_group() is called.
Furthermore there is a race allowing further processes to be waiting on the
access wait queue after wake_up (if they arrive before clear_marks_by_group()
is called).
This patch corrects this by setting a flag to inform processes that the group
is about to be destroyed and thus not to wait for access permission.
Beside this it removes the unnecessary check for the bypass_perm flag in
prepare_for_access_response(), since this function cant be called any more at
the time release() is called and the flag is set.
Signed-off-by: Lino Sanfilippo <[email protected]>
---
fs/notify/fanotify/fanotify.c | 8 +++++++-
fs/notify/fanotify/fanotify_user.c | 14 +++-----------
include/linux/fsnotify_backend.h | 2 +-
3 files changed, 11 insertions(+), 13 deletions(-)
This patch applies against commit 3aa13e3ff6700929c0e3a1a4cdc51c82139707e4 of
branch 'origin/for-next' from git.infradead.org/users/eparis/notify.git
CC'ed Andreas Gruenbacher since he originally reported this problem.
diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index cb576b8..94438db 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -91,8 +91,14 @@ static int fanotify_get_response_from_access(struct fsnotify_group *group,
int ret;
pr_debug("%s: group=%p event=%p\n", __func__, group, event);
+ if (atomic_read(&group->fanotify_data.bypass_perm))
+ return 0;
- wait_event(group->fanotify_data.access_waitq, event->response);
+ wait_event(group->fanotify_data.access_waitq, event->response ||
+ atomic_read(&group->fanotify_data.bypass_perm));
+
+ if (!event->response) /* bypass_perm set */
+ return 0;
/* userspace responded, convert to something usable */
spin_lock(&event->lock);
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index ae36c73..342d22e 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -210,14 +210,6 @@ static int prepare_for_access_response(struct fsnotify_group *group,
re->fd = fd;
mutex_lock(&group->fanotify_data.access_mutex);
-
- if (group->fanotify_data.bypass_perm) {
- mutex_unlock(&group->fanotify_data.access_mutex);
- kmem_cache_free(fanotify_response_event_cache, re);
- event->response = FAN_ALLOW;
- return 0;
- }
-
list_add_tail(&re->list, &group->fanotify_data.access_list);
mutex_unlock(&group->fanotify_data.access_mutex);
@@ -399,10 +391,9 @@ static int fanotify_release(struct inode *ignored, struct file *file)
#ifdef CONFIG_FANOTIFY_ACCESS_PERMISSIONS
struct fanotify_response_event *re, *lre;
- mutex_lock(&group->fanotify_data.access_mutex);
-
- group->fanotify_data.bypass_perm = true;
+ atomic_inc(&group->fanotify_data.bypass_perm);
+ mutex_lock(&group->fanotify_data.access_mutex);
list_for_each_entry_safe(re, lre, &group->fanotify_data.access_list, list) {
pr_debug("%s: found group=%p re=%p event=%p\n", __func__, group,
re, re->event);
@@ -706,6 +697,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
if (IS_ERR(group))
return PTR_ERR(group);
+ atomic_set(&group->fanotify_data.bypass_perm, 0);
group->fanotify_data.user = user;
atomic_inc(&user->fanotify_listeners);
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index d010f70..add1351 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -166,7 +166,7 @@ struct fsnotify_group {
struct mutex access_mutex;
struct list_head access_list;
wait_queue_head_t access_waitq;
- bool bypass_perm; /* protected by access_mutex */
+ atomic_t bypass_perm;
#endif /* CONFIG_FANOTIFY_ACCESS_PERMISSIONS */
bool readonly_fallback;
int f_flags;
--
1.5.6.5
On Fri, 2010-11-19 at 10:58 +0100, Lino Sanfilippo wrote:
> When fanotify_release() is called, there may still be processes waiting for
> access permission. Currently only processes for which an event has already been
> queued into the groups access list will be woken up. Processes for which no
> event has been queued will continue to sleep and thus cause a deadlock when
> fsnotify_put_group() is called.
I'm not sure what you mean by 'processes for which no event has been
queued.' You must mean a process that is about to send a notify event
and is about to put itself on the wait queue...
> Furthermore there is a race allowing further processes to be waiting on the
> access wait queue after wake_up (if they arrive before clear_marks_by_group()
> is called).
This is what I think you meant in the above sentence but I'm not sure.
In any case I think I described all of the possibilities here:
Lets think about the 4 relevant code paths from the PoV of the
'operator' 'listener' 'responder' and 'closer'. Where operator is the
process doing an action (like open/read) which could require permission.
Listener is the task (or in this case thread) slated with reading from
the fanotify file descriptor. The 'responder' is the thread responsible
for responding to access requests. 'Closer' is the thread attempting to
close the fanotify file descriptor.
The 'operator' is going to end up in:
fanotify_handle_event()
get_response_from_access()
(THIS BLOCKS WAITING ON USERSPACE)
The 'listener' interesting code path
fanotify_read()
copy_event_to_user()
prepare_for_access_response()
(THIS CREATES AN fanotify_response_event)
The 'responder' code path:
fanotify_write()
process_access_response()
(REMOVE A fanotify_response_event, SET RESPONSE, WAKE UP 'operator')
The 'closer':
fanotify_release()
(SUPPOSED TO CLEAN UP THE REST OF THIS MESS)
What we have today is that in the closer we remove all of the
fanotify_response_events and set a bit so no more response events are
ever created in prepare_for_access_response().
The bug is that we never wake all of the operators up and tell them to
move along. You fix that in fanotify_get_response_from_access(). You
also fix other operators which haven't gotten there yet. So I agree
that's a good fix.
But then you do:
> Beside this it removes the unnecessary check for the bypass_perm flag in
> prepare_for_access_response(), since this function cant be called any more at
> the time release() is called and the flag is set.
Which I guess is also correct but I don't like it in the same patch.
It's dropping dead code rather than fixing this bug. So it's
distracting to review the patch.
I'm going to split this into two patches, include my analysis in your
changelog and apply them separately. I hope you don't mind. I also
don't like the conversion to an atomic when I think a bool could work
just as well. I might convert it back to a boolean after I put some
thought into it....
-Eric
On Tue, Nov 23, 2010 at 05:13:44PM -0500, Eric Paris wrote:
>
> I'm not sure what you mean by 'processes for which no event has been
> queued.' You must mean a process that is about to send a notify event
> and is about to put itself on the wait queue...
>
Hm, i admit i did not explain very well what i meant.
> In any case I think I described all of the possibilities here:
>
> Lets think about the 4 relevant code paths from the PoV of the
> 'operator' 'listener' 'responder' and 'closer'. Where operator is the
> process doing an action (like open/read) which could require permission.
> Listener is the task (or in this case thread) slated with reading from
> the fanotify file descriptor. The 'responder' is the thread responsible
> for responding to access requests. 'Closer' is the thread attempting to
> close the fanotify file descriptor.
>
> The 'operator' is going to end up in:
> fanotify_handle_event()
> get_response_from_access()
> (THIS BLOCKS WAITING ON USERSPACE)
>
> The 'listener' interesting code path
> fanotify_read()
> copy_event_to_user()
> prepare_for_access_response()
> (THIS CREATES AN fanotify_response_event)
>
> The 'responder' code path:
> fanotify_write()
> process_access_response()
> (REMOVE A fanotify_response_event, SET RESPONSE, WAKE UP 'operator')
>
> The 'closer':
> fanotify_release()
> (SUPPOSED TO CLEAN UP THE REST OF THIS MESS)
>
> What we have today is that in the closer we remove all of the
> fanotify_response_events and set a bit so no more response events are
> ever created in prepare_for_access_response().
>
> The bug is that we never wake all of the operators up and tell them to
> move along.
Right, we did not wake up the operators that generated events which have not
been moved to the access_list yet, but are still on the access_waitq (because
the listener never read these events).
>
> > Beside this it removes the unnecessary check for the bypass_perm flag in
> > prepare_for_access_response(), since this function cant be called any more at
> > the time release() is called and the flag is set.
>
> Which I guess is also correct but I don't like it in the same patch.
> It's dropping dead code rather than fixing this bug. So it's
> distracting to review the patch.
Yes right, i should have split that.
>
> I'm going to split this into two patches, include my analysis in your
> changelog and apply them separately. I hope you don't mind.
Absolutely ok :)