2023-05-23 07:10:03

by Alok Tiagi

[permalink] [raw]
Subject: [RFC v6 1/2] epoll: Implement eventpoll_replace_file()

Introduce a mechanism to replace a file linked in the epoll interface with a new
file.

eventpoll_replace() finds all instances of the file to be replaced and replaces
them with the new file and the interested events.

Signed-off-by: aloktiagi <[email protected]>
---
Changes in v6:
- incorporate latest changes that get rid of the global epmutex lock.

Changes in v5:
- address review comments and move the call to replace old file in each
subsystem (epoll, io_uring, etc.) outside the fdtable helpers like
replace_fd().

Changes in v4:
- address review comment to remove the redundant eventpoll_replace() function.
- removed an extra empty line introduced in include/linux/file.h

Changes in v3:
- address review comment and iterate over the file table while holding the
spin_lock(&files->file_lock).
- address review comment and call filp_close() outside the
spin_lock(&files->file_lock).
---
fs/eventpoll.c | 76 +++++++++++++++++++++++++++++++++++++++
include/linux/eventpoll.h | 8 +++++
2 files changed, 84 insertions(+)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 980483455cc0..9c7bffa8401b 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -973,6 +973,82 @@ void eventpoll_release_file(struct file *file)
spin_unlock(&file->f_lock);
}

+static int ep_insert(struct eventpoll *ep, const struct epoll_event *event,
+ struct file *tfile, int fd, int full_check);
+
+/*
+ * This is called from eventpoll_replace() to replace a linked file in the epoll
+ * interface with a new file received from another process. This is useful in
+ * cases where a process is trying to install a new file for an existing one
+ * that is linked in the epoll interface
+ */
+int eventpoll_replace_file(struct file *toreplace, struct file *file, int tfd)
+{
+ struct file *to_remove = toreplace;
+ struct epoll_event event;
+ struct hlist_node *next;
+ struct eventpoll *ep;
+ struct epitem *epi;
+ int error = 0;
+ bool dispose;
+ int fd;
+
+ if (!file_can_poll(file))
+ return 0;
+
+ spin_lock(&toreplace->f_lock);
+ if (unlikely(!toreplace->f_ep)) {
+ spin_unlock(&toreplace->f_lock);
+ return 0;
+ }
+ hlist_for_each_entry_safe(epi, next, toreplace->f_ep, fllink) {
+ ep = epi->ep;
+ mutex_lock(&ep->mtx);
+ fd = epi->ffd.fd;
+ if (fd != tfd) {
+ mutex_unlock(&ep->mtx);
+ continue;
+ }
+ event = epi->event;
+ error = ep_insert(ep, &event, file, fd, 1);
+ mutex_unlock(&ep->mtx);
+ if (error != 0) {
+ break;
+ }
+ }
+ spin_unlock(&toreplace->f_lock);
+ /*
+ * In case of an error remove all instances of the new file in the epoll
+ * interface. If no error, remove all instances of the original file.
+ */
+ if (error != 0)
+ to_remove = file;
+
+again:
+ spin_lock(&to_remove->f_lock);
+ if (to_remove->f_ep && to_remove->f_ep->first) {
+ epi = hlist_entry(to_remove->f_ep->first, struct epitem, fllink);
+ fd = epi->ffd.fd;
+ if (fd != tfd) {
+ spin_unlock(&to_remove->f_lock);
+ goto again;
+ }
+ epi->dying = true;
+ spin_unlock(&to_remove->f_lock);
+
+ ep = epi->ep;
+ mutex_lock(&ep->mtx);
+ dispose = __ep_remove(ep, epi, true);
+ mutex_unlock(&ep->mtx);
+
+ if (dispose)
+ ep_free(ep);
+ goto again;
+ }
+ spin_unlock(&to_remove->f_lock);
+ return error;
+}
+
static int ep_alloc(struct eventpoll **pep)
{
int error;
diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h
index 3337745d81bd..2a6c8f52f272 100644
--- a/include/linux/eventpoll.h
+++ b/include/linux/eventpoll.h
@@ -25,6 +25,14 @@ struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd, unsigned long t
/* Used to release the epoll bits inside the "struct file" */
void eventpoll_release_file(struct file *file);

+/*
+ * This is called from fs/file.c:do_replace() to replace a linked file in the
+ * epoll interface with a new file received from another process. This is useful
+ * in cases where a process is trying to install a new file for an existing one
+ * that is linked in the epoll interface
+ */
+int eventpoll_replace_file(struct file *toreplace, struct file *file, int tfd);
+
/*
* This is called from inside fs/file_table.c:__fput() to unlink files
* from the eventpoll interface. We need to have this facility to cleanup
--
2.34.1



2023-05-23 12:46:16

by Christian Brauner

[permalink] [raw]
Subject: Re: [RFC v6 1/2] epoll: Implement eventpoll_replace_file()

On Tue, May 23, 2023 at 06:58:01AM +0000, aloktiagi wrote:
> Introduce a mechanism to replace a file linked in the epoll interface with a new
> file.
>
> eventpoll_replace() finds all instances of the file to be replaced and replaces
> them with the new file and the interested events.
>
> Signed-off-by: aloktiagi <[email protected]>
> ---
> Changes in v6:
> - incorporate latest changes that get rid of the global epmutex lock.
>
> Changes in v5:
> - address review comments and move the call to replace old file in each
> subsystem (epoll, io_uring, etc.) outside the fdtable helpers like
> replace_fd().
>
> Changes in v4:
> - address review comment to remove the redundant eventpoll_replace() function.
> - removed an extra empty line introduced in include/linux/file.h
>
> Changes in v3:
> - address review comment and iterate over the file table while holding the
> spin_lock(&files->file_lock).
> - address review comment and call filp_close() outside the
> spin_lock(&files->file_lock).
> ---
> fs/eventpoll.c | 76 +++++++++++++++++++++++++++++++++++++++
> include/linux/eventpoll.h | 8 +++++
> 2 files changed, 84 insertions(+)
>
> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> index 980483455cc0..9c7bffa8401b 100644
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -973,6 +973,82 @@ void eventpoll_release_file(struct file *file)
> spin_unlock(&file->f_lock);
> }
>
> +static int ep_insert(struct eventpoll *ep, const struct epoll_event *event,
> + struct file *tfile, int fd, int full_check);
> +
> +/*
> + * This is called from eventpoll_replace() to replace a linked file in the epoll
> + * interface with a new file received from another process. This is useful in
> + * cases where a process is trying to install a new file for an existing one
> + * that is linked in the epoll interface
> + */
> +int eventpoll_replace_file(struct file *toreplace, struct file *file, int tfd)
> +{
> + struct file *to_remove = toreplace;
> + struct epoll_event event;
> + struct hlist_node *next;
> + struct eventpoll *ep;
> + struct epitem *epi;
> + int error = 0;
> + bool dispose;
> + int fd;
> +
> + if (!file_can_poll(file))
> + return 0;
> +
> + spin_lock(&toreplace->f_lock);
> + if (unlikely(!toreplace->f_ep)) {
> + spin_unlock(&toreplace->f_lock);
> + return 0;
> + }
> + hlist_for_each_entry_safe(epi, next, toreplace->f_ep, fllink) {
> + ep = epi->ep;
> + mutex_lock(&ep->mtx);

Afaict, you're under a spinlock and you're acquiring a mutex. The
spinlock can't sleep (on non-rt kernels at least) but the mutex can.

> + fd = epi->ffd.fd;
> + if (fd != tfd) {
> + mutex_unlock(&ep->mtx);
> + continue;
> + }
> + event = epi->event;
> + error = ep_insert(ep, &event, file, fd, 1);
> + mutex_unlock(&ep->mtx);
> + if (error != 0) {
> + break;
> + }

nit: we don't do { } around single lines.

2023-05-23 14:08:53

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [RFC v6 1/2] epoll: Implement eventpoll_replace_file()

On Tue, May 23, 2023 at 06:58:01AM +0000, aloktiagi wrote:
> +/*
> + * This is called from eventpoll_replace() to replace a linked file in the epoll
> + * interface with a new file received from another process. This is useful in
> + * cases where a process is trying to install a new file for an existing one
> + * that is linked in the epoll interface
> + */
> +int eventpoll_replace_file(struct file *toreplace, struct file *file, int tfd)

Functions do not control where they are called from. Just take that
clause out:

/*
* Replace a linked file in the epoll interface with a new file received
* from another process. This allows a process to
* install a new file for an existing one that is linked in the epoll
* interface
*/

But, erm, aren't those two sentences basically saying the same thing?
So simplify again:

/*
* Replace a linked file in the epoll interface with a new file
*/

> diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h
> index 3337745d81bd..2a6c8f52f272 100644
> --- a/include/linux/eventpoll.h
> +++ b/include/linux/eventpoll.h
> @@ -25,6 +25,14 @@ struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd, unsigned long t
> /* Used to release the epoll bits inside the "struct file" */
> void eventpoll_release_file(struct file *file);
>
> +/*
> + * This is called from fs/file.c:do_replace() to replace a linked file in the
> + * epoll interface with a new file received from another process. This is useful
> + * in cases where a process is trying to install a new file for an existing one
> + * that is linked in the epoll interface
> + */
> +int eventpoll_replace_file(struct file *toreplace, struct file *file, int tfd);

No need to repeat the comment again. Just delete it here.

2023-05-24 06:40:21

by Alok Tiagi

[permalink] [raw]
Subject: Re: [RFC v6 1/2] epoll: Implement eventpoll_replace_file()

On Tue, May 23, 2023 at 02:32:06PM +0200, Christian Brauner wrote:
> On Tue, May 23, 2023 at 06:58:01AM +0000, aloktiagi wrote:
> > Introduce a mechanism to replace a file linked in the epoll interface with a new
> > file.
> >
> > eventpoll_replace() finds all instances of the file to be replaced and replaces
> > them with the new file and the interested events.
> >
> > Signed-off-by: aloktiagi <[email protected]>
> > ---
> > Changes in v6:
> > - incorporate latest changes that get rid of the global epmutex lock.
> >
> > Changes in v5:
> > - address review comments and move the call to replace old file in each
> > subsystem (epoll, io_uring, etc.) outside the fdtable helpers like
> > replace_fd().
> >
> > Changes in v4:
> > - address review comment to remove the redundant eventpoll_replace() function.
> > - removed an extra empty line introduced in include/linux/file.h
> >
> > Changes in v3:
> > - address review comment and iterate over the file table while holding the
> > spin_lock(&files->file_lock).
> > - address review comment and call filp_close() outside the
> > spin_lock(&files->file_lock).
> > ---
> > fs/eventpoll.c | 76 +++++++++++++++++++++++++++++++++++++++
> > include/linux/eventpoll.h | 8 +++++
> > 2 files changed, 84 insertions(+)
> >
> > diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> > index 980483455cc0..9c7bffa8401b 100644
> > --- a/fs/eventpoll.c
> > +++ b/fs/eventpoll.c
> > @@ -973,6 +973,82 @@ void eventpoll_release_file(struct file *file)
> > spin_unlock(&file->f_lock);
> > }
> >
> > +static int ep_insert(struct eventpoll *ep, const struct epoll_event *event,
> > + struct file *tfile, int fd, int full_check);
> > +
> > +/*
> > + * This is called from eventpoll_replace() to replace a linked file in the epoll
> > + * interface with a new file received from another process. This is useful in
> > + * cases where a process is trying to install a new file for an existing one
> > + * that is linked in the epoll interface
> > + */
> > +int eventpoll_replace_file(struct file *toreplace, struct file *file, int tfd)
> > +{
> > + struct file *to_remove = toreplace;
> > + struct epoll_event event;
> > + struct hlist_node *next;
> > + struct eventpoll *ep;
> > + struct epitem *epi;
> > + int error = 0;
> > + bool dispose;
> > + int fd;
> > +
> > + if (!file_can_poll(file))
> > + return 0;
> > +
> > + spin_lock(&toreplace->f_lock);
> > + if (unlikely(!toreplace->f_ep)) {
> > + spin_unlock(&toreplace->f_lock);
> > + return 0;
> > + }
> > + hlist_for_each_entry_safe(epi, next, toreplace->f_ep, fllink) {
> > + ep = epi->ep;
> > + mutex_lock(&ep->mtx);
>
> Afaict, you're under a spinlock and you're acquiring a mutex. The
> spinlock can't sleep (on non-rt kernels at least) but the mutex can.
>

thank you. I'll address this in another way in the next patch series. Please
let me know of your opinion on how it can be achieved differently.

> > + fd = epi->ffd.fd;
> > + if (fd != tfd) {
> > + mutex_unlock(&ep->mtx);
> > + continue;
> > + }
> > + event = epi->event;
> > + error = ep_insert(ep, &event, file, fd, 1);
> > + mutex_unlock(&ep->mtx);
> > + if (error != 0) {
> > + break;
> > + }
>
> nit: we don't do { } around single lines.

will fix this in the next series.

2023-05-24 06:47:22

by Alok Tiagi

[permalink] [raw]
Subject: Re: [RFC v6 1/2] epoll: Implement eventpoll_replace_file()

On Tue, May 23, 2023 at 02:31:58PM +0100, Matthew Wilcox wrote:
> On Tue, May 23, 2023 at 06:58:01AM +0000, aloktiagi wrote:
> > +/*
> > + * This is called from eventpoll_replace() to replace a linked file in the epoll
> > + * interface with a new file received from another process. This is useful in
> > + * cases where a process is trying to install a new file for an existing one
> > + * that is linked in the epoll interface
> > + */
> > +int eventpoll_replace_file(struct file *toreplace, struct file *file, int tfd)
>
> Functions do not control where they are called from. Just take that
> clause out:
>
> /*
> * Replace a linked file in the epoll interface with a new file received
> * from another process. This allows a process to
> * install a new file for an existing one that is linked in the epoll
> * interface
> */
>
> But, erm, aren't those two sentences basically saying the same thing?
> So simplify again:
>
> /*
> * Replace a linked file in the epoll interface with a new file
> */
>

thank you for pointing this out. I'll address this in the next version.

> > diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h
> > index 3337745d81bd..2a6c8f52f272 100644
> > --- a/include/linux/eventpoll.h
> > +++ b/include/linux/eventpoll.h
> > @@ -25,6 +25,14 @@ struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd, unsigned long t
> > /* Used to release the epoll bits inside the "struct file" */
> > void eventpoll_release_file(struct file *file);
> >
> > +/*
> > + * This is called from fs/file.c:do_replace() to replace a linked file in the
> > + * epoll interface with a new file received from another process. This is useful
> > + * in cases where a process is trying to install a new file for an existing one
> > + * that is linked in the epoll interface
> > + */
> > +int eventpoll_replace_file(struct file *toreplace, struct file *file, int tfd);
>
> No need to repeat the comment again. Just delete it here.

thank you. I'll update this in the next version.