2023-05-30 18:45:53

by Benjamin Segall

[permalink] [raw]
Subject: [PATCH RESEND] epoll: ep_autoremove_wake_function should use list_del_init_careful

autoremove_wake_function uses list_del_init_careful, so should epoll's
more aggressive variant. It only doesn't because it was copied from an
older wait.c rather than the most recent.

Fixes: a16ceb139610 ("epoll: autoremove wakers even more aggressively")
Signed-off-by: Ben Segall <[email protected]>
Cc: [email protected]
---
fs/eventpoll.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 52954d4637b5..081df056398a 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -1756,11 +1756,11 @@ static struct timespec64 *ep_timeout_to_timespec(struct timespec64 *to, long ms)
static int ep_autoremove_wake_function(struct wait_queue_entry *wq_entry,
unsigned int mode, int sync, void *key)
{
int ret = default_wake_function(wq_entry, mode, sync, key);

- list_del_init(&wq_entry->entry);
+ list_del_init_careful(&wq_entry->entry);
return ret;
}

/**
* ep_poll - Retrieves ready events, and delivers them to the caller-supplied
--
2.39.0.rc0.267.gcb52ba06e7-goog



2023-05-31 02:19:55

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH RESEND] epoll: ep_autoremove_wake_function should use list_del_init_careful

On Tue, May 30, 2023 at 11:32:28AM -0700, Benjamin Segall wrote:
> autoremove_wake_function uses list_del_init_careful, so should epoll's
> more aggressive variant. It only doesn't because it was copied from an
> older wait.c rather than the most recent.
>
> Fixes: a16ceb139610 ("epoll: autoremove wakers even more aggressively")
> Signed-off-by: Ben Segall <[email protected]>
> Cc: [email protected]
> ---
> fs/eventpoll.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> index 52954d4637b5..081df056398a 100644
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -1756,11 +1756,11 @@ static struct timespec64 *ep_timeout_to_timespec(struct timespec64 *to, long ms)
> static int ep_autoremove_wake_function(struct wait_queue_entry *wq_entry,
> unsigned int mode, int sync, void *key)
> {
> int ret = default_wake_function(wq_entry, mode, sync, key);
>
> - list_del_init(&wq_entry->entry);
> + list_del_init_careful(&wq_entry->entry);
> return ret;
> }

Can you please provide a more detailed explanation about why
list_del_init_careful() is needed here?

- Eric

2023-05-31 08:21:29

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH RESEND] epoll: ep_autoremove_wake_function should use list_del_init_careful

On Tue, May 30, 2023 at 06:57:48PM -0700, Eric Biggers wrote:
> On Tue, May 30, 2023 at 11:32:28AM -0700, Benjamin Segall wrote:
> > autoremove_wake_function uses list_del_init_careful, so should epoll's
> > more aggressive variant. It only doesn't because it was copied from an
> > older wait.c rather than the most recent.
> >
> > Fixes: a16ceb139610 ("epoll: autoremove wakers even more aggressively")
> > Signed-off-by: Ben Segall <[email protected]>
> > Cc: [email protected]
> > ---
> > fs/eventpoll.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> > index 52954d4637b5..081df056398a 100644
> > --- a/fs/eventpoll.c
> > +++ b/fs/eventpoll.c
> > @@ -1756,11 +1756,11 @@ static struct timespec64 *ep_timeout_to_timespec(struct timespec64 *to, long ms)
> > static int ep_autoremove_wake_function(struct wait_queue_entry *wq_entry,
> > unsigned int mode, int sync, void *key)
> > {
> > int ret = default_wake_function(wq_entry, mode, sync, key);
> >
> > - list_del_init(&wq_entry->entry);
> > + list_del_init_careful(&wq_entry->entry);
> > return ret;
> > }
>
> Can you please provide a more detailed explanation about why
> list_del_init_careful() is needed here?

Yeah, this needs more explanation... Next time someone looks at this
code and there's a *_careful() added they'll want to know why.

2023-05-31 22:27:20

by Benjamin Segall

[permalink] [raw]
Subject: Re: [PATCH RESEND] epoll: ep_autoremove_wake_function should use list_del_init_careful

Christian Brauner <[email protected]> writes:

> On Tue, May 30, 2023 at 06:57:48PM -0700, Eric Biggers wrote:
>> On Tue, May 30, 2023 at 11:32:28AM -0700, Benjamin Segall wrote:
>> > autoremove_wake_function uses list_del_init_careful, so should epoll's
>> > more aggressive variant. It only doesn't because it was copied from an
>> > older wait.c rather than the most recent.
>> >
>> > Fixes: a16ceb139610 ("epoll: autoremove wakers even more aggressively")
>> > Signed-off-by: Ben Segall <[email protected]>
>> > Cc: [email protected]
>> > ---
>> > fs/eventpoll.c | 2 +-
>> > 1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/fs/eventpoll.c b/fs/eventpoll.c
>> > index 52954d4637b5..081df056398a 100644
>> > --- a/fs/eventpoll.c
>> > +++ b/fs/eventpoll.c
>> > @@ -1756,11 +1756,11 @@ static struct timespec64 *ep_timeout_to_timespec(struct timespec64 *to, long ms)
>> > static int ep_autoremove_wake_function(struct wait_queue_entry *wq_entry,
>> > unsigned int mode, int sync, void *key)
>> > {
>> > int ret = default_wake_function(wq_entry, mode, sync, key);
>> >
>> > - list_del_init(&wq_entry->entry);
>> > + list_del_init_careful(&wq_entry->entry);
>> > return ret;
>> > }
>>
>> Can you please provide a more detailed explanation about why
>> list_del_init_careful() is needed here?
>
> Yeah, this needs more explanation... Next time someone looks at this
> code and there's a *_careful() added they'll want to know why.

So the general reason is the same as with autoremove_wake_function, it
pairs with the list_entry_careful in ep_poll (which is epoll's modified
copy of finish_wait).

I think the original actual _problem_ was a -stable issue that was fixed
instead by doing additional backports, so this may just avoid potential
extra loops and avoid potential compiler shenanigans from the data race.

2023-05-31 22:38:33

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH RESEND] epoll: ep_autoremove_wake_function should use list_del_init_careful

On Wed, 31 May 2023 15:15:41 -0700 Benjamin Segall <[email protected]> wrote:

> >> Can you please provide a more detailed explanation about why
> >> list_del_init_careful() is needed here?
> >
> > Yeah, this needs more explanation... Next time someone looks at this
> > code and there's a *_careful() added they'll want to know why.
>
> So the general reason is the same as with autoremove_wake_function, it
> pairs with the list_entry_careful in ep_poll (which is epoll's modified
> copy of finish_wait).
>
> I think the original actual _problem_ was a -stable issue that was fixed
> instead by doing additional backports, so this may just avoid potential
> extra loops and avoid potential compiler shenanigans from the data race.

The point is that the foo_careful() callsites should be commented, please.

2023-05-31 23:49:37

by Benjamin Segall

[permalink] [raw]
Subject: [PATCH v2] epoll: ep_autoremove_wake_function should use list_del_init_careful

autoremove_wake_function uses list_del_init_careful, so should epoll's
more aggressive variant. It only doesn't because it was copied from an
older wait.c rather than the most recent.

Fixes: a16ceb139610 ("epoll: autoremove wakers even more aggressively")
Signed-off-by: Ben Segall <[email protected]>
Cc: [email protected]
Change-Id: Icca05359250297f091779c9dcf4fefea92ee8c93
---
fs/eventpoll.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 980483455cc09..266d45c7685b4 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -1803,11 +1803,15 @@ static struct timespec64 *ep_timeout_to_timespec(struct timespec64 *to, long ms)
static int ep_autoremove_wake_function(struct wait_queue_entry *wq_entry,
unsigned int mode, int sync, void *key)
{
int ret = default_wake_function(wq_entry, mode, sync, key);

- list_del_init(&wq_entry->entry);
+ /*
+ * Pairs with list_empty_careful in ep_poll, and ensures future loop
+ * iterations see the cause of this wakeup.
+ */
+ list_del_init_careful(&wq_entry->entry);
return ret;
}

/**
* ep_poll - Retrieves ready events, and delivers them to the caller-supplied
--
2.41.0.rc0.172.g3f132b7071-goog