2022-08-23 12:19:14

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.4 051/389] epoll: autoremove wakers even more aggressively

From: Benjamin Segall <[email protected]>

commit a16ceb13961068f7209e34d7984f8e42d2c06159 upstream.

If a process is killed or otherwise exits while having active network
connections and many threads waiting on epoll_wait, the threads will all
be woken immediately, but not removed from ep->wq. Then when network
traffic scans ep->wq in wake_up, every wakeup attempt will fail, and will
not remove the entries from the list.

This means that the cost of the wakeup attempt is far higher than usual,
does not decrease, and this also competes with the dying threads trying to
actually make progress and remove themselves from the wq.

Handle this by removing visited epoll wq entries unconditionally, rather
than only when the wakeup succeeds - the structure of ep_poll means that
the only potential loss is the timed_out->eavail heuristic, which now can
race and result in a redundant ep_send_events attempt. (But only when
incoming data and a timeout actually race, not on every timeout)

Shakeel added:

: We are seeing this issue in production with real workloads and it has
: caused hard lockups. Particularly network heavy workloads with a lot
: of threads in epoll_wait() can easily trigger this issue if they get
: killed (oom-killed in our case).

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ben Segall <[email protected]>
Tested-by: Shakeel Butt <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Shakeel Butt <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Roman Penyaev <[email protected]>
Cc: Jason Baron <[email protected]>
Cc: Khazhismel Kumykov <[email protected]>
Cc: Heiher <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/eventpoll.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)

--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -1803,6 +1803,21 @@ static inline struct timespec64 ep_set_m
return timespec64_add_safe(now, ts);
}

+/*
+ * autoremove_wake_function, but remove even on failure to wake up, because we
+ * know that default_wake_function/ttwu will only fail if the thread is already
+ * woken, and in that case the ep_poll loop will remove the entry anyways, not
+ * try to reuse it.
+ */
+static int ep_autoremove_wake_function(struct wait_queue_entry *wq_entry,
+ unsigned int mode, int sync, void *key)
+{
+ int ret = default_wake_function(wq_entry, mode, sync, key);
+
+ list_del_init(&wq_entry->entry);
+ return ret;
+}
+
/**
* ep_poll - Retrieves ready events, and delivers them to the caller supplied
* event buffer.
@@ -1880,8 +1895,15 @@ fetch_events:
* normal wakeup path no need to call __remove_wait_queue()
* explicitly, thus ep->lock is not taken, which halts the
* event delivery.
+ *
+ * In fact, we now use an even more aggressive function that
+ * unconditionally removes, because we don't reuse the wait
+ * entry between loop iterations. This lets us also avoid the
+ * performance issue if a process is killed, causing all of its
+ * threads to wake up without being removed normally.
*/
init_wait(&wait);
+ wait.func = ep_autoremove_wake_function;
write_lock_irq(&ep->lock);
__add_wait_queue_exclusive(&ep->wq, &wait);
write_unlock_irq(&ep->lock);



2022-10-26 16:48:59

by mdecandia

[permalink] [raw]
Subject: [PATCH 5.4 051/389] epoll: autoremove wakers even more aggressively



Subject: [PATCH 5.4 051/389] epoll: autoremove wakers even more aggressively



Hi all,

I'm facing an hangup of runc command during startup of containers on Ubuntu 20.04,

just adding this patch to my updated linux kernel 5.4.210.



The runc process exits if I run an strace on it with the strace_runc_hangup.login you can find here



https://github.com/opencontainers/runc/issues/3641



with more details.



Testing it with previous docker-ce/containerio releases, just hangup the runc process and it will remain locked even analyzing it with strace.



Any idea or further test I can do on it?



Thanks,

Michele

2022-10-26 17:21:18

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 5.4 051/389] epoll: autoremove wakers even more aggressively

On Wed, Oct 26, 2022 at 06:00:51PM +0200, [email protected] wrote:
>
> Subject: [PATCH 5.4 051/389] epoll: autoremove wakers even more aggressively
>
> Hi all,
> I'm facing an hangup of runc command during startup of containers on Ubuntu 20.04,
> just adding this patch to my updated linux kernel 5.4.210.

I do not understand what you mean by this, sorry.

What kernel causes problems?

What commit causes issues?

What commit fixed the issue?

confused,

greg k-h

2022-10-26 19:01:50

by Shakeel Butt

[permalink] [raw]
Subject: Re: [PATCH 5.4 051/389] epoll: autoremove wakers even more aggressively

On Wed, Oct 26, 2022 at 11:44 AM Michele Jr De Candia
<[email protected]> wrote:
>
> Hi Greg,
> sorry for the confusion.
>
> I'm running a container-based app on top of Ubuntu Linux 20.04 and linux kernel 5.4 always updated with latest patches.
>
> Updating from 5.4.210 to 5.4.211 we faced the hang up issue and searching for the cause we have tested that
> hangup occurs only with this patch
>
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.4.y&id=cf2db24ec4b8e9d399005ececd6f6336916ab6fc
>
> While understanding root cause, wt the moment we reverted it and hang up does not occurs (actually we are running 5.4.219 without that patch).
>
> Michele
>

Hi Michele, can you try the latest upstream kernel and see if the
issue repro ther? Also is it possible to provide a simplified repro of
the issue?

Shakeel

2022-10-27 10:25:02

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 5.4 051/389] epoll: autoremove wakers even more aggressively

On Wed, Oct 26, 2022 at 11:48:01AM -0700, Shakeel Butt wrote:
> On Wed, Oct 26, 2022 at 11:44 AM Michele Jr De Candia
> <[email protected]> wrote:
> >
> > Hi Greg,
> > sorry for the confusion.
> >
> > I'm running a container-based app on top of Ubuntu Linux 20.04 and linux kernel 5.4 always updated with latest patches.
> >
> > Updating from 5.4.210 to 5.4.211 we faced the hang up issue and searching for the cause we have tested that
> > hangup occurs only with this patch
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.4.y&id=cf2db24ec4b8e9d399005ececd6f6336916ab6fc
> >
> > While understanding root cause, wt the moment we reverted it and hang up does not occurs (actually we are running 5.4.219 without that patch).
> >
> > Michele
> >
>
> Hi Michele, can you try the latest upstream kernel and see if the
> issue repro ther? Also is it possible to provide a simplified repro of
> the issue?

Also is this issue on 5.10.y and 5.15.y?

thanks,

greg k-h

2022-11-30 01:19:25

by Samuel Mendoza-Jonas

[permalink] [raw]
Subject: Re: [PATCH 5.4 051/389] epoll: autoremove wakers even more aggressively

On Thu, Oct 27, 2022 at 12:09:30PM +0200, Greg KH wrote:
> On Wed, Oct 26, 2022 at 11:48:01AM -0700, Shakeel Butt wrote:
> > On Wed, Oct 26, 2022 at 11:44 AM Michele Jr De Candia
> > <[email protected]> wrote:
> > >
> > > Hi Greg,
> > > sorry for the confusion.
> > >
> > > I'm running a container-based app on top of Ubuntu Linux 20.04 and linux kernel 5.4 always updated with latest patches.
> > >
> > > Updating from 5.4.210 to 5.4.211 we faced the hang up issue and searching for the cause we have tested that
> > > hangup occurs only with this patch
> > >
> > > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.4.y&id=cf2db24ec4b8e9d399005ececd6f6336916ab6fc
> > >
> > > While understanding root cause, wt the moment we reverted it and hang up does not occurs (actually we are running 5.4.219 without that patch).
> > >
> > > Michele
> > >
> >
> > Hi Michele, can you try the latest upstream kernel and see if the
> > issue repro ther? Also is it possible to provide a simplified repro of
> > the issue?
>
> Also is this issue on 5.10.y and 5.15.y?
>
> thanks,
>
> greg k-h

Following up this email thread for those of us who were watching it; it
looks like this was only an issue on 5.4.y due to some missing
backports that this change depended on - see this thread for details:
https://lore.kernel.org/lkml/[email protected]/T/

Cheers,
Sam