2020-03-28 18:14:18

by Omar Kilani

[permalink] [raw]
Subject: Weird issue with epoll and kernel >= 5.0

Hi there,

I've observed an issue with epoll and kernels 5.0 and above when a
system is generating a lot of epoll events.

I see this issue with nginx and jvm / netty based apps (using the
jvm's native epoll support as well as netty's own optimized epoll
support) but *not* with haproxy (?).

I'm not really sure what the actual problem is (nginx complains about
epoll_wait with a generic error), but it doesn't happen on 4.19.x and
lower.

I thought it was a netty problem at first and opened this ticket:

https://github.com/netty/netty/issues/8999

But then saw the same issue in nginx.

I haven't debugged a kernel issue in something like 20 years so I'm
not really sure where to start myself.

I'd be more than happy to provide my test case that has a very quick
repro to anyone who needs it.

Also happy to provide a VM/machine with enough CPUs to trigger it
easily (it seems to happen quicker with more CPUs present) to test
with.

Thanks!

Regards,
Omar


2020-03-28 19:24:16

by Randy Dunlap

[permalink] [raw]
Subject: Re: Weird issue with epoll and kernel >= 5.0

On 3/28/20 11:10 AM, Omar Kilani wrote:
> Hi there,
>
> I've observed an issue with epoll and kernels 5.0 and above when a
> system is generating a lot of epoll events.
>
> I see this issue with nginx and jvm / netty based apps (using the
> jvm's native epoll support as well as netty's own optimized epoll
> support) but *not* with haproxy (?).
>
> I'm not really sure what the actual problem is (nginx complains about
> epoll_wait with a generic error), but it doesn't happen on 4.19.x and
> lower.
>
> I thought it was a netty problem at first and opened this ticket:
>
> https://github.com/netty/netty/issues/8999
>
> But then saw the same issue in nginx.
>
> I haven't debugged a kernel issue in something like 20 years so I'm
> not really sure where to start myself.
>
> I'd be more than happy to provide my test case that has a very quick
> repro to anyone who needs it.

Hi,
Please do.

> Also happy to provide a VM/machine with enough CPUs to trigger it
> easily (it seems to happen quicker with more CPUs present) to test
> with.


There have been around 10 changes in fs/eventpoll.c since v5.0 was
released in March, 2019, so it would be helpful if you could test
the latest mainline kernel to see if the problem is still present.

Hm, it looks like you have identified this commit:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.1-rc5&id=c5a282e9635e9c7382821565083db5d260085e3e
as the/a problem.

I have Cc-ed the patch author also.

--
~Randy

2020-03-28 19:34:49

by Kris Karas

[permalink] [raw]
Subject: Re: Weird issue with epoll and kernel >= 5.0

Hi Omar -

Omar Kilani wrote:
> I've observed an issue with epoll and kernels 5.0 and above when a
> system is generating a lot of epoll events.

It's tough getting an audience here on LK, unless you happen to
cross-post to one of the very specific mailing lists (or directly to
individuals) who maintain a particular subsystem.  I just rejoined LK
after a 15 year hiatus; it's a rather different neighborhood. *cough* 
But I digress.

My first question on your issue is, after booting, how long does it take
to show up?  Can you reproduce it fairly easily?  If the answer is
"Yes," and you seem to be able to afford the resources for a multi-CPU
VM, then making a bisecting run would likely narrow it down to something
you could then email directly to the relevant maintainer.

If you are not already familiar with bisecting, the git documentation on
bisecting is the definitive reference:
https://git-scm.com/docs/git-bisect
But it is quite long.  You can usually find a more abbreviated version
by googling for "<yourdistroname> git bisect"; the ones for Arch and
Ubuntu are pretty decent.

Once you know which file is relevant, you can run:

$ ./scripts/get_maintainer.pl --file drivers/foo/bar.c

... or whatever "git bisect" says is the culprit file, and use the
script output as the To: for your email message.

Good luck!
Kris

2020-03-29 12:17:35

by David Laight

[permalink] [raw]
Subject: RE: Weird issue with epoll and kernel >= 5.0

From: Randy Dunlap
> Sent: 28 March 2020 19:22
...
> There have been around 10 changes in fs/eventpoll.c since v5.0 was
> released in March, 2019, so it would be helpful if you could test
> the latest mainline kernel to see if the problem is still present.

Is there any info about the scenarios that the fixes affect?
We've an application that can use epoll() or poll() and I wonder
if I should not default to epoll() on 5.0+ kernels that might be dodgy.

It rather depends whether wakeups just get lost - but the next
rx data will wake things up, or whether the linked lists get
completely hosed and 'all hell' breaks out (or doesn't).

In our case there is only one reader and the fd are all
UDP sockets (added and removed when the socket is created/closed).

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2020-03-29 16:02:21

by Randy Dunlap

[permalink] [raw]
Subject: Re: Weird issue with epoll and kernel >= 5.0

On 3/29/20 5:09 AM, David Laight wrote:
> From: Randy Dunlap
>> Sent: 28 March 2020 19:22
> ...
>> There have been around 10 changes in fs/eventpoll.c since v5.0 was
>> released in March, 2019, so it would be helpful if you could test
>> the latest mainline kernel to see if the problem is still present.
>
> Is there any info about the scenarios that the fixes affect?
> We've an application that can use epoll() or poll() and I wonder
> if I should not default to epoll() on 5.0+ kernels that might be dodgy.

5.0 was released on 2019-03-03. The following patches have been merged
since then.

> git log --oneline fs/eventpoll.c | more ### latest patches first
1b53734bd0b2 epoll: fix possible lost wakeup on epoll_ctl() path
39220e8d4a2a eventpoll: support non-blocking do_epoll_ctl() calls
58e41a44c488 eventpoll: abstract out epoll_ctl() handler
339ddb53d373 fs/epoll: remove unnecessary wakeups of nested epoll
f6520c520842 epoll: simplify ep_poll_safewake() for CONFIG_DEBUG_LOCK_ALLOC
c8377adfa781 PM / wakeup: Show wakeup sources stats in sysfs
eec4844fae7c proc/sysctl: add shared variables for range check
b772434be089 signal: simplify set_user_sigmask/restore_user_sigmask
97abc889ee29 signal: remove the wrong signal_pending() check in restore_user_sigmask()
2874c5fd2842 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152
a218cc491420 epoll: use rwlock in order to reduce ep_poll_callback() contention
c3e320b61581 epoll: unify awaking of wakeup source on ep_poll_callback() path
c141175d011f epoll: make sure all elements in ready list are in FIFO order


> It rather depends whether wakeups just get lost - but the next
> rx data will wake things up, or whether the linked lists get
> completely hosed and 'all hell' breaks out (or doesn't).
>
> In our case there is only one reader and the fd are all
> UDP sockets (added and removed when the socket is created/closed).


--
~Randy

2020-03-31 18:13:45

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: Weird issue with epoll and kernel >= 5.0

On Sat, 28 Mar 2020, Randy Dunlap wrote:

>On 3/28/20 11:10 AM, Omar Kilani wrote:
>> Hi there,
>>
>> I've observed an issue with epoll and kernels 5.0 and above when a
>> system is generating a lot of epoll events.
>>
>> I see this issue with nginx and jvm / netty based apps (using the
>> jvm's native epoll support as well as netty's own optimized epoll
>> support) but *not* with haproxy (?).
>>
>> I'm not really sure what the actual problem is (nginx complains about
>> epoll_wait with a generic error), but it doesn't happen on 4.19.x and
>> lower.
>>
>> I thought it was a netty problem at first and opened this ticket:
>>
>> https://github.com/netty/netty/issues/8999
>>
>> But then saw the same issue in nginx.
>>
>> I haven't debugged a kernel issue in something like 20 years so I'm
>> not really sure where to start myself.
>>
>> I'd be more than happy to provide my test case that has a very quick
>> repro to anyone who needs it.
>
>Hi,
>Please do.
>
>> Also happy to provide a VM/machine with enough CPUs to trigger it
>> easily (it seems to happen quicker with more CPUs present) to test
>> with.

Yeah, more than a VM, an actual reproducer would be much welcome here.

>
>
>There have been around 10 changes in fs/eventpoll.c since v5.0 was
>released in March, 2019, so it would be helpful if you could test
>the latest mainline kernel to see if the problem is still present.
>
>Hm, it looks like you have identified this commit:
>https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.1-rc5&id=c5a282e9635e9c7382821565083db5d260085e3e
>as the/a problem.

As this been bisected down to this? As you mention there are more
commits in there that are dependent of each other, so I'd like
to be certain this is actually the broken change.

Thanks,
Davidlohr