2019-01-08 10:26:51

by Roman Penyaev

[permalink] [raw]
Subject: [PATCH 1/1] epoll: remove wrong assert that ep_poll_callback is always called with irqs off

That was wrong assumption that all drivers disable irqs before waking up
a wait queue. Even assert line is removed the whole logic stays correct:
epoll always locks rwlock with irqs disabled and by itself does not call
from interrupts, thus it is up to driver how to call wake_up_locked(),
because if driver does not handle any interrupts (like fuse in the the
report) of course it is safe on its side to take a simple spin_lock.

Signed-off-by: Roman Penyaev <[email protected]>
Reported-by: [email protected]
Cc: Davidlohr Bueso <[email protected]>
Cc: Jason Baron <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
fs/eventpoll.c | 6 ------
1 file changed, 6 deletions(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index f307c8679027..f5f88250cdf2 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -1217,12 +1217,6 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
__poll_t pollflags = key_to_poll(key);
int ewake = 0;

- /*
- * Called by irq context or interrupts are disabled by the wake_up_*poll
- * callers.
- */
- lockdep_assert_irqs_disabled();
-
read_lock(&ep->lock);

ep_set_busy_poll_napi_id(epi);
--
2.19.1



2019-01-08 12:48:00

by Roman Penyaev

[permalink] [raw]
Subject: Re: [PATCH 1/1] epoll: remove wrong assert that ep_poll_callback is always called with irqs off

On 2019-01-08 11:01, Roman Penyaev wrote:
> That was wrong assumption that all drivers disable irqs before waking
> up
> a wait queue. Even assert line is removed the whole logic stays
> correct:
> epoll always locks rwlock with irqs disabled and by itself does not
> call
> from interrupts, thus it is up to driver how to call wake_up_locked(),
> because if driver does not handle any interrupts (like fuse in the the
> report) of course it is safe on its side to take a simple spin_lock.

This is wrong and can lead to dead lock: we always call read_lock(),
caller
can call us with irqs enabled. Another driver, which also calls
ep_poll_callback(), can be called from interrupt context (irqs disabled)
thus it can interrupt the one who does not disable irqs. Even we take
a read_lock() (which should be fine to interrupt), write_lock(), which
comes in the middle, can cause a dead lock:

#CPU0 #CPU1

task: task: irq:

spin_lock(&wq1->lock);
ep_poll_callback():
read_lock(&ep->lock)
....
write_lock_irq(&ep->lock) ....
#waits reads .... >>>>>>>>>>>>>> IRQ CPU1

spin_lock_irqsave(&wq2->lock)

ep_poll_callback():

read_lock(&ep->lock);
# to avoid write
starve should
# wait writer to
finish, thus
# dead lock


What we can do:

a) disable irqs if we are not in interrupt.
b) revert the patch completely.

David, is it really crucial in terms of performance to avoid double
local_irq_save() on Xen on this ep_poll_callback() hot path?

For example why not to do the following:

if (!in_interrupt())
local_irq_save(flags);
read_lock(ep->lock);

with huge comment explaining performance number.

Or just give up and simply revert the original patch completely
and always call read_lock_irqsave().

--
Roman


2019-01-08 15:19:35

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: [PATCH 1/1] epoll: remove wrong assert that ep_poll_callback is always called with irqs off

On 2019-01-08 04:42, Roman Penyaev wrote:
> What we can do:
>
> a) disable irqs if we are not in interrupt.
> b) revert the patch completely.
>
> David, is it really crucial in terms of performance to avoid double
> local_irq_save() on Xen on this ep_poll_callback() hot path?

Note that such optimizations are also relevant for baremetal, ie: x86
PUSHF + POPF can be pretty expensive because of insn dependencies.

>
> For example why not to do the following:
>
> if (!in_interrupt())
> local_irq_save(flags);
> read_lock(ep->lock);
>
> with huge comment explaining performance number.
>
> Or just give up and simply revert the original patch completely
> and always call read_lock_irqsave().

Yeah so the reason why I had done the other epoll lock irq optimizations
was because they were painfully obvious. ep_poll_callback(), however is
a different beast, as you've encountered. I vote for not shooting
ourselves in the foot and just dropping this patch -- most large
performance benefits will come from microbenches anyway.

Thanks,
Davidlohr

2019-01-08 16:11:25

by Roman Penyaev

[permalink] [raw]
Subject: Re: [PATCH 1/1] epoll: remove wrong assert that ep_poll_callback is always called with irqs off

On 2019-01-08 16:16, Davidlohr Bueso wrote:
> On 2019-01-08 04:42, Roman Penyaev wrote:
>> What we can do:
>>
>> a) disable irqs if we are not in interrupt.
>> b) revert the patch completely.
>>
>> David, is it really crucial in terms of performance to avoid double
>> local_irq_save() on Xen on this ep_poll_callback() hot path?
>
> Note that such optimizations are also relevant for baremetal, ie: x86
> PUSHF + POPF can be pretty expensive because of insn dependencies.
>
>>
>> For example why not to do the following:
>>
>> if (!in_interrupt())
>> local_irq_save(flags);
>> read_lock(ep->lock);
>>
>> with huge comment explaining performance number.
>>
>> Or just give up and simply revert the original patch completely
>> and always call read_lock_irqsave().
>
> Yeah so the reason why I had done the other epoll lock irq
> optimizations was because they were painfully obvious.
> ep_poll_callback(), however is a different beast, as you've
> encountered. I vote for not shooting ourselves in the foot and just
> dropping this patch -- most large performance benefits will come from
> microbenches anyway.

Then I will send another patch which changes read_lock() on
read_lock_irqsave(),
since simple revert of the original patch won't work.

Thanks.

--
Roman