2020-10-28 22:01:59

by Soheil Hassas Yeganeh

[permalink] [raw]
Subject: [PATCH 1/2] epoll: check ep_events_available() upon timeout

From: Soheil Hassas Yeganeh <[email protected]>

After abc610e01c66, we break out of the ep_poll loop upon timeout,
without checking whether there is any new events available. Prior to
that patch-series we always called ep_events_available() after
exiting the loop.

This can cause races and missed wakeups. For example, consider
the following scenario reported by Guantao Liu:

Suppose we have an eventfd added using EPOLLET to an epollfd.

Thread 1: Sleeps for just below 5ms and then writes to an eventfd.
Thread 2: Calls epoll_wait with a timeout of 5 ms. If it sees an
event of the eventfd, it will write back on that fd.
Thread 3: Calls epoll_wait with a negative timeout.

Prior to abc610e01c66, it is guaranteed that Thread 3 will wake up
either by Thread 1 or Thread 2. After abc610e01c66, Thread 3 can
be blocked indefinitely if Thread 2 sees a timeout right before
the write to the eventfd by Thread 1. Thread 2 will be woken up from
schedule_hrtimeout_range and, with evail 0, it will not call
ep_send_events().

To fix this issue, while holding the lock, try to remove the thread that
timed out the wait queue and check whether it was woken up or not.

Fixes: abc610e01c66 ("fs/epoll: avoid barrier after an epoll_wait(2) timeout")
Reported-by: Guantao Liu <[email protected]>
Tested-by: Guantao Liu <[email protected]>
Signed-off-by: Soheil Hassas Yeganeh <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Acked-by: Willem de Bruijn <[email protected]>
Reviewed-by: Khazhismel Kumykov <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
---
fs/eventpoll.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 4df61129566d..11388436b85a 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -1907,7 +1907,21 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,

if (!schedule_hrtimeout_range(to, slack, HRTIMER_MODE_ABS)) {
timed_out = 1;
- break;
+ __set_current_state(TASK_RUNNING);
+ /*
+ * Acquire the lock and try to remove this thread from
+ * the wait queue. If this thread is not on the wait
+ * queue, it has woken up after its timeout ended
+ * before it could re-acquire the lock. In that case,
+ * try to harvest some events.
+ */
+ write_lock_irq(&ep->lock);
+ if (!list_empty(&wait.entry))
+ __remove_wait_queue(&ep->wq, &wait);
+ else
+ eavail = 1;
+ write_unlock_irq(&ep->lock);
+ goto send_events;
}

/* We were woken up, thus go and try to harvest some events */
--
2.29.0.rc2.309.g374f81d7ae-goog


2020-10-29 08:44:32

by Soheil Hassas Yeganeh

[permalink] [raw]
Subject: [PATCH 2/2] epoll: add a selftest for epoll timeout race

From: Soheil Hassas Yeganeh <[email protected]>

Add a test case to ensure an event is observed by at least one
poller when an epoll timeout is used.

Signed-off-by: Guantao Liu <[email protected]>
Signed-off-by: Soheil Hassas Yeganeh <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Acked-by: Willem de Bruijn <[email protected]>
Reviewed-by: Khazhismel Kumykov <[email protected]>
---
.../filesystems/epoll/epoll_wakeup_test.c | 95 +++++++++++++++++++
1 file changed, 95 insertions(+)

diff --git a/tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c b/tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c
index d979ff14775a..8f82f99f7748 100644
--- a/tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c
+++ b/tools/testing/selftests/filesystems/epoll/epoll_wakeup_test.c
@@ -3282,4 +3282,99 @@ TEST(epoll60)
close(ctx.epfd);
}

+struct epoll61_ctx {
+ int epfd;
+ int evfd;
+};
+
+static void *epoll61_write_eventfd(void *ctx_)
+{
+ struct epoll61_ctx *ctx = ctx_;
+ int64_t l = 1;
+
+ usleep(10950);
+ write(ctx->evfd, &l, sizeof(l));
+ return NULL;
+}
+
+static void *epoll61_epoll_with_timeout(void *ctx_)
+{
+ struct epoll61_ctx *ctx = ctx_;
+ struct epoll_event events[1];
+ int n;
+
+ n = epoll_wait(ctx->epfd, events, 1, 11);
+ /*
+ * If epoll returned the eventfd, write on the eventfd to wake up the
+ * blocking poller.
+ */
+ if (n == 1) {
+ int64_t l = 1;
+
+ write(ctx->evfd, &l, sizeof(l));
+ }
+ return NULL;
+}
+
+static void *epoll61_blocking_epoll(void *ctx_)
+{
+ struct epoll61_ctx *ctx = ctx_;
+ struct epoll_event events[1];
+
+ epoll_wait(ctx->epfd, events, 1, -1);
+ return NULL;
+}
+
+TEST(epoll61)
+{
+ struct epoll61_ctx ctx;
+ struct epoll_event ev;
+ int i, r;
+
+ ctx.epfd = epoll_create1(0);
+ ASSERT_GE(ctx.epfd, 0);
+ ctx.evfd = eventfd(0, EFD_NONBLOCK);
+ ASSERT_GE(ctx.evfd, 0);
+
+ ev.events = EPOLLIN | EPOLLET | EPOLLERR | EPOLLHUP;
+ ev.data.ptr = NULL;
+ r = epoll_ctl(ctx.epfd, EPOLL_CTL_ADD, ctx.evfd, &ev);
+ ASSERT_EQ(r, 0);
+
+ /*
+ * We are testing a race. Repeat the test case 1000 times to make it
+ * more likely to fail in case of a bug.
+ */
+ for (i = 0; i < 1000; i++) {
+ pthread_t threads[3];
+ int n;
+
+ /*
+ * Start 3 threads:
+ * Thread 1 sleeps for 10.9ms and writes to the evenfd.
+ * Thread 2 calls epoll with a timeout of 11ms.
+ * Thread 3 calls epoll with a timeout of -1.
+ *
+ * The eventfd write by Thread 1 should either wakeup Thread 2
+ * or Thread 3. If it wakes up Thread 2, Thread 2 writes on the
+ * eventfd to wake up Thread 3.
+ *
+ * If no events are missed, all three threads should eventually
+ * be joinable.
+ */
+ ASSERT_EQ(pthread_create(&threads[0], NULL,
+ epoll61_write_eventfd, &ctx), 0);
+ ASSERT_EQ(pthread_create(&threads[1], NULL,
+ epoll61_epoll_with_timeout, &ctx), 0);
+ ASSERT_EQ(pthread_create(&threads[2], NULL,
+ epoll61_blocking_epoll, &ctx), 0);
+
+ for (n = 0; n < ARRAY_SIZE(threads); ++n)
+ ASSERT_EQ(pthread_join(threads[n], NULL), 0);
+ }
+
+ close(ctx.epfd);
+ close(ctx.evfd);
+}
+
TEST_HARNESS_MAIN
--
2.29.0.rc2.309.g374f81d7ae-goog