Received: by 2002:a25:824b:0:0:0:0:0 with SMTP id d11csp1739627ybn; Thu, 26 Sep 2019 01:15:00 -0700 (PDT) X-Google-Smtp-Source: APXvYqyAxjb3EoJdmhisp7a6hYgiNJsrAwUcGSLSncV3sy5wxMiuLwUphguLUHmmziH2Euz5+Yer X-Received: by 2002:a50:d5c5:: with SMTP id g5mr2209889edj.57.1569485700433; Thu, 26 Sep 2019 01:15:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1569485700; cv=none; d=google.com; s=arc-20160816; b=QE+RQhzlcdPoTl81c+5XwJ1ZlB6oqmGf1Rwp52n3mcE4q5mCJrV3KkdhnDexxF3e9V dNjbcYcX966m2nf0w5E0cLerlQ/wcTg81w4NxhX0k0mU8+thAEPoRR4Xi9tTfZ38alU0 0qOL2tMtJCsvee6BIQ+87vNKPg86pZWRthZS0ulPF7DkJHKS6vt2rDeuw/UUiPw71VcZ 4XhGR/sSEJ0x5QY+LC20F+rAKirCwUPoi/jDjqn0ECOBFs+vC1lZcfE39XaWovUR9kYv YaLL40r7eyZHlC4bqXx/r2xmLRDwcyWs8PGf4kf2ylnu5lFIOZbL4nOwMeehLHiTjXAd 1nog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=LiNQUQrqa3fbXUV8cnQlzIVFySFIQp5N/0FZBBuHxEA=; b=Jmgr5m7S+ivAPYpyuaTO3qZff04sncKCj7kznlwJGN6Et+aOk+sJAr+e39A8rCY5vU m4uSFM9LSZ8P/jrjpkvZ6dzqaPm+x496is8l2DI7EeUrZc+0Zt1/KeeOhpAUTcvLXD8h hszUuWMimlZgEyaZ/Pgv48gHh9OYRw9BXd76HSvd1eIuyQb8g+nslDD+a1mo3rNVnUsq WgVjBPJDNtL1cY3Zr2prc4RIRGs/Kl98NuAMdU237FBKfy+fbyb1VxjUbqjx/8jICxPW /4KWtTA3dPsZ1h+aXp/XceLi5kIQSbObU7iTdQfgHkBq6YPStw60WvDCFiIZBHZrc4Xe wacA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@hev-cc.20150623.gappssmtp.com header.s=20150623 header.b=gD4dPEuK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q8si605648ejx.315.2019.09.26.01.14.36; Thu, 26 Sep 2019 01:15:00 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@hev-cc.20150623.gappssmtp.com header.s=20150623 header.b=gD4dPEuK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2441345AbfIXO1Q (ORCPT + 99 others); Tue, 24 Sep 2019 10:27:16 -0400 Received: from mail-pg1-f194.google.com ([209.85.215.194]:45898 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2437208AbfIXO1Q (ORCPT ); Tue, 24 Sep 2019 10:27:16 -0400 Received: by mail-pg1-f194.google.com with SMTP id 4so1404418pgm.12 for ; Tue, 24 Sep 2019 07:27:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hev-cc.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=LiNQUQrqa3fbXUV8cnQlzIVFySFIQp5N/0FZBBuHxEA=; b=gD4dPEuKD/+LElxHZN8h68oXFPDiKdW+AhlUhmY+zyv9fjYTukj04Zu3uqJl7KJgnp Xv3sNVW1jea85A59x2jYkv6l5D7J8owGhsX2/ACGAYS5umfnhJKS5Om6LhNO0FVKfaY5 OZrdVHrBy3Rk8trpBCZy2yFR7MRvtWDgKiWJ0/Zxhz5VVuPX5pWqhxCXYyW8BoiVYoqO 7y9QJwChBPy+X2CbA+1Y8MfjcbqUsHuO7xQdOqWERxJ8Z+jdkCjEjZJWHsOphJ/OHQJW sLpYkSdQFvc1Wu+2L5DT2zWBbXDolKYIbAClTxi5TTvOs1ThEJ/mELP6M8Q2Khcy2HT8 vxpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=LiNQUQrqa3fbXUV8cnQlzIVFySFIQp5N/0FZBBuHxEA=; b=fyiBxljfFXlvjCqTYn0pCkB0soG4jUgnFReBK+dRCSGGfSHkFIXc2w5Zrs7TCSMyPX z0yf/DvS1q89d8gpT/9SSti+Y5FO6kO2hlDetCsjYEDs6oWhPc16kaFumpXA6D8C3qyd JgPMtFwkHihq4dKIcPcgreJIwh6a2gK/3IqVEmuKFkCxaz2b3SjOpNDdXjjuHCHsQ5uj hU1b5Eq7VTC20cKgVzFriDQrcRbf4R8rAS4mTYnT6uP7j9IxtOCY/fryc4EesbCZPDpJ Rg6oxrWVOGctFfnSmGNEfC0DC+YrDoPmlzE5BfCm2cxwEt5UhhpbJ6rcKuqbOtX/nFwM R9TA== X-Gm-Message-State: APjAAAWzu3nqo8RrHFVr15LkMFBsHIUmHUlLM0EZn4X0GtpdTIm5yVq3 c6Z4BXgZ05Fq7AVA25UjOQ6L8g== X-Received: by 2002:a63:e907:: with SMTP id i7mr3403513pgh.84.1569335234815; Tue, 24 Sep 2019 07:27:14 -0700 (PDT) Received: from hev-sbc.hz.ali.com ([47.89.83.40]) by smtp.gmail.com with ESMTPSA id h14sm2449039pfo.15.2019.09.24.07.27.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Sep 2019 07:27:13 -0700 (PDT) From: hev To: linux-fsdevel@vger.kernel.org Cc: Heiher , Al Viro , Andrew Morton , Davide Libenzi , Davidlohr Bueso , Dominik Brodowski , Eric Wong , Jason Baron , Linus Torvalds , Roman Penyaev , Sridhar Samudrala , linux-kernel@vger.kernel.org Subject: [PATCH RESEND v3] fs/epoll: Remove unnecessary wakeups of nested epoll that in ET mode Date: Tue, 24 Sep 2019 22:26:54 +0800 Message-Id: <20190924142654.5742-1-r@hev.cc> X-Mailer: git-send-email 2.23.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Heiher Take the case where we have: t0 | (ew) e0 | (et) e1 | (lt) s0 t0: thread 0 e0: epoll fd 0 e1: epoll fd 1 s0: socket fd 0 ew: epoll_wait et: edge-trigger lt: level-trigger When s0 fires an event, e1 catches the event, and then e0 catches an event from e1. After this, There is a thread t0 do epoll_wait() many times on e0, it should only get one event in total, because e1 is a dded to e0 in edge-triggered mode. This patch only allows wakeup(&ep->poll_wait) in ep_scan_ready_list if one of the conditions is met: 1. depth == 0. 2. There have event is added to ep->ovflist during processing. Test code: #include #include #include int main(int argc, char *argv[]) { int sfd[2]; int efd[2]; struct epoll_event e; if (socketpair(AF_UNIX, SOCK_STREAM, 0, sfd) < 0) goto out; efd[0] = epoll_create(1); if (efd[0] < 0) goto out; efd[1] = epoll_create(1); if (efd[1] < 0) goto out; e.events = EPOLLIN; if (epoll_ctl(efd[1], EPOLL_CTL_ADD, sfd[0], &e) < 0) goto out; e.events = EPOLLIN | EPOLLET; if (epoll_ctl(efd[0], EPOLL_CTL_ADD, efd[1], &e) < 0) goto out; if (write(sfd[1], "w", 1) != 1) goto out; if (epoll_wait(efd[0], &e, 1, 0) != 1) goto out; if (epoll_wait(efd[0], &e, 1, 0) != 0) goto out; close(efd[0]); close(efd[1]); close(sfd[0]); close(sfd[1]); return 0; out: return -1; } More tests: https://github.com/heiher/epoll-wakeup Cc: Al Viro Cc: Andrew Morton Cc: Davide Libenzi Cc: Davidlohr Bueso Cc: Dominik Brodowski Cc: Eric Wong Cc: Jason Baron Cc: Linus Torvalds Cc: Roman Penyaev Cc: Sridhar Samudrala Cc: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Signed-off-by: hev --- fs/eventpoll.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index c4159bcc05d9..a05249400901 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -685,6 +685,9 @@ static __poll_t ep_scan_ready_list(struct eventpoll *ep, if (!ep_locked) mutex_lock_nested(&ep->mtx, depth); + if (!depth || list_empty_careful(&ep->rdllist)) + pwake++; + /* * Steal the ready list, and re-init the original one to the * empty list. Also, set ep->ovflist to NULL so that events @@ -704,12 +707,21 @@ static __poll_t ep_scan_ready_list(struct eventpoll *ep, res = (*sproc)(ep, &txlist, priv); write_lock_irq(&ep->lock); + nepi = READ_ONCE(ep->ovflist); + /* + * We only need to wakeup nested epoll fds if something has been queued + * to the overflow list, since the ep_poll() traverses the rdllist + * during recursive poll and thus events on the overflow list may not be + * visible yet. + */ + if (nepi != NULL) + pwake++; /* * During the time we spent inside the "sproc" callback, some * other events might have been queued by the poll callback. * We re-insert them inside the main ready-list here. */ - for (nepi = READ_ONCE(ep->ovflist); (epi = nepi) != NULL; + for (; (epi = nepi) != NULL; nepi = epi->next, epi->next = EP_UNACTIVE_PTR) { /* * We need to check if the item is already in the list. @@ -755,7 +767,7 @@ static __poll_t ep_scan_ready_list(struct eventpoll *ep, mutex_unlock(&ep->mtx); /* We have to call this outside the lock */ - if (pwake) + if (pwake == 2) ep_poll_safewake(&ep->poll_wait); return res; -- 2.23.0