Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp6621117imu; Mon, 21 Jan 2019 12:18:47 -0800 (PST) X-Google-Smtp-Source: ALg8bN6XlZFim3eAi4mGmDWrJji73dBzhIHy4SB/+V5FGE/O8o0VDkuAxvuvET18pH5eHPjk3dDm X-Received: by 2002:a63:194f:: with SMTP id 15mr29602318pgz.192.1548101927797; Mon, 21 Jan 2019 12:18:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548101927; cv=none; d=google.com; s=arc-20160816; b=mLf93L/6bYkZpChqyetK3lO7kInOkZv2LffeO66LOzmnFUeeAA4FmQhl0ar0n/skeJ Pv9AJ/97bNKN5U10R77U4UIm90X4+uPtiKxjuTO25aSOCenc0tDa5LDSbHk8xdehcexK DgZkdlKduV7oeRP0TnHnD5eivJkduba6UXYSSO93jX9o0MwRCEPZBg3kjRMvbztws3iF Jr0HyKtQxEb5Yj42ZYlkLECSs9lKiWJWTMOSKKN1yrhYQG0gOp2xnas172Aa4RL0muLB 0p7+FXeBT4EBSpVsqAv/GqUW/x78KSMsjvzg/K0yvc5d8AI8hE/JtXysLAH238cyuoSt JV+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:from; bh=dyzBg2FRytDCSZZ9o1pjU4oATrIt3KAHavawTLZy/gc=; b=FIASr4qWi2wKsIHVyHk0ClmtwEOkbRdRzgBlGElueHF+YrDRCAmRDaRBKfK3vqYIf4 8seBkQ+Qdawjua15v1XdNpvtq3RZAOS2OGxm73Wxn8ohEb2qx2IWZeEA3k2DdHroh13w KFR9xVTuw9olzf566epAnMx+elXx8mt2hg45m4z49TJJsczpQUazm8IgHKZa4GRJPr1J nd3WGnwkATM/iREfi1C25F/dAtSULOZV9di8yd0HJh2y/aniHd7oCwW8hipbcu09H8em cUoDxma9X3Y6pQnkrK8KDT4wN+mIxW8yS0045isy4boxQ7JQvpIj8sLIXalATIv/7KVX nW/g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h189si12993340pfc.211.2019.01.21.12.18.31; Mon, 21 Jan 2019 12:18:47 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728624AbfAUUQQ (ORCPT + 99 others); Mon, 21 Jan 2019 15:16:16 -0500 Received: from mx2.suse.de ([195.135.220.15]:55626 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728272AbfAUUPQ (ORCPT ); Mon, 21 Jan 2019 15:15:16 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 511A7AFF9; Mon, 21 Jan 2019 20:15:15 +0000 (UTC) From: Roman Penyaev Cc: Roman Penyaev , Andrew Morton , Davidlohr Bueso , Jason Baron , Al Viro , "Paul E. McKenney" , Linus Torvalds , Andrea Parri , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH v2 07/13] epoll: call ep_add_event_to_uring() from ep_poll_callback() Date: Mon, 21 Jan 2019 21:14:50 +0100 Message-Id: <20190121201456.28338-8-rpenyaev@suse.de> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20190121201456.28338-1-rpenyaev@suse.de> References: <20190121201456.28338-1-rpenyaev@suse.de> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit To: unlisted-recipients:; (no To-header on input) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Each ep_poll_callback() is called when fd calls wakeup() on epfd. So account new event in user ring. The tricky part here is EPOLLONESHOT. Since we are lockless we have to be deal with ep_poll_callbacks() called in parallel, thus use cmpxchg to clear public event bits and filter out concurrent call from another cpu. Signed-off-by: Roman Penyaev Cc: Andrew Morton Cc: Davidlohr Bueso Cc: Jason Baron Cc: Al Viro Cc: "Paul E. McKenney" Cc: Linus Torvalds Cc: Andrea Parri Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org --- fs/eventpoll.c | 38 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 26d837252ba4..1d0039b334b8 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -1406,6 +1406,29 @@ struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd, } #endif /* CONFIG_CHECKPOINT_RESTORE */ +/** + * Atomically clear public event bits and return %true if the old value has + * public event bits set. + */ +static inline bool ep_clear_public_event_bits(struct epitem *epi) +{ + __poll_t old, flags; + + /* + * Here we race with ourselves and with ep_modify(), which can + * change the event bits. In order not to override events updated + * by ep_modify() we have to do cmpxchg. + */ + + old = epi->event.events; + do { + flags = old; + } while ((old = cmpxchg(&epi->event.events, flags, + flags & EP_PRIVATE_BITS)) != flags); + + return flags & ~EP_PRIVATE_BITS; +} + /** * Adds a new entry to the tail of the list in a lockless way, i.e. * multiple CPUs are allowed to call this function concurrently. @@ -1525,6 +1548,20 @@ static int ep_poll_callback(struct epitem *epi, __poll_t pollflags) if (pollflags && !(pollflags & epi->event.events)) goto out_unlock; + if (ep_polled_by_user(ep)) { + /* + * For polled descriptor from user we have to disable events on + * callback path in case of one-shot. + */ + if ((epi->event.events & EPOLLONESHOT) && + !ep_clear_public_event_bits(epi)) + /* Race is lost, another callback has cleared events */ + goto out_unlock; + + ep_add_event_to_uring(epi, pollflags); + goto wakeup; + } + /* * If we are transferring events to userspace, we can hold no locks * (because we're accessing user memory, and because of linux f_op->poll() @@ -1544,6 +1581,7 @@ static int ep_poll_callback(struct epitem *epi, __poll_t pollflags) ep_pm_stay_awake_rcu(epi); } +wakeup: /* * Wake up ( if active ) both the eventpoll wait list and the ->poll() * wait list. -- 2.19.1