Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933511AbcCNUBn (ORCPT ); Mon, 14 Mar 2016 16:01:43 -0400 Received: from mail-pa0-f53.google.com ([209.85.220.53]:35254 "EHLO mail-pa0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932183AbcCNUBh (ORCPT ); Mon, 14 Mar 2016 16:01:37 -0400 Subject: Re: [PATCH] epoll: add exclusive wakeups flag To: Jason Baron , Andrew Morton References: <56A9C03B.7020104@gmail.com> <56AA56A2.3000700@akamai.com> <56AB1F6C.7000609@gmail.com> <56E1C2B5.2040905@akamai.com> <56E1D1D7.8040000@gmail.com> <56E1DBC2.6040109@akamai.com> <56E32FC5.4030902@akamai.com> <56E353CF.6050503@gmail.com> <56E6D0ED.20609@akamai.com> <56E6F941.9040307@gmail.com> <56E711C3.8020008@akamai.com> Cc: mtk.manpages@gmail.com, mingo@kernel.org, peterz@infradead.org, viro@ftp.linux.org.uk, normalperson@yhbt.net, m@silodev.com, corbet@lwn.net, luto@amacapital.net, torvalds@linux-foundation.org, hagen@jauu.net, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org From: "Michael Kerrisk (man-pages)" Message-ID: <56E71894.4090607@gmail.com> Date: Tue, 15 Mar 2016 09:01:24 +1300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <56E711C3.8020008@akamai.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3863 Lines: 112 Hi Jason, On 03/15/2016 08:32 AM, Jason Baron wrote: > > > On 03/14/2016 01:47 PM, Michael Kerrisk (man-pages) wrote: >> [Restoring CC, which I see I accidentally dropped, one iteration back.] [...] >>>> values in events yield an error. EPOLLEXCLUSIVE may be >>>> used only in an EPOLL_CTL_ADD operation; attempts to >>>> employ it with EPOLL_CTL_MOD yield an error. If >>>> EPOLLEXCLUSIVE has set using epoll_ctl(2), then a subse‐ >>>> quent EPOLL_CTL_MOD on the same epfd, fd pair yields an >> b>> error. An epoll_ctl(2) that specifies EPOLLEXCLUSIVE in >>>> events and specifies the target file descriptor fd as an >>>> epoll instance will likewise fail. The error in all of >>>> these cases is EINVAL. >>>> >>>> ERRORS >>>> EINVAL An invalid event type was specified along with EPOLLEX‐ >>>> CLUSIVE in events. >>>> >>>> EINVAL op was EPOLL_CTL_MOD and events included EPOLLEXCLUSIVE. >>>> >>>> EINVAL op was EPOLL_CTL_MOD and the EPOLLEXCLUSIVE flag has >>>> previously been applied to this epfd, fd pair. >>>> >>>> EINVAL EPOLLEXCLUSIVE was specified in event and fd is refers >>>> to an epoll instance. >> >> Returning to the second sentence in this description: >> >> When a wakeup event occurs and multiple epoll file descrip‐ >> tors are attached to the same target file using EPOLLEXCLU‐ >> SIVE, one or more of the epoll file descriptors will >> receive an event with epoll_wait(2). >> >> There is a point that is unclear to me: what does "target file" refer to? >> Is it an open file description (aka open file table entry) or an inode? >> I suspect the former, but it was not clear in your original text. >> > > So from epoll's perspective, the wakeups are associated with a 'wait > queue'. So if the open() and subsequent EPOLL_CTL_ADD (which is done via > file->poll()) results in adding to the same 'wait queue' then we will > get 'exclusive' wakeup behavior. > > So in general, I think the answer here is that its associated with the > inode (I coudn't say with 100% certainty without really looking at all > file->poll() implementations). Certainly, with the 'FIFO' example below, > the two scenarios will have the same behavior with respect to > EPOLLEXCLUSIVE. So, in both scenarios, *one or more* processes will get a wakeup? (I'll try to add something to the text to clarify the detail we're discussing.) > Also, the 'non-exclusive' mode would be subject to the same question of > which wait queue is the epfd is associated with... I'm not sure of the point you are trying to make here? Cheers, Michael >> To make this point even clearer, here are two scenarios I'm thinking of. >> In each case, we're talking of monitoring the read end of a FIFO. >> >> === >> >> Scenario 1: >> >> We have three processes each of which >> 1. Creates an epoll instance >> 2. Opens the read end of the FIFO >> 3. Adds the read end of the FIFO to the epoll instance, specifying >> EPOLLEXCLUSIVE >> >> When input becomes available on the FIFO, how many processes >> get a wakeup? >> >> === >> >> Scenario 3 >> >> A parent process opens the read end of a FIFO and then calls >> fork() three times to create three children. Each child then: >> >> 1. Creates an epoll instance >> 2. Adds the read end of the FIFO to the epoll instance, specifying >> EPOLLEXCLUSIVE >> >> When input becomes available on the FIFO, how many processes >> get a wakeup? >> >> === >> >> Cheers, >> >> Michael >> > -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/