Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp1072082ybt; Wed, 24 Jun 2020 19:19:19 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwQ29uAnj4GmjYKHdDJCj7BRXHGF7SnHG6VnrXaW26KJkQlzBDLhtHYWGflcqvpfJ4ClgIH X-Received: by 2002:a50:b941:: with SMTP id m59mr326001ede.321.1593051559350; Wed, 24 Jun 2020 19:19:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593051559; cv=none; d=google.com; s=arc-20160816; b=SX12KodI/saxuviL36J4Uul2f7QDC29Htf90BYaDemWqL/8jSxzpHbbRyg4f4v1O+t YLd1V1WqoBYAtuYctxTIpI9Urh03cf4Q2aiq/dNv2FUmvZkGVhoqm1vGPiYT58sdvJnC AaMK0Oj0REk2v3Iig9rSNbkjnEH10wjGfxc9J9UNy7h2jI56vlvgl9v53SEFggGWSx7A E7SlBj3GsbcOG3sq6yG2tJ/o6oOrdfAL3LWrssvJ9JZHBx6U1ru8wW3DTyRtgMsQ7kGj RjghwI1VY2KUf14eU88B5me/i/A+q02OPaM2cLGSqFk3LXJCfApU/IahHO+zfEl+y6WX pfrA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=x+A1GfOyhLOLIaYyceDMa/EgzzLrl7zCGC77vOTpZ5Y=; b=dY9nl9C2TMf7/O+4nMhhm/yJKklZA8kHrLqlvxxo+nqI3I+MJ7R+IEFgcEZOCO0Mm3 xi2xjlcGLDP3P771vEloBDuEFLkSrQFXyCtmCSPhyYkmXbPcs9yW7jI81/cvFe7eaqPl sJgCVxbtvnxDrn9ARKlcjKOQCT8l624R5X6dF80jiR7iVy09cosGPp3FEv1N/WDw3bdV j2aY4U2fMwBTpDL+8/bxC3ic00CpdTeBLQdF0/P4zWvPxNTi01vBL1cMQLLdDJc1dH/7 bAlHyqgjA2Br+DK/W4C8XIZy+DQXR/vifI9Pwoi6zgndKIg0iqgN0ZxDxPOgq8KvVIhd nTGw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass (test mode) header.i=@ideasonboard.com header.s=mail header.b="h/bjotn1"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x14si510311edi.539.2020.06.24.19.18.42; Wed, 24 Jun 2020 19:19:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass (test mode) header.i=@ideasonboard.com header.s=mail header.b="h/bjotn1"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389263AbgFYCSf (ORCPT + 99 others); Wed, 24 Jun 2020 22:18:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47408 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389144AbgFYCSe (ORCPT ); Wed, 24 Jun 2020 22:18:34 -0400 Received: from perceval.ideasonboard.com (perceval.ideasonboard.com [IPv6:2001:4b98:dc2:55:216:3eff:fef7:d647]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 23AB0C061573; Wed, 24 Jun 2020 19:18:34 -0700 (PDT) Received: from pendragon.ideasonboard.com (81-175-216-236.bb.dnainternet.fi [81.175.216.236]) by perceval.ideasonboard.com (Postfix) with ESMTPSA id 54714521; Thu, 25 Jun 2020 04:18:32 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ideasonboard.com; s=mail; t=1593051512; bh=O7UUb91wU/kWVLyFpWSCzPMXCvYTvdJmc4CB0fVicT8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=h/bjotn1Latpi6z5/HUEmcVn2oIjPOyzCmKh3ZurBSdc7BINsUYXPMhEiIeXlxH5l T/IPDrjlTeIC0VasqPD1e/JC0jV4JAYesUOHYIuGNSWQbZ5bejYsDYQwP7PusnUpPs AdChf01pqEnqhbpop6H3mC3KhCZaWwuqGSB7fY64= Date: Thu, 25 Jun 2020 05:18:31 +0300 From: Laurent Pinchart To: Damian Hobson-Garcia Cc: Paul Elder , viro@zeniv.linux.org.uk, sustrik@250bpm.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, netdev@vger.kernel.org, David.Laight@aculab.com Subject: Re: [PATCH v3 1/1] eventfd: implementation of EFD_MASK flag Message-ID: <20200625021831.GZ5980@pendragon.ideasonboard.com> References: <1444873328-8466-1-git-send-email-dhobsong@igel.co.jp> <1444873328-8466-2-git-send-email-dhobsong@igel.co.jp> <20200619101619.GD2073@jade.amanokami.net> <8b9b1b31-99fc-0877-cbd3-0f52de52419c@igel.co.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <8b9b1b31-99fc-0877-cbd3-0f52de52419c@igel.co.jp> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Damian, On Tue, Jun 23, 2020 at 06:21:28PM +0900, Damian Hobson-Garcia wrote: > On 2020-06-19 7:16 p.m., Paul Elder wrote: > > Hello Damian, Martin, and all, > > > > I came across this (quite old by now) patch to extend eventfd's polling > > functionality. I was wondering what happened to it (why it hasn't been > > merged yet) and if we could, or what is needed to, move it forward. > > I think there was an open question about whether it was > best to move the definitions of EFD_SEMAPHORE, etc out of > /include/linux/eventfd.h and into a newly created > /include/uapi/linux/eventfd.h as this patch does. I would have thought that defining EFD_SEMAPHORE in a public API header would be best, but it seems that glibc has its own definition in bits/eventfd.h. I don't know what is usually preferred in these cases. > I don't know if the maintainers have any concerns on this matter, or the > patch in general, that would prevent this from moving forward. Thanks for your reply. It seems a good way forward would be to resubmit the patch then. > > I was thinking to use it for V4L2 events support via polling in the V4L2 > > compatibility layer for libcamera [1]. We can signal V4L2 buffer > > availability POLLOUT via write(), but there is no way to signal V4L2 > > events, as they are signaled via POLLPRI. > > > > [1] https://libcamera.org/docs.html#id1 > > > > On Thu, Oct 15, 2015 at 10:42:08AM +0900, Damian Hobson-Garcia wrote: > >> From: Martin Sustrik > >> > >> When implementing network protocols in user space, one has to implement > >> fake file descriptors to represent the sockets for the protocol. > >> > >> Polling on such fake file descriptors is a problem (poll/select/epoll > >> accept only true file descriptors) and forces protocol implementers to use > >> various workarounds resulting in complex, non-standard and convoluted APIs. > >> > >> More generally, ability to create full-blown file descriptors for > >> userspace-to-userspace signalling is missing. While eventfd(2) goes half > >> the way towards this goal it has follwoing shorcomings: > >> > >> I. There's no way to signal POLLPRI, POLLHUP etc. > >> II. There's no way to signal arbitrary combination of POLL* flags. Most > >> notably, simultaneous !POLLIN and !POLLOUT, which is a perfectly valid > >> combination for a network protocol (rx buffer is empty and tx buffer is > >> full), cannot be signaled using eventfd. > >> > >> This patch implements new EFD_MASK flag which solves the above problems. > >> > >> The semantics of EFD_MASK are as follows: > >> > >> eventfd(2): > >> > >> If eventfd is created with EFD_MASK flag set, it is initialised in such a > >> way as to signal no events on the file descriptor when it is polled on. > >> The 'initval' argument is ignored. > >> > >> write(2): > >> > >> User is allowed to write only buffers containing a 32-bit value > >> representing any combination of event flags as defined by the poll(2) > >> function (POLLIN, POLLOUT, POLLERR, POLLHUP etc.). Specified events > >> will be signaled when polling (select, poll, epoll) on the eventfd is > >> done later on. > >> > >> read(2): > >> > >> read is not supported and will fail with EINVAL. > >> > >> select(2), poll(2) and similar: > >> > >> When polling on the eventfd marked by EFD_MASK flag, all the events > >> specified in last written event flags shall be signaled. > >> > >> Signed-off-by: Martin Sustrik > >> > >> [dhobsong@igel.co.jp: Rebased, and resubmitted for Linux 4.3] > >> Signed-off-by: Damian Hobson-Garcia > >> --- > >> fs/eventfd.c | 102 ++++++++++++++++++++++++++++++++++++++----- > >> include/linux/eventfd.h | 16 +------ > >> include/uapi/linux/eventfd.h | 33 ++++++++++++++ > >> 3 files changed, 126 insertions(+), 25 deletions(-) > >> create mode 100644 include/uapi/linux/eventfd.h > >> > >> diff --git a/fs/eventfd.c b/fs/eventfd.c > >> index 8d0c0df..1310779 100644 > >> --- a/fs/eventfd.c > >> +++ b/fs/eventfd.c > >> @@ -2,6 +2,7 @@ > >> * fs/eventfd.c > >> * > >> * Copyright (C) 2007 Davide Libenzi > >> + * Copyright (C) 2013 Martin Sustrik > >> * > >> */ > >> > >> @@ -22,18 +23,31 @@ > >> #include > >> #include > >> > >> +#define EFD_SHARED_FCNTL_FLAGS (O_CLOEXEC | O_NONBLOCK) > >> +#define EFD_FLAGS_SET (EFD_SHARED_FCNTL_FLAGS | EFD_SEMAPHORE | EFD_MASK) > >> +#define EFD_MASK_VALID_EVENTS (POLLIN | POLLPRI | POLLOUT | POLLERR | POLLHUP) > >> + > >> struct eventfd_ctx { > >> struct kref kref; > >> wait_queue_head_t wqh; > >> - /* > >> - * Every time that a write(2) is performed on an eventfd, the > >> - * value of the __u64 being written is added to "count" and a > >> - * wakeup is performed on "wqh". A read(2) will return the "count" > >> - * value to userspace, and will reset "count" to zero. The kernel > >> - * side eventfd_signal() also, adds to the "count" counter and > >> - * issue a wakeup. > >> - */ > >> - __u64 count; > >> + union { > >> + /* > >> + * Every time that a write(2) is performed on an eventfd, the > >> + * value of the __u64 being written is added to "count" and a > >> + * wakeup is performed on "wqh". A read(2) will return the > >> + * "count" value to userspace, and will reset "count" to zero. > >> + * The kernel side eventfd_signal() also, adds to the "count" > >> + * counter and issue a wakeup. > >> + */ > >> + __u64 count; > >> + > >> + /* > >> + * When using eventfd in EFD_MASK mode this stracture stores the > >> + * current events to be signaled on the eventfd (events member) > >> + * along with opaque user-defined data (data member). > >> + */ > >> + __u32 events; > >> + }; > >> unsigned int flags; > >> }; > >> > >> @@ -134,6 +148,14 @@ static unsigned int eventfd_poll(struct file *file, poll_table *wait) > >> return events; > >> } > >> > >> +static unsigned int eventfd_mask_poll(struct file *file, poll_table *wait) > >> +{ > >> + struct eventfd_ctx *ctx = file->private_data; > >> + > >> + poll_wait(file, &ctx->wqh, wait); > >> + return ctx->events; > >> +} > >> + > >> static void eventfd_ctx_do_read(struct eventfd_ctx *ctx, __u64 *cnt) > >> { > >> *cnt = (ctx->flags & EFD_SEMAPHORE) ? 1 : ctx->count; > >> @@ -239,6 +261,14 @@ static ssize_t eventfd_read(struct file *file, char __user *buf, size_t count, > >> return put_user(cnt, (__u64 __user *) buf) ? -EFAULT : sizeof(cnt); > >> } > >> > >> +static ssize_t eventfd_mask_read(struct file *file, char __user *buf, > >> + size_t count, > >> + loff_t *ppos) > >> +{ > >> + return -EINVAL; > >> +} > >> + > >> + > >> static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t count, > >> loff_t *ppos) > >> { > >> @@ -286,6 +316,28 @@ static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t c > >> return res; > >> } > >> > >> +static ssize_t eventfd_mask_write(struct file *file, const char __user *buf, > >> + size_t count, > >> + loff_t *ppos) > >> +{ > >> + struct eventfd_ctx *ctx = file->private_data; > >> + __u32 events; > >> + > >> + if (count < sizeof(events)) > >> + return -EINVAL; > >> + if (copy_from_user(&events, buf, sizeof(events))) > >> + return -EFAULT; > >> + if (events & ~EFD_MASK_VALID_EVENTS) > >> + return -EINVAL; > >> + spin_lock_irq(&ctx->wqh.lock); > >> + memcpy(&ctx->events, &events, sizeof(ctx->events)); > >> + if (waitqueue_active(&ctx->wqh)) > >> + wake_up_locked_poll(&ctx->wqh, > >> + (unsigned long)ctx->events); > >> + spin_unlock_irq(&ctx->wqh.lock); > >> + return sizeof(ctx->events); > >> +} > >> + > >> #ifdef CONFIG_PROC_FS > >> static void eventfd_show_fdinfo(struct seq_file *m, struct file *f) > >> { > >> @@ -296,6 +348,16 @@ static void eventfd_show_fdinfo(struct seq_file *m, struct file *f) > >> (unsigned long long)ctx->count); > >> spin_unlock_irq(&ctx->wqh.lock); > >> } > >> + > >> +static void eventfd_mask_show_fdinfo(struct seq_file *m, struct file *f) > >> +{ > >> + struct eventfd_ctx *ctx = f->private_data; > >> + > >> + spin_lock_irq(&ctx->wqh.lock); > >> + seq_printf(m, "eventfd-mask: %x\n", > >> + ctx->events); > >> + spin_unlock_irq(&ctx->wqh.lock); > >> +} > >> #endif > >> > >> static const struct file_operations eventfd_fops = { > >> @@ -309,6 +371,17 @@ static const struct file_operations eventfd_fops = { > >> .llseek = noop_llseek, > >> }; > >> > >> +static const struct file_operations eventfd_mask_fops = { > >> +#ifdef CONFIG_PROC_FS > >> + .show_fdinfo = eventfd_mask_show_fdinfo, > >> +#endif > >> + .release = eventfd_release, > >> + .poll = eventfd_mask_poll, > >> + .read = eventfd_mask_read, > >> + .write = eventfd_mask_write, > >> + .llseek = noop_llseek, > >> +}; > >> + > >> /** > >> * eventfd_fget - Acquire a reference of an eventfd file descriptor. > >> * @fd: [in] Eventfd file descriptor. > >> @@ -392,6 +465,7 @@ struct file *eventfd_file_create(unsigned int count, int flags) > >> { > >> struct file *file; > >> struct eventfd_ctx *ctx; > >> + const struct file_operations *fops; > >> > >> /* Check the EFD_* constants for consistency. */ > >> BUILD_BUG_ON(EFD_CLOEXEC != O_CLOEXEC); > >> @@ -406,10 +480,16 @@ struct file *eventfd_file_create(unsigned int count, int flags) > >> > >> kref_init(&ctx->kref); > >> init_waitqueue_head(&ctx->wqh); > >> - ctx->count = count; > >> + if (flags & EFD_MASK) { > >> + ctx->events = 0; > >> + fops = &eventfd_mask_fops; > >> + } else { > >> + ctx->count = count; > >> + fops = &eventfd_fops; > >> + } > >> ctx->flags = flags; > >> > >> - file = anon_inode_getfile("[eventfd]", &eventfd_fops, ctx, > >> + file = anon_inode_getfile("[eventfd]", fops, ctx, > >> O_RDWR | (flags & EFD_SHARED_FCNTL_FLAGS)); > >> if (IS_ERR(file)) > >> eventfd_free_ctx(ctx); > >> diff --git a/include/linux/eventfd.h b/include/linux/eventfd.h > >> index ff0b981..87de343 100644 > >> --- a/include/linux/eventfd.h > >> +++ b/include/linux/eventfd.h > >> @@ -8,23 +8,11 @@ > >> #ifndef _LINUX_EVENTFD_H > >> #define _LINUX_EVENTFD_H > >> > >> +#include > >> + > >> #include > >> #include > >> > >> -/* > >> - * CAREFUL: Check include/uapi/asm-generic/fcntl.h when defining > >> - * new flags, since they might collide with O_* ones. We want > >> - * to re-use O_* flags that couldn't possibly have a meaning > >> - * from eventfd, in order to leave a free define-space for > >> - * shared O_* flags. > >> - */ > >> -#define EFD_SEMAPHORE (1 << 0) > >> -#define EFD_CLOEXEC O_CLOEXEC > >> -#define EFD_NONBLOCK O_NONBLOCK > >> - > >> -#define EFD_SHARED_FCNTL_FLAGS (O_CLOEXEC | O_NONBLOCK) > >> -#define EFD_FLAGS_SET (EFD_SHARED_FCNTL_FLAGS | EFD_SEMAPHORE) > >> - > >> struct file; > >> > >> #ifdef CONFIG_EVENTFD > >> diff --git a/include/uapi/linux/eventfd.h b/include/uapi/linux/eventfd.h > >> new file mode 100644 > >> index 0000000..097dcad > >> --- /dev/null > >> +++ b/include/uapi/linux/eventfd.h > >> @@ -0,0 +1,33 @@ > >> +/* > >> + * Copyright (C) 2013 Martin Sustrik > >> + * > >> + * This program is free software; you can redistribute it and/or modify > >> + * it under the terms of the GNU General Public License as published by > >> + * the Free Software Foundation; either version 2 of the License, or > >> + * (at your option) any later version. > >> + */ > >> + > >> +#ifndef _UAPI_LINUX_EVENTFD_H > >> +#define _UAPI_LINUX_EVENTFD_H > >> + > >> +/* For O_CLOEXEC */ > >> +#include > >> +#include > >> + > >> +/* > >> + * CAREFUL: Check include/asm-generic/fcntl.h when defining > >> + * new flags, since they might collide with O_* ones. We want > >> + * to re-use O_* flags that couldn't possibly have a meaning > >> + * from eventfd, in order to leave a free define-space for > >> + * shared O_* flags. > >> + */ > >> + > >> +/* Provide semaphore-like semantics for reads from the eventfd. */ > >> +#define EFD_SEMAPHORE (1 << 0) > >> +/* Provide event mask semantics for the eventfd. */ > >> +#define EFD_MASK (1 << 1) > >> +/* Set the close-on-exec (FD_CLOEXEC) flag on the eventfd. */ > >> +#define EFD_CLOEXEC O_CLOEXEC > >> +/* Create the eventfd in non-blocking mode. */ > >> +#define EFD_NONBLOCK O_NONBLOCK > >> +#endif /* _UAPI_LINUX_EVENTFD_H */ -- Regards, Laurent Pinchart