2009-04-23 15:14:50

by Gregory Haskins

[permalink] [raw]
Subject: [KVM PATCH 0/3] irqfd

(Applies to kvm.git b59cd3560111)

This series implements a mechanism called "irqfd". It lets you create
an eventfd based file-desriptor to inject interrupts to the guest. We
associate one gsi per fd for proper routing granularity.

We do not have a user of this interface in this series, though note
future version of virtual-bus (v4 and above) will be based on this.

The first two patches will require mainline buy-in, particularly from Davide
(cc'd). The last patch is kvm specific.

kvm-userspace.git patch to follow.

-Greg

---

Gregory Haskins (3):
kvm: add support for irqfd via eventfd-notification interface
eventfd: add a notifier mechanism
eventfd: export fget and signal interfaces for module use


arch/x86/kvm/Makefile | 2 -
arch/x86/kvm/x86.c | 1
fs/eventfd.c | 54 +++++++++++++++++++
include/linux/eventfd.h | 8 +++
include/linux/kvm.h | 7 ++
include/linux/kvm_host.h | 7 ++
virt/kvm/irqfd.c | 133 ++++++++++++++++++++++++++++++++++++++++++++++
virt/kvm/kvm_main.c | 12 ++++
8 files changed, 223 insertions(+), 1 deletions(-)
create mode 100644 virt/kvm/irqfd.c

--
Signature


2009-04-23 15:15:19

by Gregory Haskins

[permalink] [raw]
Subject: [KVM PATCH 1/3] eventfd: export fget and signal interfaces for module use

We will use this later in the series

Signed-off-by: Gregory Haskins <[email protected]>
---

fs/eventfd.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/fs/eventfd.c b/fs/eventfd.c
index 2a701d5..3f0e197 100644
--- a/fs/eventfd.c
+++ b/fs/eventfd.c
@@ -16,6 +16,7 @@
#include <linux/anon_inodes.h>
#include <linux/eventfd.h>
#include <linux/syscalls.h>
+#include <linux/module.h>

struct eventfd_ctx {
wait_queue_head_t wqh;
@@ -56,6 +57,7 @@ int eventfd_signal(struct file *file, int n)

return n;
}
+EXPORT_SYMBOL_GPL(eventfd_signal);

static int eventfd_release(struct inode *inode, struct file *file)
{
@@ -197,6 +199,7 @@ struct file *eventfd_fget(int fd)

return file;
}
+EXPORT_SYMBOL_GPL(eventfd_fget);

SYSCALL_DEFINE2(eventfd2, unsigned int, count, int, flags)
{

2009-04-23 15:15:43

by Gregory Haskins

[permalink] [raw]
Subject: [KVM PATCH 2/3] eventfd: add a notifier mechanism

This allows synchronous notifications to register with the eventfd
infrastructure. Unlike traditional vfs based eventfd readers, notifiees
do not implictly clear the counter on reception. However, the clearing
is primarily important to allowing threads to block waiting for events
anyway, so its an acceptable trade-off since blocking doesn't apply to
notifiers.

Signed-off-by: Gregory Haskins <[email protected]>
CC: Davide Libenzi <[email protected]>
---

fs/eventfd.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++
include/linux/eventfd.h | 8 +++++++
2 files changed, 59 insertions(+), 0 deletions(-)

diff --git a/fs/eventfd.c b/fs/eventfd.c
index 3f0e197..1a54bd9 100644
--- a/fs/eventfd.c
+++ b/fs/eventfd.c
@@ -17,6 +17,7 @@
#include <linux/eventfd.h>
#include <linux/syscalls.h>
#include <linux/module.h>
+#include <linux/notifier.h>

struct eventfd_ctx {
wait_queue_head_t wqh;
@@ -30,6 +31,7 @@ struct eventfd_ctx {
*/
__u64 count;
unsigned int flags;
+ struct raw_notifier_head notifier;
};

/*
@@ -48,6 +50,9 @@ int eventfd_signal(struct file *file, int n)
if (n < 0)
return -EINVAL;
spin_lock_irqsave(&ctx->wqh.lock, flags);
+
+ raw_notifier_call_chain(&ctx->notifier, 0, 0);
+
if (ULLONG_MAX - ctx->count < n)
n = (int) (ULLONG_MAX - ctx->count);
ctx->count += n;
@@ -169,6 +174,7 @@ static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t c
__set_current_state(TASK_RUNNING);
}
if (likely(res > 0)) {
+ raw_notifier_call_chain(&ctx->notifier, 0, 0);
ctx->count += ucnt;
if (waitqueue_active(&ctx->wqh))
wake_up_locked_poll(&ctx->wqh, POLLIN);
@@ -201,6 +207,50 @@ struct file *eventfd_fget(int fd)
}
EXPORT_SYMBOL_GPL(eventfd_fget);

+struct file *eventfd_notifier_register(int fd, struct notifier_block *nb)
+{
+ struct file *file = eventfd_fget(fd);
+ struct eventfd_ctx *ctx;
+ unsigned long flags;
+ int ret;
+
+ if (IS_ERR(file))
+ return file;
+
+ ctx = file->private_data;
+
+ spin_lock_irqsave(&ctx->wqh.lock, flags);
+ ret = raw_notifier_chain_register(&ctx->notifier, nb);
+ spin_unlock_irqrestore(&ctx->wqh.lock, flags);
+
+ if (ret < 0) {
+ fput(file);
+ return ERR_PTR(ret);
+ }
+
+ return file;
+}
+EXPORT_SYMBOL_GPL(eventfd_notifier_register);
+
+int eventfd_notifier_unregister(struct file *file, struct notifier_block *nb)
+{
+ struct eventfd_ctx *ctx;
+ unsigned long flags;
+ int ret;
+
+ if (file->f_op != &eventfd_fops)
+ return -EINVAL;
+
+ ctx = file->private_data;
+
+ spin_lock_irqsave(&ctx->wqh.lock, flags);
+ ret = raw_notifier_chain_unregister(&ctx->notifier, nb);
+ spin_unlock_irqrestore(&ctx->wqh.lock, flags);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(eventfd_notifier_unregister);
+
SYSCALL_DEFINE2(eventfd2, unsigned int, count, int, flags)
{
int fd;
@@ -220,6 +270,7 @@ SYSCALL_DEFINE2(eventfd2, unsigned int, count, int, flags)
init_waitqueue_head(&ctx->wqh);
ctx->count = count;
ctx->flags = flags;
+ RAW_INIT_NOTIFIER_HEAD(&ctx->notifier);

/*
* When we call this, the initialization must be complete, since
diff --git a/include/linux/eventfd.h b/include/linux/eventfd.h
index f45a8ae..e13d1c5 100644
--- a/include/linux/eventfd.h
+++ b/include/linux/eventfd.h
@@ -8,6 +8,8 @@
#ifndef _LINUX_EVENTFD_H
#define _LINUX_EVENTFD_H

+#include <linux/notifier.h>
+
#ifdef CONFIG_EVENTFD

/* For O_CLOEXEC and O_NONBLOCK */
@@ -29,12 +31,18 @@

struct file *eventfd_fget(int fd);
int eventfd_signal(struct file *file, int n);
+struct file *eventfd_notifier_register(int fd, struct notifier_block *nb);
+int eventfd_notifier_unregister(struct file *file, struct notifier_block *nb);

#else /* CONFIG_EVENTFD */

#define eventfd_fget(fd) ERR_PTR(-ENOSYS)
static inline int eventfd_signal(struct file *file, int n)
{ return 0; }
+static inline int eventfd_notifier_register(int fd, struct notifier_block *nb)
+{ return -ENOENT; }
+static inline int eventfd_notifier_unregister(int fd, struct notifier_block *nb)
+{ return -ENOENT; }

#endif /* CONFIG_EVENTFD */

2009-04-23 15:15:59

by Gregory Haskins

[permalink] [raw]
Subject: [KVM PATCH 3/3] kvm: add support for irqfd via eventfd-notification interface

This allows an eventfd to be registered as an irq source with a guest. Any
signaling operation on the eventfd (via userspace or kernel) will inject
the registered GSI at the next available window.

Signed-off-by: Gregory Haskins <[email protected]>
---

arch/x86/kvm/Makefile | 2 -
arch/x86/kvm/x86.c | 1
include/linux/kvm.h | 7 ++
include/linux/kvm_host.h | 7 ++
virt/kvm/irqfd.c | 133 ++++++++++++++++++++++++++++++++++++++++++++++
virt/kvm/kvm_main.c | 12 ++++
6 files changed, 161 insertions(+), 1 deletions(-)
create mode 100644 virt/kvm/irqfd.c

diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index b43c4ef..d5fff51 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -3,7 +3,7 @@
#

common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o \
- coalesced_mmio.o irq_comm.o)
+ coalesced_mmio.o irq_comm.o irqfd.o)
ifeq ($(CONFIG_KVM_TRACE),y)
common-objs += $(addprefix ../../../virt/kvm/, kvm_trace.o)
endif
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8cb8542..88c45e6 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1027,6 +1027,7 @@ int kvm_dev_ioctl_check_extension(long ext)
case KVM_CAP_REINJECT_CONTROL:
case KVM_CAP_IRQ_INJECT_STATUS:
case KVM_CAP_ASSIGN_DEV_IRQ:
+ case KVM_CAP_IRQFD:
r = 1;
break;
case KVM_CAP_COALESCED_MMIO:
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 3db5d8d..2404775 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -415,6 +415,7 @@ struct kvm_trace_rec {
#define KVM_CAP_ASSIGN_DEV_IRQ 29
/* Another bug in KVM_SET_USER_MEMORY_REGION fixed: */
#define KVM_CAP_JOIN_MEMORY_REGIONS_WORKS 30
+#define KVM_CAP_IRQFD 31

#ifdef KVM_CAP_IRQ_ROUTING

@@ -454,6 +455,11 @@ struct kvm_irq_routing {

#endif

+struct kvm_irqfd {
+ __u32 fd;
+ __u32 gsi;
+};
+
/*
* ioctls for VM fds
*/
@@ -498,6 +504,7 @@ struct kvm_irq_routing {
#define KVM_ASSIGN_SET_MSIX_ENTRY \
_IOW(KVMIO, 0x74, struct kvm_assigned_msix_entry)
#define KVM_DEASSIGN_DEV_IRQ _IOW(KVMIO, 0x75, struct kvm_assigned_irq)
+#define KVM_ASSIGN_IRQFD _IOW(KVMIO, 0x76, struct kvm_irqfd)

/*
* ioctls for vcpu fds
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 095ebb6..57f8dfe 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -134,6 +134,10 @@ struct kvm {
struct list_head vm_list;
struct kvm_io_bus mmio_bus;
struct kvm_io_bus pio_bus;
+ struct {
+ struct list_head items;
+ int src;
+ } irqfd;
struct kvm_vm_stat stat;
struct kvm_arch arch;
atomic_t users_count;
@@ -524,4 +528,7 @@ static inline void kvm_free_irq_routing(struct kvm *kvm) {}

#endif

+int kvm_irqfd_assign(struct kvm *kvm, int fd, int gsi);
+void kvm_irqfd_release(struct kvm *kvm);
+
#endif
diff --git a/virt/kvm/irqfd.c b/virt/kvm/irqfd.c
new file mode 100644
index 0000000..d6829b3
--- /dev/null
+++ b/virt/kvm/irqfd.c
@@ -0,0 +1,133 @@
+/*
+ * irqfd: Allows an eventfd to be used to inject an interrupt to the guest
+ *
+ * Credit goes to Avi Kivity for the original idea.
+ *
+ * Copyright 2009 Novell. All Rights Reserved.
+ *
+ * Author:
+ * Gregory Haskins <[email protected]>
+ *
+ * This file is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#include <linux/kvm_host.h>
+#include <linux/eventfd.h>
+#include <linux/workqueue.h>
+#include <linux/notifier.h>
+#include <linux/file.h>
+#include <linux/list.h>
+
+struct _irqfd {
+ struct kvm *kvm;
+ int gsi;
+ struct file *file;
+ struct list_head list;
+ struct notifier_block nb;
+ struct work_struct work;
+};
+
+static void
+irqfd_deferred_inject(struct work_struct *work)
+{
+ struct _irqfd *irqfd = container_of(work, struct _irqfd, work);
+ struct kvm *kvm = irqfd->kvm;
+
+ mutex_lock(&kvm->lock);
+ kvm_set_irq(kvm, kvm->irqfd.src, irqfd->gsi, 1);
+ mutex_unlock(&kvm->lock);
+}
+
+static int
+irqfd_inject(struct notifier_block *nb, unsigned long nr, void *val)
+{
+ struct _irqfd *irqfd = container_of(nb, struct _irqfd, nb);
+
+ /*
+ * The eventfd calls its notifier chain with interrupts disabled,
+ * so we need to defer the IRQ injection until later since we need
+ * to acquire the kvm->lock to do so.
+ */
+ schedule_work(&irqfd->work);
+
+ return 0;
+}
+
+int
+kvm_irqfd_assign(struct kvm *kvm, int fd, int gsi)
+{
+ struct _irqfd *irqfd;
+ int ret;
+
+ irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL);
+ if (!irqfd)
+ return -ENOMEM;
+
+ irqfd->kvm = kvm;
+ irqfd->gsi = gsi;
+ INIT_LIST_HEAD(&irqfd->list);
+ irqfd->nb.notifier_call = &irqfd_inject;
+ irqfd->nb.priority = 0;
+ INIT_WORK(&irqfd->work, irqfd_deferred_inject);
+
+ /*
+ * note: this acquires a reference to the fd that we are responsible
+ * for dropping later
+ */
+ irqfd->file = eventfd_notifier_register(fd, &irqfd->nb);
+ if (IS_ERR(irqfd->file)) {
+ ret = PTR_ERR(irqfd->file);
+ goto fail;
+ }
+
+ mutex_lock(&kvm->lock);
+ if (kvm->irqfd.src == -1) {
+ ret = kvm_request_irq_source_id(kvm);
+ BUG_ON(ret < 0);
+
+ kvm->irqfd.src = ret;
+ }
+
+ list_add_tail(&irqfd->list, &kvm->irqfd.items);
+
+ mutex_unlock(&kvm->lock);
+
+ return 0;
+
+fail:
+ kfree(irqfd);
+ return ret;
+}
+
+void
+kvm_irqfd_release(struct kvm *kvm)
+{
+ struct _irqfd *irqfd, *tmp;
+ int ret;
+
+ list_for_each_entry_safe(irqfd, tmp, &kvm->irqfd.items, list) {
+ ret = eventfd_notifier_unregister(irqfd->file, &irqfd->nb);
+ BUG_ON(ret < 0);
+
+ flush_work(&irqfd->work);
+ fput(irqfd->file);
+
+ list_del(&irqfd->list);
+ kfree(irqfd);
+ }
+
+ if (kvm->irqfd.src != -1)
+ kvm_free_irq_source_id(kvm, kvm->irqfd.src);
+
+}
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 3265566..c822d79 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -972,6 +972,8 @@ static struct kvm *kvm_create_vm(void)
atomic_inc(&kvm->mm->mm_count);
spin_lock_init(&kvm->mmu_lock);
kvm_io_bus_init(&kvm->pio_bus);
+ INIT_LIST_HEAD(&kvm->irqfd.items);
+ kvm->irqfd.src = -1;
mutex_init(&kvm->lock);
kvm_io_bus_init(&kvm->mmio_bus);
init_rwsem(&kvm->slots_lock);
@@ -1023,6 +1025,7 @@ static void kvm_destroy_vm(struct kvm *kvm)
spin_lock(&kvm_lock);
list_del(&kvm->vm_list);
spin_unlock(&kvm_lock);
+ kvm_irqfd_release(kvm);
kvm_free_irq_routing(kvm);
kvm_io_bus_destroy(&kvm->pio_bus);
kvm_io_bus_destroy(&kvm->mmio_bus);
@@ -2197,6 +2200,15 @@ static long kvm_vm_ioctl(struct file *filp,
}
#endif
#endif /* KVM_CAP_IRQ_ROUTING */
+ case KVM_ASSIGN_IRQFD: {
+ struct kvm_irqfd data;
+
+ r = -EFAULT;
+ if (copy_from_user(&data, argp, sizeof data))
+ goto out;
+ r = kvm_irqfd_assign(kvm, data.fd, data.gsi);
+ break;
+ }
default:
r = kvm_arch_vm_ioctl(filp, ioctl, arg);
}

2009-04-23 16:27:48

by Davide Libenzi

[permalink] [raw]
Subject: Re: [KVM PATCH 2/3] eventfd: add a notifier mechanism

On Thu, 23 Apr 2009, Gregory Haskins wrote:

> This allows synchronous notifications to register with the eventfd
> infrastructure. Unlike traditional vfs based eventfd readers, notifiees
> do not implictly clear the counter on reception. However, the clearing
> is primarily important to allowing threads to block waiting for events
> anyway, so its an acceptable trade-off since blocking doesn't apply to
> notifiers.

Do you really need to add a notifier? Eventfd already has a wait queue,
and we support callback-based wakeups, so is there any reason we shouldn't
use those and rely on the already existing wakeups?



- Davide

2009-04-23 16:47:16

by Gregory Haskins

[permalink] [raw]
Subject: Re: [KVM PATCH 2/3] eventfd: add a notifier mechanism

Davide Libenzi wrote:
> On Thu, 23 Apr 2009, Gregory Haskins wrote:
>
>
>> This allows synchronous notifications to register with the eventfd
>> infrastructure. Unlike traditional vfs based eventfd readers, notifiees
>> do not implictly clear the counter on reception. However, the clearing
>> is primarily important to allowing threads to block waiting for events
>> anyway, so its an acceptable trade-off since blocking doesn't apply to
>> notifiers.
>>
>
> Do you really need to add a notifier? Eventfd already has a wait queue,
> and we support callback-based wakeups, so is there any reason we shouldn't
> use those and rely on the already existing wakeups?
>
Well, IIUC the issue is that a wait queue implies that you are in fact
waiting...which we may not. :)

The target in this particular application with kvm-irqfd is a vcpu
context, which *may* be sleeping in something like a HLT, but it also
could be in a number of other states such as non-root (guest) mode, it
could be running in the kernel, it could be up in userspace, etc.

That said: I am not married to the concept that this has to be a
notifier callback, but I do want to be able to meet the target
application. So if there is some way to do that within the existing
wait-queue contstruct, I am open to suggestions.

Thanks Davide,
-Greg

>
>
> - Davide
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>



Attachments:
signature.asc (266.00 B)
OpenPGP digital signature

2009-04-23 22:07:38

by Davide Libenzi

[permalink] [raw]
Subject: Re: [KVM PATCH 2/3] eventfd: add a notifier mechanism

On Thu, 23 Apr 2009, Gregory Haskins wrote:

> Davide Libenzi wrote:
> > On Thu, 23 Apr 2009, Gregory Haskins wrote:
> >
> >
> >> This allows synchronous notifications to register with the eventfd
> >> infrastructure. Unlike traditional vfs based eventfd readers, notifiees
> >> do not implictly clear the counter on reception. However, the clearing
> >> is primarily important to allowing threads to block waiting for events
> >> anyway, so its an acceptable trade-off since blocking doesn't apply to
> >> notifiers.
> >>
> >
> > Do you really need to add a notifier? Eventfd already has a wait queue,
> > and we support callback-based wakeups, so is there any reason we shouldn't
> > use those and rely on the already existing wakeups?
> >
> Well, IIUC the issue is that a wait queue implies that you are in fact
> waiting...which we may not. :)
>
> The target in this particular application with kvm-irqfd is a vcpu
> context, which *may* be sleeping in something like a HLT, but it also
> could be in a number of other states such as non-root (guest) mode, it
> could be running in the kernel, it could be up in userspace, etc.
>
> That said: I am not married to the concept that this has to be a
> notifier callback, but I do want to be able to meet the target
> application. So if there is some way to do that within the existing
> wait-queue contstruct, I am open to suggestions.

Take a look at init_waitqueue_func_entry(), in particula at that "func"
parameter. Then look at how __wake_up_common() does its thing.
You don't need to be "waiting" for our wakeup system to work. Callbacks
works just fine, otherwise things like epoll could not work at all.



- Davide

2009-04-23 22:12:33

by Gregory Haskins

[permalink] [raw]
Subject: Re: [KVM PATCH 2/3] eventfd: add a notifier mechanism

Davide Libenzi wrote:
> On Thu, 23 Apr 2009, Gregory Haskins wrote:
>
>
>> Davide Libenzi wrote:
>>
>>> On Thu, 23 Apr 2009, Gregory Haskins wrote:
>>>
>>>
>>>
>>>> This allows synchronous notifications to register with the eventfd
>>>> infrastructure. Unlike traditional vfs based eventfd readers, notifiees
>>>> do not implictly clear the counter on reception. However, the clearing
>>>> is primarily important to allowing threads to block waiting for events
>>>> anyway, so its an acceptable trade-off since blocking doesn't apply to
>>>> notifiers.
>>>>
>>>>
>>> Do you really need to add a notifier? Eventfd already has a wait queue,
>>> and we support callback-based wakeups, so is there any reason we shouldn't
>>> use those and rely on the already existing wakeups?
>>>
>>>
>> Well, IIUC the issue is that a wait queue implies that you are in fact
>> waiting...which we may not. :)
>>
>> The target in this particular application with kvm-irqfd is a vcpu
>> context, which *may* be sleeping in something like a HLT, but it also
>> could be in a number of other states such as non-root (guest) mode, it
>> could be running in the kernel, it could be up in userspace, etc.
>>
>> That said: I am not married to the concept that this has to be a
>> notifier callback, but I do want to be able to meet the target
>> application. So if there is some way to do that within the existing
>> wait-queue contstruct, I am open to suggestions.
>>
>
> Take a look at init_waitqueue_func_entry(), in particula at that "func"
> parameter. Then look at how __wake_up_common() does its thing.
> You don't need to be "waiting" for our wakeup system to work. Callbacks
> works just fine, otherwise things like epoll could not work at all.
>

Yeah, I was looking at that this afternoon after you mentioned it. That
makes sense.

As of right now the wqh is embedded in the eventfd, accessible only by
the .read() vtable entry. In order to do this as you suggest, I imagine
I need to slightly modify the eventfd interface to allow waiters other
than the embedded readers to join the wait-queue. How would you like to
see that interface look?

Thanks Davide,
-Greg




Attachments:
signature.asc (266.00 B)
OpenPGP digital signature

2009-04-24 02:47:53

by Davide Libenzi

[permalink] [raw]
Subject: Re: [KVM PATCH 2/3] eventfd: add a notifier mechanism

On Thu, 23 Apr 2009, Gregory Haskins wrote:

> > Take a look at init_waitqueue_func_entry(), in particula at that "func"
> > parameter. Then look at how __wake_up_common() does its thing.
> > You don't need to be "waiting" for our wakeup system to work. Callbacks
> > works just fine, otherwise things like epoll could not work at all.
> >
>
> Yeah, I was looking at that this afternoon after you mentioned it. That
> makes sense.
>
> As of right now the wqh is embedded in the eventfd, accessible only by
> the .read() vtable entry. In order to do this as you suggest, I imagine
> I need to slightly modify the eventfd interface to allow waiters other
> than the embedded readers to join the wait-queue. How would you like to
> see that interface look?

Actually, you need no changes in eventfd, since you can use its poll()
directly, by dropping a callback entry.
Take a look at how epoll does it in ep_insert()...


- Davide

2009-04-24 04:12:27

by Gregory Haskins

[permalink] [raw]
Subject: Re: [KVM PATCH 2/3] eventfd: add a notifier mechanism

Davide Libenzi wrote:
> On Thu, 23 Apr 2009, Gregory Haskins wrote:
>
>
>>> Take a look at init_waitqueue_func_entry(), in particula at that "func"
>>> parameter. Then look at how __wake_up_common() does its thing.
>>> You don't need to be "waiting" for our wakeup system to work. Callbacks
>>> works just fine, otherwise things like epoll could not work at all.
>>>
>>>
>> Yeah, I was looking at that this afternoon after you mentioned it. That
>> makes sense.
>>
>> As of right now the wqh is embedded in the eventfd, accessible only by
>> the .read() vtable entry. In order to do this as you suggest, I imagine
>> I need to slightly modify the eventfd interface to allow waiters other
>> than the embedded readers to join the wait-queue. How would you like to
>> see that interface look?
>>
>
> Actually, you need no changes in eventfd, since you can use its poll()
> directly, by dropping a callback entry.
> Take a look at how epoll does it in ep_insert()...
>
Ah! I just tried your suggestion and it was simple and it works. That
is brilliant.

I will post a follow-up v2 that drops patch 2/3. Thanks, Davide.

-Greg




Attachments:
signature.asc (266.00 B)
OpenPGP digital signature

2009-04-24 16:17:26

by Davide Libenzi

[permalink] [raw]
Subject: Re: [KVM PATCH 2/3] eventfd: add a notifier mechanism

On Fri, 24 Apr 2009, Gregory Haskins wrote:

> Davide Libenzi wrote:
> > On Thu, 23 Apr 2009, Gregory Haskins wrote:
> >
> >
> >>> Take a look at init_waitqueue_func_entry(), in particula at that "func"
> >>> parameter. Then look at how __wake_up_common() does its thing.
> >>> You don't need to be "waiting" for our wakeup system to work. Callbacks
> >>> works just fine, otherwise things like epoll could not work at all.
> >>>
> >>>
> >> Yeah, I was looking at that this afternoon after you mentioned it. That
> >> makes sense.
> >>
> >> As of right now the wqh is embedded in the eventfd, accessible only by
> >> the .read() vtable entry. In order to do this as you suggest, I imagine
> >> I need to slightly modify the eventfd interface to allow waiters other
> >> than the embedded readers to join the wait-queue. How would you like to
> >> see that interface look?
> >>
> >
> > Actually, you need no changes in eventfd, since you can use its poll()
> > directly, by dropping a callback entry.
> > Take a look at how epoll does it in ep_insert()...
> >
> Ah! I just tried your suggestion and it was simple and it works. That
> is brilliant.
>
> I will post a follow-up v2 that drops patch 2/3. Thanks, Davide.

I assume you're in control of the file/fd lifetime, so that the file*
cannot vanish while you've your wait_queue dropped inside its
wait_queue_head, isn't it?


- Davide

2009-04-24 16:30:10

by Gregory Haskins

[permalink] [raw]
Subject: Re: [KVM PATCH 2/3] eventfd: add a notifier mechanism

Davide Libenzi wrote:
> On Fri, 24 Apr 2009, Gregory Haskins wrote:
>
>
>> Davide Libenzi wrote:
>>
>>> On Thu, 23 Apr 2009, Gregory Haskins wrote:
>>>
>>>
>>>
>>>>> Take a look at init_waitqueue_func_entry(), in particula at that "func"
>>>>> parameter. Then look at how __wake_up_common() does its thing.
>>>>> You don't need to be "waiting" for our wakeup system to work. Callbacks
>>>>> works just fine, otherwise things like epoll could not work at all.
>>>>>
>>>>>
>>>>>
>>>> Yeah, I was looking at that this afternoon after you mentioned it. That
>>>> makes sense.
>>>>
>>>> As of right now the wqh is embedded in the eventfd, accessible only by
>>>> the .read() vtable entry. In order to do this as you suggest, I imagine
>>>> I need to slightly modify the eventfd interface to allow waiters other
>>>> than the embedded readers to join the wait-queue. How would you like to
>>>> see that interface look?
>>>>
>>>>
>>> Actually, you need no changes in eventfd, since you can use its poll()
>>> directly, by dropping a callback entry.
>>> Take a look at how epoll does it in ep_insert()...
>>>
>>>
>> Ah! I just tried your suggestion and it was simple and it works. That
>> is brilliant.
>>
>> I will post a follow-up v2 that drops patch 2/3. Thanks, Davide.
>>
>
> I assume you're in control of the file/fd lifetime, so that the file*
> cannot vanish while you've your wait_queue dropped inside its
> wait_queue_head, isn't it?
>
I've addressed this issue by holding the file via eventfd_fget(), but
your review to make sure I got it right would be appreciated.

-Greg



Attachments:
signature.asc (266.00 B)
OpenPGP digital signature