2015-07-10 17:52:44

by Alex Williamson

[permalink] [raw]
Subject: [PATCH] virt: IRQ bypass manager

When a physical I/O device is assigned to a virtual machine through
facilities like VFIO and KVM, the interrupt for the device generally
bounces through the host system before being injected into the VM.
However, hardware technologies exist that often allow the host to be
bypassed for some of these scenarios. Intel Posted Interrupts allow
the specified physical edge interrupts to be directly injected into a
guest when delivered to a physical processor while the vCPU is
running. ARM IRQ Forwarding allows the hypervisor to handle level
triggered device interrupts as edge interrupts, by giving the guest
control of de-asserting and unmasking the interrupt line.

The IRQ bypass manager here is meant to provide the shim to connect
interrupt producers, generally the host physical device driver, with
interrupt consumers, generally the hypervisor, in order to configure
these bypass mechanism. To do this, we base the connection on a
shared, opaque token. For KVM-VFIO this is expected to be an
eventfd_ctx since this is the connection we already use to connect an
eventfd to an irqfd on the in-kernel path. When a producer and
consumer with matching tokens is found, callbacks via both registered
participants allow the bypass facilities to be automatically enabled.

Signed-off-by: Alex Williamson <[email protected]>
Signed-off-by: Eric Auger <[email protected]>
---

Changes:
- Moved to virt/lib/
- Dropped update callback
- Filled in missing documentation
- @resume callback renamed to @stop
- Only @start/@stop are optional

One of the difficulties with moving this code to virt/lib is that nobody
builds it by default. Thinking about this for a bit, it really needs a
consumer to be useful and KVM is currently the only consumer, so I tested
with the following:

--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -100,5 +101,6 @@ config KVM_DEVICE_ASSIGNMENT
# the virtualization menu.
source drivers/vhost/Kconfig
source drivers/lguest/Kconfig
+source virt/lib/Kconfig

endif # VIRTUALIZATION
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -20,3 +20,5 @@ kvm-amd-y += svm.o
obj-$(CONFIG_KVM) += kvm.o
obj-$(CONFIG_KVM_INTEL) += kvm-intel.o
obj-$(CONFIG_KVM_AMD) += kvm-amd.o
+
+obj-y += ../../../virt/lib/

Perhaps if a second consumer comes along that would be justification for
tying it elsewhere in the build system. ARM will obviously need to do
similar. Are there better options?

Also, there's no maintainer for the top level virt/ directory. Paolo,
would you feel comfortable taking this, maybe with some additional acks?
That would probably be the most convenient for merging the consumer code.
Thanks,
Alex

include/linux/irqbypass.h | 90 +++++++++++++++++++
virt/lib/Kconfig | 2
virt/lib/Makefile | 1
virt/lib/irqbypass.c | 212 +++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 305 insertions(+)
create mode 100644 include/linux/irqbypass.h
create mode 100644 virt/lib/Kconfig
create mode 100644 virt/lib/Makefile
create mode 100644 virt/lib/irqbypass.c

diff --git a/include/linux/irqbypass.h b/include/linux/irqbypass.h
new file mode 100644
index 0000000..41df18d
--- /dev/null
+++ b/include/linux/irqbypass.h
@@ -0,0 +1,90 @@
+/*
+ * IRQ offload/bypass manager
+ *
+ * Copyright (C) 2015 Red Hat, Inc.
+ * Copyright (c) 2015 Linaro Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef IRQBYPASS_H
+#define IRQBYPASS_H
+
+#include <linux/list.h>
+
+struct irq_bypass_consumer;
+
+/*
+ * Theory of operation
+ *
+ * The IRQ bypass manager is a simple set of lists and callbacks that allows
+ * IRQ producers (ex. physical interrupt sources) to be matched to IRQ
+ * consumers (ex. virtualization hardware that allows IRQ bypass or offload)
+ * via a shared token (ex. eventfd_ctx). Producers and consumers register
+ * independently. When a token match is found, the optional @stop callback
+ * will be called for each participant. The pair will then be connected via
+ * the @add_* callbacks, and finally the optional @start callback will allow
+ * any final coordination. When either participant is unregistered, the
+ * process is repeated using the @del_* callbacks in place of the @add_*
+ * callbacks. Match tokens must be unique per producer/consumer, 1:N parings
+ * are not supported.
+ */
+
+/**
+ * struct irq_bypass_producer - IRQ bypass producer definition
+ * @node: IRQ bypass manager private list management
+ * @token: opaque token to match between producer and consumer
+ * @irq: Linux IRQ number for the producer device
+ * @add_consumer: Connect the IRQ producer to an IRQ consumer
+ * @del_consumer: Disconnect the IRQ producer from an IRQ consumer
+ * @stop: Perform any quiesce operations necessary prior to add/del (optional)
+ * @start: Perform any startup operations necessary after add/del (optional)
+ *
+ * The IRQ bypass producer structure represents an interrupt source for
+ * participation in possible host bypass, for instance an interrupt vector
+ * for a physical device assigned to a VM.
+ */
+struct irq_bypass_producer {
+ struct list_head node;
+ void *token;
+ int irq;
+ void (*add_consumer)(struct irq_bypass_producer *,
+ struct irq_bypass_consumer *);
+ void (*del_consumer)(struct irq_bypass_producer *,
+ struct irq_bypass_consumer *);
+ void (*stop)(struct irq_bypass_producer *);
+ void (*start)(struct irq_bypass_producer *);
+};
+
+/**
+ * struct irq_bypass_consumer - IRQ bypass consumer definition
+ * @node: IRQ bypass manager private list management
+ * @token: opaque token to match between producer and consumer
+ * @add_producer: Connect the IRQ consumer to an IRQ producer
+ * @del_producer: Disconnect the IRQ consumer from an IRQ producer
+ * @stop: Perform any quiesce operations necessary prior to add/del (optional)
+ * @start: Perform any startup operations necessary after add/del (optional)
+ *
+ * The IRQ bypass consumer structure represents an interrupt sink for
+ * participation in possible host bypass, for instance a hypervisor may
+ * support offloads to allow bypassing the host entirely or offload
+ * portions of the interrupt handling to the VM.
+ */
+struct irq_bypass_consumer {
+ struct list_head node;
+ void *token;
+ void (*add_producer)(struct irq_bypass_consumer *,
+ struct irq_bypass_producer *);
+ void (*del_producer)(struct irq_bypass_consumer *,
+ struct irq_bypass_producer *);
+ void (*stop)(struct irq_bypass_consumer *);
+ void (*start)(struct irq_bypass_consumer *);
+};
+
+int irq_bypass_register_producer(struct irq_bypass_producer *);
+void irq_bypass_unregister_producer(struct irq_bypass_producer *);
+int irq_bypass_register_consumer(struct irq_bypass_consumer *);
+void irq_bypass_unregister_consumer(struct irq_bypass_consumer *);
+
+#endif /* IRQBYPASS_H */
diff --git a/virt/lib/Kconfig b/virt/lib/Kconfig
new file mode 100644
index 0000000..89a414f
--- /dev/null
+++ b/virt/lib/Kconfig
@@ -0,0 +1,2 @@
+config IRQ_BYPASS_MANAGER
+ tristate
diff --git a/virt/lib/Makefile b/virt/lib/Makefile
new file mode 100644
index 0000000..901228d
--- /dev/null
+++ b/virt/lib/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_IRQ_BYPASS_MANAGER) += irqbypass.o
diff --git a/virt/lib/irqbypass.c b/virt/lib/irqbypass.c
new file mode 100644
index 0000000..f1091e6
--- /dev/null
+++ b/virt/lib/irqbypass.c
@@ -0,0 +1,212 @@
+/*
+ * IRQ offload/bypass manager
+ *
+ * Copyright (C) 2015 Red Hat, Inc.
+ * Copyright (c) 2015 Linaro Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Various virtualization hardware acceleration techniques allow bypassing
+ * or offloading interrupts received from devices around the host kernel.
+ * Posted Interrupts on Intel VT-d systems can allow interrupts to be
+ * received directly by a virtual machine. ARM IRQ Forwarding can allow
+ * level triggered device interrupts to be de-asserted directly by the VM.
+ * This manager allows interrupt producers and consumers to find each other
+ * to enable this sort of bypass.
+ */
+
+#include <linux/irqbypass.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("IRQ bypass manager utility module");
+
+static LIST_HEAD(producers);
+static LIST_HEAD(consumers);
+static DEFINE_MUTEX(lock);
+
+/* @lock must be held when calling connect */
+static void __connect(struct irq_bypass_producer *prod,
+ struct irq_bypass_consumer *cons)
+{
+ if (prod->stop)
+ prod->stop(prod);
+ if (cons->stop)
+ cons->stop(cons);
+
+ prod->add_consumer(prod, cons);
+ cons->add_producer(cons, prod);
+
+ if (cons->start)
+ cons->start(cons);
+ if (prod->start)
+ prod->start(prod);
+}
+
+/* @lock must be held when calling disconnect */
+static void __disconnect(struct irq_bypass_producer *prod,
+ struct irq_bypass_consumer *cons)
+{
+ if (prod->stop)
+ prod->stop(prod);
+ if (cons->stop)
+ cons->stop(cons);
+
+ cons->del_producer(cons, prod);
+ prod->del_consumer(prod, cons);
+
+ if (cons->start)
+ cons->start(cons);
+ if (prod->start)
+ prod->start(prod);
+}
+
+/**
+ * irq_bypass_register_producer - register IRQ bypass producer
+ * @producer: pointer to producer structure
+ *
+ * Add the provided IRQ producer to the list of producers and connect
+ * with any matching tokens found on the IRQ consumers list.
+ */
+int irq_bypass_register_producer(struct irq_bypass_producer *producer)
+{
+ struct irq_bypass_producer *tmp;
+ struct irq_bypass_consumer *consumer;
+
+ if (!producer->add_consumer || !producer->del_consumer)
+ return -EINVAL;
+
+ might_sleep();
+
+ if (!try_module_get(THIS_MODULE))
+ return -ENODEV;
+
+ mutex_lock(&lock);
+
+ list_for_each_entry(tmp, &producers, node) {
+ if (tmp->token == producer->token) {
+ mutex_unlock(&lock);
+ module_put(THIS_MODULE);
+ return -EINVAL;
+ }
+ }
+
+ list_add(&producer->node, &producers);
+
+ list_for_each_entry(consumer, &consumers, node) {
+ if (consumer->token == producer->token) {
+ __connect(producer, consumer);
+ break;
+ }
+ }
+
+ mutex_unlock(&lock);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(irq_bypass_register_producer);
+
+/**
+ * irq_bypass_unregister_producer - unregister IRQ bypass producer
+ * @producer: pointer to producer structure
+ *
+ * Remove a previously registered IRQ producer from the list of producers
+ * and disconnected from any connected IRQ consumers.
+ */
+void irq_bypass_unregister_producer(struct irq_bypass_producer *producer)
+{
+ struct irq_bypass_consumer *consumer;
+
+ might_sleep();
+
+ mutex_lock(&lock);
+
+ list_for_each_entry(consumer, &consumers, node) {
+ if (consumer->token == producer->token) {
+ __disconnect(producer, consumer);
+ break;
+ }
+ }
+
+ list_del(&producer->node);
+
+ mutex_unlock(&lock);
+ module_put(THIS_MODULE);
+}
+EXPORT_SYMBOL_GPL(irq_bypass_unregister_producer);
+
+/**
+ * irq_bypass_register_consumer - register IRQ bypass consumer
+ * @consumer: pointer to consumer structure
+ *
+ * Add the provided IRQ consumer to the list of consumers and connect
+ * with any matching tokens found on the IRQ producer list.
+ */
+int irq_bypass_register_consumer(struct irq_bypass_consumer *consumer)
+{
+ struct irq_bypass_consumer *tmp;
+ struct irq_bypass_producer *producer;
+
+ if (!consumer->add_producer || !consumer->del_producer)
+ return -EINVAL;
+
+ might_sleep();
+
+ if (!try_module_get(THIS_MODULE))
+ return -ENODEV;
+
+ mutex_lock(&lock);
+
+ list_for_each_entry(tmp, &consumers, node) {
+ if (tmp->token == consumer->token) {
+ mutex_unlock(&lock);
+ module_put(THIS_MODULE);
+ return -EINVAL;
+ }
+ }
+
+ list_add(&consumer->node, &consumers);
+
+ list_for_each_entry(producer, &producers, node) {
+ if (producer->token == consumer->token) {
+ __connect(producer, consumer);
+ break;
+ }
+ }
+
+ mutex_unlock(&lock);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(irq_bypass_register_consumer);
+
+/**
+ * irq_bypass_unregister_consumer - unregister IRQ bypass consumer
+ * @consumer: pointer to consumer structure
+ *
+ * Remove a previously registered IRQ consumer from the list of consumers
+ * and disconnected from any connected IRQ producers.
+ */
+void irq_bypass_unregister_consumer(struct irq_bypass_consumer *consumer)
+{
+ struct irq_bypass_producer *producer;
+
+ might_sleep();
+
+ mutex_lock(&lock);
+
+ list_for_each_entry(producer, &producers, node) {
+ if (producer->token == consumer->token) {
+ __disconnect(producer, consumer);
+ break;
+ }
+ }
+
+ list_del(&consumer->node);
+
+ mutex_unlock(&lock);
+ module_put(THIS_MODULE);
+}
+EXPORT_SYMBOL_GPL(irq_bypass_unregister_consumer);


2015-07-10 21:09:49

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH] virt: IRQ bypass manager



On 10/07/2015 19:52, Alex Williamson wrote:
> Perhaps if a second consumer comes along that would be justification for
> tying it elsewhere in the build system. ARM will obviously need to do
> similar. Are there better options?
>
> Also, there's no maintainer for the top level virt/ directory. Paolo,
> would you feel comfortable taking this, maybe with some additional acks?

That's okay; alternatively, we can share it since after all you wrote
most of it.

Paolo

> That would probably be the most convenient for merging the consumer code.
> Thanks,

2015-07-13 15:33:12

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH] virt: IRQ bypass manager

Hi Alex,
On 07/10/2015 07:52 PM, Alex Williamson wrote:
> When a physical I/O device is assigned to a virtual machine through
> facilities like VFIO and KVM, the interrupt for the device generally
> bounces through the host system before being injected into the VM.
> However, hardware technologies exist that often allow the host to be
> bypassed for some of these scenarios. Intel Posted Interrupts allow
> the specified physical edge interrupts to be directly injected into a
> guest when delivered to a physical processor while the vCPU is
> running. ARM IRQ Forwarding allows the hypervisor to handle level
> triggered device interrupts as edge interrupts, by giving the guest
> control of de-asserting and unmasking the interrupt line.
ARM IRQ Forwarding allows forwarded physical interrupts to be directly
deactivated by the guest?
>
> The IRQ bypass manager here is meant to provide the shim to connect
> interrupt producers, generally the host physical device driver, with
> interrupt consumers, generally the hypervisor, in order to configure
> these bypass mechanism. To do this, we base the connection on a
> shared, opaque token. For KVM-VFIO this is expected to be an
> eventfd_ctx since this is the connection we already use to connect an
> eventfd to an irqfd on the in-kernel path. When a producer and
> consumer with matching tokens is found, callbacks via both registered
> participants allow the bypass facilities to be automatically enabled.
>
> Signed-off-by: Alex Williamson <[email protected]>
> Signed-off-by: Eric Auger <[email protected]>
> ---
>
> Changes:
> - Moved to virt/lib/
> - Dropped update callback
> - Filled in missing documentation
> - @resume callback renamed to @stop
> - Only @start/@stop are optional
>
> One of the difficulties with moving this code to virt/lib is that nobody
> builds it by default. Thinking about this for a bit, it really needs a
> consumer to be useful and KVM is currently the only consumer, so I tested
> with the following:
>
> --- a/arch/x86/kvm/Kconfig
> +++ b/arch/x86/kvm/Kconfig
> @@ -100,5 +101,6 @@ config KVM_DEVICE_ASSIGNMENT
> # the virtualization menu.
> source drivers/vhost/Kconfig
> source drivers/lguest/Kconfig
> +source virt/lib/Kconfig
>
> endif # VIRTUALIZATION
> --- a/arch/x86/kvm/Makefile
> +++ b/arch/x86/kvm/Makefile
> @@ -20,3 +20,5 @@ kvm-amd-y += svm.o
> obj-$(CONFIG_KVM) += kvm.o
> obj-$(CONFIG_KVM_INTEL) += kvm-intel.o
> obj-$(CONFIG_KVM_AMD) += kvm-amd.o
> +
> +obj-y += ../../../virt/lib/
>
> Perhaps if a second consumer comes along that would be justification for
> tying it elsewhere in the build system. ARM will obviously need to do
> similar. Are there better options?
>
> Also, there's no maintainer for the top level virt/ directory. Paolo,
> would you feel comfortable taking this, maybe with some additional acks?
> That would probably be the most convenient for merging the consumer code.
> Thanks,
> Alex
>
> include/linux/irqbypass.h | 90 +++++++++++++++++++
> virt/lib/Kconfig | 2
> virt/lib/Makefile | 1
> virt/lib/irqbypass.c | 212 +++++++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 305 insertions(+)
> create mode 100644 include/linux/irqbypass.h
> create mode 100644 virt/lib/Kconfig
> create mode 100644 virt/lib/Makefile
> create mode 100644 virt/lib/irqbypass.c
>
> diff --git a/include/linux/irqbypass.h b/include/linux/irqbypass.h
> new file mode 100644
> index 0000000..41df18d
> --- /dev/null
> +++ b/include/linux/irqbypass.h
> @@ -0,0 +1,90 @@
> +/*
> + * IRQ offload/bypass manager
> + *
> + * Copyright (C) 2015 Red Hat, Inc.
> + * Copyright (c) 2015 Linaro Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +#ifndef IRQBYPASS_H
> +#define IRQBYPASS_H
> +
> +#include <linux/list.h>
> +
> +struct irq_bypass_consumer;
> +
> +/*
> + * Theory of operation
> + *
> + * The IRQ bypass manager is a simple set of lists and callbacks that allows
> + * IRQ producers (ex. physical interrupt sources) to be matched to IRQ
> + * consumers (ex. virtualization hardware that allows IRQ bypass or offload)
> + * via a shared token (ex. eventfd_ctx). Producers and consumers register
> + * independently. When a token match is found, the optional @stop callback
> + * will be called for each participant. The pair will then be connected via
> + * the @add_* callbacks, and finally the optional @start callback will allow
> + * any final coordination. When either participant is unregistered, the
> + * process is repeated using the @del_* callbacks in place of the @add_*
> + * callbacks. Match tokens must be unique per producer/consumer, 1:N parings
pairings?
> + * are not supported.
> + */
> +
> +/**
> + * struct irq_bypass_producer - IRQ bypass producer definition
> + * @node: IRQ bypass manager private list management
> + * @token: opaque token to match between producer and consumer
> + * @irq: Linux IRQ number for the producer device
> + * @add_consumer: Connect the IRQ producer to an IRQ consumer
> + * @del_consumer: Disconnect the IRQ producer from an IRQ consumer
> + * @stop: Perform any quiesce operations necessary prior to add/del (optional)
> + * @start: Perform any startup operations necessary after add/del (optional)
> + *
> + * The IRQ bypass producer structure represents an interrupt source for
> + * participation in possible host bypass, for instance an interrupt vector
> + * for a physical device assigned to a VM.
> + */
> +struct irq_bypass_producer {
> + struct list_head node;
> + void *token;
> + int irq;
active
> + void (*add_consumer)(struct irq_bypass_producer *,
> + struct irq_bypass_consumer *);
> + void (*del_consumer)(struct irq_bypass_producer *,
> + struct irq_bypass_consumer *);
> + void (*stop)(struct irq_bypass_producer *);
> + void (*start)(struct irq_bypass_producer *);
> +};
> +
> +/**
> + * struct irq_bypass_consumer - IRQ bypass consumer definition
> + * @node: IRQ bypass manager private list management
> + * @token: opaque token to match between producer and consumer
> + * @add_producer: Connect the IRQ consumer to an IRQ producer
> + * @del_producer: Disconnect the IRQ consumer from an IRQ producer
> + * @stop: Perform any quiesce operations necessary prior to add/del (optional)
> + * @start: Perform any startup operations necessary after add/del (optional)
> + *
> + * The IRQ bypass consumer structure represents an interrupt sink for
> + * participation in possible host bypass, for instance a hypervisor may
> + * support offloads to allow bypassing the host entirely or offload
> + * portions of the interrupt handling to the VM.
> + */
> +struct irq_bypass_consumer {
> + struct list_head node;
> + void *token;
> + void (*add_producer)(struct irq_bypass_consumer *,
> + struct irq_bypass_producer *);
> + void (*del_producer)(struct irq_bypass_consumer *,
> + struct irq_bypass_producer *);
> + void (*stop)(struct irq_bypass_consumer *);
> + void (*start)(struct irq_bypass_consumer *);
> +};
> +
> +int irq_bypass_register_producer(struct irq_bypass_producer *);
> +void irq_bypass_unregister_producer(struct irq_bypass_producer *);
> +int irq_bypass_register_consumer(struct irq_bypass_consumer *);
> +void irq_bypass_unregister_consumer(struct irq_bypass_consumer *);
> +
> +#endif /* IRQBYPASS_H */
> diff --git a/virt/lib/Kconfig b/virt/lib/Kconfig
> new file mode 100644
> index 0000000..89a414f
> --- /dev/null
> +++ b/virt/lib/Kconfig
> @@ -0,0 +1,2 @@
> +config IRQ_BYPASS_MANAGER
> + tristate
> diff --git a/virt/lib/Makefile b/virt/lib/Makefile
> new file mode 100644
> index 0000000..901228d
> --- /dev/null
> +++ b/virt/lib/Makefile
> @@ -0,0 +1 @@
> +obj-$(CONFIG_IRQ_BYPASS_MANAGER) += irqbypass.o
> diff --git a/virt/lib/irqbypass.c b/virt/lib/irqbypass.c
> new file mode 100644
> index 0000000..f1091e6
> --- /dev/null
> +++ b/virt/lib/irqbypass.c
> @@ -0,0 +1,212 @@
> +/*
> + * IRQ offload/bypass manager
> + *
> + * Copyright (C) 2015 Red Hat, Inc.
> + * Copyright (c) 2015 Linaro Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * Various virtualization hardware acceleration techniques allow bypassing
> + * or offloading interrupts received from devices around the host kernel.
> + * Posted Interrupts on Intel VT-d systems can allow interrupts to be
> + * received directly by a virtual machine. ARM IRQ Forwarding can allow
> + * level triggered device interrupts to be de-asserted directly by the VM.
ARM IRQ Forwarding allows forwarded physical interrupts to be directly
deactivated by the guest
> + * This manager allows interrupt producers and consumers to find each other
> + * to enable this sort of bypass.
> + */
> +
> +#include <linux/irqbypass.h>
> +#include <linux/list.h>
> +#include <linux/module.h>
> +#include <linux/mutex.h>
> +
> +MODULE_LICENSE("GPL v2");
> +MODULE_DESCRIPTION("IRQ bypass manager utility module");
> +
> +static LIST_HEAD(producers);
> +static LIST_HEAD(consumers);
> +static DEFINE_MUTEX(lock);
> +
> +/* @lock must be held when calling connect */
> +static void __connect(struct irq_bypass_producer *prod,
> + struct irq_bypass_consumer *cons)
> +{
> + if (prod->stop)
> + prod->stop(prod);
> + if (cons->stop)
> + cons->stop(cons);
> +
> + prod->add_consumer(prod, cons);
> + cons->add_producer(cons, prod);

In case you are reluctant to add the active boolean - which looks as a
dirty hack I acknowledge -, could we change the proto of add_* so that
they return an error. if any of the add_* fails __connect would restore
the initial state and return an error. list_add would not be done.

I can prototype this and add it in my forwarding series if you prefer.

> +
> + if (cons->start)
> + cons->start(cons);
> + if (prod->start)
> + prod->start(prod);
> +}
> +
> +/* @lock must be held when calling disconnect */
> +static void __disconnect(struct irq_bypass_producer *prod,
> + struct irq_bypass_consumer *cons)
> +{
> + if (prod->stop)
> + prod->stop(prod);
> + if (cons->stop)
> + cons->stop(cons);
> +
> + cons->del_producer(cons, prod);
> + prod->del_consumer(prod, cons);
> +
> + if (cons->start)
> + cons->start(cons);
> + if (prod->start)
> + prod->start(prod);
> +}
> +
> +/**
> + * irq_bypass_register_producer - register IRQ bypass producer
> + * @producer: pointer to producer structure
> + *
> + * Add the provided IRQ producer to the list of producers and connect
> + * with any matching tokens found on the IRQ consumers list.
token? 1-1 pairing only. not sure about the usage of plural in that case
though. Please ignore if this is an english language mistake ;-)

> + */
> +int irq_bypass_register_producer(struct irq_bypass_producer *producer)
> +{
> + struct irq_bypass_producer *tmp;
> + struct irq_bypass_consumer *consumer;
> +
> + if (!producer->add_consumer || !producer->del_consumer)
> + return -EINVAL;
> +
> + might_sleep();
> +
> + if (!try_module_get(THIS_MODULE))
> + return -ENODEV;
> +
> + mutex_lock(&lock);
> +
> + list_for_each_entry(tmp, &producers, node) {
> + if (tmp->token == producer->token) {
> + mutex_unlock(&lock);
> + module_put(THIS_MODULE);
> + return -EINVAL;
> + }
> + }
> +
> + list_add(&producer->node, &producers);
> +
> + list_for_each_entry(consumer, &consumers, node) {
> + if (consumer->token == producer->token) {
> + __connect(producer, consumer);
> + break;
> + }
> + }
> +
> + mutex_unlock(&lock);
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(irq_bypass_register_producer);
> +
> +/**
> + * irq_bypass_unregister_producer - unregister IRQ bypass producer
> + * @producer: pointer to producer structure
> + *
> + * Remove a previously registered IRQ producer from the list of producers
> + * and disconnected from any connected IRQ consumers.
disconnect it from any connected IRQ consumer?
> + */
> +void irq_bypass_unregister_producer(struct irq_bypass_producer *producer)
> +{
> + struct irq_bypass_consumer *consumer;
> +
> + might_sleep();
> +
> + mutex_lock(&lock);
> +
> + list_for_each_entry(consumer, &consumers, node) {
> + if (consumer->token == producer->token) {
> + __disconnect(producer, consumer);
> + break;
> + }
> + }
> +
> + list_del(&producer->node);
> +
> + mutex_unlock(&lock);
> + module_put(THIS_MODULE);
> +}
> +EXPORT_SYMBOL_GPL(irq_bypass_unregister_producer);
> +
> +/**
> + * irq_bypass_register_consumer - register IRQ bypass consumer
> + * @consumer: pointer to consumer structure
> + *
> + * Add the provided IRQ consumer to the list of consumers and connect
> + * with any matching tokens found on the IRQ producer list.
token?
> + */
> +int irq_bypass_register_consumer(struct irq_bypass_consumer *consumer)
> +{
> + struct irq_bypass_consumer *tmp;
> + struct irq_bypass_producer *producer;
> +
> + if (!consumer->add_producer || !consumer->del_producer)
> + return -EINVAL;
> +
> + might_sleep();
> +
> + if (!try_module_get(THIS_MODULE))
> + return -ENODEV;
> +
> + mutex_lock(&lock);
> +
> + list_for_each_entry(tmp, &consumers, node) {
> + if (tmp->token == consumer->token) {
> + mutex_unlock(&lock);
> + module_put(THIS_MODULE);
> + return -EINVAL;
> + }
> + }
> +
> + list_add(&consumer->node, &consumers);
> +
> + list_for_each_entry(producer, &producers, node) {
> + if (producer->token == consumer->token) {
> + __connect(producer, consumer);
> + break;
> + }
> + }
> +
> + mutex_unlock(&lock);
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(irq_bypass_register_consumer);
> +
> +/**
> + * irq_bypass_unregister_consumer - unregister IRQ bypass consumer
> + * @consumer: pointer to consumer structure
> + *
> + * Remove a previously registered IRQ consumer from the list of consumers
> + * and disconnected from any connected IRQ producers.
disconnect it from any connected producer.

Best Regards

Eric
> + */
> +void irq_bypass_unregister_consumer(struct irq_bypass_consumer *consumer)
> +{
> + struct irq_bypass_producer *producer;
> +
> + might_sleep();
> +
> + mutex_lock(&lock);
> +
> + list_for_each_entry(producer, &producers, node) {
> + if (producer->token == consumer->token) {
> + __disconnect(producer, consumer);
> + break;
> + }
> + }
> +
> + list_del(&consumer->node);
> +
> + mutex_unlock(&lock);
> + module_put(THIS_MODULE);
> +}
> +EXPORT_SYMBOL_GPL(irq_bypass_unregister_consumer);
>

2015-07-13 18:25:26

by Alex Williamson

[permalink] [raw]
Subject: Re: [PATCH] virt: IRQ bypass manager

On Mon, 2015-07-13 at 17:32 +0200, Eric Auger wrote:
> Hi Alex,
> On 07/10/2015 07:52 PM, Alex Williamson wrote:
> > When a physical I/O device is assigned to a virtual machine through
> > facilities like VFIO and KVM, the interrupt for the device generally
> > bounces through the host system before being injected into the VM.
> > However, hardware technologies exist that often allow the host to be
> > bypassed for some of these scenarios. Intel Posted Interrupts allow
> > the specified physical edge interrupts to be directly injected into a
> > guest when delivered to a physical processor while the vCPU is
> > running. ARM IRQ Forwarding allows the hypervisor to handle level
> > triggered device interrupts as edge interrupts, by giving the guest
> > control of de-asserting and unmasking the interrupt line.
> ARM IRQ Forwarding allows forwarded physical interrupts to be directly
> deactivated by the guest?

Replaced

> > The IRQ bypass manager here is meant to provide the shim to connect
> > interrupt producers, generally the host physical device driver, with
> > interrupt consumers, generally the hypervisor, in order to configure
> > these bypass mechanism. To do this, we base the connection on a
> > shared, opaque token. For KVM-VFIO this is expected to be an
> > eventfd_ctx since this is the connection we already use to connect an
> > eventfd to an irqfd on the in-kernel path. When a producer and
> > consumer with matching tokens is found, callbacks via both registered
> > participants allow the bypass facilities to be automatically enabled.
> >
> > Signed-off-by: Alex Williamson <[email protected]>
> > Signed-off-by: Eric Auger <[email protected]>
> > ---
> >
> > Changes:
> > - Moved to virt/lib/
> > - Dropped update callback
> > - Filled in missing documentation
> > - @resume callback renamed to @stop
> > - Only @start/@stop are optional
> >
> > One of the difficulties with moving this code to virt/lib is that nobody
> > builds it by default. Thinking about this for a bit, it really needs a
> > consumer to be useful and KVM is currently the only consumer, so I tested
> > with the following:
> >
> > --- a/arch/x86/kvm/Kconfig
> > +++ b/arch/x86/kvm/Kconfig
> > @@ -100,5 +101,6 @@ config KVM_DEVICE_ASSIGNMENT
> > # the virtualization menu.
> > source drivers/vhost/Kconfig
> > source drivers/lguest/Kconfig
> > +source virt/lib/Kconfig
> >
> > endif # VIRTUALIZATION
> > --- a/arch/x86/kvm/Makefile
> > +++ b/arch/x86/kvm/Makefile
> > @@ -20,3 +20,5 @@ kvm-amd-y += svm.o
> > obj-$(CONFIG_KVM) += kvm.o
> > obj-$(CONFIG_KVM_INTEL) += kvm-intel.o
> > obj-$(CONFIG_KVM_AMD) += kvm-amd.o
> > +
> > +obj-y += ../../../virt/lib/
> >
> > Perhaps if a second consumer comes along that would be justification for
> > tying it elsewhere in the build system. ARM will obviously need to do
> > similar. Are there better options?
> >
> > Also, there's no maintainer for the top level virt/ directory. Paolo,
> > would you feel comfortable taking this, maybe with some additional acks?
> > That would probably be the most convenient for merging the consumer code.
> > Thanks,
> > Alex
> >
> > include/linux/irqbypass.h | 90 +++++++++++++++++++
> > virt/lib/Kconfig | 2
> > virt/lib/Makefile | 1
> > virt/lib/irqbypass.c | 212 +++++++++++++++++++++++++++++++++++++++++++++
> > 4 files changed, 305 insertions(+)
> > create mode 100644 include/linux/irqbypass.h
> > create mode 100644 virt/lib/Kconfig
> > create mode 100644 virt/lib/Makefile
> > create mode 100644 virt/lib/irqbypass.c
> >
> > diff --git a/include/linux/irqbypass.h b/include/linux/irqbypass.h
> > new file mode 100644
> > index 0000000..41df18d
> > --- /dev/null
> > +++ b/include/linux/irqbypass.h
> > @@ -0,0 +1,90 @@
> > +/*
> > + * IRQ offload/bypass manager
> > + *
> > + * Copyright (C) 2015 Red Hat, Inc.
> > + * Copyright (c) 2015 Linaro Ltd.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + */
> > +#ifndef IRQBYPASS_H
> > +#define IRQBYPASS_H
> > +
> > +#include <linux/list.h>
> > +
> > +struct irq_bypass_consumer;
> > +
> > +/*
> > + * Theory of operation
> > + *
> > + * The IRQ bypass manager is a simple set of lists and callbacks that allows
> > + * IRQ producers (ex. physical interrupt sources) to be matched to IRQ
> > + * consumers (ex. virtualization hardware that allows IRQ bypass or offload)
> > + * via a shared token (ex. eventfd_ctx). Producers and consumers register
> > + * independently. When a token match is found, the optional @stop callback
> > + * will be called for each participant. The pair will then be connected via
> > + * the @add_* callbacks, and finally the optional @start callback will allow
> > + * any final coordination. When either participant is unregistered, the
> > + * process is repeated using the @del_* callbacks in place of the @add_*
> > + * callbacks. Match tokens must be unique per producer/consumer, 1:N parings
> pairings?

Fixed

> > + * are not supported.
> > + */
> > +
> > +/**
> > + * struct irq_bypass_producer - IRQ bypass producer definition
> > + * @node: IRQ bypass manager private list management
> > + * @token: opaque token to match between producer and consumer
> > + * @irq: Linux IRQ number for the producer device
> > + * @add_consumer: Connect the IRQ producer to an IRQ consumer
> > + * @del_consumer: Disconnect the IRQ producer from an IRQ consumer
> > + * @stop: Perform any quiesce operations necessary prior to add/del (optional)
> > + * @start: Perform any startup operations necessary after add/del (optional)
> > + *
> > + * The IRQ bypass producer structure represents an interrupt source for
> > + * participation in possible host bypass, for instance an interrupt vector
> > + * for a physical device assigned to a VM.
> > + */
> > +struct irq_bypass_producer {
> > + struct list_head node;
> > + void *token;
> > + int irq;
> active

See below

> > + void (*add_consumer)(struct irq_bypass_producer *,
> > + struct irq_bypass_consumer *);
> > + void (*del_consumer)(struct irq_bypass_producer *,
> > + struct irq_bypass_consumer *);
> > + void (*stop)(struct irq_bypass_producer *);
> > + void (*start)(struct irq_bypass_producer *);
> > +};
> > +
> > +/**
> > + * struct irq_bypass_consumer - IRQ bypass consumer definition
> > + * @node: IRQ bypass manager private list management
> > + * @token: opaque token to match between producer and consumer
> > + * @add_producer: Connect the IRQ consumer to an IRQ producer
> > + * @del_producer: Disconnect the IRQ consumer from an IRQ producer
> > + * @stop: Perform any quiesce operations necessary prior to add/del (optional)
> > + * @start: Perform any startup operations necessary after add/del (optional)
> > + *
> > + * The IRQ bypass consumer structure represents an interrupt sink for
> > + * participation in possible host bypass, for instance a hypervisor may
> > + * support offloads to allow bypassing the host entirely or offload
> > + * portions of the interrupt handling to the VM.
> > + */
> > +struct irq_bypass_consumer {
> > + struct list_head node;
> > + void *token;
> > + void (*add_producer)(struct irq_bypass_consumer *,
> > + struct irq_bypass_producer *);
> > + void (*del_producer)(struct irq_bypass_consumer *,
> > + struct irq_bypass_producer *);
> > + void (*stop)(struct irq_bypass_consumer *);
> > + void (*start)(struct irq_bypass_consumer *);
> > +};
> > +
> > +int irq_bypass_register_producer(struct irq_bypass_producer *);
> > +void irq_bypass_unregister_producer(struct irq_bypass_producer *);
> > +int irq_bypass_register_consumer(struct irq_bypass_consumer *);
> > +void irq_bypass_unregister_consumer(struct irq_bypass_consumer *);
> > +
> > +#endif /* IRQBYPASS_H */
> > diff --git a/virt/lib/Kconfig b/virt/lib/Kconfig
> > new file mode 100644
> > index 0000000..89a414f
> > --- /dev/null
> > +++ b/virt/lib/Kconfig
> > @@ -0,0 +1,2 @@
> > +config IRQ_BYPASS_MANAGER
> > + tristate
> > diff --git a/virt/lib/Makefile b/virt/lib/Makefile
> > new file mode 100644
> > index 0000000..901228d
> > --- /dev/null
> > +++ b/virt/lib/Makefile
> > @@ -0,0 +1 @@
> > +obj-$(CONFIG_IRQ_BYPASS_MANAGER) += irqbypass.o
> > diff --git a/virt/lib/irqbypass.c b/virt/lib/irqbypass.c
> > new file mode 100644
> > index 0000000..f1091e6
> > --- /dev/null
> > +++ b/virt/lib/irqbypass.c
> > @@ -0,0 +1,212 @@
> > +/*
> > + * IRQ offload/bypass manager
> > + *
> > + * Copyright (C) 2015 Red Hat, Inc.
> > + * Copyright (c) 2015 Linaro Ltd.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + *
> > + * Various virtualization hardware acceleration techniques allow bypassing
> > + * or offloading interrupts received from devices around the host kernel.
> > + * Posted Interrupts on Intel VT-d systems can allow interrupts to be
> > + * received directly by a virtual machine. ARM IRQ Forwarding can allow
> > + * level triggered device interrupts to be de-asserted directly by the VM.
> ARM IRQ Forwarding allows forwarded physical interrupts to be directly
> deactivated by the guest

Replaced

> > + * This manager allows interrupt producers and consumers to find each other
> > + * to enable this sort of bypass.
> > + */
> > +
> > +#include <linux/irqbypass.h>
> > +#include <linux/list.h>
> > +#include <linux/module.h>
> > +#include <linux/mutex.h>
> > +
> > +MODULE_LICENSE("GPL v2");
> > +MODULE_DESCRIPTION("IRQ bypass manager utility module");
> > +
> > +static LIST_HEAD(producers);
> > +static LIST_HEAD(consumers);
> > +static DEFINE_MUTEX(lock);
> > +
> > +/* @lock must be held when calling connect */
> > +static void __connect(struct irq_bypass_producer *prod,
> > + struct irq_bypass_consumer *cons)
> > +{
> > + if (prod->stop)
> > + prod->stop(prod);
> > + if (cons->stop)
> > + cons->stop(cons);
> > +
> > + prod->add_consumer(prod, cons);
> > + cons->add_producer(cons, prod);
>
> In case you are reluctant to add the active boolean - which looks as a
> dirty hack I acknowledge -, could we change the proto of add_* so that
> they return an error. if any of the add_* fails __connect would restore
> the initial state and return an error. list_add would not be done.
>
> I can prototype this and add it in my forwarding series if you prefer.

The active boolean does seem like a hack, its state that either
participant should be able to generate or track themselves. In fact, if
we add the error paths you suggest, we can determine on-demand whether a
bypass is active simply by looking for a token on both lists. I'll post
a version with error paths.

> > +
> > + if (cons->start)
> > + cons->start(cons);
> > + if (prod->start)
> > + prod->start(prod);
> > +}
> > +
> > +/* @lock must be held when calling disconnect */
> > +static void __disconnect(struct irq_bypass_producer *prod,
> > + struct irq_bypass_consumer *cons)
> > +{
> > + if (prod->stop)
> > + prod->stop(prod);
> > + if (cons->stop)
> > + cons->stop(cons);
> > +
> > + cons->del_producer(cons, prod);
> > + prod->del_consumer(prod, cons);
> > +
> > + if (cons->start)
> > + cons->start(cons);
> > + if (prod->start)
> > + prod->start(prod);
> > +}
> > +
> > +/**
> > + * irq_bypass_register_producer - register IRQ bypass producer
> > + * @producer: pointer to producer structure
> > + *
> > + * Add the provided IRQ producer to the list of producers and connect
> > + * with any matching tokens found on the IRQ consumers list.
> token? 1-1 pairing only. not sure about the usage of plural in that case
> though. Please ignore if this is an english language mistake ;-)

Yes, singular is correct.

> > + */
> > +int irq_bypass_register_producer(struct irq_bypass_producer *producer)
> > +{
> > + struct irq_bypass_producer *tmp;
> > + struct irq_bypass_consumer *consumer;
> > +
> > + if (!producer->add_consumer || !producer->del_consumer)
> > + return -EINVAL;
> > +
> > + might_sleep();
> > +
> > + if (!try_module_get(THIS_MODULE))
> > + return -ENODEV;
> > +
> > + mutex_lock(&lock);
> > +
> > + list_for_each_entry(tmp, &producers, node) {
> > + if (tmp->token == producer->token) {
> > + mutex_unlock(&lock);
> > + module_put(THIS_MODULE);
> > + return -EINVAL;
> > + }
> > + }
> > +
> > + list_add(&producer->node, &producers);
> > +
> > + list_for_each_entry(consumer, &consumers, node) {
> > + if (consumer->token == producer->token) {
> > + __connect(producer, consumer);
> > + break;
> > + }
> > + }
> > +
> > + mutex_unlock(&lock);
> > + return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(irq_bypass_register_producer);
> > +
> > +/**
> > + * irq_bypass_unregister_producer - unregister IRQ bypass producer
> > + * @producer: pointer to producer structure
> > + *
> > + * Remove a previously registered IRQ producer from the list of producers
> > + * and disconnected from any connected IRQ consumers.
> disconnect it from any connected IRQ consumer?

Yes

> > + */
> > +void irq_bypass_unregister_producer(struct irq_bypass_producer *producer)
> > +{
> > + struct irq_bypass_consumer *consumer;
> > +
> > + might_sleep();
> > +
> > + mutex_lock(&lock);
> > +
> > + list_for_each_entry(consumer, &consumers, node) {
> > + if (consumer->token == producer->token) {
> > + __disconnect(producer, consumer);
> > + break;
> > + }
> > + }
> > +
> > + list_del(&producer->node);
> > +
> > + mutex_unlock(&lock);
> > + module_put(THIS_MODULE);
> > +}
> > +EXPORT_SYMBOL_GPL(irq_bypass_unregister_producer);
> > +
> > +/**
> > + * irq_bypass_register_consumer - register IRQ bypass consumer
> > + * @consumer: pointer to consumer structure
> > + *
> > + * Add the provided IRQ consumer to the list of consumers and connect
> > + * with any matching tokens found on the IRQ producer list.
> token?

Yes

> > + */
> > +int irq_bypass_register_consumer(struct irq_bypass_consumer *consumer)
> > +{
> > + struct irq_bypass_consumer *tmp;
> > + struct irq_bypass_producer *producer;
> > +
> > + if (!consumer->add_producer || !consumer->del_producer)
> > + return -EINVAL;
> > +
> > + might_sleep();
> > +
> > + if (!try_module_get(THIS_MODULE))
> > + return -ENODEV;
> > +
> > + mutex_lock(&lock);
> > +
> > + list_for_each_entry(tmp, &consumers, node) {
> > + if (tmp->token == consumer->token) {
> > + mutex_unlock(&lock);
> > + module_put(THIS_MODULE);
> > + return -EINVAL;
> > + }
> > + }
> > +
> > + list_add(&consumer->node, &consumers);
> > +
> > + list_for_each_entry(producer, &producers, node) {
> > + if (producer->token == consumer->token) {
> > + __connect(producer, consumer);
> > + break;
> > + }
> > + }
> > +
> > + mutex_unlock(&lock);
> > + return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(irq_bypass_register_consumer);
> > +
> > +/**
> > + * irq_bypass_unregister_consumer - unregister IRQ bypass consumer
> > + * @consumer: pointer to consumer structure
> > + *
> > + * Remove a previously registered IRQ consumer from the list of consumers
> > + * and disconnected from any connected IRQ producers.
> disconnect it from any connected producer.

Yep

Thanks for the review!

Alex

> > + */
> > +void irq_bypass_unregister_consumer(struct irq_bypass_consumer *consumer)
> > +{
> > + struct irq_bypass_producer *producer;
> > +
> > + might_sleep();
> > +
> > + mutex_lock(&lock);
> > +
> > + list_for_each_entry(producer, &producers, node) {
> > + if (producer->token == consumer->token) {
> > + __disconnect(producer, consumer);
> > + break;
> > + }
> > + }
> > +
> > + list_del(&consumer->node);
> > +
> > + mutex_unlock(&lock);
> > + module_put(THIS_MODULE);
> > +}
> > +EXPORT_SYMBOL_GPL(irq_bypass_unregister_consumer);
> >
>


2015-07-13 18:37:28

by Alex Williamson

[permalink] [raw]
Subject: Re: [PATCH] virt: IRQ bypass manager

On Fri, 2015-07-10 at 23:09 +0200, Paolo Bonzini wrote:
>
> On 10/07/2015 19:52, Alex Williamson wrote:
> > Perhaps if a second consumer comes along that would be justification for
> > tying it elsewhere in the build system. ARM will obviously need to do
> > similar. Are there better options?
> >
> > Also, there's no maintainer for the top level virt/ directory. Paolo,
> > would you feel comfortable taking this, maybe with some additional acks?
>
> That's okay; alternatively, we can share it since after all you wrote
> most of it.

Sure. Actually my wording was confusing, I was actually just suggesting
the patch be merged through the KVM tree since I think the consumer code
is going to be significantly bulkier than the producer code. I can
certainly add a MAINTAINERS entry for virt/lib/ that we can share if you
like. Thanks,

Alex

2015-07-13 20:58:50

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH] virt: IRQ bypass manager

Hi Alex,
On 07/13/2015 08:25 PM, Alex Williamson wrote:
> On Mon, 2015-07-13 at 17:32 +0200, Eric Auger wrote:
>> Hi Alex,
>> On 07/10/2015 07:52 PM, Alex Williamson wrote:
>>> When a physical I/O device is assigned to a virtual machine through
>>> facilities like VFIO and KVM, the interrupt for the device generally
>>> bounces through the host system before being injected into the VM.
>>> However, hardware technologies exist that often allow the host to be
>>> bypassed for some of these scenarios. Intel Posted Interrupts allow
>>> the specified physical edge interrupts to be directly injected into a
>>> guest when delivered to a physical processor while the vCPU is
>>> running. ARM IRQ Forwarding allows the hypervisor to handle level
>>> triggered device interrupts as edge interrupts, by giving the guest
>>> control of de-asserting and unmasking the interrupt line.
>> ARM IRQ Forwarding allows forwarded physical interrupts to be directly
>> deactivated by the guest?
>
> Replaced
>
>>> The IRQ bypass manager here is meant to provide the shim to connect
>>> interrupt producers, generally the host physical device driver, with
>>> interrupt consumers, generally the hypervisor, in order to configure
>>> these bypass mechanism. To do this, we base the connection on a
>>> shared, opaque token. For KVM-VFIO this is expected to be an
>>> eventfd_ctx since this is the connection we already use to connect an
>>> eventfd to an irqfd on the in-kernel path. When a producer and
>>> consumer with matching tokens is found, callbacks via both registered
>>> participants allow the bypass facilities to be automatically enabled.
>>>
>>> Signed-off-by: Alex Williamson <[email protected]>
>>> Signed-off-by: Eric Auger <[email protected]>
>>> ---
>>>
>>> Changes:
>>> - Moved to virt/lib/
>>> - Dropped update callback
>>> - Filled in missing documentation
>>> - @resume callback renamed to @stop
>>> - Only @start/@stop are optional
>>>
>>> One of the difficulties with moving this code to virt/lib is that nobody
>>> builds it by default. Thinking about this for a bit, it really needs a
>>> consumer to be useful and KVM is currently the only consumer, so I tested
>>> with the following:
>>>
>>> --- a/arch/x86/kvm/Kconfig
>>> +++ b/arch/x86/kvm/Kconfig
>>> @@ -100,5 +101,6 @@ config KVM_DEVICE_ASSIGNMENT
>>> # the virtualization menu.
>>> source drivers/vhost/Kconfig
>>> source drivers/lguest/Kconfig
>>> +source virt/lib/Kconfig
>>>
>>> endif # VIRTUALIZATION
>>> --- a/arch/x86/kvm/Makefile
>>> +++ b/arch/x86/kvm/Makefile
>>> @@ -20,3 +20,5 @@ kvm-amd-y += svm.o
>>> obj-$(CONFIG_KVM) += kvm.o
>>> obj-$(CONFIG_KVM_INTEL) += kvm-intel.o
>>> obj-$(CONFIG_KVM_AMD) += kvm-amd.o
>>> +
>>> +obj-y += ../../../virt/lib/
>>>
>>> Perhaps if a second consumer comes along that would be justification for
>>> tying it elsewhere in the build system. ARM will obviously need to do
>>> similar. Are there better options?
>>>
>>> Also, there's no maintainer for the top level virt/ directory. Paolo,
>>> would you feel comfortable taking this, maybe with some additional acks?
>>> That would probably be the most convenient for merging the consumer code.
>>> Thanks,
>>> Alex
>>>
>>> include/linux/irqbypass.h | 90 +++++++++++++++++++
>>> virt/lib/Kconfig | 2
>>> virt/lib/Makefile | 1
>>> virt/lib/irqbypass.c | 212 +++++++++++++++++++++++++++++++++++++++++++++
>>> 4 files changed, 305 insertions(+)
>>> create mode 100644 include/linux/irqbypass.h
>>> create mode 100644 virt/lib/Kconfig
>>> create mode 100644 virt/lib/Makefile
>>> create mode 100644 virt/lib/irqbypass.c
>>>
>>> diff --git a/include/linux/irqbypass.h b/include/linux/irqbypass.h
>>> new file mode 100644
>>> index 0000000..41df18d
>>> --- /dev/null
>>> +++ b/include/linux/irqbypass.h
>>> @@ -0,0 +1,90 @@
>>> +/*
>>> + * IRQ offload/bypass manager
>>> + *
>>> + * Copyright (C) 2015 Red Hat, Inc.
>>> + * Copyright (c) 2015 Linaro Ltd.
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> + * it under the terms of the GNU General Public License version 2 as
>>> + * published by the Free Software Foundation.
>>> + */
>>> +#ifndef IRQBYPASS_H
>>> +#define IRQBYPASS_H
>>> +
>>> +#include <linux/list.h>
>>> +
>>> +struct irq_bypass_consumer;
>>> +
>>> +/*
>>> + * Theory of operation
>>> + *
>>> + * The IRQ bypass manager is a simple set of lists and callbacks that allows
>>> + * IRQ producers (ex. physical interrupt sources) to be matched to IRQ
>>> + * consumers (ex. virtualization hardware that allows IRQ bypass or offload)
>>> + * via a shared token (ex. eventfd_ctx). Producers and consumers register
>>> + * independently. When a token match is found, the optional @stop callback
>>> + * will be called for each participant. The pair will then be connected via
>>> + * the @add_* callbacks, and finally the optional @start callback will allow
>>> + * any final coordination. When either participant is unregistered, the
>>> + * process is repeated using the @del_* callbacks in place of the @add_*
>>> + * callbacks. Match tokens must be unique per producer/consumer, 1:N parings
>> pairings?
>
> Fixed
>
>>> + * are not supported.
>>> + */
>>> +
>>> +/**
>>> + * struct irq_bypass_producer - IRQ bypass producer definition
>>> + * @node: IRQ bypass manager private list management
>>> + * @token: opaque token to match between producer and consumer
>>> + * @irq: Linux IRQ number for the producer device
>>> + * @add_consumer: Connect the IRQ producer to an IRQ consumer
>>> + * @del_consumer: Disconnect the IRQ producer from an IRQ consumer
>>> + * @stop: Perform any quiesce operations necessary prior to add/del (optional)
>>> + * @start: Perform any startup operations necessary after add/del (optional)
>>> + *
>>> + * The IRQ bypass producer structure represents an interrupt source for
>>> + * participation in possible host bypass, for instance an interrupt vector
>>> + * for a physical device assigned to a VM.
>>> + */
>>> +struct irq_bypass_producer {
>>> + struct list_head node;
>>> + void *token;
>>> + int irq;
>> active
>
> See below
>
>>> + void (*add_consumer)(struct irq_bypass_producer *,
>>> + struct irq_bypass_consumer *);
>>> + void (*del_consumer)(struct irq_bypass_producer *,
>>> + struct irq_bypass_consumer *);
>>> + void (*stop)(struct irq_bypass_producer *);
>>> + void (*start)(struct irq_bypass_producer *);
>>> +};
>>> +
>>> +/**
>>> + * struct irq_bypass_consumer - IRQ bypass consumer definition
>>> + * @node: IRQ bypass manager private list management
>>> + * @token: opaque token to match between producer and consumer
>>> + * @add_producer: Connect the IRQ consumer to an IRQ producer
>>> + * @del_producer: Disconnect the IRQ consumer from an IRQ producer
>>> + * @stop: Perform any quiesce operations necessary prior to add/del (optional)
>>> + * @start: Perform any startup operations necessary after add/del (optional)
>>> + *
>>> + * The IRQ bypass consumer structure represents an interrupt sink for
>>> + * participation in possible host bypass, for instance a hypervisor may
>>> + * support offloads to allow bypassing the host entirely or offload
>>> + * portions of the interrupt handling to the VM.
>>> + */
>>> +struct irq_bypass_consumer {
>>> + struct list_head node;
>>> + void *token;
>>> + void (*add_producer)(struct irq_bypass_consumer *,
>>> + struct irq_bypass_producer *);
>>> + void (*del_producer)(struct irq_bypass_consumer *,
>>> + struct irq_bypass_producer *);
>>> + void (*stop)(struct irq_bypass_consumer *);
>>> + void (*start)(struct irq_bypass_consumer *);
>>> +};
>>> +
>>> +int irq_bypass_register_producer(struct irq_bypass_producer *);
>>> +void irq_bypass_unregister_producer(struct irq_bypass_producer *);
>>> +int irq_bypass_register_consumer(struct irq_bypass_consumer *);
>>> +void irq_bypass_unregister_consumer(struct irq_bypass_consumer *);
>>> +
>>> +#endif /* IRQBYPASS_H */
>>> diff --git a/virt/lib/Kconfig b/virt/lib/Kconfig
>>> new file mode 100644
>>> index 0000000..89a414f
>>> --- /dev/null
>>> +++ b/virt/lib/Kconfig
>>> @@ -0,0 +1,2 @@
>>> +config IRQ_BYPASS_MANAGER
>>> + tristate
>>> diff --git a/virt/lib/Makefile b/virt/lib/Makefile
>>> new file mode 100644
>>> index 0000000..901228d
>>> --- /dev/null
>>> +++ b/virt/lib/Makefile
>>> @@ -0,0 +1 @@
>>> +obj-$(CONFIG_IRQ_BYPASS_MANAGER) += irqbypass.o
>>> diff --git a/virt/lib/irqbypass.c b/virt/lib/irqbypass.c
>>> new file mode 100644
>>> index 0000000..f1091e6
>>> --- /dev/null
>>> +++ b/virt/lib/irqbypass.c
>>> @@ -0,0 +1,212 @@
>>> +/*
>>> + * IRQ offload/bypass manager
>>> + *
>>> + * Copyright (C) 2015 Red Hat, Inc.
>>> + * Copyright (c) 2015 Linaro Ltd.
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> + * it under the terms of the GNU General Public License version 2 as
>>> + * published by the Free Software Foundation.
>>> + *
>>> + * Various virtualization hardware acceleration techniques allow bypassing
>>> + * or offloading interrupts received from devices around the host kernel.
>>> + * Posted Interrupts on Intel VT-d systems can allow interrupts to be
>>> + * received directly by a virtual machine. ARM IRQ Forwarding can allow
>>> + * level triggered device interrupts to be de-asserted directly by the VM.
>> ARM IRQ Forwarding allows forwarded physical interrupts to be directly
>> deactivated by the guest
>
> Replaced
>
>>> + * This manager allows interrupt producers and consumers to find each other
>>> + * to enable this sort of bypass.
>>> + */
>>> +
>>> +#include <linux/irqbypass.h>
>>> +#include <linux/list.h>
>>> +#include <linux/module.h>
>>> +#include <linux/mutex.h>
>>> +
>>> +MODULE_LICENSE("GPL v2");
>>> +MODULE_DESCRIPTION("IRQ bypass manager utility module");
>>> +
>>> +static LIST_HEAD(producers);
>>> +static LIST_HEAD(consumers);
>>> +static DEFINE_MUTEX(lock);
>>> +
>>> +/* @lock must be held when calling connect */
>>> +static void __connect(struct irq_bypass_producer *prod,
>>> + struct irq_bypass_consumer *cons)
>>> +{
>>> + if (prod->stop)
>>> + prod->stop(prod);
>>> + if (cons->stop)
>>> + cons->stop(cons);
>>> +
>>> + prod->add_consumer(prod, cons);
>>> + cons->add_producer(cons, prod);
>>
>> In case you are reluctant to add the active boolean - which looks as a
>> dirty hack I acknowledge -, could we change the proto of add_* so that
>> they return an error. if any of the add_* fails __connect would restore
>> the initial state and return an error. list_add would not be done.
>>
>> I can prototype this and add it in my forwarding series if you prefer.
>
> The active boolean does seem like a hack, its state that either
> participant should be able to generate or track themselves. In fact, if
> we add the error paths you suggest, we can determine on-demand whether a
> bypass is active simply by looking for a token on both lists. I'll post
> a version with error paths.
- error path makes possible to reject any forward attempt when not safe
which is really useful for forwarding
- however this is not sufficient for the unforward (del) where I really
need to communicate the IRQ active state computed by the consumer to the
producer: KVM detects the physical IRQ was not deactivated by the guest
so VFIO driver must mask it as KVM will deactivate it.

Just to make sure we share the same understanding the active state does
not mean that the link between the producer and the consumer is in place
but really corresponds to the IRQ state computed by the
consumer/producer on add/del and communicated to the producer/consumer.

maybe a more generic solution is to pass a void* data when registering
the producer (stored as a member in the producer) and change the proto
of add/del to use a 3d arg consisting of this data pointer. Then
__connect and __disconnect would call add/del with producer->data;
Data semantic then could be any data struct to be exchanged by fellow
add functions or del functions

in my case data would point to a boolean stored in the vfio_platform_irq.

What do you think?

Thanks

Eric
>
>>> +
>>> + if (cons->start)
>>> + cons->start(cons);
>>> + if (prod->start)
>>> + prod->start(prod);
>>> +}
>>> +
>>> +/* @lock must be held when calling disconnect */
>>> +static void __disconnect(struct irq_bypass_producer *prod,
>>> + struct irq_bypass_consumer *cons)
>>> +{
>>> + if (prod->stop)
>>> + prod->stop(prod);
>>> + if (cons->stop)
>>> + cons->stop(cons);
>>> +
>>> + cons->del_producer(cons, prod);
>>> + prod->del_consumer(prod, cons);
>>> +
>>> + if (cons->start)
>>> + cons->start(cons);
>>> + if (prod->start)
>>> + prod->start(prod);
>>> +}
>>> +
>>> +/**
>>> + * irq_bypass_register_producer - register IRQ bypass producer
>>> + * @producer: pointer to producer structure
>>> + *
>>> + * Add the provided IRQ producer to the list of producers and connect
>>> + * with any matching tokens found on the IRQ consumers list.
>> token? 1-1 pairing only. not sure about the usage of plural in that case
>> though. Please ignore if this is an english language mistake ;-)
>
> Yes, singular is correct.
>
>>> + */
>>> +int irq_bypass_register_producer(struct irq_bypass_producer *producer)
>>> +{
>>> + struct irq_bypass_producer *tmp;
>>> + struct irq_bypass_consumer *consumer;
>>> +
>>> + if (!producer->add_consumer || !producer->del_consumer)
>>> + return -EINVAL;
>>> +
>>> + might_sleep();
>>> +
>>> + if (!try_module_get(THIS_MODULE))
>>> + return -ENODEV;
>>> +
>>> + mutex_lock(&lock);
>>> +
>>> + list_for_each_entry(tmp, &producers, node) {
>>> + if (tmp->token == producer->token) {
>>> + mutex_unlock(&lock);
>>> + module_put(THIS_MODULE);
>>> + return -EINVAL;
>>> + }
>>> + }
>>> +
>>> + list_add(&producer->node, &producers);
>>> +
>>> + list_for_each_entry(consumer, &consumers, node) {
>>> + if (consumer->token == producer->token) {
>>> + __connect(producer, consumer);
>>> + break;
>>> + }
>>> + }
>>> +
>>> + mutex_unlock(&lock);
>>> + return 0;
>>> +}
>>> +EXPORT_SYMBOL_GPL(irq_bypass_register_producer);
>>> +
>>> +/**
>>> + * irq_bypass_unregister_producer - unregister IRQ bypass producer
>>> + * @producer: pointer to producer structure
>>> + *
>>> + * Remove a previously registered IRQ producer from the list of producers
>>> + * and disconnected from any connected IRQ consumers.
>> disconnect it from any connected IRQ consumer?
>
> Yes
>
>>> + */
>>> +void irq_bypass_unregister_producer(struct irq_bypass_producer *producer)
>>> +{
>>> + struct irq_bypass_consumer *consumer;
>>> +
>>> + might_sleep();
>>> +
>>> + mutex_lock(&lock);
>>> +
>>> + list_for_each_entry(consumer, &consumers, node) {
>>> + if (consumer->token == producer->token) {
>>> + __disconnect(producer, consumer);
>>> + break;
>>> + }
>>> + }
>>> +
>>> + list_del(&producer->node);
>>> +
>>> + mutex_unlock(&lock);
>>> + module_put(THIS_MODULE);
>>> +}
>>> +EXPORT_SYMBOL_GPL(irq_bypass_unregister_producer);
>>> +
>>> +/**
>>> + * irq_bypass_register_consumer - register IRQ bypass consumer
>>> + * @consumer: pointer to consumer structure
>>> + *
>>> + * Add the provided IRQ consumer to the list of consumers and connect
>>> + * with any matching tokens found on the IRQ producer list.
>> token?
>
> Yes
>
>>> + */
>>> +int irq_bypass_register_consumer(struct irq_bypass_consumer *consumer)
>>> +{
>>> + struct irq_bypass_consumer *tmp;
>>> + struct irq_bypass_producer *producer;
>>> +
>>> + if (!consumer->add_producer || !consumer->del_producer)
>>> + return -EINVAL;
>>> +
>>> + might_sleep();
>>> +
>>> + if (!try_module_get(THIS_MODULE))
>>> + return -ENODEV;
>>> +
>>> + mutex_lock(&lock);
>>> +
>>> + list_for_each_entry(tmp, &consumers, node) {
>>> + if (tmp->token == consumer->token) {
>>> + mutex_unlock(&lock);
>>> + module_put(THIS_MODULE);
>>> + return -EINVAL;
>>> + }
>>> + }
>>> +
>>> + list_add(&consumer->node, &consumers);
>>> +
>>> + list_for_each_entry(producer, &producers, node) {
>>> + if (producer->token == consumer->token) {
>>> + __connect(producer, consumer);
>>> + break;
>>> + }
>>> + }
>>> +
>>> + mutex_unlock(&lock);
>>> + return 0;
>>> +}
>>> +EXPORT_SYMBOL_GPL(irq_bypass_register_consumer);
>>> +
>>> +/**
>>> + * irq_bypass_unregister_consumer - unregister IRQ bypass consumer
>>> + * @consumer: pointer to consumer structure
>>> + *
>>> + * Remove a previously registered IRQ consumer from the list of consumers
>>> + * and disconnected from any connected IRQ producers.
>> disconnect it from any connected producer.
>
> Yep
>
> Thanks for the review!
>
> Alex
>
>>> + */
>>> +void irq_bypass_unregister_consumer(struct irq_bypass_consumer *consumer)
>>> +{
>>> + struct irq_bypass_producer *producer;
>>> +
>>> + might_sleep();
>>> +
>>> + mutex_lock(&lock);
>>> +
>>> + list_for_each_entry(producer, &producers, node) {
>>> + if (producer->token == consumer->token) {
>>> + __disconnect(producer, consumer);
>>> + break;
>>> + }
>>> + }
>>> +
>>> + list_del(&consumer->node);
>>> +
>>> + mutex_unlock(&lock);
>>> + module_put(THIS_MODULE);
>>> +}
>>> +EXPORT_SYMBOL_GPL(irq_bypass_unregister_consumer);
>>>
>>
>
>
>

2015-07-13 21:54:22

by Alex Williamson

[permalink] [raw]
Subject: Re: [PATCH] virt: IRQ bypass manager

On Mon, 2015-07-13 at 22:58 +0200, Eric Auger wrote:
> Hi Alex,
> On 07/13/2015 08:25 PM, Alex Williamson wrote:
> > On Mon, 2015-07-13 at 17:32 +0200, Eric Auger wrote:
> >> Hi Alex,
> >> On 07/10/2015 07:52 PM, Alex Williamson wrote:
> >>> When a physical I/O device is assigned to a virtual machine through
> >>> facilities like VFIO and KVM, the interrupt for the device generally
> >>> bounces through the host system before being injected into the VM.
> >>> However, hardware technologies exist that often allow the host to be
> >>> bypassed for some of these scenarios. Intel Posted Interrupts allow
> >>> the specified physical edge interrupts to be directly injected into a
> >>> guest when delivered to a physical processor while the vCPU is
> >>> running. ARM IRQ Forwarding allows the hypervisor to handle level
> >>> triggered device interrupts as edge interrupts, by giving the guest
> >>> control of de-asserting and unmasking the interrupt line.
> >> ARM IRQ Forwarding allows forwarded physical interrupts to be directly
> >> deactivated by the guest?
> >
> > Replaced
> >
> >>> The IRQ bypass manager here is meant to provide the shim to connect
> >>> interrupt producers, generally the host physical device driver, with
> >>> interrupt consumers, generally the hypervisor, in order to configure
> >>> these bypass mechanism. To do this, we base the connection on a
> >>> shared, opaque token. For KVM-VFIO this is expected to be an
> >>> eventfd_ctx since this is the connection we already use to connect an
> >>> eventfd to an irqfd on the in-kernel path. When a producer and
> >>> consumer with matching tokens is found, callbacks via both registered
> >>> participants allow the bypass facilities to be automatically enabled.
> >>>
> >>> Signed-off-by: Alex Williamson <[email protected]>
> >>> Signed-off-by: Eric Auger <[email protected]>
> >>> ---
> >>>
> >>> Changes:
> >>> - Moved to virt/lib/
> >>> - Dropped update callback
> >>> - Filled in missing documentation
> >>> - @resume callback renamed to @stop
> >>> - Only @start/@stop are optional
> >>>
> >>> One of the difficulties with moving this code to virt/lib is that nobody
> >>> builds it by default. Thinking about this for a bit, it really needs a
> >>> consumer to be useful and KVM is currently the only consumer, so I tested
> >>> with the following:
> >>>
> >>> --- a/arch/x86/kvm/Kconfig
> >>> +++ b/arch/x86/kvm/Kconfig
> >>> @@ -100,5 +101,6 @@ config KVM_DEVICE_ASSIGNMENT
> >>> # the virtualization menu.
> >>> source drivers/vhost/Kconfig
> >>> source drivers/lguest/Kconfig
> >>> +source virt/lib/Kconfig
> >>>
> >>> endif # VIRTUALIZATION
> >>> --- a/arch/x86/kvm/Makefile
> >>> +++ b/arch/x86/kvm/Makefile
> >>> @@ -20,3 +20,5 @@ kvm-amd-y += svm.o
> >>> obj-$(CONFIG_KVM) += kvm.o
> >>> obj-$(CONFIG_KVM_INTEL) += kvm-intel.o
> >>> obj-$(CONFIG_KVM_AMD) += kvm-amd.o
> >>> +
> >>> +obj-y += ../../../virt/lib/
> >>>
> >>> Perhaps if a second consumer comes along that would be justification for
> >>> tying it elsewhere in the build system. ARM will obviously need to do
> >>> similar. Are there better options?
> >>>
> >>> Also, there's no maintainer for the top level virt/ directory. Paolo,
> >>> would you feel comfortable taking this, maybe with some additional acks?
> >>> That would probably be the most convenient for merging the consumer code.
> >>> Thanks,
> >>> Alex
> >>>
> >>> include/linux/irqbypass.h | 90 +++++++++++++++++++
> >>> virt/lib/Kconfig | 2
> >>> virt/lib/Makefile | 1
> >>> virt/lib/irqbypass.c | 212 +++++++++++++++++++++++++++++++++++++++++++++
> >>> 4 files changed, 305 insertions(+)
> >>> create mode 100644 include/linux/irqbypass.h
> >>> create mode 100644 virt/lib/Kconfig
> >>> create mode 100644 virt/lib/Makefile
> >>> create mode 100644 virt/lib/irqbypass.c
> >>>
> >>> diff --git a/include/linux/irqbypass.h b/include/linux/irqbypass.h
> >>> new file mode 100644
> >>> index 0000000..41df18d
> >>> --- /dev/null
> >>> +++ b/include/linux/irqbypass.h
> >>> @@ -0,0 +1,90 @@
> >>> +/*
> >>> + * IRQ offload/bypass manager
> >>> + *
> >>> + * Copyright (C) 2015 Red Hat, Inc.
> >>> + * Copyright (c) 2015 Linaro Ltd.
> >>> + *
> >>> + * This program is free software; you can redistribute it and/or modify
> >>> + * it under the terms of the GNU General Public License version 2 as
> >>> + * published by the Free Software Foundation.
> >>> + */
> >>> +#ifndef IRQBYPASS_H
> >>> +#define IRQBYPASS_H
> >>> +
> >>> +#include <linux/list.h>
> >>> +
> >>> +struct irq_bypass_consumer;
> >>> +
> >>> +/*
> >>> + * Theory of operation
> >>> + *
> >>> + * The IRQ bypass manager is a simple set of lists and callbacks that allows
> >>> + * IRQ producers (ex. physical interrupt sources) to be matched to IRQ
> >>> + * consumers (ex. virtualization hardware that allows IRQ bypass or offload)
> >>> + * via a shared token (ex. eventfd_ctx). Producers and consumers register
> >>> + * independently. When a token match is found, the optional @stop callback
> >>> + * will be called for each participant. The pair will then be connected via
> >>> + * the @add_* callbacks, and finally the optional @start callback will allow
> >>> + * any final coordination. When either participant is unregistered, the
> >>> + * process is repeated using the @del_* callbacks in place of the @add_*
> >>> + * callbacks. Match tokens must be unique per producer/consumer, 1:N parings
> >> pairings?
> >
> > Fixed
> >
> >>> + * are not supported.
> >>> + */
> >>> +
> >>> +/**
> >>> + * struct irq_bypass_producer - IRQ bypass producer definition
> >>> + * @node: IRQ bypass manager private list management
> >>> + * @token: opaque token to match between producer and consumer
> >>> + * @irq: Linux IRQ number for the producer device
> >>> + * @add_consumer: Connect the IRQ producer to an IRQ consumer
> >>> + * @del_consumer: Disconnect the IRQ producer from an IRQ consumer
> >>> + * @stop: Perform any quiesce operations necessary prior to add/del (optional)
> >>> + * @start: Perform any startup operations necessary after add/del (optional)
> >>> + *
> >>> + * The IRQ bypass producer structure represents an interrupt source for
> >>> + * participation in possible host bypass, for instance an interrupt vector
> >>> + * for a physical device assigned to a VM.
> >>> + */
> >>> +struct irq_bypass_producer {
> >>> + struct list_head node;
> >>> + void *token;
> >>> + int irq;
> >> active
> >
> > See below
> >
> >>> + void (*add_consumer)(struct irq_bypass_producer *,
> >>> + struct irq_bypass_consumer *);
> >>> + void (*del_consumer)(struct irq_bypass_producer *,
> >>> + struct irq_bypass_consumer *);
> >>> + void (*stop)(struct irq_bypass_producer *);
> >>> + void (*start)(struct irq_bypass_producer *);
> >>> +};
> >>> +
> >>> +/**
> >>> + * struct irq_bypass_consumer - IRQ bypass consumer definition
> >>> + * @node: IRQ bypass manager private list management
> >>> + * @token: opaque token to match between producer and consumer
> >>> + * @add_producer: Connect the IRQ consumer to an IRQ producer
> >>> + * @del_producer: Disconnect the IRQ consumer from an IRQ producer
> >>> + * @stop: Perform any quiesce operations necessary prior to add/del (optional)
> >>> + * @start: Perform any startup operations necessary after add/del (optional)
> >>> + *
> >>> + * The IRQ bypass consumer structure represents an interrupt sink for
> >>> + * participation in possible host bypass, for instance a hypervisor may
> >>> + * support offloads to allow bypassing the host entirely or offload
> >>> + * portions of the interrupt handling to the VM.
> >>> + */
> >>> +struct irq_bypass_consumer {
> >>> + struct list_head node;
> >>> + void *token;
> >>> + void (*add_producer)(struct irq_bypass_consumer *,
> >>> + struct irq_bypass_producer *);
> >>> + void (*del_producer)(struct irq_bypass_consumer *,
> >>> + struct irq_bypass_producer *);
> >>> + void (*stop)(struct irq_bypass_consumer *);
> >>> + void (*start)(struct irq_bypass_consumer *);
> >>> +};
> >>> +
> >>> +int irq_bypass_register_producer(struct irq_bypass_producer *);
> >>> +void irq_bypass_unregister_producer(struct irq_bypass_producer *);
> >>> +int irq_bypass_register_consumer(struct irq_bypass_consumer *);
> >>> +void irq_bypass_unregister_consumer(struct irq_bypass_consumer *);
> >>> +
> >>> +#endif /* IRQBYPASS_H */
> >>> diff --git a/virt/lib/Kconfig b/virt/lib/Kconfig
> >>> new file mode 100644
> >>> index 0000000..89a414f
> >>> --- /dev/null
> >>> +++ b/virt/lib/Kconfig
> >>> @@ -0,0 +1,2 @@
> >>> +config IRQ_BYPASS_MANAGER
> >>> + tristate
> >>> diff --git a/virt/lib/Makefile b/virt/lib/Makefile
> >>> new file mode 100644
> >>> index 0000000..901228d
> >>> --- /dev/null
> >>> +++ b/virt/lib/Makefile
> >>> @@ -0,0 +1 @@
> >>> +obj-$(CONFIG_IRQ_BYPASS_MANAGER) += irqbypass.o
> >>> diff --git a/virt/lib/irqbypass.c b/virt/lib/irqbypass.c
> >>> new file mode 100644
> >>> index 0000000..f1091e6
> >>> --- /dev/null
> >>> +++ b/virt/lib/irqbypass.c
> >>> @@ -0,0 +1,212 @@
> >>> +/*
> >>> + * IRQ offload/bypass manager
> >>> + *
> >>> + * Copyright (C) 2015 Red Hat, Inc.
> >>> + * Copyright (c) 2015 Linaro Ltd.
> >>> + *
> >>> + * This program is free software; you can redistribute it and/or modify
> >>> + * it under the terms of the GNU General Public License version 2 as
> >>> + * published by the Free Software Foundation.
> >>> + *
> >>> + * Various virtualization hardware acceleration techniques allow bypassing
> >>> + * or offloading interrupts received from devices around the host kernel.
> >>> + * Posted Interrupts on Intel VT-d systems can allow interrupts to be
> >>> + * received directly by a virtual machine. ARM IRQ Forwarding can allow
> >>> + * level triggered device interrupts to be de-asserted directly by the VM.
> >> ARM IRQ Forwarding allows forwarded physical interrupts to be directly
> >> deactivated by the guest
> >
> > Replaced
> >
> >>> + * This manager allows interrupt producers and consumers to find each other
> >>> + * to enable this sort of bypass.
> >>> + */
> >>> +
> >>> +#include <linux/irqbypass.h>
> >>> +#include <linux/list.h>
> >>> +#include <linux/module.h>
> >>> +#include <linux/mutex.h>
> >>> +
> >>> +MODULE_LICENSE("GPL v2");
> >>> +MODULE_DESCRIPTION("IRQ bypass manager utility module");
> >>> +
> >>> +static LIST_HEAD(producers);
> >>> +static LIST_HEAD(consumers);
> >>> +static DEFINE_MUTEX(lock);
> >>> +
> >>> +/* @lock must be held when calling connect */
> >>> +static void __connect(struct irq_bypass_producer *prod,
> >>> + struct irq_bypass_consumer *cons)
> >>> +{
> >>> + if (prod->stop)
> >>> + prod->stop(prod);
> >>> + if (cons->stop)
> >>> + cons->stop(cons);
> >>> +
> >>> + prod->add_consumer(prod, cons);
> >>> + cons->add_producer(cons, prod);
> >>
> >> In case you are reluctant to add the active boolean - which looks as a
> >> dirty hack I acknowledge -, could we change the proto of add_* so that
> >> they return an error. if any of the add_* fails __connect would restore
> >> the initial state and return an error. list_add would not be done.
> >>
> >> I can prototype this and add it in my forwarding series if you prefer.
> >
> > The active boolean does seem like a hack, its state that either
> > participant should be able to generate or track themselves. In fact, if
> > we add the error paths you suggest, we can determine on-demand whether a
> > bypass is active simply by looking for a token on both lists. I'll post
> > a version with error paths.
> - error path makes possible to reject any forward attempt when not safe
> which is really useful for forwarding
> - however this is not sufficient for the unforward (del) where I really
> need to communicate the IRQ active state computed by the consumer to the
> producer: KVM detects the physical IRQ was not deactivated by the guest
> so VFIO driver must mask it as KVM will deactivate it.

I'm still a little fuzzy here. So forwarding was enabled, vfio
triggered the eventfd, the guest has control of deactivating the
interrupt but hasn't yet done so, and now we disconnect forwarding.
Doesn't this just work itself out? If KVM deactivates the IRQ, the
device re-fires, vfio is in the right mode, guest possibly gets a
spurious interrupt, *shrug*. It sounds like you're wanting to inform
vfio whether the guest deasserted the interrupt to put it in the correct
mask state and wait for the resamplefd, but is it worth it? Can we not
count on hardware to re-assert the interrupt? Thanks,

Alex

> Just to make sure we share the same understanding the active state does
> not mean that the link between the producer and the consumer is in place
> but really corresponds to the IRQ state computed by the
> consumer/producer on add/del and communicated to the producer/consumer.
>
> maybe a more generic solution is to pass a void* data when registering
> the producer (stored as a member in the producer) and change the proto
> of add/del to use a 3d arg consisting of this data pointer. Then
> __connect and __disconnect would call add/del with producer->data;
> Data semantic then could be any data struct to be exchanged by fellow
> add functions or del functions
>
> in my case data would point to a boolean stored in the vfio_platform_irq.
>
> What do you think?