Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754250AbbGCFc2 (ORCPT ); Fri, 3 Jul 2015 01:32:28 -0400 Received: from mail-wi0-f179.google.com ([209.85.212.179]:36777 "EHLO mail-wi0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751845AbbGCFcV (ORCPT ); Fri, 3 Jul 2015 01:32:21 -0400 Message-ID: <55961E51.6010202@linaro.org> Date: Fri, 03 Jul 2015 07:32:01 +0200 From: Eric Auger User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: "Wu, Feng" , "eric.auger@st.com" , "linux-arm-kernel@lists.infradead.org" , "kvmarm@lists.cs.columbia.edu" , "kvm@vger.kernel.org" , "christoffer.dall@linaro.org" , "marc.zyngier@arm.com" , "alex.williamson@redhat.com" , "pbonzini@redhat.com" , "avi.kivity@gmail.com" , "mtosatti@redhat.com" , "joro@8bytes.org" , "b.reynal@virtualopensystems.com" CC: "linux-kernel@vger.kernel.org" , "patches@linaro.org" Subject: Re: [RFC 09/17] bypass: IRQ bypass manager proto by Alex References: <1435843047-6327-1-git-send-email-eric.auger@linaro.org> <1435843047-6327-10-git-send-email-eric.auger@linaro.org> In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 11606 Lines: 355 Hi Feng, On 07/03/2015 04:16 AM, Wu, Feng wrote: > > >> -----Original Message----- >> From: Eric Auger [mailto:eric.auger@linaro.org] >> Sent: Thursday, July 02, 2015 9:17 PM >> To: eric.auger@st.com; eric.auger@linaro.org; >> linux-arm-kernel@lists.infradead.org; kvmarm@lists.cs.columbia.edu; >> kvm@vger.kernel.org; christoffer.dall@linaro.org; marc.zyngier@arm.com; >> alex.williamson@redhat.com; pbonzini@redhat.com; avi.kivity@gmail.com; >> mtosatti@redhat.com; Wu, Feng; joro@8bytes.org; >> b.reynal@virtualopensystems.com >> Cc: linux-kernel@vger.kernel.org; patches@linaro.org >> Subject: [RFC 09/17] bypass: IRQ bypass manager proto by Alex >> >> From: Alex Williamson >> >> There are plenty of details to be filled in, but I think the basics >> looks something like the code below. The IRQ bypass manager just >> defines a pair of structures, one for interrupt producers and one for >> interrupt consumers. I'm certain that we'll need more callbacks than >> I've defined below, but figuring out what those should be for the best >> abstraction is the hardest part of this idea. The manager provides both >> registration and de-registration interfaces for both types of objects >> and keeps lists for each, protected by a lock. The manager doesn't even >> really need to know what the match token is, but I assume for our >> purposes it will be an eventfd_ctx. >> >> On the vfio side, the producer struct would be embedded in the >> vfio_pci_irq_ctx struct. KVM would probably embed the consumer struct >> in _irqfd. As I've coded below, the IRQ bypass manager calls the >> consumer callbacks, so the producer struct would need fields or >> callbacks to provide the consumer the info it needs. AIUI the Posted >> Interrupt model, VFIO only needs to provide data to the consumer. For >> IRQ Forwarding, I think the producer needs to be informed when bypass is >> active to model the incoming interrupt as edge vs level. >> >> I've prototyped the base IRQ bypass manager here as static, but I don't >> see any reason it couldn't be a module that's loaded by dependency when >> either vfio-pci or kvm-intel is loaded (or other producer/consumer >> objects). >> >> Is this a reasonable starting point to craft the additional fields and >> callbacks and interaction of who calls who that we need to support >> Posted Interrupts and IRQ Forwarding? Is the AMD version of this still >> alive? Thanks, >> >> Alex > > In fact, I also implement a RFC patch for this new framework. I am > thinking, can we discuss all the requirements for irq forwarding and > posted interrupts, and make it a separate patchset as a general > layer? Then we can continue to push arch specific stuff, it is more > clear and easy. Sure. I intend to respin today according to Paolo's directives and I will put common patches in a separate series. Let's see next week with Alex how he prefers things to be handled. Best Regards Eric > > Thanks, > Feng > >> --- >> arch/x86/kvm/Kconfig | 1 + >> drivers/vfio/pci/Kconfig | 1 + >> drivers/vfio/pci/vfio_pci_intrs.c | 6 ++ >> include/linux/irqbypass.h | 23 ++++++++ >> kernel/irq/Kconfig | 3 + >> kernel/irq/Makefile | 1 + >> kernel/irq/bypass.c | 116 >> ++++++++++++++++++++++++++++++++++++++ >> virt/kvm/eventfd.c | 4 ++ >> 8 files changed, 155 insertions(+) >> create mode 100644 include/linux/irqbypass.h >> create mode 100644 kernel/irq/bypass.c >> >> diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig >> index d8a1d56..86d0d77 100644 >> --- a/arch/x86/kvm/Kconfig >> +++ b/arch/x86/kvm/Kconfig >> @@ -61,6 +61,7 @@ config KVM_INTEL >> depends on KVM >> # for perf_guest_get_msrs(): >> depends on CPU_SUP_INTEL >> + select IRQ_BYPASS_MANAGER >> ---help--- >> Provides support for KVM on Intel processors equipped with the VT >> extensions. >> diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig >> index 579d83b..02912f1 100644 >> --- a/drivers/vfio/pci/Kconfig >> +++ b/drivers/vfio/pci/Kconfig >> @@ -2,6 +2,7 @@ config VFIO_PCI >> tristate "VFIO support for PCI devices" >> depends on VFIO && PCI && EVENTFD >> select VFIO_VIRQFD >> + select IRQ_BYPASS_MANAGER >> help >> Support for the PCI VFIO bus driver. This is required to make >> use of PCI drivers using the VFIO framework. >> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c >> index 1f577b4..4e053be 100644 >> --- a/drivers/vfio/pci/vfio_pci_intrs.c >> +++ b/drivers/vfio/pci/vfio_pci_intrs.c >> @@ -181,6 +181,7 @@ static int vfio_intx_set_signal(struct vfio_pci_device >> *vdev, int fd) >> >> if (vdev->ctx[0].trigger) { >> free_irq(pdev->irq, vdev); >> + /* irq_bypass_unregister_producer(); */ >> kfree(vdev->ctx[0].name); >> eventfd_ctx_put(vdev->ctx[0].trigger); >> vdev->ctx[0].trigger = NULL; >> @@ -214,6 +215,8 @@ static int vfio_intx_set_signal(struct vfio_pci_device >> *vdev, int fd) >> return ret; >> } >> >> + /* irq_bypass_register_producer(); */ >> + >> /* >> * INTx disable will stick across the new irq setup, >> * disable_irq won't. >> @@ -319,6 +322,7 @@ static int vfio_msi_set_vector_signal(struct >> vfio_pci_device *vdev, >> >> if (vdev->ctx[vector].trigger) { >> free_irq(irq, vdev->ctx[vector].trigger); >> + /* irq_bypass_unregister_producer(); */ >> kfree(vdev->ctx[vector].name); >> eventfd_ctx_put(vdev->ctx[vector].trigger); >> vdev->ctx[vector].trigger = NULL; >> @@ -360,6 +364,8 @@ static int vfio_msi_set_vector_signal(struct >> vfio_pci_device *vdev, >> return ret; >> } >> >> + /* irq_bypass_register_producer(); */ >> + >> vdev->ctx[vector].trigger = trigger; >> >> return 0; >> diff --git a/include/linux/irqbypass.h b/include/linux/irqbypass.h >> new file mode 100644 >> index 0000000..718508e >> --- /dev/null >> +++ b/include/linux/irqbypass.h >> @@ -0,0 +1,23 @@ >> +#ifndef IRQBYPASS_H >> +#define IRQBYPASS_H >> + >> +#include >> + >> +struct irq_bypass_producer { >> + struct list_head node; >> + void *token; >> + /* TBD */ >> +}; >> + >> +struct irq_bypass_consumer { >> + struct list_head node; >> + void *token; >> + void (*add_producer)(struct irq_bypass_producer *); >> + void (*del_producer)(struct irq_bypass_producer *); >> +}; >> + >> +int irq_bypass_register_producer(struct irq_bypass_producer *); >> +void irq_bypass_unregister_producer(struct irq_bypass_producer *); >> +int irq_bypass_register_consumer(struct irq_bypass_consumer *); >> +void irq_bypass_unregister_consumer(struct irq_bypass_consumer *); >> +#endif /* IRQBYPASS_H */ >> diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig >> index 9a76e3b..4502cdc 100644 >> --- a/kernel/irq/Kconfig >> +++ b/kernel/irq/Kconfig >> @@ -100,4 +100,7 @@ config SPARSE_IRQ >> >> If you don't know what to do here, say N. >> >> +config IRQ_BYPASS_MANAGER >> + bool >> + >> endmenu >> diff --git a/kernel/irq/Makefile b/kernel/irq/Makefile >> index d121235..a30ed77 100644 >> --- a/kernel/irq/Makefile >> +++ b/kernel/irq/Makefile >> @@ -7,3 +7,4 @@ obj-$(CONFIG_PROC_FS) += proc.o >> obj-$(CONFIG_GENERIC_PENDING_IRQ) += migration.o >> obj-$(CONFIG_PM_SLEEP) += pm.o >> obj-$(CONFIG_GENERIC_MSI_IRQ) += msi.o >> +obj-$(CONFIG_IRQ_BYPASS_MANAGER) += bypass.o >> diff --git a/kernel/irq/bypass.c b/kernel/irq/bypass.c >> new file mode 100644 >> index 0000000..5d0f92b >> --- /dev/null >> +++ b/kernel/irq/bypass.c >> @@ -0,0 +1,116 @@ >> +/* >> + * IRQ offload/bypass manager >> + * >> + * Various virtualization hardware acceleration techniques allow bypassing >> + * or offloading interrupts receieved from devices around the host kernel. >> + * Posted Interrupts on Intel VT-d systems can allow interrupts to be >> + * recieved directly by a virtual machine. ARM IRQ Forwarding can allow >> + * level triggered device interrupts to be de-asserted directly by the VM. >> + * This manager allows interrupt producers and consumers to find each other >> + * to enable this sort of bypass. >> + */ >> + >> +#include >> +#include >> +#include >> +#include >> + >> +static LIST_HEAD(producers); >> +static LIST_HEAD(consumers); >> +static DEFINE_MUTEX(lock); >> + >> +int irq_bypass_register_producer(struct irq_bypass_producer *producer) >> +{ >> + struct irq_bypass_producer *tmp; >> + struct irq_bypass_consumer *consumer; >> + int ret = 0; >> + >> + mutex_lock(&lock); >> + >> + list_for_each_entry(tmp, &producers, node) { >> + if (tmp->token == producer->token) { >> + ret = -EINVAL; >> + goto unlock; >> + } >> + } >> + >> + list_add(&producer->node, &producers); >> + >> + list_for_each_entry(consumer, &consumers, node) { >> + if (consumer->token == producer->token) { >> + consumer->add_producer(producer); >> + break; >> + } >> + } >> +unlock: >> + mutex_unlock(&lock); >> + return ret; >> +} >> +EXPORT_SYMBOL_GPL(irq_bypass_register_producer); >> + >> +void irq_bypass_unregister_producer(struct irq_bypass_producer *producer) >> +{ >> + struct irq_bypass_consumer *consumer; >> + >> + mutex_lock(&lock); >> + >> + list_for_each_entry(consumer, &consumers, node) { >> + if (consumer->token == producer->token) { >> + consumer->del_producer(producer); >> + break; >> + } >> + } >> + >> + list_del(&producer->node); >> + >> + mutex_unlock(&lock); >> +} >> +EXPORT_SYMBOL_GPL(irq_bypass_unregister_producer); >> + >> +int irq_bypass_register_consumer(struct irq_bypass_consumer *consumer) >> +{ >> + struct irq_bypass_consumer *tmp; >> + struct irq_bypass_producer *producer; >> + int ret = 0; >> + >> + mutex_lock(&lock); >> + >> + list_for_each_entry(tmp, &consumers, node) { >> + if (tmp->token == consumer->token) { >> + ret = -EINVAL; >> + goto unlock; >> + } >> + } >> + >> + list_add(&consumer->node, &consumers); >> + >> + list_for_each_entry(producer, &producers, node) { >> + if (producer->token == consumer->token) { >> + consumer->add_producer(producer); >> + break; >> + } >> + } >> +unlock: >> + mutex_unlock(&lock); >> + return ret; >> +} >> +EXPORT_SYMBOL_GPL(irq_bypass_register_consumer); >> + >> +void irq_bypass_unregister_consumer(struct irq_bypass_consumer >> *consumer) >> +{ >> + struct irq_bypass_producer *producer; >> + >> + mutex_lock(&lock); >> + >> + list_for_each_entry(producer, &producers, node) { >> + if (producer->token == consumer->token) { >> + consumer->del_producer(producer); >> + break; >> + } >> + } >> + >> + list_del(&consumer->node); >> + >> + mutex_unlock(&lock); >> +} >> +EXPORT_SYMBOL_GPL(irq_bypass_unregister_consumer); >> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c >> index 9ff4193..f3da161 100644 >> --- a/virt/kvm/eventfd.c >> +++ b/virt/kvm/eventfd.c >> @@ -429,6 +429,8 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd >> *args) >> */ >> fdput(f); >> >> + /* irq_bypass_register_consumer(); */ >> + >> return 0; >> >> fail: >> @@ -528,6 +530,8 @@ kvm_irqfd_deassign(struct kvm *kvm, struct kvm_irqfd >> *args) >> struct _irqfd *irqfd, *tmp; >> struct eventfd_ctx *eventfd; >> >> + /* irq_bypass_unregister_consumer() */ >> + >> eventfd = eventfd_ctx_fdget(args->fd); >> if (IS_ERR(eventfd)) >> return PTR_ERR(eventfd); >> -- >> 1.9.1 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/