2015-04-30 07:14:58

by Wu, Feng

[permalink] [raw]
Subject: [v4 0/3] prerequisite changes for VT-d posted-interrupts

VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
With VT-d Posted-Interrupts enabled, external interrupts from
direct-assigned devices can be delivered to guests without VMM
intervention when guest is running in non-root mode.

You can find the VT-d Posted-Interrtups Spec. in the following URL:
http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/vt-directed-io-spec.html

This series implement some prerequisite parts for VT-d posted-interrupts. It was part of
http://thread.gmane.org/gmane.linux.kernel.iommu/7708. To make things clear, I will divide
the whole series which contain multiple components into three parts:
- prerequisite changes (included in this series)
- IOMMU part (v4 was reviewed, some comments need to be addressed)
- KVM and VFIO parts (will send out this part once the first two parts are accepted)

This series is rebased on the x86-apic branch of tip tree.

Feng Wu (3):
genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a
VCPU
x86, irq: Implement irq_set_vcpu_affinity for pci_msi_ir_controller
x86, irq: Define a global vector for VT-d Posted-Interrupts

arch/x86/include/asm/entry_arch.h | 2 ++
arch/x86/include/asm/hardirq.h | 1 +
arch/x86/include/asm/hw_irq.h | 2 ++
arch/x86/include/asm/irq_vectors.h | 1 +
arch/x86/kernel/apic/msi.c | 1 +
arch/x86/kernel/entry_64.S | 2 ++
arch/x86/kernel/irq.c | 27 +++++++++++++++++++++++++++
arch/x86/kernel/irqinit.c | 2 ++
include/linux/irq.h | 6 ++++++
kernel/irq/chip.c | 14 ++++++++++++++
kernel/irq/manage.c | 20 ++++++++++++++++++++
11 files changed, 78 insertions(+)

--
2.1.0


2015-04-30 07:15:11

by Wu, Feng

[permalink] [raw]
Subject: [v4 1/3] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU

With Posted-Interrupts support in Intel CPU and IOMMU, an external
interrupt from assigned-devices could be directly delivered to a
virtual CPU in a virtual machine. Instead of hacking KVM and Intel
IOMMU drivers, we propose a platform independent interface to target
an interrupt to a specific virtual CPU in a virtual machine, or set
virtual CPU affinity for an interrupt.

By adopting this new interface and the hierarchy irqdomain, we could
easily support posted-interrupts on Intel platforms, and also provide
flexible enough interfaces for other platforms to support similar
features.

Here is the usage scenario for this interface:
Guest update MSI/MSI-X interrupt configuration
-->QEMU and KVM handle this
-->KVM call this interface (passing posted interrupts descriptor
and guest vector)
-->irq core will transfer the control to IOMMU
-->IOMMU will do the real work of updating IRTE (IRTE has new
format for VT-d Posted-Interrupts)

Signed-off-by: Jiang Liu <[email protected]>
Signed-off-by: Feng Wu <[email protected]>
---
include/linux/irq.h | 4 ++++
kernel/irq/chip.c | 14 ++++++++++++++
kernel/irq/manage.c | 20 ++++++++++++++++++++
3 files changed, 38 insertions(+)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 62c6901..684c35d 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -327,6 +327,7 @@ static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d)
* @irq_write_msi_msg: optional to write message content for MSI
* @irq_get_irqchip_state: return the internal state of an interrupt
* @irq_set_irqchip_state: set the internal state of a interrupt
+ * @irq_set_vcpu_affinity: optional to target a virtual CPU in a virtual
* @flags: chip specific flags
*/
struct irq_chip {
@@ -369,6 +370,8 @@ struct irq_chip {
int (*irq_get_irqchip_state)(struct irq_data *data, enum irqchip_irq_state which, bool *state);
int (*irq_set_irqchip_state)(struct irq_data *data, enum irqchip_irq_state which, bool state);

+ int (*irq_set_vcpu_affinity)(struct irq_data *data, void *vcpu_info);
+
unsigned long flags;
};

@@ -422,6 +425,7 @@ extern void irq_cpu_online(void);
extern void irq_cpu_offline(void);
extern int irq_set_affinity_locked(struct irq_data *data,
const struct cpumask *cpumask, bool force);
+extern int irq_set_vcpu_affinity(unsigned int irq, void *vcpu_info);

#if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_PENDING_IRQ)
void irq_move_irq(struct irq_data *data);
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index eb9a4ea..55016b2 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -950,6 +950,20 @@ int irq_chip_retrigger_hierarchy(struct irq_data *data)
}

/**
+ * irq_chip_set_vcpu_affinity_parent - Set vcpu affinity on the parent interrupt
+ * @data: Pointer to interrupt specific data
+ * @dest: The vcpu affinity information
+ */
+int irq_chip_set_vcpu_affinity_parent(struct irq_data *data, void *vcpu_info)
+{
+ data = data->parent_data;
+ if (data->chip->irq_set_vcpu_affinity)
+ return data->chip->irq_set_vcpu_affinity(data, vcpu_info);
+
+ return -ENOSYS;
+}
+
+/**
* irq_chip_set_wake_parent - Set/reset wake-up on the parent interrupt
* @data: Pointer to interrupt specific data
* @on: Whether to set or reset the wake-up capability of this irq
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index e68932b..5e09bc2 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -256,6 +256,26 @@ int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
}
EXPORT_SYMBOL_GPL(irq_set_affinity_hint);

+int irq_set_vcpu_affinity(unsigned int irq, void *vcpu_info)
+{
+ struct irq_desc *desc = irq_to_desc(irq);
+ struct irq_chip *chip;
+ unsigned long flags;
+ int ret = -ENOSYS;
+
+ if (!desc)
+ return -EINVAL;
+
+ raw_spin_lock_irqsave(&desc->lock, flags);
+ chip = desc->irq_data.chip;
+ if (chip && chip->irq_set_vcpu_affinity)
+ ret = chip->irq_set_vcpu_affinity(irq_desc_get_irq_data(desc),
+ vcpu_info);
+ raw_spin_unlock_irqrestore(&desc->lock, flags);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(irq_set_vcpu_affinity);
+
static void irq_affinity_notify(struct work_struct *work)
{
struct irq_affinity_notify *notify =
--
2.1.0

2015-04-30 07:15:22

by Wu, Feng

[permalink] [raw]
Subject: [v4 2/3] x86, irq: Implement irq_set_vcpu_affinity for pci_msi_ir_controller

Implement irq_set_vcpu_affinity for pci_msi_ir_controller.

Signed-off-by: Feng Wu <[email protected]>
Reviewed-by: Jiang Liu <[email protected]>
---
arch/x86/kernel/apic/msi.c | 1 +
include/linux/irq.h | 2 ++
2 files changed, 3 insertions(+)

diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 58fde66..d2d95e2 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -152,6 +152,7 @@ static struct irq_chip pci_msi_ir_controller = {
.irq_mask = pci_msi_mask_irq,
.irq_ack = irq_chip_ack_parent,
.irq_retrigger = irq_chip_retrigger_hierarchy,
+ .irq_set_vcpu_affinity = irq_chip_set_vcpu_affinity_parent,
.flags = IRQCHIP_SKIP_SET_WAKE,
};

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 684c35d..cb688fb 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -471,6 +471,8 @@ extern int irq_chip_set_affinity_parent(struct irq_data *data,
const struct cpumask *dest,
bool force);
extern int irq_chip_set_wake_parent(struct irq_data *data, unsigned int on);
+extern int irq_chip_set_vcpu_affinity_parent(struct irq_data *data,
+ void *vcpu_info);
#endif

/* Handling of unhandled and spurious interrupts: */
--
2.1.0

2015-04-30 07:15:30

by Wu, Feng

[permalink] [raw]
Subject: [v4 3/3] x86, irq: Define a global vector for VT-d Posted-Interrupts

Currently, we use a global vector as the Posted-Interrupts
Notification Event for all the vCPUs in the system. We need
to introduce another global vector for VT-d Posted-Interrtups,
which will be used to wakeup the sleep vCPU when an external
interrupt from a direct-assigned device happens for that vCPU.

Signed-off-by: Feng Wu <[email protected]>
Suggested-by: Yang Zhang <[email protected]>
Acked-by: H. Peter Anvin <[email protected]>
---
arch/x86/include/asm/entry_arch.h | 2 ++
arch/x86/include/asm/hardirq.h | 1 +
arch/x86/include/asm/hw_irq.h | 2 ++
arch/x86/include/asm/irq_vectors.h | 1 +
arch/x86/kernel/entry_64.S | 2 ++
arch/x86/kernel/irq.c | 27 +++++++++++++++++++++++++++
arch/x86/kernel/irqinit.c | 2 ++
7 files changed, 37 insertions(+)

diff --git a/arch/x86/include/asm/entry_arch.h b/arch/x86/include/asm/entry_arch.h
index dc5fa66..27ca0af 100644
--- a/arch/x86/include/asm/entry_arch.h
+++ b/arch/x86/include/asm/entry_arch.h
@@ -23,6 +23,8 @@ BUILD_INTERRUPT(x86_platform_ipi, X86_PLATFORM_IPI_VECTOR)
#ifdef CONFIG_HAVE_KVM
BUILD_INTERRUPT3(kvm_posted_intr_ipi, POSTED_INTR_VECTOR,
smp_kvm_posted_intr_ipi)
+BUILD_INTERRUPT3(kvm_posted_intr_wakeup_ipi, POSTED_INTR_WAKEUP_VECTOR,
+ smp_kvm_posted_intr_wakeup_ipi)
#endif

/*
diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h
index 0f5fb6b..9866065 100644
--- a/arch/x86/include/asm/hardirq.h
+++ b/arch/x86/include/asm/hardirq.h
@@ -14,6 +14,7 @@ typedef struct {
#endif
#ifdef CONFIG_HAVE_KVM
unsigned int kvm_posted_intr_ipis;
+ unsigned int kvm_posted_intr_wakeup_ipis;
#endif
unsigned int x86_platform_ipis; /* arch dependent */
unsigned int apic_perf_irqs;
diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index 1f88e71..6ffc847 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -29,6 +29,7 @@
extern asmlinkage void apic_timer_interrupt(void);
extern asmlinkage void x86_platform_ipi(void);
extern asmlinkage void kvm_posted_intr_ipi(void);
+extern asmlinkage void kvm_posted_intr_wakeup_ipi(void);
extern asmlinkage void error_interrupt(void);
extern asmlinkage void irq_work_interrupt(void);

@@ -92,6 +93,7 @@ extern void trace_call_function_single_interrupt(void);
#define trace_irq_move_cleanup_interrupt irq_move_cleanup_interrupt
#define trace_reboot_interrupt reboot_interrupt
#define trace_kvm_posted_intr_ipi kvm_posted_intr_ipi
+#define trace_kvm_posted_intr_wakeup_ipi kvm_posted_intr_wakeup_ipi
#endif /* CONFIG_TRACING */

#ifdef CONFIG_X86_LOCAL_APIC
diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h
index b26cb12..dca94f2 100644
--- a/arch/x86/include/asm/irq_vectors.h
+++ b/arch/x86/include/asm/irq_vectors.h
@@ -105,6 +105,7 @@
/* Vector for KVM to deliver posted interrupt IPI */
#ifdef CONFIG_HAVE_KVM
#define POSTED_INTR_VECTOR 0xf2
+#define POSTED_INTR_WAKEUP_VECTOR 0xf1
#endif

/*
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index c7b2384..177feec 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -919,6 +919,8 @@ apicinterrupt X86_PLATFORM_IPI_VECTOR \
#ifdef CONFIG_HAVE_KVM
apicinterrupt3 POSTED_INTR_VECTOR \
kvm_posted_intr_ipi smp_kvm_posted_intr_ipi
+apicinterrupt3 POSTED_INTR_WAKEUP_VECTOR \
+ kvm_posted_intr_wakeup_ipi smp_kvm_posted_intr_wakeup_ipi
#endif

#ifdef CONFIG_X86_MCE_THRESHOLD
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index e5952c2..81b6bf8 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -237,6 +237,9 @@ __visible void smp_x86_platform_ipi(struct pt_regs *regs)
}

#ifdef CONFIG_HAVE_KVM
+void (*wakeup_handler_callback)(void);
+EXPORT_SYMBOL_GPL(wakeup_handler_callback);
+
/*
* Handler for POSTED_INTERRUPT_VECTOR.
*/
@@ -256,6 +259,30 @@ __visible void smp_kvm_posted_intr_ipi(struct pt_regs *regs)

set_irq_regs(old_regs);
}
+
+/*
+ * Handler for POSTED_INTERRUPT_WAKEUP_VECTOR.
+ */
+__visible void smp_kvm_posted_intr_wakeup_ipi(struct pt_regs *regs)
+{
+ struct pt_regs *old_regs = set_irq_regs(regs);
+
+ ack_APIC_irq();
+
+ irq_enter();
+
+ exit_idle();
+
+ inc_irq_stat(kvm_posted_intr_wakeup_ipis);
+
+ if (wakeup_handler_callback)
+ wakeup_handler_callback();
+
+ irq_exit();
+
+ set_irq_regs(old_regs);
+}
+
#endif

__visible void smp_trace_x86_platform_ipi(struct pt_regs *regs)
diff --git a/arch/x86/kernel/irqinit.c b/arch/x86/kernel/irqinit.c
index cd10a64..895941d 100644
--- a/arch/x86/kernel/irqinit.c
+++ b/arch/x86/kernel/irqinit.c
@@ -144,6 +144,8 @@ static void __init apic_intr_init(void)
#ifdef CONFIG_HAVE_KVM
/* IPI for KVM to deliver posted interrupt */
alloc_intr_gate(POSTED_INTR_VECTOR, kvm_posted_intr_ipi);
+ /* IPI for KVM to deliver interrupt to wake up tasks */
+ alloc_intr_gate(POSTED_INTR_WAKEUP_VECTOR, kvm_posted_intr_wakeup_ipi);
#endif

/* IPI vectors for APIC spurious and error interrupts */
--
2.1.0

2015-05-07 02:05:34

by Wu, Feng

[permalink] [raw]
Subject: RE: [v4 0/3] prerequisite changes for VT-d posted-interrupts

Hi Thomas,

Ping..

Since Liu Jiang's IRQ work has already been in the tip tree for a while, I think
It is time to send this series again based on the x86/apic branch of tip tree.
It would be very appreciated if you can have a look at this series!

Thanks,
Feng

> -----Original Message-----
> From: Wu, Feng
> Sent: Thursday, April 30, 2015 3:07 PM
> To: [email protected]; [email protected]; [email protected]
> Cc: [email protected]; [email protected]; Wu, Feng
> Subject: [v4 0/3] prerequisite changes for VT-d posted-interrupts
>
> VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
> With VT-d Posted-Interrupts enabled, external interrupts from
> direct-assigned devices can be delivered to guests without VMM
> intervention when guest is running in non-root mode.
>
> You can find the VT-d Posted-Interrtups Spec. in the following URL:
> http://www.intel.com/content/www/us/en/intelligent-systems/intel-technolog
> y/vt-directed-io-spec.html
>
> This series implement some prerequisite parts for VT-d posted-interrupts. It
> was part of
> http://thread.gmane.org/gmane.linux.kernel.iommu/7708. To make things
> clear, I will divide
> the whole series which contain multiple components into three parts:
> - prerequisite changes (included in this series)
> - IOMMU part (v4 was reviewed, some comments need to be addressed)
> - KVM and VFIO parts (will send out this part once the first two parts are
> accepted)
>
> This series is rebased on the x86-apic branch of tip tree.
>
> Feng Wu (3):
> genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a
> VCPU
> x86, irq: Implement irq_set_vcpu_affinity for pci_msi_ir_controller
> x86, irq: Define a global vector for VT-d Posted-Interrupts
>
> arch/x86/include/asm/entry_arch.h | 2 ++
> arch/x86/include/asm/hardirq.h | 1 +
> arch/x86/include/asm/hw_irq.h | 2 ++
> arch/x86/include/asm/irq_vectors.h | 1 +
> arch/x86/kernel/apic/msi.c | 1 +
> arch/x86/kernel/entry_64.S | 2 ++
> arch/x86/kernel/irq.c | 27 +++++++++++++++++++++++++++
> arch/x86/kernel/irqinit.c | 2 ++
> include/linux/irq.h | 6 ++++++
> kernel/irq/chip.c | 14 ++++++++++++++
> kernel/irq/manage.c | 20 ++++++++++++++++++++
> 11 files changed, 78 insertions(+)
>
> --
> 2.1.0

2015-05-13 08:10:17

by Jiang Liu

[permalink] [raw]
Subject: Re: [v4 1/3] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU

On 2015/4/30 15:06, Feng Wu wrote:
> With Posted-Interrupts support in Intel CPU and IOMMU, an external
> interrupt from assigned-devices could be directly delivered to a
> virtual CPU in a virtual machine. Instead of hacking KVM and Intel
> IOMMU drivers, we propose a platform independent interface to target
> an interrupt to a specific virtual CPU in a virtual machine, or set
> virtual CPU affinity for an interrupt.
>
> By adopting this new interface and the hierarchy irqdomain, we could
> easily support posted-interrupts on Intel platforms, and also provide
> flexible enough interfaces for other platforms to support similar
> features.
>
> Here is the usage scenario for this interface:
> Guest update MSI/MSI-X interrupt configuration
> -->QEMU and KVM handle this
> -->KVM call this interface (passing posted interrupts descriptor
> and guest vector)
> -->irq core will transfer the control to IOMMU
> -->IOMMU will do the real work of updating IRTE (IRTE has new
> format for VT-d Posted-Interrupts)

Hi Thomas,
Any comments or suggestions about this abstraction interface?
Thanks!
Gerry

>
> Signed-off-by: Jiang Liu <[email protected]>
> Signed-off-by: Feng Wu <[email protected]>
> ---
> include/linux/irq.h | 4 ++++
> kernel/irq/chip.c | 14 ++++++++++++++
> kernel/irq/manage.c | 20 ++++++++++++++++++++
> 3 files changed, 38 insertions(+)
>
> diff --git a/include/linux/irq.h b/include/linux/irq.h
> index 62c6901..684c35d 100644
> --- a/include/linux/irq.h
> +++ b/include/linux/irq.h
> @@ -327,6 +327,7 @@ static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d)
> * @irq_write_msi_msg: optional to write message content for MSI
> * @irq_get_irqchip_state: return the internal state of an interrupt
> * @irq_set_irqchip_state: set the internal state of a interrupt
> + * @irq_set_vcpu_affinity: optional to target a virtual CPU in a virtual
> * @flags: chip specific flags
> */
> struct irq_chip {
> @@ -369,6 +370,8 @@ struct irq_chip {
> int (*irq_get_irqchip_state)(struct irq_data *data, enum irqchip_irq_state which, bool *state);
> int (*irq_set_irqchip_state)(struct irq_data *data, enum irqchip_irq_state which, bool state);
>
> + int (*irq_set_vcpu_affinity)(struct irq_data *data, void *vcpu_info);
> +
> unsigned long flags;
> };
>
> @@ -422,6 +425,7 @@ extern void irq_cpu_online(void);
> extern void irq_cpu_offline(void);
> extern int irq_set_affinity_locked(struct irq_data *data,
> const struct cpumask *cpumask, bool force);
> +extern int irq_set_vcpu_affinity(unsigned int irq, void *vcpu_info);
>
> #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_PENDING_IRQ)
> void irq_move_irq(struct irq_data *data);
> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> index eb9a4ea..55016b2 100644
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -950,6 +950,20 @@ int irq_chip_retrigger_hierarchy(struct irq_data *data)
> }
>
> /**
> + * irq_chip_set_vcpu_affinity_parent - Set vcpu affinity on the parent interrupt
> + * @data: Pointer to interrupt specific data
> + * @dest: The vcpu affinity information
> + */
> +int irq_chip_set_vcpu_affinity_parent(struct irq_data *data, void *vcpu_info)
> +{
> + data = data->parent_data;
> + if (data->chip->irq_set_vcpu_affinity)
> + return data->chip->irq_set_vcpu_affinity(data, vcpu_info);
> +
> + return -ENOSYS;
> +}
> +
> +/**
> * irq_chip_set_wake_parent - Set/reset wake-up on the parent interrupt
> * @data: Pointer to interrupt specific data
> * @on: Whether to set or reset the wake-up capability of this irq
> diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> index e68932b..5e09bc2 100644
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -256,6 +256,26 @@ int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
> }
> EXPORT_SYMBOL_GPL(irq_set_affinity_hint);
>
> +int irq_set_vcpu_affinity(unsigned int irq, void *vcpu_info)
> +{
> + struct irq_desc *desc = irq_to_desc(irq);
> + struct irq_chip *chip;
> + unsigned long flags;
> + int ret = -ENOSYS;
> +
> + if (!desc)
> + return -EINVAL;
> +
> + raw_spin_lock_irqsave(&desc->lock, flags);
> + chip = desc->irq_data.chip;
> + if (chip && chip->irq_set_vcpu_affinity)
> + ret = chip->irq_set_vcpu_affinity(irq_desc_get_irq_data(desc),
> + vcpu_info);
> + raw_spin_unlock_irqrestore(&desc->lock, flags);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(irq_set_vcpu_affinity);
> +
> static void irq_affinity_notify(struct work_struct *work)
> {
> struct irq_affinity_notify *notify =
>

2015-05-13 21:03:43

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [v4 1/3] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU


On Wed, 13 May 2015, Jiang Liu wrote:

> On 2015/4/30 15:06, Feng Wu wrote:
> > With Posted-Interrupts support in Intel CPU and IOMMU, an external
> > interrupt from assigned-devices could be directly delivered to a
> > virtual CPU in a virtual machine. Instead of hacking KVM and Intel
> > IOMMU drivers, we propose a platform independent interface to target
> > an interrupt to a specific virtual CPU in a virtual machine, or set
> > virtual CPU affinity for an interrupt.
> >
> > By adopting this new interface and the hierarchy irqdomain, we could
> > easily support posted-interrupts on Intel platforms, and also provide
> > flexible enough interfaces for other platforms to support similar
> > features.
> >
> > Here is the usage scenario for this interface:
> > Guest update MSI/MSI-X interrupt configuration
> > -->QEMU and KVM handle this
> > -->KVM call this interface (passing posted interrupts descriptor
> > and guest vector)
> > -->irq core will transfer the control to IOMMU
> > -->IOMMU will do the real work of updating IRTE (IRTE has new
> > format for VT-d Posted-Interrupts)
>
> Hi Thomas,
> Any comments or suggestions about this abstraction interface?

It's on my review list...

2015-05-14 01:04:40

by Wu, Feng

[permalink] [raw]
Subject: RE: [v4 1/3] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU



> -----Original Message-----
> From: Thomas Gleixner [mailto:[email protected]]
> Sent: Thursday, May 14, 2015 5:04 AM
> To: Jiang Liu
> Cc: Wu, Feng; [email protected]; [email protected];
> [email protected]
> Subject: Re: [v4 1/3] genirq: Introduce irq_set_vcpu_affinity() to target an
> interrupt to a VCPU
>
>
> On Wed, 13 May 2015, Jiang Liu wrote:
>
> > On 2015/4/30 15:06, Feng Wu wrote:
> > > With Posted-Interrupts support in Intel CPU and IOMMU, an external
> > > interrupt from assigned-devices could be directly delivered to a
> > > virtual CPU in a virtual machine. Instead of hacking KVM and Intel
> > > IOMMU drivers, we propose a platform independent interface to target
> > > an interrupt to a specific virtual CPU in a virtual machine, or set
> > > virtual CPU affinity for an interrupt.
> > >
> > > By adopting this new interface and the hierarchy irqdomain, we could
> > > easily support posted-interrupts on Intel platforms, and also provide
> > > flexible enough interfaces for other platforms to support similar
> > > features.
> > >
> > > Here is the usage scenario for this interface:
> > > Guest update MSI/MSI-X interrupt configuration
> > > -->QEMU and KVM handle this
> > > -->KVM call this interface (passing posted interrupts descriptor
> > > and guest vector)
> > > -->irq core will transfer the control to IOMMU
> > > -->IOMMU will do the real work of updating IRTE (IRTE has new
> > > format for VT-d Posted-Interrupts)
> >
> > Hi Thomas,
> > Any comments or suggestions about this abstraction interface?
>
> It's on my review list...

Thanks a lot, Thomas!

Thanks,
Feng

2015-05-15 13:18:00

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [v4 1/3] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU

On Thu, 30 Apr 2015, Feng Wu wrote:
>
> Signed-off-by: Jiang Liu <[email protected]>

So I assume Jiang is the author, right?

> Signed-off-by: Feng Wu <[email protected]>

> /**
> + * irq_chip_set_vcpu_affinity_parent - Set vcpu affinity on the parent interrupt
> + * @data: Pointer to interrupt specific data
> + * @dest: The vcpu affinity information
> + */
> +int irq_chip_set_vcpu_affinity_parent(struct irq_data *data, void *vcpu_info)
> +{
> + data = data->parent_data;
> + if (data->chip->irq_set_vcpu_affinity)
> + return data->chip->irq_set_vcpu_affinity(data, vcpu_info);
> +
> + return -ENOSYS;
> +}

That needs a prototype in irq.h, methinks

> +int irq_set_vcpu_affinity(unsigned int irq, void *vcpu_info)
> +{
> + struct irq_desc *desc = irq_to_desc(irq);

irq_get_desc_lock() please

> + struct irq_chip *chip;
> + unsigned long flags;
> + int ret = -ENOSYS;
> +
> + if (!desc)
> + return -EINVAL;
> +
> + raw_spin_lock_irqsave(&desc->lock, flags);
> + chip = desc->irq_data.chip;
> + if (chip && chip->irq_set_vcpu_affinity)
> + ret = chip->irq_set_vcpu_affinity(irq_desc_get_irq_data(desc),

Above you fiddle with desc->irq_data directly. Why using the accessor here?

> + vcpu_info);
> + raw_spin_unlock_irqrestore(&desc->lock, flags);

Otherwise this looks good.

Thanks,

tglx

2015-05-15 13:19:14

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [v4 2/3] x86, irq: Implement irq_set_vcpu_affinity for pci_msi_ir_controller

On Thu, 30 Apr 2015, Feng Wu wrote:
> Implement irq_set_vcpu_affinity for pci_msi_ir_controller.
>
> Signed-off-by: Feng Wu <[email protected]>
> Reviewed-by: Jiang Liu <[email protected]>
> ---
> arch/x86/kernel/apic/msi.c | 1 +
> include/linux/irq.h | 2 ++
> 2 files changed, 3 insertions(+)
>
> diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
> index 58fde66..d2d95e2 100644
> --- a/arch/x86/kernel/apic/msi.c
> +++ b/arch/x86/kernel/apic/msi.c
> @@ -152,6 +152,7 @@ static struct irq_chip pci_msi_ir_controller = {
> .irq_mask = pci_msi_mask_irq,
> .irq_ack = irq_chip_ack_parent,
> .irq_retrigger = irq_chip_retrigger_hierarchy,
> + .irq_set_vcpu_affinity = irq_chip_set_vcpu_affinity_parent,
> .flags = IRQCHIP_SKIP_SET_WAKE,
> };
>
> diff --git a/include/linux/irq.h b/include/linux/irq.h
> index 684c35d..cb688fb 100644
> --- a/include/linux/irq.h
> +++ b/include/linux/irq.h
> @@ -471,6 +471,8 @@ extern int irq_chip_set_affinity_parent(struct irq_data *data,
> const struct cpumask *dest,
> bool force);
> extern int irq_chip_set_wake_parent(struct irq_data *data, unsigned int on);
> +extern int irq_chip_set_vcpu_affinity_parent(struct irq_data *data,
> + void *vcpu_info);

Ah, here it is. Just in the wrong patch ....


Thanks,

tglx

2015-05-15 13:27:06

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [v4 3/3] x86, irq: Define a global vector for VT-d Posted-Interrupts

On Thu, 30 Apr 2015, Feng Wu wrote:
> #ifdef CONFIG_HAVE_KVM
> +void (*wakeup_handler_callback)(void);
> +EXPORT_SYMBOL_GPL(wakeup_handler_callback);

The matching entry in a header file is going to come later again?

> /*
> * Handler for POSTED_INTERRUPT_VECTOR.
> */
> @@ -256,6 +259,30 @@ __visible void smp_kvm_posted_intr_ipi(struct pt_regs *regs)
>
> set_irq_regs(old_regs);
> }
> +
> +/*
> + * Handler for POSTED_INTERRUPT_WAKEUP_VECTOR.
> + */
> +__visible void smp_kvm_posted_intr_wakeup_ipi(struct pt_regs *regs)
> +{
> + struct pt_regs *old_regs = set_irq_regs(regs);
> +
> + ack_APIC_irq();
> +
> + irq_enter();
> +
> + exit_idle();

entering_ack_irq() please

> + inc_irq_stat(kvm_posted_intr_wakeup_ipis);
> +
> + if (wakeup_handler_callback)
> + wakeup_handler_callback();
> +
> + irq_exit();
> +
> + set_irq_regs(old_regs);
> +}

Thanks,

tglx

2015-05-18 02:42:33

by Wu, Feng

[permalink] [raw]
Subject: RE: [v4 1/3] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU

Thanks for the review!

> -----Original Message-----
> From: Thomas Gleixner [mailto:[email protected]]
> Sent: Friday, May 15, 2015 9:18 PM
> To: Wu, Feng
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]
> Subject: Re: [v4 1/3] genirq: Introduce irq_set_vcpu_affinity() to target an
> interrupt to a VCPU
>
> On Thu, 30 Apr 2015, Feng Wu wrote:
> >
> > Signed-off-by: Jiang Liu <[email protected]>
>
> So I assume Jiang is the author, right?

Oh, yes, I think I made some mistakes while applying the patches. Thanks
for pointing this out!

>
> > Signed-off-by: Feng Wu <[email protected]>
>
> > /**
> > + * irq_chip_set_vcpu_affinity_parent - Set vcpu affinity on the parent
> interrupt
> > + * @data: Pointer to interrupt specific data
> > + * @dest: The vcpu affinity information
> > + */
> > +int irq_chip_set_vcpu_affinity_parent(struct irq_data *data, void
> *vcpu_info)
> > +{
> > + data = data->parent_data;
> > + if (data->chip->irq_set_vcpu_affinity)
> > + return data->chip->irq_set_vcpu_affinity(data, vcpu_info);
> > +
> > + return -ENOSYS;
> > +}
>
> That needs a prototype in irq.h, methinks
>
> > +int irq_set_vcpu_affinity(unsigned int irq, void *vcpu_info)
> > +{
> > + struct irq_desc *desc = irq_to_desc(irq);
>
> irq_get_desc_lock() please
>
> > + struct irq_chip *chip;
> > + unsigned long flags;
> > + int ret = -ENOSYS;
> > +
> > + if (!desc)
> > + return -EINVAL;
> > +
> > + raw_spin_lock_irqsave(&desc->lock, flags);
> > + chip = desc->irq_data.chip;
> > + if (chip && chip->irq_set_vcpu_affinity)
> > + ret = chip->irq_set_vcpu_affinity(irq_desc_get_irq_data(desc),
>
> Above you fiddle with desc->irq_data directly. Why using the accessor here?

I will only use one style here.

Thanks,
Feng

>
> > + vcpu_info);
> > + raw_spin_unlock_irqrestore(&desc->lock, flags);
>
> Otherwise this looks good.
>
> Thanks,
>
> tglx

2015-05-18 02:45:22

by Wu, Feng

[permalink] [raw]
Subject: RE: [v4 3/3] x86, irq: Define a global vector for VT-d Posted-Interrupts



> -----Original Message-----
> From: Thomas Gleixner [mailto:[email protected]]
> Sent: Friday, May 15, 2015 9:27 PM
> To: Wu, Feng
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]
> Subject: Re: [v4 3/3] x86, irq: Define a global vector for VT-d Posted-Interrupts
>
> On Thu, 30 Apr 2015, Feng Wu wrote:
> > #ifdef CONFIG_HAVE_KVM
> > +void (*wakeup_handler_callback)(void);
> > +EXPORT_SYMBOL_GPL(wakeup_handler_callback);
>
> The matching entry in a header file is going to come later again?

I will add the declaration in a header file in this patch.

>
> > /*
> > * Handler for POSTED_INTERRUPT_VECTOR.
> > */
> > @@ -256,6 +259,30 @@ __visible void smp_kvm_posted_intr_ipi(struct
> pt_regs *regs)
> >
> > set_irq_regs(old_regs);
> > }
> > +
> > +/*
> > + * Handler for POSTED_INTERRUPT_WAKEUP_VECTOR.
> > + */
> > +__visible void smp_kvm_posted_intr_wakeup_ipi(struct pt_regs *regs)
> > +{
> > + struct pt_regs *old_regs = set_irq_regs(regs);
> > +
> > + ack_APIC_irq();
> > +
> > + irq_enter();
> > +
> > + exit_idle();
>
> entering_ack_irq() please

Good idea!

Thanks,
Feng

>
> > + inc_irq_stat(kvm_posted_intr_wakeup_ipis);
> > +
> > + if (wakeup_handler_callback)
> > + wakeup_handler_callback();
> > +
> > + irq_exit();
> > +
> > + set_irq_regs(old_regs);
> > +}
>
> Thanks,
>
> tglx