This series allows to set ARM IRQ forwarding between a VFIO platform
device physical IRQ and a guest virtual IRQ. The link is coordinated
by the IRQ bypass manager.
The principle is the VFIO platform driver registers an IRQ bypass producer
struct on VFIO_IRQ_SET_ACTION_TRIGGER while KVM irqfd registers a consumer
struct on the irqfd assignment. This leads to a handshake based on the
eventfd context (used as token) match. When either of the producer/consumer
disappears, an unregistration occurs and the link is disconnected.
This kernel integration deprecates the former kvm-vfio approach:
https://lkml.org/lkml/2015/4/13/353. Some rationale about that change can
be found in IRQ bypass manager thread: https://lkml.org/lkml/2015/6/29/268
Dependencies:
1- [PATCH v4 00/11] arm/arm64: KVM: Active interrupt state switching
for shared devices (http://www.spinics.net/lists/arm-kernel/msg437884.html)
except PATCH 11
2- [PATCH 0/6] irqchip: GICv2/v3: Add support for irq_vcpu_affinity
3- [PATCH v4] virt: IRQ bypass manager (https://lkml.org/lkml/2015/8/6/526)
4- [PATCH 0/2] KVM: arm/arm64: Guest synchronous halt/resume
(https://www.mail-archive.com/[email protected]/msg950942.html)
All those pieces can be found at:
https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.2-rc6-irq-forward-v3
More backgroung on ARM IRQ forwarding in the text below and at
http://www.linux-kvm.org/images/a/a8/01x04-ARMdevice.pdf.
A forwarded IRQ is deactivated by the guest and not by the host.
When the guest deactivates the associated virtual IRQ, the interrupt
controller automatically completes the physical IRQ. Obviously
this requires some HW support in the interrupt controller. This is
the case for ARM GICv2.
The direct benefit is that, for a level sensitive IRQ, a VM exit
can be avoided on forwarded IRQ completion.
When the IRQ is forwarded, the VFIO platform driver does not need to
mask the physical IRQ anymore before signaling the eventfd. Indeed
genirq lowers the running priority, enabling other physical IRQ to hit
except that one.
Besides, the injection still is based on irqfd triggering. The only
impact on irqfd process is resamplefd is not called anymore on
virtual IRQ completion since deactivation is not trapped by KVM.
This was tested on Calxeda Midway, assigning the xgmac main IRQ
History:
v2 (RFC) -> v3(PATCH):
- all dependencies now have a patch status
- we dropped the producer active boolean exchanged between the
VFIO producer and irqfd arm consumer. As a consequence, on
unforward, if the IRQ is active, this latter is deactivated
without VFIO-masking it. So we do not exactly come back to the
exact state where we would be in unforwarded state. A new
physical IRQ can hit while the previous virtual IRQ is under
treatment. Its injection in the guest may be rejected thanks
to the VGIC state machine. This IRQ will be lost but I don't
think this is a severe issue. In case no new IRQ hits, the
guest deactivation of the virtual IRQ will trigger the resamplefd
which will VFIO unmask the non-masked IRQ. This also has no
consequence.
- VFIO platform driver consumer_add now can fail. It rejects the
transition for forwarding state in case the IRQ is active
- the series is rebased on new irq_vcpu_affinity series
- no dependency anymore on "chip/vgic adaptations for forwarded irq"
which was partially integrated into Marc's series. A fix is still
needed through.
- Guest synchronous halt/resume patch re-integrated into this series
- integrate a new patch file coming mixing
[PATCH v4 11/11] KVM: arm/arm64: vgic: Allow HW interrupts for
non-shared devices &
[RFC v2 2/4] KVM: arm: vgic: fix state machine for forwarded IRQ
v1 -> v2:
- irq bypass manager and irqfd consumer moved in a separate patch
- kvm_arm_[halt,resume]_guest moved in a separate patch
- remove VFIO external functions since we do not need them anymore
- apply container_of strategy advised by Paolo. Only active field
remains and discussions will tell whether we get rid of it.
- renamed kvm_arch functions
- kvm-vfio v6 -> RFC v1 based on IRQ bypass manager
see previous history in https://lkml.org/lkml/2015/4/13/353).
Best Regards
Eric
Eric Auger (9):
VFIO: platform: registration of a dummy IRQ bypass producer
VFIO: platform: test forwarded state when selecting the IRQ handler
VFIO: platform: single handler using function pointer
VFIO: platform: add vfio_platform_set_automasked
VFIO: platform: add vfio_platform_is_active
VFIO: platform: add irq bypass producer management
KVM: arm/arm64: vgic: support irqfd injection of a forwarded IRQ
KVM: arm/arm64: vgic: forwarding control
KVM: arm/arm64: implement IRQ bypass consumer functions
Marc Zyngier (1):
KVM: arm/arm64: vgic: Allow HW interrupts for non-shared devices
arch/arm/kvm/Kconfig | 1 +
arch/arm/kvm/arm.c | 35 +++++
arch/arm64/kvm/Kconfig | 1 +
drivers/vfio/platform/vfio_platform_irq.c | 113 +++++++++++++-
drivers/vfio/platform/vfio_platform_private.h | 4 +
include/kvm/arm_vgic.h | 12 +-
virt/kvm/arm/arch_timer.c | 3 +-
virt/kvm/arm/vgic.c | 215 ++++++++++++++++++++++++--
8 files changed, 358 insertions(+), 26 deletions(-)
--
1.9.1
Register a dummy producer with void callbacks
Signed-off-by: Eric Auger <[email protected]>
---
v2 -> v3:
- rename vfio_platform_irq_bypass_resume into *_start
---
drivers/vfio/platform/vfio_platform_irq.c | 32 +++++++++++++++++++++++++++
drivers/vfio/platform/vfio_platform_private.h | 2 ++
2 files changed, 34 insertions(+)
diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index 88bba57..b5cb8c7 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -20,6 +20,7 @@
#include <linux/types.h>
#include <linux/vfio.h>
#include <linux/irq.h>
+#include <linux/irqbypass.h>
#include "vfio_platform_private.h"
@@ -177,6 +178,27 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
return IRQ_HANDLED;
}
+static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
+{
+}
+
+static void vfio_platform_irq_bypass_start(struct irq_bypass_producer *prod)
+{
+}
+
+static int vfio_platform_irq_bypass_add_consumer(
+ struct irq_bypass_producer *prod,
+ struct irq_bypass_consumer *cons)
+{
+ return 0;
+}
+
+static void vfio_platform_irq_bypass_del_consumer(
+ struct irq_bypass_producer *prod,
+ struct irq_bypass_consumer *cons)
+{
+}
+
static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
int fd, irq_handler_t handler)
{
@@ -186,6 +208,7 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
if (irq->trigger) {
free_irq(irq->hwirq, irq);
+ irq_bypass_unregister_producer(&irq->producer);
kfree(irq->name);
eventfd_ctx_put(irq->trigger);
irq->trigger = NULL;
@@ -216,6 +239,15 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
return ret;
}
+ irq->producer.token = (void *)trigger;
+ irq->producer.irq = irq->hwirq;
+ irq->producer.add_consumer = vfio_platform_irq_bypass_add_consumer;
+ irq->producer.del_consumer = vfio_platform_irq_bypass_del_consumer;
+ irq->producer.stop = vfio_platform_irq_bypass_stop;
+ irq->producer.start = vfio_platform_irq_bypass_start;
+ ret = irq_bypass_register_producer(&irq->producer);
+ WARN_ON(ret);
+
if (!irq->masked)
enable_irq(irq->hwirq);
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index 1c9b3d5..1d2d4d6 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -17,6 +17,7 @@
#include <linux/types.h>
#include <linux/interrupt.h>
+#include <linux/irqbypass.h>
#define VFIO_PLATFORM_OFFSET_SHIFT 40
#define VFIO_PLATFORM_OFFSET_MASK (((u64)(1) << VFIO_PLATFORM_OFFSET_SHIFT) - 1)
@@ -37,6 +38,7 @@ struct vfio_platform_irq {
spinlock_t lock;
struct virqfd *unmask;
struct virqfd *mask;
+ struct irq_bypass_producer producer;
};
struct vfio_platform_region {
--
1.9.1
Add a new forwarded flag in vfio_platform_irq. In case the IRQ
is forwarded, the VFIO platform IRQ handler does not need to
disable the IRQ anymore.
When setting the IRQ handler we now also test the forwarded state. In
case the IRQ is forwarded we select the vfio_irq_handler.
Signed-off-by: Eric Auger <[email protected]>
---
v2:
- add a new forwarded flag and do not use irqd_irq_forwarded anymore
---
drivers/vfio/platform/vfio_platform_irq.c | 3 ++-
drivers/vfio/platform/vfio_platform_private.h | 1 +
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index b5cb8c7..40f057a 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -262,7 +262,8 @@ static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev,
struct vfio_platform_irq *irq = &vdev->irqs[index];
irq_handler_t handler;
- if (vdev->irqs[index].flags & VFIO_IRQ_INFO_AUTOMASKED)
+ if (vdev->irqs[index].flags & VFIO_IRQ_INFO_AUTOMASKED &&
+ !irq->forwarded)
handler = vfio_automasked_irq_handler;
else
handler = vfio_irq_handler;
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index 1d2d4d6..8b4f814 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -39,6 +39,7 @@ struct vfio_platform_irq {
struct virqfd *unmask;
struct virqfd *mask;
struct irq_bypass_producer producer;
+ bool forwarded;
};
struct vfio_platform_region {
--
1.9.1
A single handler now is registered whatever the use case: automasked
or not. A function pointer is set according to the wished behavior
and the handler calls this function.
The irq lock is taken/released in the root handler. eventfd_signal can
be called in regions not allowed to sleep.
Signed-off-by: Eric Auger <[email protected]>
---
v4: creation
---
drivers/vfio/platform/vfio_platform_irq.c | 21 +++++++++++++++------
drivers/vfio/platform/vfio_platform_private.h | 1 +
2 files changed, 16 insertions(+), 6 deletions(-)
diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index 40f057a..b31b1f0 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -148,11 +148,8 @@ static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev,
static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
{
struct vfio_platform_irq *irq_ctx = dev_id;
- unsigned long flags;
int ret = IRQ_NONE;
- spin_lock_irqsave(&irq_ctx->lock, flags);
-
if (!irq_ctx->masked) {
ret = IRQ_HANDLED;
@@ -161,8 +158,6 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
irq_ctx->masked = true;
}
- spin_unlock_irqrestore(&irq_ctx->lock, flags);
-
if (ret == IRQ_HANDLED)
eventfd_signal(irq_ctx->trigger, 1);
@@ -178,6 +173,19 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
return IRQ_HANDLED;
}
+static irqreturn_t vfio_handler(int irq, void *dev_id)
+{
+ struct vfio_platform_irq *irq_ctx = dev_id;
+ unsigned long flags;
+ irqreturn_t ret;
+
+ spin_lock_irqsave(&irq_ctx->lock, flags);
+ ret = irq_ctx->handler(irq, dev_id);
+ spin_unlock_irqrestore(&irq_ctx->lock, flags);
+
+ return ret;
+}
+
static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
{
}
@@ -229,9 +237,10 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
}
irq->trigger = trigger;
+ irq->handler = handler;
irq_set_status_flags(irq->hwirq, IRQ_NOAUTOEN);
- ret = request_irq(irq->hwirq, handler, 0, irq->name, irq);
+ ret = request_irq(irq->hwirq, vfio_handler, 0, irq->name, irq);
if (ret) {
kfree(irq->name);
eventfd_ctx_put(trigger);
diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
index 8b4f814..f848a6b 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -40,6 +40,7 @@ struct vfio_platform_irq {
struct virqfd *mask;
struct irq_bypass_producer producer;
bool forwarded;
+ irqreturn_t (*handler)(int irq, void *dev_id);
};
struct vfio_platform_region {
--
1.9.1
This function makes possible to change the automasked mode.
Signed-off-by: Eric Auger <[email protected]>
---
v1 -> v2:
- set forwarded flag
---
drivers/vfio/platform/vfio_platform_irq.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index b31b1f0..a285384 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -186,6 +186,25 @@ static irqreturn_t vfio_handler(int irq, void *dev_id)
return ret;
}
+static int vfio_platform_set_automasked(struct vfio_platform_irq *irq,
+ bool automasked)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&irq->lock, flags);
+ if (automasked) {
+ irq->forwarded = true;
+ irq->flags |= VFIO_IRQ_INFO_AUTOMASKED;
+ irq->handler = vfio_automasked_irq_handler;
+ } else {
+ irq->forwarded = false;
+ irq->flags &= ~VFIO_IRQ_INFO_AUTOMASKED;
+ irq->handler = vfio_irq_handler;
+ }
+ spin_unlock_irqrestore(&irq->lock, flags);
+ return 0;
+}
+
static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
{
}
--
1.9.1
This function returns whether the IRQ is active at irqchip level or
VFIO masked. If either is true, it is considered the IRQ is active.
Currently there is no way to differentiate userspace masked IRQ from
automasked IRQ. There might be false detection of activity. However
it is currently acceptable to have false detection.
Signed-off-by: Eric Auger <[email protected]>
---
---
drivers/vfio/platform/vfio_platform_irq.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index a285384..efaee58 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -205,6 +205,23 @@ static int vfio_platform_set_automasked(struct vfio_platform_irq *irq,
return 0;
}
+static int vfio_platform_is_active(struct vfio_platform_irq *irq)
+{
+ unsigned long flags;
+ bool active, masked, outstanding;
+ int ret;
+
+ spin_lock_irqsave(&irq->lock, flags);
+
+ ret = irq_get_irqchip_state(irq->hwirq, IRQCHIP_STATE_ACTIVE, &active);
+ BUG_ON(ret);
+ masked = irq->masked;
+ outstanding = active || masked;
+
+ spin_unlock_irqrestore(&irq->lock, flags);
+ return outstanding;
+}
+
static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
{
}
--
1.9.1
This patch populates the IRQ bypass callacks:
- stop/start producer simply consist in disabling/enabling the host irq
- add/del consumer: basically set the automasked flag to false/true
Signed-off-by: Eric Auger <[email protected]>
---
v2 -> v3:
- vfio_platform_irq_bypass_add_consumer now returns an error in case
the IRQ is recognized as active
- active boolean not set anymore
- do not VFIO mask the IRQ anymore on unforward
v1 -> v2:
- device handle caching in vfio_platform_device is introduced in a
separate patch
- use container_of
---
drivers/vfio/platform/vfio_platform_irq.c | 23 ++++++++++++++++++++++-
1 file changed, 22 insertions(+), 1 deletion(-)
diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
index efaee58..400a188 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -224,23 +224,44 @@ static int vfio_platform_is_active(struct vfio_platform_irq *irq)
static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
{
+ disable_irq(prod->irq);
}
static void vfio_platform_irq_bypass_start(struct irq_bypass_producer *prod)
{
+ enable_irq(prod->irq);
}
static int vfio_platform_irq_bypass_add_consumer(
struct irq_bypass_producer *prod,
struct irq_bypass_consumer *cons)
{
- return 0;
+ struct vfio_platform_irq *irq =
+ container_of(prod, struct vfio_platform_irq, producer);
+
+ /*
+ * if the IRQ is active at irqchip level or VFIO (auto)masked
+ * this means the host IRQ is already under injection in the
+ * guest and this not safe to change the forwarding state at
+ * that stage.
+ * It is not possible to differentiate user-space masking
+ * from auto-masking, leading to possible false detection of
+ * active state.
+ */
+ if (vfio_platform_is_active(irq))
+ return -EAGAIN;
+
+ return vfio_platform_set_automasked(irq, false);
}
static void vfio_platform_irq_bypass_del_consumer(
struct irq_bypass_producer *prod,
struct irq_bypass_consumer *cons)
{
+ struct vfio_platform_irq *irq =
+ container_of(prod, struct vfio_platform_irq, producer);
+
+ vfio_platform_set_automasked(irq, true);
}
static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
--
1.9.1
From: Marc Zyngier <[email protected]>
So far, the only use of the HW interrupt facility was the timer,
implying that the active state is context-switched for each vcpu,
as the device is is shared across all vcpus.
This does not work for a device that has been assigned to a VM,
as the guest is entierely in control of that device (the HW is
not shared). In that case, it makes sense to bypass the whole
active state switching.
Also the VGIC state machine is adapted to support those assigned
(non shared) HW IRQs:
- nly can be sampled when it is pending
- when queueing the IRQ (programming the LR), the pending state is
removed as for edge sensitive IRQs
- queued state is not modelled. Level state is not modelled
- its injection always is valid since steming from the HW.
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Eric Auger <[email protected]>
---
- a mix of
[PATCH v4 11/11] KVM: arm/arm64: vgic: Allow HW interrupts for
non-shared devices
[RFC v2 2/4] KVM: arm: vgic: fix state machine for forwarded IRQ
---
include/kvm/arm_vgic.h | 6 +++--
virt/kvm/arm/arch_timer.c | 3 ++-
virt/kvm/arm/vgic.c | 58 +++++++++++++++++++++++++++++++++++------------
3 files changed, 49 insertions(+), 18 deletions(-)
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index d901f1a..7ef9ce0 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -163,7 +163,8 @@ struct irq_phys_map {
u32 virt_irq;
u32 phys_irq;
u32 irq;
- bool active;
+ bool shared;
+ bool active; /* Only valid if shared */
};
struct irq_phys_map_entry {
@@ -356,7 +357,8 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
- int virt_irq, int irq);
+ int virt_irq, int irq,
+ bool shared);
int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 76e38d2..db21d8f 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -203,7 +203,8 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
* Tell the VGIC that the virtual interrupt is tied to a
* physical interrupt. We do that once per VCPU.
*/
- map = kvm_vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
+ map = kvm_vgic_map_phys_irq(vcpu, irq->irq,
+ host_vtimer_irq, true);
if (WARN_ON(IS_ERR(map)))
return PTR_ERR(map);
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 9eb489a..fbd5ba5 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -400,7 +400,11 @@ void vgic_cpu_irq_clear(struct kvm_vcpu *vcpu, int irq)
static bool vgic_can_sample_irq(struct kvm_vcpu *vcpu, int irq)
{
- return !vgic_irq_is_queued(vcpu, irq);
+ struct irq_phys_map *map = vgic_irq_map_search(vcpu, irq);
+ bool shared_hw = map && !map->shared;
+
+ return !vgic_irq_is_queued(vcpu, irq) ||
+ (shared_hw && vgic_dist_irq_is_pending(vcpu, irq));
}
/**
@@ -1150,19 +1154,26 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, int irq,
* active in the physical world. Otherwise the
* physical interrupt will fire and the guest will
* exit before processing the virtual interrupt.
+ *
+ * This is of course only valid for a shared
+ * interrupt. A non shared interrupt should already be
+ * active.
*/
if (map) {
- int ret;
-
- BUG_ON(!map->active);
vlr.hwirq = map->phys_irq;
vlr.state |= LR_HW;
vlr.state &= ~LR_EOI_INT;
- ret = irq_set_irqchip_state(map->irq,
- IRQCHIP_STATE_ACTIVE,
- true);
- WARN_ON(ret);
+ if (map->shared) {
+ int ret;
+
+ BUG_ON(!map->active);
+ ret = irq_set_irqchip_state(
+ map->irq,
+ IRQCHIP_STATE_ACTIVE,
+ true);
+ WARN_ON(ret);
+ }
/*
* Make sure we're not going to sample this
@@ -1229,10 +1240,13 @@ bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
static bool vgic_queue_hwirq(struct kvm_vcpu *vcpu, int irq)
{
+ struct irq_phys_map *map = vgic_irq_map_search(vcpu, irq);
+ bool shared_hw = map && !map->shared;
+
if (!vgic_can_sample_irq(vcpu, irq))
return true; /* level interrupt, already queued */
- if (vgic_queue_irq(vcpu, 0, irq)) {
+ if (vgic_queue_irq(vcpu, 0, irq) || shared_hw) {
if (vgic_irq_is_edge(vcpu, irq)) {
vgic_dist_irq_clear_pending(vcpu, irq);
vgic_cpu_irq_clear(vcpu, irq);
@@ -1411,7 +1425,12 @@ static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
return 0;
map = vgic_irq_map_search(vcpu, vlr.irq);
- BUG_ON(!map || !map->active);
+ BUG_ON(!map);
+
+ if (!map->shared)
+ return 0;
+
+ BUG_ON(map->shared && !map->active);
ret = irq_get_irqchip_state(map->irq,
IRQCHIP_STATE_ACTIVE,
@@ -1563,6 +1582,7 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
int edge_triggered, level_triggered;
int enabled;
bool ret = true, can_inject = true;
+ bool shared_hw = map && !map->shared;
if (irq_num >= min(kvm->arch.vgic.nr_irqs, 1020))
return -EINVAL;
@@ -1573,7 +1593,8 @@ static int vgic_update_irq_pending(struct kvm *kvm, int cpuid,
edge_triggered = vgic_irq_is_edge(vcpu, irq_num);
level_triggered = !edge_triggered;
- if (!vgic_validate_injection(vcpu, irq_num, level)) {
+ if (!vgic_validate_injection(vcpu, irq_num, level) &&
+ !shared_hw) {
ret = false;
goto out;
}
@@ -1742,16 +1763,21 @@ static struct list_head *vgic_get_irq_phys_map_list(struct kvm_vcpu *vcpu,
* @vcpu: The VCPU pointer
* @virt_irq: The virtual irq number
* @irq: The Linux IRQ number
+ * @shared: Indicates if the interrupt has to be context-switched or
+ * if it is private to a VM
*
* Establish a mapping between a guest visible irq (@virt_irq) and a
* Linux irq (@irq). On injection, @virt_irq will be associated with
* the physical interrupt represented by @irq. This mapping can be
* established multiple times as long as the parameters are the same.
+ * If @shared is true, the active state of the interrupt will be
+ * context-switched.
*
* Returns a valid pointer on success, and an error pointer otherwise
*/
struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
- int virt_irq, int irq)
+ int virt_irq, int irq,
+ bool shared)
{
struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
struct list_head *root = vgic_get_irq_phys_map_list(vcpu, virt_irq);
@@ -1785,7 +1811,8 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
if (map) {
/* Make sure this mapping matches */
if (map->phys_irq != phys_irq ||
- map->irq != irq)
+ map->irq != irq ||
+ map->shared != shared)
map = ERR_PTR(-EINVAL);
/* Found an existing, valid mapping */
@@ -1796,6 +1823,7 @@ struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
map->virt_irq = virt_irq;
map->phys_irq = phys_irq;
map->irq = irq;
+ map->shared = shared;
list_add_tail_rcu(&entry->entry, root);
@@ -1846,7 +1874,7 @@ static void vgic_free_phys_irq_map_rcu(struct rcu_head *rcu)
*/
bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
{
- BUG_ON(!map);
+ BUG_ON(!map || !map->shared);
return map->active;
}
@@ -1858,7 +1886,7 @@ bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map)
*/
void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active)
{
- BUG_ON(!map);
+ BUG_ON(!map || !map->shared);
map->active = active;
}
--
1.9.1
Currently irqfd injection relies on kvm_vgic_inject_irq function.
However this function cannot be used anymore for mapped IRQs. So
let's change the implementation to use kvm_vgic_inject_mapped_irq
when the IRQ is forwarded.
Signed-off-by: Eric Auger <[email protected]>
---
virt/kvm/arm/vgic.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index fbd5ba5..03a85b3 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -2529,13 +2529,19 @@ int kvm_irq_map_chip_pin(struct kvm *kvm, unsigned irqchip, unsigned pin)
int kvm_set_irq(struct kvm *kvm, int irq_source_id,
u32 irq, int level, bool line_status)
{
+ struct irq_phys_map *map;
unsigned int spi = irq + VGIC_NR_PRIVATE_IRQS;
trace_kvm_set_irq(irq, level, irq_source_id);
BUG_ON(!vgic_initialized(kvm));
- return kvm_vgic_inject_irq(kvm, 0, spi, level);
+ map = vgic_irq_map_search(kvm_get_vcpu(kvm, 0), spi);
+
+ if (!map)
+ return kvm_vgic_inject_irq(kvm, 0, spi, level);
+ else
+ return kvm_vgic_inject_mapped_irq(kvm, 0, map, level);
}
/* MSI not implemented yet */
--
1.9.1
Implements kvm_vgic_[set|unset]_forward.
Handle low-level VGIC programming: physical IRQ/guest IRQ mapping,
list register cleanup, VGIC state machine. Also interacts with
the irqchip.
Signed-off-by: Eric Auger <[email protected]>
---
v2 -> v3:
- on unforward, we do not compute & output the active state anymore.
This means if the unforward happens while the physical IRQ is
active, we will not VFIO mask the IRQ while deactiving it. If a
new physical IRQ hits, the corresponding virtual IRQ might not
be injected (hence lost) due to VGIC state machine.
bypass rfc v2:
- use irq_set_vcpu_affinity API
- use irq_set_irqchip_state instead of chip->irq_eoi
bypass rfc:
- rename kvm_arch_{set|unset}_forward into
kvm_vgic_{set|unset}_forward. Remove __KVM_HAVE_ARCH_HALT_GUEST.
The function is bound to be called by ARM code only.
v4 -> v5:
- fix arm64 compilation issues, ie. also defines
__KVM_HAVE_ARCH_HALT_GUEST for arm64
v3 -> v4:
- code originally located in kvm_vfio_arm.c
- kvm_arch_vfio_{set|unset}_forward renamed into
kvm_arch_{set|unset}_forward
- split into 2 functions (set/unset) since unset does not fail anymore
- unset can be invoked at whatever time. Extra care is taken to handle
transition in VGIC state machine, LR cleanup, ...
v2 -> v3:
- renaming of kvm_arch_set_fwd_state into kvm_arch_vfio_set_forward
- takes a bool arg instead of kvm_fwd_irq_action enum
- removal of KVM_VFIO_IRQ_CLEANUP
- platform device check now happens here
- more precise errors returned
- irq_eoi handled externally to this patch (VGIC)
- correct enable_irq bug done twice
- reword the commit message
- correct check of platform_bus_type
- use raw_spin_lock_irqsave and check the validity of the handler
---
include/kvm/arm_vgic.h | 6 ++
virt/kvm/arm/vgic.c | 149 +++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 155 insertions(+)
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 7ef9ce0..409ac0f 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -363,6 +363,12 @@ int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
+int kvm_vgic_set_forward(struct kvm *kvm,
+ unsigned int host_irq, unsigned int guest_irq);
+
+void kvm_vgic_unset_forward(struct kvm *kvm,
+ unsigned int host_irq, unsigned int guest_irq);
+
#define irqchip_in_kernel(k) (!!((k)->arch.vgic.in_kernel))
#define vgic_initialized(k) (!!((k)->arch.vgic.nr_cpus))
#define vgic_ready(k) ((k)->arch.vgic.ready)
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 03a85b3..b15999a 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -2551,3 +2551,152 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
{
return 0;
}
+
+/**
+ * kvm_vgic_set_forward - Set IRQ forwarding
+ *
+ * @kvm: handle to the VM
+ * @host_irq: physical IRQ number
+ * @guest_irq: virtual IRQ number
+ *
+ * This function is supposed to be called only if the IRQ
+ * is not in progress: ie. not active at GIC level and not
+ * currently under injection in the guest. The physical IRQ must
+ * also be disabled and the guest must have been exited and
+ * prevented from being re-entered.
+ */
+int kvm_vgic_set_forward(struct kvm *kvm,
+ unsigned int host_irq,
+ unsigned int guest_irq)
+{
+ struct irq_phys_map *map = NULL;
+ struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, 0);
+ int spi_id = guest_irq + VGIC_NR_PRIVATE_IRQS;
+
+ kvm_debug("%s host_irq=%d guest_irq=%d\n",
+ __func__, host_irq, guest_irq);
+
+ if (!vcpu)
+ return 0;
+
+ irq_set_vcpu_affinity(host_irq, vcpu);
+ /*
+ * next physical IRQ will be be handled as forwarded
+ * by the host (priority drop only)
+ */
+
+ map = kvm_vgic_map_phys_irq(vcpu, spi_id, host_irq, false);
+ /*
+ * next guest_irq injection will be considered as
+ * forwarded and next flush will program LR
+ * without maintenance IRQ but with HW bit set
+ */
+ return !map;
+}
+
+/**
+ * kvm_vgic_unset_forward - Unset IRQ forwarding
+ *
+ * @kvm: handle to the VM
+ * @host_irq: physical IRQ number
+ * @guest_irq: virtual IRQ number
+ *
+ * This function must be called when the host_irq is disabled
+ * and guest has been exited and prevented from being re-entered.
+ *
+ */
+void kvm_vgic_unset_forward(struct kvm *kvm,
+ unsigned int host_irq,
+ unsigned int guest_irq)
+{
+ struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, 0);
+ struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
+ struct vgic_dist *dist = &kvm->arch.vgic;
+ int ret, lr;
+ struct vgic_lr vlr;
+ int spi_id = guest_irq + VGIC_NR_PRIVATE_IRQS;
+ bool queued = false, needs_deactivate = true;
+ struct irq_phys_map *map;
+ bool active;
+
+ kvm_debug("%s host_irq=%d guest_irq=%d\n",
+ __func__, host_irq, guest_irq);
+
+ spin_lock(&dist->lock);
+
+ irq_get_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, &active);
+
+ if (!vcpu)
+ goto out;
+
+ map = vgic_irq_map_search(vcpu, spi_id);
+ BUG_ON(!map);
+ ret = kvm_vgic_unmap_phys_irq(vcpu, map);
+ BUG_ON(ret);
+ /*
+ * subsequent update_irq_pending or flush will handle this
+ * irq as not forwarded
+ */
+ if (likely(!(active))) {
+ /*
+ * the IRQ was not active. let's simply prepare the states
+ * for subsequent non forwarded injection.
+ */
+ vgic_dist_irq_clear_level(vcpu, spi_id);
+ vgic_dist_irq_clear_pending(vcpu, spi_id);
+ vgic_irq_clear_queued(vcpu, spi_id);
+ needs_deactivate = false;
+ goto out;
+ }
+
+ /* is there any list register with valid state? */
+ lr = vgic_cpu->vgic_irq_lr_map[spi_id];
+ if (lr != LR_EMPTY) {
+ vlr = vgic_get_lr(vcpu, lr);
+ if (vlr.state & LR_STATE_MASK)
+ queued = true;
+ }
+
+ if (!queued) {
+ vgic_irq_clear_queued(vcpu, spi_id);
+ if (vgic_dist_irq_is_pending(vcpu, spi_id)) {
+ /*
+ * IRQ is injected but not yet queued. LR will be
+ * written with EOI_INT and process_maintenance will
+ * reset the states: queued, level(resampler). Pending
+ * will be reset on flush.
+ */
+ vgic_dist_irq_set_level(vcpu, spi_id);
+ } else {
+ /*
+ * We are somewhere before the update_irq_pending.
+ * we can't be sure the virtual IRQ will ever be
+ * injected (due to previous disable_irq).
+ * Let's simply clear the level which was not correctly
+ * modelled in forwarded state.
+ */
+ vgic_dist_irq_clear_level(vcpu, spi_id);
+ }
+ goto out;
+ }
+
+ /*
+ * the virtual IRQ is queued and a valid LR exists, let's patch it so
+ * that when EOI happens a maintenance IRQ gets triggered
+ */
+ vlr.state |= LR_EOI_INT;
+ vgic_set_lr(vcpu, lr, vlr);
+
+ vgic_dist_irq_set_level(vcpu, spi_id);
+ vgic_dist_irq_set_pending(vcpu, spi_id);
+ vgic_irq_set_queued(vcpu, spi_id);
+ /* The maintenance IRQ will reset all states above */
+
+out:
+ irq_set_irqchip_state(host_irq, IRQCHIP_STATE_ACTIVE, false);
+ irq_set_vcpu_affinity(host_irq, NULL);
+ /* next occurrence will be deactivated by the host */
+
+ spin_unlock(&dist->lock);
+}
+
--
1.9.1
Implement IRQ bypass callbacks for arm/arm64 IRQ forwarding:
- kvm_arch_irq_bypass_add_producer: perform VGIC/irqchip
settings for forwarding
- kvm_arch_irq_bypass_del_producer: same for inverse operation
- kvm_arch_irq_bypass_stop: halt guest execution
- kvm_arch_irq_bypass_start: resume guest execution
and set CONFIG_HAVE_KVM_IRQ_BYPASS for arm/arm64.
Signed-off-by: Eric Auger <[email protected]>
---
v2 -> v3:
- kvm_arch_irq_bypass_resume renamed into kvm_arch_irq_bypass_start
- kvm_vgic_unset_forward does not take the active bool param anymore
- kvm_arch_irq_bypass_add_producer now returns an error value
- remove kvm_arch_irq_bypass_update
v1 -> v2:
- struct kvm_kernel_irqfd is retrieved with container_of
- function names changed
---
arch/arm/kvm/Kconfig | 1 +
arch/arm/kvm/arm.c | 35 +++++++++++++++++++++++++++++++++++
arch/arm64/kvm/Kconfig | 1 +
3 files changed, 37 insertions(+)
diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index 3c565b9..655d277 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -33,6 +33,7 @@ config KVM
select HAVE_KVM_EVENTFD
select HAVE_KVM_IRQFD
select IRQ_BYPASS_MANAGER
+ select HAVE_KVM_IRQ_BYPASS
depends on ARM_VIRT_EXT && ARM_LPAE && ARM_ARCH_TIMER
---help---
Support hosting virtualized guest machines.
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 0529b38..7cfc5dc 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -27,6 +27,8 @@
#include <linux/mman.h>
#include <linux/sched.h>
#include <linux/kvm.h>
+#include <linux/kvm_irqfd.h>
+#include <linux/irqbypass.h>
#include <trace/events/kvm.h>
#define CREATE_TRACE_POINTS
@@ -1149,6 +1151,39 @@ struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr)
return NULL;
}
+int kvm_arch_irq_bypass_add_producer(struct irq_bypass_consumer *cons,
+ struct irq_bypass_producer *prod)
+{
+ struct kvm_kernel_irqfd *irqfd =
+ container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+ return kvm_vgic_set_forward(irqfd->kvm, prod->irq, irqfd->gsi);
+}
+void kvm_arch_irq_bypass_del_producer(struct irq_bypass_consumer *cons,
+ struct irq_bypass_producer *prod)
+{
+ struct kvm_kernel_irqfd *irqfd =
+ container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+ kvm_vgic_unset_forward(irqfd->kvm, prod->irq, irqfd->gsi);
+}
+
+void kvm_arch_irq_bypass_stop(struct irq_bypass_consumer *cons)
+{
+ struct kvm_kernel_irqfd *irqfd =
+ container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+ kvm_arm_halt_guest(irqfd->kvm);
+}
+
+void kvm_arch_irq_bypass_start(struct irq_bypass_consumer *cons)
+{
+ struct kvm_kernel_irqfd *irqfd =
+ container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+ kvm_arm_resume_guest(irqfd->kvm);
+}
+
/**
* Initialize Hyp-mode and memory mappings on all CPUs.
*/
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index 2509539..6f6e7a5 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -33,6 +33,7 @@ config KVM
select HAVE_KVM_EVENTFD
select HAVE_KVM_IRQFD
select IRQ_BYPASS_MANAGER
+ select HAVE_KVM_IRQ_BYPASS
---help---
Support hosting virtualized guest machines.
--
1.9.1
On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
> Register a dummy producer with void callbacks
>
> Signed-off-by: Eric Auger <[email protected]>
>
> ---
>
> v2 -> v3:
> - rename vfio_platform_irq_bypass_resume into *_start
> ---
> drivers/vfio/platform/vfio_platform_irq.c | 32 +++++++++++++++++++++++++++
> drivers/vfio/platform/vfio_platform_private.h | 2 ++
> 2 files changed, 34 insertions(+)
>
> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> index 88bba57..b5cb8c7 100644
> --- a/drivers/vfio/platform/vfio_platform_irq.c
> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> @@ -20,6 +20,7 @@
> #include <linux/types.h>
> #include <linux/vfio.h>
> #include <linux/irq.h>
> +#include <linux/irqbypass.h>
>
> #include "vfio_platform_private.h"
>
> @@ -177,6 +178,27 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
> return IRQ_HANDLED;
> }
>
> +static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
> +{
> +}
> +
> +static void vfio_platform_irq_bypass_start(struct irq_bypass_producer *prod)
> +{
> +}
> +
> +static int vfio_platform_irq_bypass_add_consumer(
> + struct irq_bypass_producer *prod,
> + struct irq_bypass_consumer *cons)
> +{
> + return 0;
> +}
> +
> +static void vfio_platform_irq_bypass_del_consumer(
> + struct irq_bypass_producer *prod,
> + struct irq_bypass_consumer *cons)
> +{
> +}
> +
> static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
> int fd, irq_handler_t handler)
> {
> @@ -186,6 +208,7 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>
> if (irq->trigger) {
> free_irq(irq->hwirq, irq);
> + irq_bypass_unregister_producer(&irq->producer);
> kfree(irq->name);
> eventfd_ctx_put(irq->trigger);
> irq->trigger = NULL;
> @@ -216,6 +239,15 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
> return ret;
> }
>
> + irq->producer.token = (void *)trigger;
> + irq->producer.irq = irq->hwirq;
> + irq->producer.add_consumer = vfio_platform_irq_bypass_add_consumer;
> + irq->producer.del_consumer = vfio_platform_irq_bypass_del_consumer;
> + irq->producer.stop = vfio_platform_irq_bypass_stop;
> + irq->producer.start = vfio_platform_irq_bypass_start;
> + ret = irq_bypass_register_producer(&irq->producer);
> + WARN_ON(ret);
For what purpose?
> +
> if (!irq->masked)
> enable_irq(irq->hwirq);
>
> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
> index 1c9b3d5..1d2d4d6 100644
> --- a/drivers/vfio/platform/vfio_platform_private.h
> +++ b/drivers/vfio/platform/vfio_platform_private.h
> @@ -17,6 +17,7 @@
>
> #include <linux/types.h>
> #include <linux/interrupt.h>
> +#include <linux/irqbypass.h>
>
> #define VFIO_PLATFORM_OFFSET_SHIFT 40
> #define VFIO_PLATFORM_OFFSET_MASK (((u64)(1) << VFIO_PLATFORM_OFFSET_SHIFT) - 1)
> @@ -37,6 +38,7 @@ struct vfio_platform_irq {
> spinlock_t lock;
> struct virqfd *unmask;
> struct virqfd *mask;
> + struct irq_bypass_producer producer;
> };
>
> struct vfio_platform_region {
On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
> A single handler now is registered whatever the use case: automasked
> or not. A function pointer is set according to the wished behavior
> and the handler calls this function.
>
> The irq lock is taken/released in the root handler. eventfd_signal can
> be called in regions not allowed to sleep.
>
> Signed-off-by: Eric Auger <[email protected]>
>
> ---
>
> v4: creation
> ---
> drivers/vfio/platform/vfio_platform_irq.c | 21 +++++++++++++++------
> drivers/vfio/platform/vfio_platform_private.h | 1 +
> 2 files changed, 16 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> index 40f057a..b31b1f0 100644
> --- a/drivers/vfio/platform/vfio_platform_irq.c
> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> @@ -148,11 +148,8 @@ static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev,
> static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
> {
> struct vfio_platform_irq *irq_ctx = dev_id;
> - unsigned long flags;
> int ret = IRQ_NONE;
>
> - spin_lock_irqsave(&irq_ctx->lock, flags);
> -
> if (!irq_ctx->masked) {
> ret = IRQ_HANDLED;
>
> @@ -161,8 +158,6 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
> irq_ctx->masked = true;
> }
>
> - spin_unlock_irqrestore(&irq_ctx->lock, flags);
> -
> if (ret == IRQ_HANDLED)
> eventfd_signal(irq_ctx->trigger, 1);
Has this been run with lockdep to check whether this is safe to call
with spinlock_irqsave held?
>
> @@ -178,6 +173,19 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
> return IRQ_HANDLED;
> }
>
> +static irqreturn_t vfio_handler(int irq, void *dev_id)
> +{
> + struct vfio_platform_irq *irq_ctx = dev_id;
> + unsigned long flags;
> + irqreturn_t ret;
> +
> + spin_lock_irqsave(&irq_ctx->lock, flags);
> + ret = irq_ctx->handler(irq, dev_id);
> + spin_unlock_irqrestore(&irq_ctx->lock, flags);
> +
> + return ret;
> +}
> +
> static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
> {
> }
> @@ -229,9 +237,10 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
> }
>
> irq->trigger = trigger;
> + irq->handler = handler;
>
> irq_set_status_flags(irq->hwirq, IRQ_NOAUTOEN);
> - ret = request_irq(irq->hwirq, handler, 0, irq->name, irq);
> + ret = request_irq(irq->hwirq, vfio_handler, 0, irq->name, irq);
> if (ret) {
> kfree(irq->name);
> eventfd_ctx_put(trigger);
> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
> index 8b4f814..f848a6b 100644
> --- a/drivers/vfio/platform/vfio_platform_private.h
> +++ b/drivers/vfio/platform/vfio_platform_private.h
> @@ -40,6 +40,7 @@ struct vfio_platform_irq {
> struct virqfd *mask;
> struct irq_bypass_producer producer;
> bool forwarded;
> + irqreturn_t (*handler)(int irq, void *dev_id);
> };
>
> struct vfio_platform_region {
On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
> This function makes possible to change the automasked mode.
>
> Signed-off-by: Eric Auger <[email protected]>
>
> ---
>
> v1 -> v2:
> - set forwarded flag
> ---
> drivers/vfio/platform/vfio_platform_irq.c | 19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> index b31b1f0..a285384 100644
> --- a/drivers/vfio/platform/vfio_platform_irq.c
> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> @@ -186,6 +186,25 @@ static irqreturn_t vfio_handler(int irq, void *dev_id)
> return ret;
> }
>
> +static int vfio_platform_set_automasked(struct vfio_platform_irq *irq,
> + bool automasked)
> +{
> + unsigned long flags;
> +
> + spin_lock_irqsave(&irq->lock, flags);
> + if (automasked) {
> + irq->forwarded = true;
> + irq->flags |= VFIO_IRQ_INFO_AUTOMASKED;
> + irq->handler = vfio_automasked_irq_handler;
> + } else {
> + irq->forwarded = false;
> + irq->flags &= ~VFIO_IRQ_INFO_AUTOMASKED;
> + irq->handler = vfio_irq_handler;
> + }
> + spin_unlock_irqrestore(&irq->lock, flags);
> + return 0;
In vfio-speak, automasked means level and we're not magically changing
the IRQ from level to edge, we're simply able to handle level
differently based on a hardware optimization. Should the user visible
flags therefore change based on this? Aren't we really setting the
forwarded state rather than the automasked state?
> +}
> +
> static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
> {
> }
On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
> This function returns whether the IRQ is active at irqchip level or
> VFIO masked. If either is true, it is considered the IRQ is active.
> Currently there is no way to differentiate userspace masked IRQ from
> automasked IRQ. There might be false detection of activity. However
> it is currently acceptable to have false detection.
>
> Signed-off-by: Eric Auger <[email protected]>
>
> ---
> ---
> drivers/vfio/platform/vfio_platform_irq.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> index a285384..efaee58 100644
> --- a/drivers/vfio/platform/vfio_platform_irq.c
> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> @@ -205,6 +205,23 @@ static int vfio_platform_set_automasked(struct vfio_platform_irq *irq,
> return 0;
> }
>
> +static int vfio_platform_is_active(struct vfio_platform_irq *irq)
vfio_platform_irq_is_active()?
> +{
> + unsigned long flags;
> + bool active, masked, outstanding;
> + int ret;
> +
> + spin_lock_irqsave(&irq->lock, flags);
> +
> + ret = irq_get_irqchip_state(irq->hwirq, IRQCHIP_STATE_ACTIVE, &active);
> + BUG_ON(ret);
Why can't we propagate this error to the caller and let them decide?
> + masked = irq->masked;
> + outstanding = active || masked;
> +
> + spin_unlock_irqrestore(&irq->lock, flags);
> + return outstanding;
> +}
> +
> static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
> {
> }
On Mon, 2015-08-10 at 15:21 +0200, Eric Auger wrote:
> This patch populates the IRQ bypass callacks:
> - stop/start producer simply consist in disabling/enabling the host irq
> - add/del consumer: basically set the automasked flag to false/true
>
> Signed-off-by: Eric Auger <[email protected]>
>
> ---
> v2 -> v3:
> - vfio_platform_irq_bypass_add_consumer now returns an error in case
> the IRQ is recognized as active
> - active boolean not set anymore
> - do not VFIO mask the IRQ anymore on unforward
>
> v1 -> v2:
> - device handle caching in vfio_platform_device is introduced in a
> separate patch
> - use container_of
> ---
> drivers/vfio/platform/vfio_platform_irq.c | 23 ++++++++++++++++++++++-
> 1 file changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> index efaee58..400a188 100644
> --- a/drivers/vfio/platform/vfio_platform_irq.c
> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> @@ -224,23 +224,44 @@ static int vfio_platform_is_active(struct vfio_platform_irq *irq)
>
> static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
> {
> + disable_irq(prod->irq);
> }
>
> static void vfio_platform_irq_bypass_start(struct irq_bypass_producer *prod)
> {
> + enable_irq(prod->irq);
> }
>
> static int vfio_platform_irq_bypass_add_consumer(
> struct irq_bypass_producer *prod,
> struct irq_bypass_consumer *cons)
> {
> - return 0;
> + struct vfio_platform_irq *irq =
> + container_of(prod, struct vfio_platform_irq, producer);
> +
> + /*
> + * if the IRQ is active at irqchip level or VFIO (auto)masked
> + * this means the host IRQ is already under injection in the
> + * guest and this not safe to change the forwarding state at
> + * that stage.
> + * It is not possible to differentiate user-space masking
> + * from auto-masking, leading to possible false detection of
> + * active state.
> + */
> + if (vfio_platform_is_active(irq))
> + return -EAGAIN;
Here's an example of why we don't want WARN_ON if a registration fails,
this is effectively random. When and how is a re-try going to happen?
> +
> + return vfio_platform_set_automasked(irq, false);
set_forwarded just seems so much more logical here.
> }
>
> static void vfio_platform_irq_bypass_del_consumer(
> struct irq_bypass_producer *prod,
> struct irq_bypass_consumer *cons)
> {
> + struct vfio_platform_irq *irq =
> + container_of(prod, struct vfio_platform_irq, producer);
> +
> + vfio_platform_set_automasked(irq, true);
> }
>
> static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
Hi Alex,
On 08/12/2015 08:56 PM, Alex Williamson wrote:
> On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
>> Register a dummy producer with void callbacks
>>
>> Signed-off-by: Eric Auger <[email protected]>
>>
>> ---
>>
>> v2 -> v3:
>> - rename vfio_platform_irq_bypass_resume into *_start
>> ---
>> drivers/vfio/platform/vfio_platform_irq.c | 32 +++++++++++++++++++++++++++
>> drivers/vfio/platform/vfio_platform_private.h | 2 ++
>> 2 files changed, 34 insertions(+)
>>
>> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
>> index 88bba57..b5cb8c7 100644
>> --- a/drivers/vfio/platform/vfio_platform_irq.c
>> +++ b/drivers/vfio/platform/vfio_platform_irq.c
>> @@ -20,6 +20,7 @@
>> #include <linux/types.h>
>> #include <linux/vfio.h>
>> #include <linux/irq.h>
>> +#include <linux/irqbypass.h>
>>
>> #include "vfio_platform_private.h"
>>
>> @@ -177,6 +178,27 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
>> return IRQ_HANDLED;
>> }
>>
>> +static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
>> +{
>> +}
>> +
>> +static void vfio_platform_irq_bypass_start(struct irq_bypass_producer *prod)
>> +{
>> +}
>> +
>> +static int vfio_platform_irq_bypass_add_consumer(
>> + struct irq_bypass_producer *prod,
>> + struct irq_bypass_consumer *cons)
>> +{
>> + return 0;
>> +}
>> +
>> +static void vfio_platform_irq_bypass_del_consumer(
>> + struct irq_bypass_producer *prod,
>> + struct irq_bypass_consumer *cons)
>> +{
>> +}
>> +
>> static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>> int fd, irq_handler_t handler)
>> {
>> @@ -186,6 +208,7 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>>
>> if (irq->trigger) {
>> free_irq(irq->hwirq, irq);
>> + irq_bypass_unregister_producer(&irq->producer);
>> kfree(irq->name);
>> eventfd_ctx_put(irq->trigger);
>> irq->trigger = NULL;
>> @@ -216,6 +239,15 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>> return ret;
>> }
>>
>> + irq->producer.token = (void *)trigger;
>> + irq->producer.irq = irq->hwirq;
>> + irq->producer.add_consumer = vfio_platform_irq_bypass_add_consumer;
>> + irq->producer.del_consumer = vfio_platform_irq_bypass_del_consumer;
>> + irq->producer.stop = vfio_platform_irq_bypass_stop;
>> + irq->producer.start = vfio_platform_irq_bypass_start;
>> + ret = irq_bypass_register_producer(&irq->producer);
>> + WARN_ON(ret);
>
> For what purpose?
Yes. I will replace by a simple pr_info as done in eventfd.c.
Best Regards
Eric
>
>> +
>> if (!irq->masked)
>> enable_irq(irq->hwirq);
>>
>> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
>> index 1c9b3d5..1d2d4d6 100644
>> --- a/drivers/vfio/platform/vfio_platform_private.h
>> +++ b/drivers/vfio/platform/vfio_platform_private.h
>> @@ -17,6 +17,7 @@
>>
>> #include <linux/types.h>
>> #include <linux/interrupt.h>
>> +#include <linux/irqbypass.h>
>>
>> #define VFIO_PLATFORM_OFFSET_SHIFT 40
>> #define VFIO_PLATFORM_OFFSET_MASK (((u64)(1) << VFIO_PLATFORM_OFFSET_SHIFT) - 1)
>> @@ -37,6 +38,7 @@ struct vfio_platform_irq {
>> spinlock_t lock;
>> struct virqfd *unmask;
>> struct virqfd *mask;
>> + struct irq_bypass_producer producer;
>> };
>>
>> struct vfio_platform_region {
>
>
>
Alex,
On 08/12/2015 08:56 PM, Alex Williamson wrote:
> On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
>> A single handler now is registered whatever the use case: automasked
>> or not. A function pointer is set according to the wished behavior
>> and the handler calls this function.
>>
>> The irq lock is taken/released in the root handler. eventfd_signal can
>> be called in regions not allowed to sleep.
>>
>> Signed-off-by: Eric Auger <[email protected]>
>>
>> ---
>>
>> v4: creation
>> ---
>> drivers/vfio/platform/vfio_platform_irq.c | 21 +++++++++++++++------
>> drivers/vfio/platform/vfio_platform_private.h | 1 +
>> 2 files changed, 16 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
>> index 40f057a..b31b1f0 100644
>> --- a/drivers/vfio/platform/vfio_platform_irq.c
>> +++ b/drivers/vfio/platform/vfio_platform_irq.c
>> @@ -148,11 +148,8 @@ static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev,
>> static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
>> {
>> struct vfio_platform_irq *irq_ctx = dev_id;
>> - unsigned long flags;
>> int ret = IRQ_NONE;
>>
>> - spin_lock_irqsave(&irq_ctx->lock, flags);
>> -
>> if (!irq_ctx->masked) {
>> ret = IRQ_HANDLED;
>>
>> @@ -161,8 +158,6 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
>> irq_ctx->masked = true;
>> }
>>
>> - spin_unlock_irqrestore(&irq_ctx->lock, flags);
>> -
>> if (ret == IRQ_HANDLED)
>> eventfd_signal(irq_ctx->trigger, 1);
>
> Has this been run with lockdep to check whether this is safe to call
> with spinlock_irqsave held?
No I did not check with lockdep and I will do. There is a comment in
fs/eventfd.c in eventfd_signal function comments that says:
"This function is supposed to be called by the kernel in paths that do
not allow sleeping. In this function we allow the counter to reach the
ULLONG_MAX value, and we signal this as overflow condition by returining
a POLLERR to poll(2)."
so I understood from this it is safe.
Best Regards
Eric
>
>>
>> @@ -178,6 +173,19 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
>> return IRQ_HANDLED;
>> }
>>
>> +static irqreturn_t vfio_handler(int irq, void *dev_id)
>> +{
>> + struct vfio_platform_irq *irq_ctx = dev_id;
>> + unsigned long flags;
>> + irqreturn_t ret;
>> +
>> + spin_lock_irqsave(&irq_ctx->lock, flags);
>> + ret = irq_ctx->handler(irq, dev_id);
>> + spin_unlock_irqrestore(&irq_ctx->lock, flags);
>> +
>> + return ret;
>> +}
>> +
>> static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
>> {
>> }
>> @@ -229,9 +237,10 @@ static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>> }
>>
>> irq->trigger = trigger;
>> + irq->handler = handler;
>>
>> irq_set_status_flags(irq->hwirq, IRQ_NOAUTOEN);
>> - ret = request_irq(irq->hwirq, handler, 0, irq->name, irq);
>> + ret = request_irq(irq->hwirq, vfio_handler, 0, irq->name, irq);
>> if (ret) {
>> kfree(irq->name);
>> eventfd_ctx_put(trigger);
>> diff --git a/drivers/vfio/platform/vfio_platform_private.h b/drivers/vfio/platform/vfio_platform_private.h
>> index 8b4f814..f848a6b 100644
>> --- a/drivers/vfio/platform/vfio_platform_private.h
>> +++ b/drivers/vfio/platform/vfio_platform_private.h
>> @@ -40,6 +40,7 @@ struct vfio_platform_irq {
>> struct virqfd *mask;
>> struct irq_bypass_producer producer;
>> bool forwarded;
>> + irqreturn_t (*handler)(int irq, void *dev_id);
>> };
>>
>> struct vfio_platform_region {
>
>
>
On 08/12/2015 08:56 PM, Alex Williamson wrote:
> On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
>> This function makes possible to change the automasked mode.
>>
>> Signed-off-by: Eric Auger <[email protected]>
>>
>> ---
>>
>> v1 -> v2:
>> - set forwarded flag
>> ---
>> drivers/vfio/platform/vfio_platform_irq.c | 19 +++++++++++++++++++
>> 1 file changed, 19 insertions(+)
>>
>> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
>> index b31b1f0..a285384 100644
>> --- a/drivers/vfio/platform/vfio_platform_irq.c
>> +++ b/drivers/vfio/platform/vfio_platform_irq.c
>> @@ -186,6 +186,25 @@ static irqreturn_t vfio_handler(int irq, void *dev_id)
>> return ret;
>> }
>>
>> +static int vfio_platform_set_automasked(struct vfio_platform_irq *irq,
>> + bool automasked)
>> +{
>> + unsigned long flags;
>> +
>> + spin_lock_irqsave(&irq->lock, flags);
>> + if (automasked) {
>> + irq->forwarded = true;
>> + irq->flags |= VFIO_IRQ_INFO_AUTOMASKED;
>> + irq->handler = vfio_automasked_irq_handler;
>> + } else {
>> + irq->forwarded = false;
>> + irq->flags &= ~VFIO_IRQ_INFO_AUTOMASKED;
>> + irq->handler = vfio_irq_handler;
>> + }
>> + spin_unlock_irqrestore(&irq->lock, flags);
>> + return 0;
>
> In vfio-speak, automasked means level and we're not magically changing
> the IRQ from level to edge, we're simply able to handle level
> differently based on a hardware optimization. Should the user visible
> flags therefore change based on this? Aren't we really setting the
> forwarded state rather than the automasked state?
Well actually this was following the discussion we had a long time ago
about that topic:
http://lkml.iu.edu/hypermail/linux/kernel/1409.1/03659.html
I did not really know how to conclude ...
If it is preferred I can hide this to the userspace, no problem.
Eric
>
>> +}
>> +
>> static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
>> {
>> }
>
>
>
On 08/12/2015 08:56 PM, Alex Williamson wrote:
> On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
>> This function returns whether the IRQ is active at irqchip level or
>> VFIO masked. If either is true, it is considered the IRQ is active.
>> Currently there is no way to differentiate userspace masked IRQ from
>> automasked IRQ. There might be false detection of activity. However
>> it is currently acceptable to have false detection.
>>
>> Signed-off-by: Eric Auger <[email protected]>
>>
>> ---
>> ---
>> drivers/vfio/platform/vfio_platform_irq.c | 17 +++++++++++++++++
>> 1 file changed, 17 insertions(+)
>>
>> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
>> index a285384..efaee58 100644
>> --- a/drivers/vfio/platform/vfio_platform_irq.c
>> +++ b/drivers/vfio/platform/vfio_platform_irq.c
>> @@ -205,6 +205,23 @@ static int vfio_platform_set_automasked(struct vfio_platform_irq *irq,
>> return 0;
>> }
>>
>> +static int vfio_platform_is_active(struct vfio_platform_irq *irq)
>
> vfio_platform_irq_is_active()?
OK
>
>> +{
>> + unsigned long flags;
>> + bool active, masked, outstanding;
>> + int ret;
>> +
>> + spin_lock_irqsave(&irq->lock, flags);
>> +
>> + ret = irq_get_irqchip_state(irq->hwirq, IRQCHIP_STATE_ACTIVE, &active);
>> + BUG_ON(ret);
>
> Why can't we propagate this error to the caller and let them decide?
sure
Eric
>
>> + masked = irq->masked;
>> + outstanding = active || masked;
>> +
>> + spin_unlock_irqrestore(&irq->lock, flags);
>> + return outstanding;
>> +}
>> +
>> static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
>> {
>> }
>
>
>
Hi Alex,
On 08/12/2015 08:56 PM, Alex Williamson wrote:
> On Mon, 2015-08-10 at 15:21 +0200, Eric Auger wrote:
>> This patch populates the IRQ bypass callacks:
>> - stop/start producer simply consist in disabling/enabling the host irq
>> - add/del consumer: basically set the automasked flag to false/true
>>
>> Signed-off-by: Eric Auger <[email protected]>
>>
>> ---
>> v2 -> v3:
>> - vfio_platform_irq_bypass_add_consumer now returns an error in case
>> the IRQ is recognized as active
>> - active boolean not set anymore
>> - do not VFIO mask the IRQ anymore on unforward
>>
>> v1 -> v2:
>> - device handle caching in vfio_platform_device is introduced in a
>> separate patch
>> - use container_of
>> ---
>> drivers/vfio/platform/vfio_platform_irq.c | 23 ++++++++++++++++++++++-
>> 1 file changed, 22 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
>> index efaee58..400a188 100644
>> --- a/drivers/vfio/platform/vfio_platform_irq.c
>> +++ b/drivers/vfio/platform/vfio_platform_irq.c
>> @@ -224,23 +224,44 @@ static int vfio_platform_is_active(struct vfio_platform_irq *irq)
>>
>> static void vfio_platform_irq_bypass_stop(struct irq_bypass_producer *prod)
>> {
>> + disable_irq(prod->irq);
>> }
>>
>> static void vfio_platform_irq_bypass_start(struct irq_bypass_producer *prod)
>> {
>> + enable_irq(prod->irq);
>> }
>>
>> static int vfio_platform_irq_bypass_add_consumer(
>> struct irq_bypass_producer *prod,
>> struct irq_bypass_consumer *cons)
>> {
>> - return 0;
>> + struct vfio_platform_irq *irq =
>> + container_of(prod, struct vfio_platform_irq, producer);
>> +
>> + /*
>> + * if the IRQ is active at irqchip level or VFIO (auto)masked
>> + * this means the host IRQ is already under injection in the
>> + * guest and this not safe to change the forwarding state at
>> + * that stage.
>> + * It is not possible to differentiate user-space masking
>> + * from auto-masking, leading to possible false detection of
>> + * active state.
>> + */
>> + if (vfio_platform_is_active(irq))
>> + return -EAGAIN;
>
> Here's an example of why we don't want WARN_ON if a registration fails,
> this is effectively random. When and how is a re-try going to happen?
To be honest I did not intend to implement any retry mechanism. I rather
intended to change the user-side which currently is not adapted to that
start-up. Typically with current QEMU code we have this 2 phase IRQ
signal startup, first set up eventfd signaling, then VFIO mask, then
turn irqfd on. With such sequence forwarding setup will fail because the
IRQ is seen as masked (false detection of activity).
Do you mandate such a retry mechanism? If forwarding fails we resort on
standard irqfd ...
I think forwarding setup should be as much static as possible (I think
this opinion also is shared by Marc?).
Please let me know your opinion on this.
Best Regards
Eric
>
>> +
>> + return vfio_platform_set_automasked(irq, false);
>
> set_forwarded just seems so much more logical here.
>
>> }
>>
>> static void vfio_platform_irq_bypass_del_consumer(
>> struct irq_bypass_producer *prod,
>> struct irq_bypass_consumer *cons)
>> {
>> + struct vfio_platform_irq *irq =
>> + container_of(prod, struct vfio_platform_irq, producer);
>> +
>> + vfio_platform_set_automasked(irq, true);
>> }
>>
>> static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
>
>
>
On Mon, 2015-08-17 at 17:38 +0200, Eric Auger wrote:
> On 08/12/2015 08:56 PM, Alex Williamson wrote:
> > On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
> >> This function makes possible to change the automasked mode.
> >>
> >> Signed-off-by: Eric Auger <[email protected]>
> >>
> >> ---
> >>
> >> v1 -> v2:
> >> - set forwarded flag
> >> ---
> >> drivers/vfio/platform/vfio_platform_irq.c | 19 +++++++++++++++++++
> >> 1 file changed, 19 insertions(+)
> >>
> >> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> >> index b31b1f0..a285384 100644
> >> --- a/drivers/vfio/platform/vfio_platform_irq.c
> >> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> >> @@ -186,6 +186,25 @@ static irqreturn_t vfio_handler(int irq, void *dev_id)
> >> return ret;
> >> }
> >>
> >> +static int vfio_platform_set_automasked(struct vfio_platform_irq *irq,
> >> + bool automasked)
> >> +{
> >> + unsigned long flags;
> >> +
> >> + spin_lock_irqsave(&irq->lock, flags);
> >> + if (automasked) {
> >> + irq->forwarded = true;
> >> + irq->flags |= VFIO_IRQ_INFO_AUTOMASKED;
> >> + irq->handler = vfio_automasked_irq_handler;
> >> + } else {
> >> + irq->forwarded = false;
> >> + irq->flags &= ~VFIO_IRQ_INFO_AUTOMASKED;
> >> + irq->handler = vfio_irq_handler;
> >> + }
> >> + spin_unlock_irqrestore(&irq->lock, flags);
> >> + return 0;
> >
> > In vfio-speak, automasked means level and we're not magically changing
> > the IRQ from level to edge, we're simply able to handle level
> > differently based on a hardware optimization. Should the user visible
> > flags therefore change based on this? Aren't we really setting the
> > forwarded state rather than the automasked state?
>
> Well actually this was following the discussion we had a long time ago
> about that topic:
>
> http://lkml.iu.edu/hypermail/linux/kernel/1409.1/03659.html
>
> I did not really know how to conclude ...
>
> If it is preferred I can hide this to the userspace, no problem.
I think that was based on the user being involved in enabling forwarding
though, now that it's hidden and automatic, it doesn't make much sense
to me to toggle any of the interrupt info details based on the state of
the forward. The user always needs to handle the interrupt as level
since the bypass can be torn down at any point in time. We're taking
advantage of the in-kernel path to make further optimizations, which
seems like they should be transparent to the user. Thanks,
Alex
On 18-Aug-15 19:44, Alex Williamson wrote:
> On Mon, 2015-08-17 at 17:38 +0200, Eric Auger wrote:
>> On 08/12/2015 08:56 PM, Alex Williamson wrote:
>>> On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
>>>> This function makes possible to change the automasked mode.
>>>>
>>>> Signed-off-by: Eric Auger <[email protected]>
>>>>
>>>> ---
>>>>
>>>> v1 -> v2:
>>>> - set forwarded flag
>>>> ---
>>>> drivers/vfio/platform/vfio_platform_irq.c | 19 +++++++++++++++++++
>>>> 1 file changed, 19 insertions(+)
>>>>
>>>> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
>>>> index b31b1f0..a285384 100644
>>>> --- a/drivers/vfio/platform/vfio_platform_irq.c
>>>> +++ b/drivers/vfio/platform/vfio_platform_irq.c
>>>> @@ -186,6 +186,25 @@ static irqreturn_t vfio_handler(int irq, void *dev_id)
>>>> return ret;
>>>> }
>>>>
>>>> +static int vfio_platform_set_automasked(struct vfio_platform_irq *irq,
>>>> + bool automasked)
>>>> +{
>>>> + unsigned long flags;
>>>> +
>>>> + spin_lock_irqsave(&irq->lock, flags);
>>>> + if (automasked) {
>>>> + irq->forwarded = true;
>>>> + irq->flags |= VFIO_IRQ_INFO_AUTOMASKED;
>>>> + irq->handler = vfio_automasked_irq_handler;
>>>> + } else {
>>>> + irq->forwarded = false;
>>>> + irq->flags &= ~VFIO_IRQ_INFO_AUTOMASKED;
>>>> + irq->handler = vfio_irq_handler;
>>>> + }
>>>> + spin_unlock_irqrestore(&irq->lock, flags);
>>>> + return 0;
>>>
>>> In vfio-speak, automasked means level and we're not magically changing
>>> the IRQ from level to edge, we're simply able to handle level
>>> differently based on a hardware optimization. Should the user visible
>>> flags therefore change based on this? Aren't we really setting the
>>> forwarded state rather than the automasked state?
>>
>> Well actually this was following the discussion we had a long time ago
>> about that topic:
>>
>> http://lkml.iu.edu/hypermail/linux/kernel/1409.1/03659.html
>>
>> I did not really know how to conclude ...
>>
>> If it is preferred I can hide this to the userspace, no problem.
>
> I think that was based on the user being involved in enabling forwarding
> though, now that it's hidden and automatic, it doesn't make much sense
> to me to toggle any of the interrupt info details based on the state of
> the forward. The user always needs to handle the interrupt as level
> since the bypass can be torn down at any point in time. We're taking
> advantage of the in-kernel path to make further optimizations, which
> seems like they should be transparent to the user. Thanks,
I wonder if it makes sense to rename VFIO_IRQ_INFO_AUTOMASKED to
VFIO_IRQ_INFO_LEVEL_TRIGGERED, and reintroduce VFIO_IRQ_INFO_AUTOMASKED as
an alias, so compatibility with user space can be maintained? This way
this semantic misunderstanding could be left behind.
Cheers,
Antonios
>
> Alex
>
> _______________________________________________
> kvmarm mailing list
> [email protected]
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
>
--
Antonios Motakis
Virtualization Engineer
Huawei Technologies Duesseldorf GmbH
European Research Center
Riesstrasse 25, 80992 M?nchen
On Mon, 2015-08-31 at 13:43 +0200, Antonios Motakis wrote:
>
> On 18-Aug-15 19:44, Alex Williamson wrote:
> > On Mon, 2015-08-17 at 17:38 +0200, Eric Auger wrote:
> >> On 08/12/2015 08:56 PM, Alex Williamson wrote:
> >>> On Mon, 2015-08-10 at 15:20 +0200, Eric Auger wrote:
> >>>> This function makes possible to change the automasked mode.
> >>>>
> >>>> Signed-off-by: Eric Auger <[email protected]>
> >>>>
> >>>> ---
> >>>>
> >>>> v1 -> v2:
> >>>> - set forwarded flag
> >>>> ---
> >>>> drivers/vfio/platform/vfio_platform_irq.c | 19 +++++++++++++++++++
> >>>> 1 file changed, 19 insertions(+)
> >>>>
> >>>> diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c
> >>>> index b31b1f0..a285384 100644
> >>>> --- a/drivers/vfio/platform/vfio_platform_irq.c
> >>>> +++ b/drivers/vfio/platform/vfio_platform_irq.c
> >>>> @@ -186,6 +186,25 @@ static irqreturn_t vfio_handler(int irq, void *dev_id)
> >>>> return ret;
> >>>> }
> >>>>
> >>>> +static int vfio_platform_set_automasked(struct vfio_platform_irq *irq,
> >>>> + bool automasked)
> >>>> +{
> >>>> + unsigned long flags;
> >>>> +
> >>>> + spin_lock_irqsave(&irq->lock, flags);
> >>>> + if (automasked) {
> >>>> + irq->forwarded = true;
> >>>> + irq->flags |= VFIO_IRQ_INFO_AUTOMASKED;
> >>>> + irq->handler = vfio_automasked_irq_handler;
> >>>> + } else {
> >>>> + irq->forwarded = false;
> >>>> + irq->flags &= ~VFIO_IRQ_INFO_AUTOMASKED;
> >>>> + irq->handler = vfio_irq_handler;
> >>>> + }
> >>>> + spin_unlock_irqrestore(&irq->lock, flags);
> >>>> + return 0;
> >>>
> >>> In vfio-speak, automasked means level and we're not magically changing
> >>> the IRQ from level to edge, we're simply able to handle level
> >>> differently based on a hardware optimization. Should the user visible
> >>> flags therefore change based on this? Aren't we really setting the
> >>> forwarded state rather than the automasked state?
> >>
> >> Well actually this was following the discussion we had a long time ago
> >> about that topic:
> >>
> >> http://lkml.iu.edu/hypermail/linux/kernel/1409.1/03659.html
> >>
> >> I did not really know how to conclude ...
> >>
> >> If it is preferred I can hide this to the userspace, no problem.
> >
> > I think that was based on the user being involved in enabling forwarding
> > though, now that it's hidden and automatic, it doesn't make much sense
> > to me to toggle any of the interrupt info details based on the state of
> > the forward. The user always needs to handle the interrupt as level
> > since the bypass can be torn down at any point in time. We're taking
> > advantage of the in-kernel path to make further optimizations, which
> > seems like they should be transparent to the user. Thanks,
>
> I wonder if it makes sense to rename VFIO_IRQ_INFO_AUTOMASKED to
> VFIO_IRQ_INFO_LEVEL_TRIGGERED, and reintroduce VFIO_IRQ_INFO_AUTOMASKED as
> an alias, so compatibility with user space can be maintained? This way
> this semantic misunderstanding could be left behind.
Is there really a misunderstanding here? I think the change is that the
user was previously involved and updating the flag reinforced that. Now
the user is not involved and the flag changing is just unexpected noise.
"automasked" is intentionally not "level" because being level triggered
may not be the only reason we'd need to mask an interrupt upon receiving
it. We could actually have automasked edge triggered interrupts if we
wanted. We don't do that, so we can equate automasked to level
currently, but technically automasked simply indicates that an unmask is
required to get the next interrupt. In this case the VM is able to
effectively do the unmask itself. Thanks,
Alex