From: Li Liu <[email protected]>
This set of patches try to implemet irqfd support of vhost-net
based on virtio-mmio.
I had posted a mail to talking about the status of vhost-net
on kvm-arm refer to http://www.spinics.net/lists/kvm-arm/msg10804.html.
Some dependent patches are listed in the mail too. Basically the
vhost-net brings great performance improvements, almost 50%+.
It's easy to implement irqfd support with PCI MSI-X. But till
now arm32 do not provide equivalent mechanism to let a device
allocate multiple interrupts. And even the aarch64 provid LPI
but also not available in a short time.
As Gauguey Remy said "Vhost does not emulate a complete virtio
adapter but only manage virtqueue operations". Vhost module
don't update the ISR register, so if with only one irq then it's
no way to get the interrupt reason even we can inject the
irq correctly.
To get the interrupt reason to support such VIRTIO_NET_F_STATUS
features I add a new register offset VIRTIO_MMIO_ISRMEM which
will help to establish a shared memory region between qemu and
virtio-mmio device. Then the interrupt reason can be accessed by
guest driver through this region. At the same time, the virtio-mmio
dirver check this region to see irqfd is supported or not during
the irq handler registration, and different handler will be assigned.
I want to know it's the right direction? Does it comply with the
virtio-mmio spec.? Or anyone have more good ideas to emulate mis-x
based on virtio-mmio? I hope to get feedback and guidance.
Thx for any help.
Li Liu (2):
Add a new register offset let interrupt reason available
Assign a new irq handler while irqfd enabled
drivers/virtio/virtio_mmio.c | 55 +++++++++++++++++++++++++++++++++++++++---
include/linux/virtio_mmio.h | 3 +++
2 files changed, 55 insertions(+), 3 deletions(-)
--
1.7.9.5
From: Li Liu <[email protected]>
This irq handler will get the interrupt reason from a
shared memory. And will be assigned only while irqfd
enabled.
Signed-off-by: Li Liu <[email protected]>
---
drivers/virtio/virtio_mmio.c | 34 ++++++++++++++++++++++++++++++++--
1 file changed, 32 insertions(+), 2 deletions(-)
diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
index 28ddb55..7229605 100644
--- a/drivers/virtio/virtio_mmio.c
+++ b/drivers/virtio/virtio_mmio.c
@@ -259,7 +259,31 @@ static irqreturn_t vm_interrupt(int irq, void *opaque)
return ret;
}
+/* Notify all virtqueues on an interrupt. */
+static irqreturn_t vm_interrupt_irqfd(int irq, void *opaque)
+{
+ struct virtio_mmio_device *vm_dev = opaque;
+ struct virtio_mmio_vq_info *info;
+ unsigned long status;
+ unsigned long flags;
+ irqreturn_t ret = IRQ_NONE;
+ /* Read the interrupt reason and reset it */
+ status = *vm_dev->isr_mem;
+ *vm_dev->isr_mem = 0x0;
+
+ if (unlikely(status & VIRTIO_MMIO_INT_CONFIG)) {
+ virtio_config_changed(&vm_dev->vdev);
+ ret = IRQ_HANDLED;
+ }
+
+ spin_lock_irqsave(&vm_dev->lock, flags);
+ list_for_each_entry(info, &vm_dev->virtqueues, node)
+ ret |= vring_interrupt(irq, info->vq);
+ spin_unlock_irqrestore(&vm_dev->lock, flags);
+
+ return ret;
+}
static void vm_del_vq(struct virtqueue *vq)
{
@@ -391,6 +415,7 @@ error_available:
return ERR_PTR(err);
}
+#define VIRTIO_MMIO_F_IRQFD (1 << 7)
static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
struct virtqueue *vqs[],
vq_callback_t *callbacks[],
@@ -400,8 +425,13 @@ static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
unsigned int irq = platform_get_irq(vm_dev->pdev, 0);
int i, err;
- err = request_irq(irq, vm_interrupt, IRQF_SHARED,
- dev_name(&vdev->dev), vm_dev);
+ if (*vm_dev->isr_mem & VIRTIO_MMIO_F_IRQFD) {
+ err = request_irq(irq, vm_interrupt_irqfd, IRQF_SHARED,
+ dev_name(&vdev->dev), vm_dev);
+ } else {
+ err = request_irq(irq, vm_interrupt, IRQF_SHARED,
+ dev_name(&vdev->dev), vm_dev);
+ }
if (err)
return err;
--
1.7.9.5
From: Li Liu <[email protected]>
Add a new register offset VIRTIO_MMIO_ISRMEM which help to
estblish a shared memory region between virtio-mmio driver
and qemu with two purposes:
1.Guest virtio-mmio driver can get the interrupt reason.
2.Check irqfd enabled or not to register different irq handler.
Signed-off-by: Li Liu <[email protected]>
---
drivers/virtio/virtio_mmio.c | 21 ++++++++++++++++++++-
include/linux/virtio_mmio.h | 3 +++
2 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
index ef9a165..28ddb55 100644
--- a/drivers/virtio/virtio_mmio.c
+++ b/drivers/virtio/virtio_mmio.c
@@ -122,6 +122,8 @@ struct virtio_mmio_device {
/* a list of queues so we can dispatch IRQs */
spinlock_t lock;
struct list_head virtqueues;
+
+ uint8_t *isr_mem;
};
struct virtio_mmio_vq_info {
@@ -443,6 +445,7 @@ static int virtio_mmio_probe(struct platform_device *pdev)
struct virtio_mmio_device *vm_dev;
struct resource *mem;
unsigned long magic;
+ int err;
mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
if (!mem)
@@ -481,6 +484,15 @@ static int virtio_mmio_probe(struct platform_device *pdev)
return -ENXIO;
}
+ vm_dev->isr_mem = alloc_pages_exact(PAGE_SIZE, GFP_KERNEL|__GFP_ZERO);
+ if (vm_dev->isr_mem == NULL) {
+ dev_err(&pdev->dev, "Allocate isr memory failed!\n");
+ return -ENOMEM;
+ }
+
+ writel(virt_to_phys(vm_dev->isr_mem),
+ vm_dev->base + VIRTIO_MMIO_ISRMEM);
+
vm_dev->vdev.id.device = readl(vm_dev->base + VIRTIO_MMIO_DEVICE_ID);
vm_dev->vdev.id.vendor = readl(vm_dev->base + VIRTIO_MMIO_VENDOR_ID);
@@ -488,13 +500,20 @@ static int virtio_mmio_probe(struct platform_device *pdev)
platform_set_drvdata(pdev, vm_dev);
- return register_virtio_device(&vm_dev->vdev);
+ err = register_virtio_device(&vm_dev->vdev);
+ if (err) {
+ free_pages_exact(vm_dev->isr_mem, PAGE_SIZE);
+ vm_dev->isr_mem = NULL;
+ }
+
+ return err;
}
static int virtio_mmio_remove(struct platform_device *pdev)
{
struct virtio_mmio_device *vm_dev = platform_get_drvdata(pdev);
+ free_pages_exact(vm_dev->isr_mem, PAGE_SIZE);
unregister_virtio_device(&vm_dev->vdev);
return 0;
diff --git a/include/linux/virtio_mmio.h b/include/linux/virtio_mmio.h
index 5c7b6f0..b1e3ec7 100644
--- a/include/linux/virtio_mmio.h
+++ b/include/linux/virtio_mmio.h
@@ -95,6 +95,9 @@
/* Device status register - Read Write */
#define VIRTIO_MMIO_STATUS 0x070
+/* Allocate ISRMEM for interrupt reason - Write Only */
+#define VIRTIO_MMIO_ISRMEM 0x080
+
/* The config space is defined by each driver as
* the per-driver configuration space - Read Write */
#define VIRTIO_MMIO_CONFIG 0x100
--
1.7.9.5
On Sat, Oct 25, 2014 at 04:24:52PM +0800, john.liuli wrote:
> From: Li Liu <[email protected]>
>
> This set of patches try to implemet irqfd support of vhost-net
> based on virtio-mmio.
>
> I had posted a mail to talking about the status of vhost-net
> on kvm-arm refer to http://www.spinics.net/lists/kvm-arm/msg10804.html.
> Some dependent patches are listed in the mail too. Basically the
> vhost-net brings great performance improvements, almost 50%+.
>
> It's easy to implement irqfd support with PCI MSI-X. But till
> now arm32 do not provide equivalent mechanism to let a device
> allocate multiple interrupts. And even the aarch64 provid LPI
> but also not available in a short time.
>
> As Gauguey Remy said "Vhost does not emulate a complete virtio
> adapter but only manage virtqueue operations". Vhost module
> don't update the ISR register, so if with only one irq then it's
> no way to get the interrupt reason even we can inject the
> irq correctly.
Well guests don't read ISR in MSI-X mode so why does it help
to set the ISR bit?
> To get the interrupt reason to support such VIRTIO_NET_F_STATUS
> features I add a new register offset VIRTIO_MMIO_ISRMEM which
> will help to establish a shared memory region between qemu and
> virtio-mmio device. Then the interrupt reason can be accessed by
> guest driver through this region. At the same time, the virtio-mmio
> dirver check this region to see irqfd is supported or not during
> the irq handler registration, and different handler will be assigned.
>
> I want to know it's the right direction? Does it comply with the
> virtio-mmio spec.? Or anyone have more good ideas to emulate mis-x
> based on virtio-mmio? I hope to get feedback and guidance.
> Thx for any help.
>
> Li Liu (2):
> Add a new register offset let interrupt reason available
> Assign a new irq handler while irqfd enabled
>
> drivers/virtio/virtio_mmio.c | 55 +++++++++++++++++++++++++++++++++++++++---
> include/linux/virtio_mmio.h | 3 +++
> 2 files changed, 55 insertions(+), 3 deletions(-)
>
> --
> 1.7.9.5
>
On Sat, Oct 25, 2014 at 04:24:54PM +0800, john.liuli wrote:
> From: Li Liu <[email protected]>
>
> This irq handler will get the interrupt reason from a
> shared memory. And will be assigned only while irqfd
> enabled.
>
> Signed-off-by: Li Liu <[email protected]>
> ---
> drivers/virtio/virtio_mmio.c | 34 ++++++++++++++++++++++++++++++++--
> 1 file changed, 32 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
> index 28ddb55..7229605 100644
> --- a/drivers/virtio/virtio_mmio.c
> +++ b/drivers/virtio/virtio_mmio.c
> @@ -259,7 +259,31 @@ static irqreturn_t vm_interrupt(int irq, void *opaque)
> return ret;
> }
>
> +/* Notify all virtqueues on an interrupt. */
> +static irqreturn_t vm_interrupt_irqfd(int irq, void *opaque)
> +{
> + struct virtio_mmio_device *vm_dev = opaque;
> + struct virtio_mmio_vq_info *info;
> + unsigned long status;
> + unsigned long flags;
> + irqreturn_t ret = IRQ_NONE;
>
> + /* Read the interrupt reason and reset it */
> + status = *vm_dev->isr_mem;
> + *vm_dev->isr_mem = 0x0;
you are reading and modifying shared memory
without atomics and any memory barriers.
Why is this safe?
> +
> + if (unlikely(status & VIRTIO_MMIO_INT_CONFIG)) {
> + virtio_config_changed(&vm_dev->vdev);
> + ret = IRQ_HANDLED;
> + }
> +
> + spin_lock_irqsave(&vm_dev->lock, flags);
> + list_for_each_entry(info, &vm_dev->virtqueues, node)
> + ret |= vring_interrupt(irq, info->vq);
> + spin_unlock_irqrestore(&vm_dev->lock, flags);
> +
> + return ret;
> +}
>
> static void vm_del_vq(struct virtqueue *vq)
> {
So you invoke callbacks for all VQs.
This won't scale well as the number of VQs grows, will it?
> @@ -391,6 +415,7 @@ error_available:
> return ERR_PTR(err);
> }
>
> +#define VIRTIO_MMIO_F_IRQFD (1 << 7)
> static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
> struct virtqueue *vqs[],
> vq_callback_t *callbacks[],
> @@ -400,8 +425,13 @@ static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
> unsigned int irq = platform_get_irq(vm_dev->pdev, 0);
> int i, err;
>
> - err = request_irq(irq, vm_interrupt, IRQF_SHARED,
> - dev_name(&vdev->dev), vm_dev);
> + if (*vm_dev->isr_mem & VIRTIO_MMIO_F_IRQFD) {
> + err = request_irq(irq, vm_interrupt_irqfd, IRQF_SHARED,
> + dev_name(&vdev->dev), vm_dev);
> + } else {
> + err = request_irq(irq, vm_interrupt, IRQF_SHARED,
> + dev_name(&vdev->dev), vm_dev);
> + }
> if (err)
> return err;
So still a single interrupt for all VQs.
Again this doesn't scale: a single CPU has to handle
interrupts for all of them.
I think you need to find a way to get per-VQ interrupts.
> --
> 1.7.9.5
>
On Sat, Oct 25, 2014 at 04:24:53PM +0800, john.liuli wrote:
> From: Li Liu <[email protected]>
>
> Add a new register offset VIRTIO_MMIO_ISRMEM which help to
> estblish a shared memory region between virtio-mmio driver
> and qemu with two purposes:
>
> 1.Guest virtio-mmio driver can get the interrupt reason.
> 2.Check irqfd enabled or not to register different irq handler.
>
> Signed-off-by: Li Liu <[email protected]>
> ---
> drivers/virtio/virtio_mmio.c | 21 ++++++++++++++++++++-
> include/linux/virtio_mmio.h | 3 +++
> 2 files changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
> index ef9a165..28ddb55 100644
> --- a/drivers/virtio/virtio_mmio.c
> +++ b/drivers/virtio/virtio_mmio.c
> @@ -122,6 +122,8 @@ struct virtio_mmio_device {
> /* a list of queues so we can dispatch IRQs */
> spinlock_t lock;
> struct list_head virtqueues;
> +
> + uint8_t *isr_mem;
> };
>
> struct virtio_mmio_vq_info {
> @@ -443,6 +445,7 @@ static int virtio_mmio_probe(struct platform_device *pdev)
> struct virtio_mmio_device *vm_dev;
> struct resource *mem;
> unsigned long magic;
> + int err;
>
> mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> if (!mem)
> @@ -481,6 +484,15 @@ static int virtio_mmio_probe(struct platform_device *pdev)
> return -ENXIO;
> }
>
> + vm_dev->isr_mem = alloc_pages_exact(PAGE_SIZE, GFP_KERNEL|__GFP_ZERO);
> + if (vm_dev->isr_mem == NULL) {
> + dev_err(&pdev->dev, "Allocate isr memory failed!\n");
> + return -ENOMEM;
> + }
> +
> + writel(virt_to_phys(vm_dev->isr_mem),
> + vm_dev->base + VIRTIO_MMIO_ISRMEM);
> +
What happens to existing devices?
then might not expect writes at this address.
> vm_dev->vdev.id.device = readl(vm_dev->base + VIRTIO_MMIO_DEVICE_ID);
> vm_dev->vdev.id.vendor = readl(vm_dev->base + VIRTIO_MMIO_VENDOR_ID);
>
> @@ -488,13 +500,20 @@ static int virtio_mmio_probe(struct platform_device *pdev)
>
> platform_set_drvdata(pdev, vm_dev);
>
> - return register_virtio_device(&vm_dev->vdev);
> + err = register_virtio_device(&vm_dev->vdev);
> + if (err) {
> + free_pages_exact(vm_dev->isr_mem, PAGE_SIZE);
> + vm_dev->isr_mem = NULL;
> + }
> +
> + return err;
> }
>
> static int virtio_mmio_remove(struct platform_device *pdev)
> {
> struct virtio_mmio_device *vm_dev = platform_get_drvdata(pdev);
>
> + free_pages_exact(vm_dev->isr_mem, PAGE_SIZE);
> unregister_virtio_device(&vm_dev->vdev);
>
> return 0;
> diff --git a/include/linux/virtio_mmio.h b/include/linux/virtio_mmio.h
> index 5c7b6f0..b1e3ec7 100644
> --- a/include/linux/virtio_mmio.h
> +++ b/include/linux/virtio_mmio.h
> @@ -95,6 +95,9 @@
> /* Device status register - Read Write */
> #define VIRTIO_MMIO_STATUS 0x070
>
> +/* Allocate ISRMEM for interrupt reason - Write Only */
> +#define VIRTIO_MMIO_ISRMEM 0x080
> +
> /* The config space is defined by each driver as
> * the per-driver configuration space - Read Write */
> #define VIRTIO_MMIO_CONFIG 0x100
> --
> 1.7.9.5
>
On 2014/10/26 19:52, Michael S. Tsirkin wrote:
> On Sat, Oct 25, 2014 at 04:24:52PM +0800, john.liuli wrote:
>> From: Li Liu <[email protected]>
>>
>> This set of patches try to implemet irqfd support of vhost-net
>> based on virtio-mmio.
>>
>> I had posted a mail to talking about the status of vhost-net
>> on kvm-arm refer to http://www.spinics.net/lists/kvm-arm/msg10804.html.
>> Some dependent patches are listed in the mail too. Basically the
>> vhost-net brings great performance improvements, almost 50%+.
>>
>> It's easy to implement irqfd support with PCI MSI-X. But till
>> now arm32 do not provide equivalent mechanism to let a device
>> allocate multiple interrupts. And even the aarch64 provid LPI
>> but also not available in a short time.
>>
>> As Gauguey Remy said "Vhost does not emulate a complete virtio
>> adapter but only manage virtqueue operations". Vhost module
>> don't update the ISR register, so if with only one irq then it's
>> no way to get the interrupt reason even we can inject the
>> irq correctly.
>
> Well guests don't read ISR in MSI-X mode so why does it help
> to set the ISR bit?
Yeah, vhost don't need to set ISR under MSI-X mode. But for ARM
without MSI-X kind mechanism guest can't get the interrupt reason
through the only one irq hanlder (with one gsi resource).
So I build a shared memory region to provide the interrupt reason
instead of ISR regiser by qemu without bothering vhost. Then even
there's only one irq with only one irq hanlder, it still can
distinguish why irq occur.
Li.
>
>> To get the interrupt reason to support such VIRTIO_NET_F_STATUS
>> features I add a new register offset VIRTIO_MMIO_ISRMEM which
>> will help to establish a shared memory region between qemu and
>> virtio-mmio device. Then the interrupt reason can be accessed by
>> guest driver through this region. At the same time, the virtio-mmio
>> dirver check this region to see irqfd is supported or not during
>> the irq handler registration, and different handler will be assigned.
>>
>> I want to know it's the right direction? Does it comply with the
>> virtio-mmio spec.? Or anyone have more good ideas to emulate mis-x
>> based on virtio-mmio? I hope to get feedback and guidance.
>> Thx for any help.
>>
>> Li Liu (2):
>> Add a new register offset let interrupt reason available
>> Assign a new irq handler while irqfd enabled
>>
>> drivers/virtio/virtio_mmio.c | 55 +++++++++++++++++++++++++++++++++++++++---
>> include/linux/virtio_mmio.h | 3 +++
>> 2 files changed, 55 insertions(+), 3 deletions(-)
>>
>> --
>> 1.7.9.5
>>
>
> .
>
On 25 October 2014 09:24, john.liuli <[email protected]> wrote:
> To get the interrupt reason to support such VIRTIO_NET_F_STATUS
> features I add a new register offset VIRTIO_MMIO_ISRMEM which
> will help to establish a shared memory region between qemu and
> virtio-mmio device. Then the interrupt reason can be accessed by
> guest driver through this region. At the same time, the virtio-mmio
> dirver check this region to see irqfd is supported or not during
> the irq handler registration, and different handler will be assigned.
If you want to add a new register you should probably propose
an update to the virtio spec. However, it seems to me it would
be better to get generic PCI/PCIe working on the ARM virt
board instead; then we can let virtio-mmio quietly fade away.
This has been on the todo list for ages (and there have been
RFC patches posted for plain PCI), it's just nobody's had time
to work on it.
thanks
-- PMM
On Mon, Oct 27, 2014 at 05:19:23PM +0800, Li Liu wrote:
>
>
> On 2014/10/26 19:52, Michael S. Tsirkin wrote:
> > On Sat, Oct 25, 2014 at 04:24:52PM +0800, john.liuli wrote:
> >> From: Li Liu <[email protected]>
> >>
> >> This set of patches try to implemet irqfd support of vhost-net
> >> based on virtio-mmio.
> >>
> >> I had posted a mail to talking about the status of vhost-net
> >> on kvm-arm refer to http://www.spinics.net/lists/kvm-arm/msg10804.html.
> >> Some dependent patches are listed in the mail too. Basically the
> >> vhost-net brings great performance improvements, almost 50%+.
> >>
> >> It's easy to implement irqfd support with PCI MSI-X. But till
> >> now arm32 do not provide equivalent mechanism to let a device
> >> allocate multiple interrupts. And even the aarch64 provid LPI
> >> but also not available in a short time.
> >>
> >> As Gauguey Remy said "Vhost does not emulate a complete virtio
> >> adapter but only manage virtqueue operations". Vhost module
> >> don't update the ISR register, so if with only one irq then it's
> >> no way to get the interrupt reason even we can inject the
> >> irq correctly.
> >
> > Well guests don't read ISR in MSI-X mode so why does it help
> > to set the ISR bit?
>
> Yeah, vhost don't need to set ISR under MSI-X mode. But for ARM
> without MSI-X kind mechanism guest can't get the interrupt reason
> through the only one irq hanlder (with one gsi resource).
>
> So I build a shared memory region to provide the interrupt reason
> instead of ISR regiser by qemu without bothering vhost. Then even
> there's only one irq with only one irq hanlder, it still can
> distinguish why irq occur.
>
> Li.
OK so this might allow sharing IRQs between config and VQs
which might be a handy optimization even for PCI
(at the moment reading ISR causes an exit).
Need to see how all this would work on MP systems though.
>
> >
> >> To get the interrupt reason to support such VIRTIO_NET_F_STATUS
> >> features I add a new register offset VIRTIO_MMIO_ISRMEM which
> >> will help to establish a shared memory region between qemu and
> >> virtio-mmio device. Then the interrupt reason can be accessed by
> >> guest driver through this region. At the same time, the virtio-mmio
> >> dirver check this region to see irqfd is supported or not during
> >> the irq handler registration, and different handler will be assigned.
> >>
> >> I want to know it's the right direction? Does it comply with the
> >> virtio-mmio spec.? Or anyone have more good ideas to emulate mis-x
> >> based on virtio-mmio? I hope to get feedback and guidance.
> >> Thx for any help.
> >>
> >> Li Liu (2):
> >> Add a new register offset let interrupt reason available
> >> Assign a new irq handler while irqfd enabled
> >>
> >> drivers/virtio/virtio_mmio.c | 55 +++++++++++++++++++++++++++++++++++++++---
> >> include/linux/virtio_mmio.h | 3 +++
> >> 2 files changed, 55 insertions(+), 3 deletions(-)
> >>
> >> --
> >> 1.7.9.5
> >>
> >
> > .
> >
On 2014/10/26 19:56, Michael S. Tsirkin wrote:
> On Sat, Oct 25, 2014 at 04:24:54PM +0800, john.liuli wrote:
>> From: Li Liu <[email protected]>
>>
>> This irq handler will get the interrupt reason from a
>> shared memory. And will be assigned only while irqfd
>> enabled.
>>
>> Signed-off-by: Li Liu <[email protected]>
>> ---
>> drivers/virtio/virtio_mmio.c | 34 ++++++++++++++++++++++++++++++++--
>> 1 file changed, 32 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
>> index 28ddb55..7229605 100644
>> --- a/drivers/virtio/virtio_mmio.c
>> +++ b/drivers/virtio/virtio_mmio.c
>> @@ -259,7 +259,31 @@ static irqreturn_t vm_interrupt(int irq, void *opaque)
>> return ret;
>> }
>>
>> +/* Notify all virtqueues on an interrupt. */
>> +static irqreturn_t vm_interrupt_irqfd(int irq, void *opaque)
>> +{
>> + struct virtio_mmio_device *vm_dev = opaque;
>> + struct virtio_mmio_vq_info *info;
>> + unsigned long status;
>> + unsigned long flags;
>> + irqreturn_t ret = IRQ_NONE;
>>
>> + /* Read the interrupt reason and reset it */
>> + status = *vm_dev->isr_mem;
>> + *vm_dev->isr_mem = 0x0;
>
> you are reading and modifying shared memory
> without atomics and any memory barriers.
> Why is this safe?
>
good catch, a stupid mistake.
>> +
>> + if (unlikely(status & VIRTIO_MMIO_INT_CONFIG)) {
>> + virtio_config_changed(&vm_dev->vdev);
>> + ret = IRQ_HANDLED;
>> + }
>> +
>> + spin_lock_irqsave(&vm_dev->lock, flags);
>> + list_for_each_entry(info, &vm_dev->virtqueues, node)
>> + ret |= vring_interrupt(irq, info->vq);
>> + spin_unlock_irqrestore(&vm_dev->lock, flags);
>> +
>> + return ret;
>> +}
>>
>> static void vm_del_vq(struct virtqueue *vq)
>> {
>
> So you invoke callbacks for all VQs.
> This won't scale well as the number of VQs grows, will it?
>
>> @@ -391,6 +415,7 @@ error_available:
>> return ERR_PTR(err);
>> }
>>
>> +#define VIRTIO_MMIO_F_IRQFD (1 << 7)
>> static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
>> struct virtqueue *vqs[],
>> vq_callback_t *callbacks[],
>> @@ -400,8 +425,13 @@ static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
>> unsigned int irq = platform_get_irq(vm_dev->pdev, 0);
>> int i, err;
>>
>> - err = request_irq(irq, vm_interrupt, IRQF_SHARED,
>> - dev_name(&vdev->dev), vm_dev);
>> + if (*vm_dev->isr_mem & VIRTIO_MMIO_F_IRQFD) {
>> + err = request_irq(irq, vm_interrupt_irqfd, IRQF_SHARED,
>> + dev_name(&vdev->dev), vm_dev);
>> + } else {
>> + err = request_irq(irq, vm_interrupt, IRQF_SHARED,
>> + dev_name(&vdev->dev), vm_dev);
>> + }
>> if (err)
>> return err;
>
>
> So still a single interrupt for all VQs.
> Again this doesn't scale: a single CPU has to handle
> interrupts for all of them.
> I think you need to find a way to get per-VQ interrupts.
Yeah, AFAIK it's impossible to distribute works to different CPUs with
only one irq without MSI-X kind mechanism. Assign multiple gsis to one
device, obviously it's consumptive and not scalable. Any ideas? Thx.
>
>> --
>> 1.7.9.5
>>
>
> .
>
On 2014/10/27 17:37, Peter Maydell wrote:
> On 25 October 2014 09:24, john.liuli <[email protected]> wrote:
>> To get the interrupt reason to support such VIRTIO_NET_F_STATUS
>> features I add a new register offset VIRTIO_MMIO_ISRMEM which
>> will help to establish a shared memory region between qemu and
>> virtio-mmio device. Then the interrupt reason can be accessed by
>> guest driver through this region. At the same time, the virtio-mmio
>> dirver check this region to see irqfd is supported or not during
>> the irq handler registration, and different handler will be assigned.
>
> If you want to add a new register you should probably propose
> an update to the virtio spec. However, it seems to me it would
> be better to get generic PCI/PCIe working on the ARM virt
> board instead; then we can let virtio-mmio quietly fade away.
> This has been on the todo list for ages (and there have been
> RFC patches posted for plain PCI), it's just nobody's had time
> to work on it.
>
> thanks
> -- PMM
>
So you mean virtio-mmio will be replaced by PCI/PCIe on ARM at last?
If so, let this patch go with the wind:). Thx.
Li.
> .
>
On 27 October 2014 11:23, Li Liu <[email protected]> wrote:
> So you mean virtio-mmio will be replaced by PCI/PCIe on ARM at last?
That is the plan, yes. I can't make any promises on
timescales at the moment, though...
-- PMM
On Mon, Oct 27, 2014 at 07:04:11PM +0800, Li Liu wrote:
>
>
> On 2014/10/26 19:56, Michael S. Tsirkin wrote:
> > On Sat, Oct 25, 2014 at 04:24:54PM +0800, john.liuli wrote:
> >> From: Li Liu <[email protected]>
> >>
> >> This irq handler will get the interrupt reason from a
> >> shared memory. And will be assigned only while irqfd
> >> enabled.
> >>
> >> Signed-off-by: Li Liu <[email protected]>
> >> ---
> >> drivers/virtio/virtio_mmio.c | 34 ++++++++++++++++++++++++++++++++--
> >> 1 file changed, 32 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
> >> index 28ddb55..7229605 100644
> >> --- a/drivers/virtio/virtio_mmio.c
> >> +++ b/drivers/virtio/virtio_mmio.c
> >> @@ -259,7 +259,31 @@ static irqreturn_t vm_interrupt(int irq, void *opaque)
> >> return ret;
> >> }
> >>
> >> +/* Notify all virtqueues on an interrupt. */
> >> +static irqreturn_t vm_interrupt_irqfd(int irq, void *opaque)
> >> +{
> >> + struct virtio_mmio_device *vm_dev = opaque;
> >> + struct virtio_mmio_vq_info *info;
> >> + unsigned long status;
> >> + unsigned long flags;
> >> + irqreturn_t ret = IRQ_NONE;
> >>
> >> + /* Read the interrupt reason and reset it */
> >> + status = *vm_dev->isr_mem;
> >> + *vm_dev->isr_mem = 0x0;
> >
> > you are reading and modifying shared memory
> > without atomics and any memory barriers.
> > Why is this safe?
> >
>
> good catch, a stupid mistake.
>
> >> +
> >> + if (unlikely(status & VIRTIO_MMIO_INT_CONFIG)) {
> >> + virtio_config_changed(&vm_dev->vdev);
> >> + ret = IRQ_HANDLED;
> >> + }
> >> +
> >> + spin_lock_irqsave(&vm_dev->lock, flags);
> >> + list_for_each_entry(info, &vm_dev->virtqueues, node)
> >> + ret |= vring_interrupt(irq, info->vq);
> >> + spin_unlock_irqrestore(&vm_dev->lock, flags);
> >> +
> >> + return ret;
> >> +}
> >>
> >> static void vm_del_vq(struct virtqueue *vq)
> >> {
> >
> > So you invoke callbacks for all VQs.
> > This won't scale well as the number of VQs grows, will it?
> >
> >> @@ -391,6 +415,7 @@ error_available:
> >> return ERR_PTR(err);
> >> }
> >>
> >> +#define VIRTIO_MMIO_F_IRQFD (1 << 7)
> >> static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
> >> struct virtqueue *vqs[],
> >> vq_callback_t *callbacks[],
> >> @@ -400,8 +425,13 @@ static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
> >> unsigned int irq = platform_get_irq(vm_dev->pdev, 0);
> >> int i, err;
> >>
> >> - err = request_irq(irq, vm_interrupt, IRQF_SHARED,
> >> - dev_name(&vdev->dev), vm_dev);
> >> + if (*vm_dev->isr_mem & VIRTIO_MMIO_F_IRQFD) {
> >> + err = request_irq(irq, vm_interrupt_irqfd, IRQF_SHARED,
> >> + dev_name(&vdev->dev), vm_dev);
> >> + } else {
> >> + err = request_irq(irq, vm_interrupt, IRQF_SHARED,
> >> + dev_name(&vdev->dev), vm_dev);
> >> + }
> >> if (err)
> >> return err;
> >
> >
> > So still a single interrupt for all VQs.
> > Again this doesn't scale: a single CPU has to handle
> > interrupts for all of them.
> > I think you need to find a way to get per-VQ interrupts.
>
> Yeah, AFAIK it's impossible to distribute works to different CPUs with
> only one irq without MSI-X kind mechanism. Assign multiple gsis to one
> device, obviously it's consumptive and not scalable.
Why not? How many gsis are there on ARM?
> Any ideas? Thx.
>
> >
> >> --
> >> 1.7.9.5
> >>
> >
> > .
> >
On 10/27/2014 12:23 PM, Li Liu wrote:
>
>
> On 2014/10/27 17:37, Peter Maydell wrote:
>> On 25 October 2014 09:24, john.liuli <[email protected]> wrote:
>>> To get the interrupt reason to support such VIRTIO_NET_F_STATUS
>>> features I add a new register offset VIRTIO_MMIO_ISRMEM which
>>> will help to establish a shared memory region between qemu and
>>> virtio-mmio device. Then the interrupt reason can be accessed by
>>> guest driver through this region. At the same time, the virtio-mmio
>>> dirver check this region to see irqfd is supported or not during
>>> the irq handler registration, and different handler will be assigned.
>>
>> If you want to add a new register you should probably propose
>> an update to the virtio spec. However, it seems to me it would
>> be better to get generic PCI/PCIe working on the ARM virt
>> board instead; then we can let virtio-mmio quietly fade away.
>> This has been on the todo list for ages (and there have been
>> RFC patches posted for plain PCI), it's just nobody's had time
>> to work on it.
>>
>> thanks
>> -- PMM
>>
>
> So you mean virtio-mmio will be replaced by PCI/PCIe on ARM at last?
> If so, let this patch go with the wind:). Thx.
Hi,
As a fix of current situation where ISR is only partially updated when
vhost-irqfd handles standard IRQ and waiting for PCI emuluation,
wouldn't it make sense to store ISR content on vhost driver side and
introduce ioctls to read/write it. When using vhost BE, virtio QEMU
device would use those ioctl to read/update the ISR content. On top of
that we would update the ISR in vhost before triggering the irqfd. If I
do not miss anything this would at least make things functional with irqfd.
As a second step, we could try to introduce in-kernel emulation of
ISR/ACK to fix the performance issue related to going to user-side each
time ISR/ACK accesses are done.
Do you think it is worth investigating this direction?
Thank you in advance
Best Regards
Eric
>
> Li.
>> .
>>
>
>
On Mon, Oct 27, 2014 at 12:58 PM, Peter Maydell
<[email protected]> wrote:
> On 27 October 2014 11:23, Li Liu <[email protected]> wrote:
>> So you mean virtio-mmio will be replaced by PCI/PCIe on ARM at last?
>
> That is the plan, yes. I can't make any promises on
> timescales at the moment, though...
>
Linaro has scheduled resources to work on this (Ard, cc'ed) and we
expect to be able to deliver this within a reasonable time frame.
-Christoffer
On 2014/11/5 16:43, Eric Auger wrote:
> On 10/27/2014 12:23 PM, Li Liu wrote:
>>
>>
>> On 2014/10/27 17:37, Peter Maydell wrote:
>>> On 25 October 2014 09:24, john.liuli <[email protected]> wrote:
>>>> To get the interrupt reason to support such VIRTIO_NET_F_STATUS
>>>> features I add a new register offset VIRTIO_MMIO_ISRMEM which
>>>> will help to establish a shared memory region between qemu and
>>>> virtio-mmio device. Then the interrupt reason can be accessed by
>>>> guest driver through this region. At the same time, the virtio-mmio
>>>> dirver check this region to see irqfd is supported or not during
>>>> the irq handler registration, and different handler will be assigned.
>>>
>>> If you want to add a new register you should probably propose
>>> an update to the virtio spec. However, it seems to me it would
>>> be better to get generic PCI/PCIe working on the ARM virt
>>> board instead; then we can let virtio-mmio quietly fade away.
>>> This has been on the todo list for ages (and there have been
>>> RFC patches posted for plain PCI), it's just nobody's had time
>>> to work on it.
>>>
>>> thanks
>>> -- PMM
>>>
>>
>> So you mean virtio-mmio will be replaced by PCI/PCIe on ARM at last?
>> If so, let this patch go with the wind:). Thx.
>
> Hi,
>
> As a fix of current situation where ISR is only partially updated when
> vhost-irqfd handles standard IRQ and waiting for PCI emuluation,
> wouldn't it make sense to store ISR content on vhost driver side and
> introduce ioctls to read/write it. When using vhost BE, virtio QEMU
> device would use those ioctl to read/update the ISR content. On top of
> that we would update the ISR in vhost before triggering the irqfd. If I
> do not miss anything this would at least make things functional with irqfd.
>
> As a second step, we could try to introduce in-kernel emulation of
> ISR/ACK to fix the performance issue related to going to user-side each
> time ISR/ACK accesses are done.
>
> Do you think it is worth investigating this direction?
>
Hi,
About this problem I have a talk with Li Liu. As MST said, we could use
multiple GSI to support vhost-net with irqfd. And we have figured out a way
to solve this problem. The method is as same as virtio-pci which is to assign
multiple irqs for virtio-mmio. Also it can support multiqueue virtio-net on arm.
Would you have a look at this method? Thank you very much.
- virtio-mmio: support for multiple irqs
http://www.spinics.net/lists/kernel/msg1858860.html
Thanks,
Shannon
> Thank you in advance
>
> Best Regards
>
> Eric
>
>
>>
>> Li.
>>> .
>>>
On 2014/11/6 9:59, Shannon Zhao wrote:
>
>
> On 2014/11/5 16:43, Eric Auger wrote:
>> On 10/27/2014 12:23 PM, Li Liu wrote:
>>>
>>>
>>> On 2014/10/27 17:37, Peter Maydell wrote:
>>>> On 25 October 2014 09:24, john.liuli <[email protected]> wrote:
>>>>> To get the interrupt reason to support such VIRTIO_NET_F_STATUS
>>>>> features I add a new register offset VIRTIO_MMIO_ISRMEM which
>>>>> will help to establish a shared memory region between qemu and
>>>>> virtio-mmio device. Then the interrupt reason can be accessed by
>>>>> guest driver through this region. At the same time, the virtio-mmio
>>>>> dirver check this region to see irqfd is supported or not during
>>>>> the irq handler registration, and different handler will be assigned.
>>>>
>>>> If you want to add a new register you should probably propose
>>>> an update to the virtio spec. However, it seems to me it would
>>>> be better to get generic PCI/PCIe working on the ARM virt
>>>> board instead; then we can let virtio-mmio quietly fade away.
>>>> This has been on the todo list for ages (and there have been
>>>> RFC patches posted for plain PCI), it's just nobody's had time
>>>> to work on it.
>>>>
>>>> thanks
>>>> -- PMM
>>>>
>>>
>>> So you mean virtio-mmio will be replaced by PCI/PCIe on ARM at last?
>>> If so, let this patch go with the wind:). Thx.
>>
>> Hi,
>>
>> As a fix of current situation where ISR is only partially updated when
>> vhost-irqfd handles standard IRQ and waiting for PCI emuluation,
>> wouldn't it make sense to store ISR content on vhost driver side and
>> introduce ioctls to read/write it. When using vhost BE, virtio QEMU
>> device would use those ioctl to read/update the ISR content. On top of
>> that we would update the ISR in vhost before triggering the irqfd. If I
>> do not miss anything this would at least make things functional with irqfd.
>>
>> As a second step, we could try to introduce in-kernel emulation of
>> ISR/ACK to fix the performance issue related to going to user-side each
>> time ISR/ACK accesses are done.
>>
>> Do you think it is worth investigating this direction?
>>
> Hi,
>
> About this problem I have a talk with Li Liu. As MST said, we could use
> multiple GSI to support vhost-net with irqfd. And we have figured out a way
> to solve this problem. The method is as same as virtio-pci which is to assign
> multiple irqs for virtio-mmio. Also it can support multiqueue virtio-net on arm.
>
> Would you have a look at this method? Thank you very much.
>
> - virtio-mmio: support for multiple irqs
> http://www.spinics.net/lists/kernel/msg1858860.html
>
> Thanks,
> Shannon
>
Yeah, I think multiple GSI is more compatible with MSI-X. And even virtio-mmio
will fade away at last. It still make senses for ARM32 which can't support PCI/PCIe.
BTW, this patch is handed over to Shannon and please refer to new patch at
http://www.spinics.net/lists/kernel/msg1858860.html.
Li.
>> Thank you in advance
>>
>> Best Regards
>>
>> Eric
>>
>>
>>>
>>> Li.
>>>> .
>>>>
>
>
> .
>