2023-11-04 10:29:39

by Marc Zyngier

[permalink] [raw]
Subject: Re: [RFC PATCH] KVM: arm/arm64: GICv4: Support shared VLPI

On Thu, 02 Nov 2023 14:35:07 +0000,
Kunkun Jiang <[email protected]> wrote:
>
> In some scenarios, the guest virtio-pci driver will request two MSI-X,
> one vector for config, one shared for queues. However, the host driver
> (vDPA or VFIO) will request a vector for each queue.

Well, VFIO will request *all* available MSI-X. It doesn't know what a
queue is.

>
> In the current implementation of GICv4/4.1 direct injection of vLPI,
> pINTID and vINTID have one-to-one correspondence. Therefore, the

This matching is a hard requirement that matches the architecture. You
cannot change it.

> above scenario cannot be handled correctly. The host kernel will
> execute its_map_vlpi multiple times but only execute its_unmap_vlpi
> once. This may cause guest hang[1].

Why does it hang? As far as it is concerned, it has unmapped the
interrupts it cares about. Where are the calls to its_map_vlpi()
coming from? It should only occur if the guest actively programs the
MSI-X registers. What is your VMM? How can I reproduce this issue?

>
> | WARN_ON(!(irq->hw && irq->host_irq == virq));
> | if (irq->hw) {
> | atomic_dec(&irq->target_vcpu->arch.vgic_cpu.vgic_v3.its_vpe.vlpi_count);
> | irq->hw = false;
> | ret = its_unmap_vlpi(virq);
> | }
>
> Add a list to struct vgic_irq to record all host irqs mapped to the vlpi.
> When performing an action on the vlpi, traverse the list and perform this
> action on all host irqs.

This makes no sense. You are blindly associating multiple host
interrupts with a single guest interrupt. This is a blatant violation
of the architecture. When unmapping a VLPI from a guest, only this one
should be turned again into an LPI. Not two, not all, just this one.

Maybe you have found an actual issue, but this patch is absolutely
unacceptable. Please fully describe the problem, provide traces, and
if possible a reproducer.

>
> Link: https://lore.kernel.org/all/[email protected]/#t

I tried to parse this, but it hardly makes sense either. You seem to
imply that the host driver pre-configures the device, which is
completely wrong. The host driver (VFIO) should simply request all
possible physical LPIs, and that's all. It is expected that this
requesting has no other effect on the HW. Also, since your guest
driver only configures a single vLPI, there should be only a single
its_map_vlpi() call.

So it seems to me that your HW and SW are doing things that are not
expected at all.

M.

--
Without deviation from the norm, progress is not possible.


2023-11-06 15:35:51

by Kunkun Jiang

[permalink] [raw]
Subject: Re: [RFC PATCH] KVM: arm/arm64: GICv4: Support shared VLPI

Hi Marc,

On 2023/11/4 18:29, Marc Zyngier wrote:
> On Thu, 02 Nov 2023 14:35:07 +0000,
> Kunkun Jiang <[email protected]> wrote:
>> In some scenarios, the guest virtio-pci driver will request two MSI-X,
>> one vector for config, one shared for queues. However, the host driver
>> (vDPA or VFIO) will request a vector for each queue.
> Well, VFIO will request *all* available MSI-X. It doesn't know what a
> queue is.
>
>> In the current implementation of GICv4/4.1 direct injection of vLPI,
>> pINTID and vINTID have one-to-one correspondence. Therefore, the
> This matching is a hard requirement that matches the architecture. You
> cannot change it.
>
>> above scenario cannot be handled correctly. The host kernel will
>> execute its_map_vlpi multiple times but only execute its_unmap_vlpi
>> once. This may cause guest hang[1].
> Why does it hang? As far as it is concerned, it has unmapped the
> interrupts it cares about. Where are the calls to its_map_vlpi()
> coming from? It should only occur if the guest actively programs the
> MSI-X registers. What is your VMM? How can I reproduce this issue?
>
>> | WARN_ON(!(irq->hw && irq->host_irq == virq));
>> | if (irq->hw) {
>> | atomic_dec(&irq->target_vcpu->arch.vgic_cpu.vgic_v3.its_vpe.vlpi_count);
>> | irq->hw = false;
>> | ret = its_unmap_vlpi(virq);
>> | }
>>
>> Add a list to struct vgic_irq to record all host irqs mapped to the vlpi.
>> When performing an action on the vlpi, traverse the list and perform this
>> action on all host irqs.
> This makes no sense. You are blindly associating multiple host
> interrupts with a single guest interrupt. This is a blatant violation
> of the architecture. When unmapping a VLPI from a guest, only this one
> should be turned again into an LPI. Not two, not all, just this one.
>
> Maybe you have found an actual issue, but this patch is absolutely
> unacceptable. Please fully describe the problem, provide traces, and
> if possible a reproducer.
>
>> Link: https://lore.kernel.org/all/[email protected]/#t
> I tried to parse this, but it hardly makes sense either. You seem to
> imply that the host driver pre-configures the device, which is
> completely wrong. The host driver (VFIO) should simply request all
> possible physical LPIs, and that's all. It is expected that this
> requesting has no other effect on the HW. Also, since your guest
> driver only configures a single vLPI, there should be only a single
> its_map_vlpi() call.
Sorry to replay so late.

The virtio-scsi device has seven vectors (entry0-6): one for config,
six for queues. In Guest, e.g. centos 7.6 4.19, virtio-pci driver
will request only one vLPI, which is shared for queues.
The entry 0 is used for config. It's not relevant to this issue, so
we're not going to discuss it. The virtio-pci driver write entry1-6
massage.data in the msix-table and trap to QEMU for processing. The
massage.data is as follow:
> entry-0 0
> entry-1 1
> entry-2 1
> entry-3 1
> entry-4 1
> entry-5 1
> entry-6 1

The calling process of kvm is as follows. its_map_vlpi_will be
executed 6 times. Six host irqs are mapped to one vLPI.
> kvm_irqfd_assign
>     irq_bypass_register_consumer
>         ...
>         kvm_arch_irq_bypass_add_producer
>             kvm_vgic_v4_set_forwarding
>                 its_map_vlpi

When executing the reboot command inside the Guest,
kvm_vgic_v4_unset_forwarding will be execute 6 times. WARN_ON
will also be triggered 6 times. But its_unmap_vlpi will only
be executed the first time.
> kvm_arch_irq_bypass_del_producer
>     kvm_vgic_v4_unset_forwarding
>         WARN_ON(!(irq->hw && irq->host_irq == virq));
>         if (irq->hw) {
>             irq->hw = false;
> its_unmap_vlpi
>         }

Therefore, only the mapping between the first host irq and
vLPI is unmapped. When the guest reboots into the BIOS phase,
the remaining 5 host irqs may still send interrupts. This
causes the guest to hang.

Looking forward to your reply.

Thanks,
Kunkun Jiang
> So it seems to me that your HW and SW are doing things that are not
> expected at all.
>
> M.
>