On Tue, 14 Apr 2020 11:03:47 +0800
Zenghui Yu <[email protected]> wrote:
Hi Zenghui,
> It's likely that the vcpu fails to handle all virtual interrupts if
> userspace decides to destroy it, leaving the pending ones stay in the
> ap_list. If the un-handled one is a LPI, its vgic_irq structure will
> be eventually leaked because of an extra refcount increment in
> vgic_queue_irq_unlock().
>
> This was detected by kmemleak on almost every guest destroy, the
> backtrace is as follows:
>
> unreferenced object 0xffff80725aed5500 (size 128):
> comm "CPU 5/KVM", pid 40711, jiffies 4298024754 (age 166366.512s)
> hex dump (first 32 bytes):
> 00 00 00 00 00 00 00 00 08 01 a9 73 6d 80 ff ff ...........sm...
> c8 61 ee a9 00 20 ff ff 28 1e 55 81 6c 80 ff ff .a... ..(.U.l...
> backtrace:
> [<000000004bcaa122>] kmem_cache_alloc_trace+0x2dc/0x418
> [<0000000069c7dabb>] vgic_add_lpi+0x88/0x418
> [<00000000bfefd5c5>] vgic_its_cmd_handle_mapi+0x4dc/0x588
> [<00000000cf993975>] vgic_its_process_commands.part.5+0x484/0x1198
> [<000000004bd3f8e3>] vgic_its_process_commands+0x50/0x80
> [<00000000b9a65b2b>] vgic_mmio_write_its_cwriter+0xac/0x108
> [<0000000009641ebb>] dispatch_mmio_write+0xd0/0x188
> [<000000008f79d288>] __kvm_io_bus_write+0x134/0x240
> [<00000000882f39ac>] kvm_io_bus_write+0xe0/0x150
> [<0000000078197602>] io_mem_abort+0x484/0x7b8
> [<0000000060954e3c>] kvm_handle_guest_abort+0x4cc/0xa58
> [<00000000e0d0cd65>] handle_exit+0x24c/0x770
> [<00000000b44a7fad>] kvm_arch_vcpu_ioctl_run+0x460/0x1988
> [<0000000025fb897c>] kvm_vcpu_ioctl+0x4f8/0xee0
> [<000000003271e317>] do_vfs_ioctl+0x160/0xcd8
> [<00000000e7f39607>] ksys_ioctl+0x98/0xd8
>
> Fix it by retiring all pending LPIs in the ap_list on the destroy path.
>
> p.s. I can also reproduce it on a normal guest shutdown. It is because
> userspace still send LPIs to vcpu (through KVM_SIGNAL_MSI ioctl) while
> the guest is being shutdown and unable to handle it. A little strange
> though and haven't dig further...
What userspace are you using? You'd hope that the VMM would stop
processing I/Os when destroying the guest. But we still need to handle
it anyway, and I thing this fix makes sense.
>
> Signed-off-by: Zenghui Yu <[email protected]>
> ---
> virt/kvm/arm/vgic/vgic-init.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
> index a963b9d766b7..53ec9b9d9bc4 100644
> --- a/virt/kvm/arm/vgic/vgic-init.c
> +++ b/virt/kvm/arm/vgic/vgic-init.c
> @@ -348,6 +348,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
> {
> struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>
> + /*
> + * Retire all pending LPIs on this vcpu anyway as we're
> + * going to destroy it.
> + */
> + vgic_flush_pending_lpis(vcpu);
> +
> INIT_LIST_HEAD(&vgic_cpu->ap_list_head);
> }
>
I guess that at this stage, the INIT_LIST_HEAD() is superfluous, right?
Otherwise, looks good. If you agree with the above, I can fix that
locally, no need to resend this patch.
Thanks,
M.
--
Jazz is not dead. It just smells funny...
Hi Marc,
On 2020/4/14 18:54, Marc Zyngier wrote:
> On Tue, 14 Apr 2020 11:03:47 +0800
> Zenghui Yu <[email protected]> wrote:
>
> Hi Zenghui,
>
>> It's likely that the vcpu fails to handle all virtual interrupts if
>> userspace decides to destroy it, leaving the pending ones stay in the
>> ap_list. If the un-handled one is a LPI, its vgic_irq structure will
>> be eventually leaked because of an extra refcount increment in
>> vgic_queue_irq_unlock().
>>
>> This was detected by kmemleak on almost every guest destroy, the
>> backtrace is as follows:
>>
>> unreferenced object 0xffff80725aed5500 (size 128):
>> comm "CPU 5/KVM", pid 40711, jiffies 4298024754 (age 166366.512s)
>> hex dump (first 32 bytes):
>> 00 00 00 00 00 00 00 00 08 01 a9 73 6d 80 ff ff ...........sm...
>> c8 61 ee a9 00 20 ff ff 28 1e 55 81 6c 80 ff ff .a... ..(.U.l...
>> backtrace:
>> [<000000004bcaa122>] kmem_cache_alloc_trace+0x2dc/0x418
>> [<0000000069c7dabb>] vgic_add_lpi+0x88/0x418
>> [<00000000bfefd5c5>] vgic_its_cmd_handle_mapi+0x4dc/0x588
>> [<00000000cf993975>] vgic_its_process_commands.part.5+0x484/0x1198
>> [<000000004bd3f8e3>] vgic_its_process_commands+0x50/0x80
>> [<00000000b9a65b2b>] vgic_mmio_write_its_cwriter+0xac/0x108
>> [<0000000009641ebb>] dispatch_mmio_write+0xd0/0x188
>> [<000000008f79d288>] __kvm_io_bus_write+0x134/0x240
>> [<00000000882f39ac>] kvm_io_bus_write+0xe0/0x150
>> [<0000000078197602>] io_mem_abort+0x484/0x7b8
>> [<0000000060954e3c>] kvm_handle_guest_abort+0x4cc/0xa58
>> [<00000000e0d0cd65>] handle_exit+0x24c/0x770
>> [<00000000b44a7fad>] kvm_arch_vcpu_ioctl_run+0x460/0x1988
>> [<0000000025fb897c>] kvm_vcpu_ioctl+0x4f8/0xee0
>> [<000000003271e317>] do_vfs_ioctl+0x160/0xcd8
>> [<00000000e7f39607>] ksys_ioctl+0x98/0xd8
>>
>> Fix it by retiring all pending LPIs in the ap_list on the destroy path.
>>
>> p.s. I can also reproduce it on a normal guest shutdown. It is because
>> userspace still send LPIs to vcpu (through KVM_SIGNAL_MSI ioctl) while
>> the guest is being shutdown and unable to handle it. A little strange
>> though and haven't dig further...
>
> What userspace are you using? You'd hope that the VMM would stop
> processing I/Os when destroying the guest. But we still need to handle
> it anyway, and I thing this fix makes sense.
I'm using Qemu (master) for debugging. Looks like an interrupt
corresponding to a virtio device configuration change, triggered after
all other devices had freed their irqs. Not sure if it's expected.
>>
>> Signed-off-by: Zenghui Yu <[email protected]>
>> ---
>> virt/kvm/arm/vgic/vgic-init.c | 6 ++++++
>> 1 file changed, 6 insertions(+)
>>
>> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
>> index a963b9d766b7..53ec9b9d9bc4 100644
>> --- a/virt/kvm/arm/vgic/vgic-init.c
>> +++ b/virt/kvm/arm/vgic/vgic-init.c
>> @@ -348,6 +348,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
>> {
>> struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>
>> + /*
>> + * Retire all pending LPIs on this vcpu anyway as we're
>> + * going to destroy it.
>> + */
>> + vgic_flush_pending_lpis(vcpu);
>> +
>> INIT_LIST_HEAD(&vgic_cpu->ap_list_head);
>> }
>>
>
> I guess that at this stage, the INIT_LIST_HEAD() is superfluous, right?
I was just thinking that the ap_list_head may not be empty (besides LPI,
with other active or pending interrupts), so leave it unchanged.
> Otherwise, looks good. If you agree with the above, I can fix that
> locally, no need to resend this patch.
Thanks,
Zenghui