2024-02-28 18:30:02

by Peter Delevoryas

[permalink] [raw]
Subject: [q&a] Status of IOMMU virtualization for nested virtualization (userspace PCI drivers in VMs)

Hey guys,

I’m having a little trouble reading between the lines on various docs, mailing list threads, KVM presentations, github forks, etc, so I figured I’d just ask:

What is the status of IOMMU virtualization, like in the case where I want a VM guest to have a virtual IOMMU?

I found this great presentation from KVM Forum 2021: [1]

1. I’m using -device intel-iommu right now. This has performance implications and large DMA transfers hit the vfio_iommu_type1 dma_entry_limit on the host because of how the mappings are made.

2. -device virtio-iommu is an improvement, but it doesn’t seem compatible with -device vfio-pci? I was only able to test this with cloud-hypervisor, and it has a better vfio mapping pattern (avoids hitting dma_entry_limit).

3. -object iommufd [2] I haven’t tried this quite yet, planning to: if it’s using iommufd, and I have all the right kernel features in the guest and host, I assume it’s implementing the passthrough mode that AMD has described in their talk? Because I imagine that would be the best solution for me, I’m just having trouble understanding if it’s actually related or orthogonal. I see AMD has -device amd-viommu here [3], is that ever going to be upstreamed or is that what -object iommufd is abstracting? I also found this mailing list submission [4], and the context and changes there imply this is all about that (exposing iommu virtualization to the guest)

Thanks!
Peter

[1] https://static.sched.com/hosted_files/kvmforum2021/da/vIOMMU%20KVM%20Forum%202021%20-%20v4.pdf
[2] https://www.qemu.org/docs/master/devel/vfio-iommufd.html
[3] https://github.com/AMDESE/qemu/commit/ee056455c411ee3369a47c65ba8a54783b5d2814
[4] https://lore.kernel.org/lkml/[email protected]/



2024-02-28 19:41:16

by Alex Williamson

[permalink] [raw]
Subject: Re: [q&a] Status of IOMMU virtualization for nested virtualization (userspace PCI drivers in VMs)

On Wed, 28 Feb 2024 10:29:32 -0800
Peter Delevoryas <[email protected]> wrote:

> Hey guys,
>
> I’m having a little trouble reading between the lines on various
> docs, mailing list threads, KVM presentations, github forks, etc, so
> I figured I’d just ask:
>
> What is the status of IOMMU virtualization, like in the case where I
> want a VM guest to have a virtual IOMMU?

It works fine for simply nested assignment scenarios, ie. guest
userspace drivers or nested VMs.

> I found this great presentation from KVM Forum 2021: [1]
>
> 1. I’m using -device intel-iommu right now. This has performance
> implications and large DMA transfers hit the vfio_iommu_type1
> dma_entry_limit on the host because of how the mappings are made.

Hugepages for the guest and mappings within the guest should help both
the mapping performance and DMA entry limit. In general the type1 vfio
IOMMU backend is not optimized for dynamic mapping, so performance-wise
your best bet is still to design the userspace driver for static DMA
buffers.

> 2. -device virtio-iommu is an improvement, but it doesn’t seem
> compatible with -device vfio-pci? I was only able to test this with
> cloud-hypervisor, and it has a better vfio mapping pattern (avoids
> hitting dma_entry_limit).

AFAIK it's just growing pains, it should work but it's working through
bugs.

> 3. -object iommufd [2] I haven’t tried this quite yet, planning to:
> if it’s using iommufd, and I have all the right kernel features in
> the guest and host, I assume it’s implementing the passthrough mode
> that AMD has described in their talk? Because I imagine that would be
> the best solution for me, I’m just having trouble understanding if
> it’s actually related or orthogonal.

For now iommufd provides a similar DMA mapping interface to type1, but
it does remove the DMA entry limit and improves locked page accounting.

To really see a performance improvement relative to dynamic mappings,
you'll need nesting support in the IOMMU, which is under active
development. From this aspect you will want iommufd since similar
features will not be provided by type1. Thanks,

Alex


2024-02-29 19:54:32

by Peter Delevoryas

[permalink] [raw]
Subject: Re: [q&a] Status of IOMMU virtualization for nested virtualization (userspace PCI drivers in VMs)



> On Feb 28, 2024, at 11:38 AM, Alex Williamson <[email protected]> wrote:
>
> On Wed, 28 Feb 2024 10:29:32 -0800
> Peter Delevoryas <[email protected]> wrote:
>
>> Hey guys,
>>
>> I’m having a little trouble reading between the lines on various
>> docs, mailing list threads, KVM presentations, github forks, etc, so
>> I figured I’d just ask:
>>
>> What is the status of IOMMU virtualization, like in the case where I
>> want a VM guest to have a virtual IOMMU?
>
> It works fine for simply nested assignment scenarios, ie. guest
> userspace drivers or nested VMs.
>
>> I found this great presentation from KVM Forum 2021: [1]
>>
>> 1. I’m using -device intel-iommu right now. This has performance
>> implications and large DMA transfers hit the vfio_iommu_type1
>> dma_entry_limit on the host because of how the mappings are made.
>
> Hugepages for the guest and mappings within the guest should help both
> the mapping performance and DMA entry limit. In general the type1 vfio
> IOMMU backend is not optimized for dynamic mapping, so performance-wise
> your best bet is still to design the userspace driver for static DMA
> buffers.

Yep, huge pages definitely help, will probably switch to allocating them at boot for better guarantees.

>
>> 2. -device virtio-iommu is an improvement, but it doesn’t seem
>> compatible with -device vfio-pci? I was only able to test this with
>> cloud-hypervisor, and it has a better vfio mapping pattern (avoids
>> hitting dma_entry_limit).
>
> AFAIK it's just growing pains, it should work but it's working through
> bugs.

Oh really?? Ok: I might even be configuring things incorrectly, or
Maybe I need to upgrade from QEMU 7.1 to 8. I was relying on whatever
libvirt does by default, which seems to just be:

-device virtio-iommu -device vfio-pci,host=<bdf>

But maybe I need some other options?

>
>> 3. -object iommufd [2] I haven’t tried this quite yet, planning to:
>> if it’s using iommufd, and I have all the right kernel features in
>> the guest and host, I assume it’s implementing the passthrough mode
>> that AMD has described in their talk? Because I imagine that would be
>> the best solution for me, I’m just having trouble understanding if
>> it’s actually related or orthogonal.
>
> For now iommufd provides a similar DMA mapping interface to type1, but
> it does remove the DMA entry limit and improves locked page accounting.
>
> To really see a performance improvement relative to dynamic mappings,
> you'll need nesting support in the IOMMU, which is under active
> development. From this aspect you will want iommufd since similar
> features will not be provided by type1. Thanks,

I see, thanks! That’s great to hear.

>
> Alex
>