2023-09-18 16:53:51

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map

On 2023-09-18 12:51, Niklas Schnelle wrote:
> Pull out the sync operation from viommu_map_pages() by implementing
> ops->iotlb_sync_map. This allows the common IOMMU code to map multiple
> elements of an sg with a single sync (see iommu_map_sg()). Furthermore,
> it is also a requirement for IOMMU_CAP_DEFERRED_FLUSH.

Is it really a requirement? Deferred flush only deals with unmapping. Or
are you just trying to say that it's not too worthwhile to try doing
more for unmapping performance while obvious mapping performance is
still left on the table?

> Link: https://lore.kernel.org/lkml/[email protected]/
> Signed-off-by: Niklas Schnelle <[email protected]>
> ---
> drivers/iommu/virtio-iommu.c | 17 ++++++++++++++++-
> 1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> index 17dcd826f5c2..3649586f0e5c 100644
> --- a/drivers/iommu/virtio-iommu.c
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -189,6 +189,12 @@ static int viommu_sync_req(struct viommu_dev *viommu)
> int ret;
> unsigned long flags;
>
> + /*
> + * .iotlb_sync_map and .flush_iotlb_all may be called before the viommu
> + * is initialized e.g. via iommu_create_device_direct_mappings()
> + */
> + if (!viommu)
> + return 0;

Minor nit: I'd be inclined to make that check explicitly in the places
where it definitely is expected, rather than allowing *any* sync to
silently do nothing if called incorrectly. Plus then they could use
vdomain->nr_endpoints for consistency with the equivalent checks
elsewhere (it did take me a moment to figure out how we could get to
.iotlb_sync_map with a NULL viommu without viommu_map_pages() blowing up
first...)

Thanks,
Robin.

> spin_lock_irqsave(&viommu->request_lock, flags);
> ret = __viommu_sync_req(viommu);
> if (ret)
> @@ -843,7 +849,7 @@ static int viommu_map_pages(struct iommu_domain *domain, unsigned long iova,
> .flags = cpu_to_le32(flags),
> };
>
> - ret = viommu_send_req_sync(vdomain->viommu, &map, sizeof(map));
> + ret = viommu_add_req(vdomain->viommu, &map, sizeof(map));
> if (ret) {
> viommu_del_mappings(vdomain, iova, end);
> return ret;
> @@ -912,6 +918,14 @@ static void viommu_iotlb_sync(struct iommu_domain *domain,
> viommu_sync_req(vdomain->viommu);
> }
>
> +static int viommu_iotlb_sync_map(struct iommu_domain *domain,
> + unsigned long iova, size_t size)
> +{
> + struct viommu_domain *vdomain = to_viommu_domain(domain);
> +
> + return viommu_sync_req(vdomain->viommu);
> +}
> +
> static void viommu_get_resv_regions(struct device *dev, struct list_head *head)
> {
> struct iommu_resv_region *entry, *new_entry, *msi = NULL;
> @@ -1058,6 +1072,7 @@ static struct iommu_ops viommu_ops = {
> .unmap_pages = viommu_unmap_pages,
> .iova_to_phys = viommu_iova_to_phys,
> .iotlb_sync = viommu_iotlb_sync,
> + .iotlb_sync_map = viommu_iotlb_sync_map,
> .free = viommu_domain_free,
> }
> };
>


2023-09-19 17:05:42

by Jean-Philippe Brucker

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map

On Mon, Sep 18, 2023 at 05:37:47PM +0100, Robin Murphy wrote:
> > diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> > index 17dcd826f5c2..3649586f0e5c 100644
> > --- a/drivers/iommu/virtio-iommu.c
> > +++ b/drivers/iommu/virtio-iommu.c
> > @@ -189,6 +189,12 @@ static int viommu_sync_req(struct viommu_dev *viommu)
> > int ret;
> > unsigned long flags;
> > + /*
> > + * .iotlb_sync_map and .flush_iotlb_all may be called before the viommu
> > + * is initialized e.g. via iommu_create_device_direct_mappings()
> > + */
> > + if (!viommu)
> > + return 0;
>
> Minor nit: I'd be inclined to make that check explicitly in the places where
> it definitely is expected, rather than allowing *any* sync to silently do
> nothing if called incorrectly. Plus then they could use
> vdomain->nr_endpoints for consistency with the equivalent checks elsewhere
> (it did take me a moment to figure out how we could get to .iotlb_sync_map
> with a NULL viommu without viommu_map_pages() blowing up first...)

They're not strictly equivalent: this check works around a temporary issue
with the IOMMU core, which calls map/unmap before the domain is finalized.
Once we merge domain_alloc() and finalize(), then this check disappears,
but we still need to test nr_endpoints in map/unmap to handle detached
domains (and we still need to fix the synchronization of nr_endpoints
against attach/detach). That's why I preferred doing this on viommu and
keeping it in one place.

Thanks,
Jean

2023-09-19 21:15:27

by Niklas Schnelle

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map

On Mon, 2023-09-18 at 17:37 +0100, Robin Murphy wrote:
> On 2023-09-18 12:51, Niklas Schnelle wrote:
> > Pull out the sync operation from viommu_map_pages() by implementing
> > ops->iotlb_sync_map. This allows the common IOMMU code to map multiple
> > elements of an sg with a single sync (see iommu_map_sg()). Furthermore,
> > it is also a requirement for IOMMU_CAP_DEFERRED_FLUSH.
>
> Is it really a requirement? Deferred flush only deals with unmapping. Or
> are you just trying to say that it's not too worthwhile to try doing
> more for unmapping performance while obvious mapping performance is
> still left on the table?
>

You're right there is no hard requirement. I somehow thought that
iommu_create_device_direct_map() relied on it because it does
flush_iotlb_all() and iommu_map() but that doesn't seem to be true. If
you want I can resend with the last sentence removed.

> > Link: https://lore.kernel.org/lkml/[email protected]/
> > Signed-off-by: Niklas Schnelle <[email protected]>
> > ---
> > drivers/iommu/virtio-iommu.c | 17 ++++++++++++++++-
> > 1 file changed, 16 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> > index 17dcd826f5c2..3649586f0e5c 100644
> > --- a/drivers/iommu/virtio-iommu.c
> > +++ b/drivers/iommu/virtio-iommu.c
> > @@ -189,6 +189,12 @@ static int viommu_sync_req(struct viommu_dev *viommu)
> > int ret;
> > unsigned long flags;
> >
> > + /*
> > + * .iotlb_sync_map and .flush_iotlb_all may be called before the viommu
> > + * is initialized e.g. via iommu_create_device_direct_mappings()
> > + */
> > + if (!viommu)
> > + return 0;
>
> Minor nit: I'd be inclined to make that check explicitly in the places
> where it definitely is expected, rather than allowing *any* sync to
> silently do nothing if called incorrectly. Plus then they could use
> vdomain->nr_endpoints for consistency with the equivalent checks
> elsewhere (it did take me a moment to figure out how we could get to
> .iotlb_sync_map with a NULL viommu without viommu_map_pages() blowing up
> first...)
>
> Thanks,
> Robin.

That's what I had in v1. I think this is a matter of taste and Jean-
Philippe pointed me to moving the check into viommu_sync_req() I added
a comment because it really is not entirely obvious.

>
> > spin_lock_irqsave(&viommu->request_lock, flags);
> > ret = __viommu_sync_req(viommu);
> > if (ret)
> > @@ -843,7 +849,7 @@ static int viommu_map_pages(struct iommu_domain *domain, unsigned long iova,
> > .flags = cpu_to_le32(flags),
> > };
> >
> > - ret = viommu_send_req_sync(vdomain->viommu, &map, sizeof(map));
> > + ret = viommu_add_req(vdomain->viommu, &map, sizeof(map));
> > if (ret) {
> > viommu_del_mappings(vdomain, iova, end);
> > return ret;
> > @@ -912,6 +918,14 @@ static void viommu_iotlb_sync(struct iommu_domain *domain,
> > viommu_sync_req(vdomain->viommu);
> > }
> >
> > +static int viommu_iotlb_sync_map(struct iommu_domain *domain,
> > + unsigned long iova, size_t size)
> > +{
> > + struct viommu_domain *vdomain = to_viommu_domain(domain);
> > +
> > + return viommu_sync_req(vdomain->viommu);
> > +}
> > +
> > static void viommu_get_resv_regions(struct device *dev, struct list_head *head)
> > {
> > struct iommu_resv_region *entry, *new_entry, *msi = NULL;
> > @@ -1058,6 +1072,7 @@ static struct iommu_ops viommu_ops = {
> > .unmap_pages = viommu_unmap_pages,
> > .iova_to_phys = viommu_iova_to_phys,
> > .iotlb_sync = viommu_iotlb_sync,
> > + .iotlb_sync_map = viommu_iotlb_sync_map,
> > .free = viommu_domain_free,
> > }
> > };
> >