2020-11-18 11:25:43

by Eric Auger

[permalink] [raw]
Subject: [PATCH v13 02/15] iommu: Introduce bind/unbind_guest_msi

On ARM, MSI are translated by the SMMU. An IOVA is allocated
for each MSI doorbell. If both the host and the guest are exposed
with SMMUs, we end up with 2 different IOVAs allocated by each.
guest allocates an IOVA (gIOVA) to map onto the guest MSI
doorbell (gDB). The Host allocates another IOVA (hIOVA) to map
onto the physical doorbell (hDB).

So we end up with 2 untied mappings:
S1 S2
gIOVA -> gDB
hIOVA -> hDB

Currently the PCI device is programmed by the host with hIOVA
as MSI doorbell. So this does not work.

This patch introduces an API to pass gIOVA/gDB to the host so
that gIOVA can be reused by the host instead of re-allocating
a new IOVA. So the goal is to create the following nested mapping:

S1 S2
gIOVA -> gDB -> hDB

and program the PCI device with gIOVA MSI doorbell.

In case we have several devices attached to this nested domain
(devices belonging to the same group), they cannot be isolated
on guest side either. So they should also end up in the same domain
on guest side. We will enforce that all the devices attached to
the host iommu domain use the same physical doorbell and similarly
a single virtual doorbell mapping gets registered (1 single
virtual doorbell is used on guest as well).

Signed-off-by: Eric Auger <[email protected]>

---
v7 -> v8:
- dummy iommu_unbind_guest_msi turned into a void function

v6 -> v7:
- remove the device handle parameter.
- Add comments saying there can only be a single MSI binding
registered per iommu_domain
v5 -> v6:
-fix compile issue when IOMMU_API is not set

v3 -> v4:
- add unbind

v2 -> v3:
- add a struct device handle
---
drivers/iommu/iommu.c | 37 +++++++++++++++++++++++++++++++++++++
include/linux/iommu.h | 20 ++++++++++++++++++++
2 files changed, 57 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 978fe34378fb..0b1f458b444f 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2252,6 +2252,43 @@ static void __iommu_detach_device(struct iommu_domain *domain,
trace_detach_device_from_domain(dev);
}

+/**
+ * iommu_bind_guest_msi - Passes the stage1 GIOVA/GPA mapping of a
+ * virtual doorbell
+ *
+ * @domain: iommu domain the stage 1 mapping will be attached to
+ * @iova: iova allocated by the guest
+ * @gpa: guest physical address of the virtual doorbell
+ * @size: granule size used for the mapping
+ *
+ * The associated IOVA can be reused by the host to create a nested
+ * stage2 binding mapping translating into the physical doorbell used
+ * by the devices attached to the domain.
+ *
+ * All devices within the domain must share the same physical doorbell.
+ * A single MSI GIOVA/GPA mapping can be attached to an iommu_domain.
+ */
+
+int iommu_bind_guest_msi(struct iommu_domain *domain,
+ dma_addr_t giova, phys_addr_t gpa, size_t size)
+{
+ if (unlikely(!domain->ops->bind_guest_msi))
+ return -ENODEV;
+
+ return domain->ops->bind_guest_msi(domain, giova, gpa, size);
+}
+EXPORT_SYMBOL_GPL(iommu_bind_guest_msi);
+
+void iommu_unbind_guest_msi(struct iommu_domain *domain,
+ dma_addr_t iova)
+{
+ if (unlikely(!domain->ops->unbind_guest_msi))
+ return;
+
+ domain->ops->unbind_guest_msi(domain, iova);
+}
+EXPORT_SYMBOL_GPL(iommu_unbind_guest_msi);
+
void iommu_detach_device(struct iommu_domain *domain, struct device *dev)
{
struct iommu_group *group;
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 464fcbecf841..35819bff03bc 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -225,6 +225,8 @@ struct iommu_iotlb_gather {
* @sva_unbind_gpasid: unbind guest pasid and mm
* @attach_pasid_table: attach a pasid table
* @detach_pasid_table: detach the pasid table
+ * @bind_guest_msi: provides a stage1 giova/gpa MSI doorbell mapping
+ * @unbind_guest_msi: withdraw a stage1 giova/gpa MSI doorbell mapping
* @def_domain_type: device default domain type, return value:
* - IOMMU_DOMAIN_IDENTITY: must use an identity domain
* - IOMMU_DOMAIN_DMA: must use a dma domain
@@ -305,6 +307,10 @@ struct iommu_ops {

int (*def_domain_type)(struct device *dev);

+ int (*bind_guest_msi)(struct iommu_domain *domain,
+ dma_addr_t giova, phys_addr_t gpa, size_t size);
+ void (*unbind_guest_msi)(struct iommu_domain *domain, dma_addr_t giova);
+
unsigned long pgsize_bitmap;
struct module *owner;
};
@@ -444,6 +450,10 @@ extern int iommu_attach_pasid_table(struct iommu_domain *domain,
extern int iommu_uapi_attach_pasid_table(struct iommu_domain *domain,
void __user *udata);
extern void iommu_detach_pasid_table(struct iommu_domain *domain);
+extern int iommu_bind_guest_msi(struct iommu_domain *domain,
+ dma_addr_t giova, phys_addr_t gpa, size_t size);
+extern void iommu_unbind_guest_msi(struct iommu_domain *domain,
+ dma_addr_t giova);
extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev);
extern struct iommu_domain *iommu_get_dma_domain(struct device *dev);
extern int iommu_map(struct iommu_domain *domain, unsigned long iova,
@@ -1087,6 +1097,16 @@ static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev)
{
return NULL;
}
+
+static inline
+int iommu_bind_guest_msi(struct iommu_domain *domain,
+ dma_addr_t giova, phys_addr_t gpa, size_t size)
+{
+ return -ENODEV;
+}
+static inline
+void iommu_unbind_guest_msi(struct iommu_domain *domain, dma_addr_t giova) {}
+
#endif /* CONFIG_IOMMU_API */

/**
--
2.21.3


2021-02-01 11:54:57

by Keqian Zhu

[permalink] [raw]
Subject: Re: [PATCH v13 02/15] iommu: Introduce bind/unbind_guest_msi

Hi Eric,

On 2020/11/18 19:21, Eric Auger wrote:
> On ARM, MSI are translated by the SMMU. An IOVA is allocated
> for each MSI doorbell. If both the host and the guest are exposed
> with SMMUs, we end up with 2 different IOVAs allocated by each.
> guest allocates an IOVA (gIOVA) to map onto the guest MSI
> doorbell (gDB). The Host allocates another IOVA (hIOVA) to map
> onto the physical doorbell (hDB).
>
> So we end up with 2 untied mappings:
> S1 S2
> gIOVA -> gDB
> hIOVA -> hDB
>
> Currently the PCI device is programmed by the host with hIOVA
> as MSI doorbell. So this does not work.
>
> This patch introduces an API to pass gIOVA/gDB to the host so
> that gIOVA can be reused by the host instead of re-allocating
> a new IOVA. So the goal is to create the following nested mapping:
Does the gDB can be reused under non-nested mode?

>
> S1 S2
> gIOVA -> gDB -> hDB
>
> and program the PCI device with gIOVA MSI doorbell.
>
> In case we have several devices attached to this nested domain
> (devices belonging to the same group), they cannot be isolated
> on guest side either. So they should also end up in the same domain
> on guest side. We will enforce that all the devices attached to
> the host iommu domain use the same physical doorbell and similarly
> a single virtual doorbell mapping gets registered (1 single
> virtual doorbell is used on guest as well).
>
[...]

> + *
> + * The associated IOVA can be reused by the host to create a nested
> + * stage2 binding mapping translating into the physical doorbell used
> + * by the devices attached to the domain.
> + *
> + * All devices within the domain must share the same physical doorbell.
> + * A single MSI GIOVA/GPA mapping can be attached to an iommu_domain.
> + */
> +
> +int iommu_bind_guest_msi(struct iommu_domain *domain,
> + dma_addr_t giova, phys_addr_t gpa, size_t size)
> +{
> + if (unlikely(!domain->ops->bind_guest_msi))
> + return -ENODEV;
> +
> + return domain->ops->bind_guest_msi(domain, giova, gpa, size);
> +}
> +EXPORT_SYMBOL_GPL(iommu_bind_guest_msi);
> +
> +void iommu_unbind_guest_msi(struct iommu_domain *domain,
> + dma_addr_t iova)
nit: s/iova/giova

> +{
> + if (unlikely(!domain->ops->unbind_guest_msi))
> + return;
> +
> + domain->ops->unbind_guest_msi(domain, iova);
> +}
> +EXPORT_SYMBOL_GPL(iommu_unbind_guest_msi);
> +
[...]

Thanks,
Keqian

2021-02-12 08:59:00

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v13 02/15] iommu: Introduce bind/unbind_guest_msi

Hi Keqian,

On 2/1/21 12:52 PM, Keqian Zhu wrote:
> Hi Eric,
>
> On 2020/11/18 19:21, Eric Auger wrote:
>> On ARM, MSI are translated by the SMMU. An IOVA is allocated
>> for each MSI doorbell. If both the host and the guest are exposed
>> with SMMUs, we end up with 2 different IOVAs allocated by each.
>> guest allocates an IOVA (gIOVA) to map onto the guest MSI
>> doorbell (gDB). The Host allocates another IOVA (hIOVA) to map
>> onto the physical doorbell (hDB).
>>
>> So we end up with 2 untied mappings:
>> S1 S2
>> gIOVA -> gDB
>> hIOVA -> hDB
>>
>> Currently the PCI device is programmed by the host with hIOVA
>> as MSI doorbell. So this does not work.
>>
>> This patch introduces an API to pass gIOVA/gDB to the host so
>> that gIOVA can be reused by the host instead of re-allocating
>> a new IOVA. So the goal is to create the following nested mapping:
> Does the gDB can be reused under non-nested mode?

Under non nested mode the hIOVA is allocated within the MSI reserved
region exposed by the SMMU driver, [0x8000000, 80fffff]. see
iommu_dma_prepare_msi/iommu_dma_get_msi_page in dma_iommu.c. this hIOVA
is programmed in the physical device so that the physical SMMU
translates it into the physical doorbell (hDB = host physical ITS
doorbell). The gDB is not used at pIOMMU programming level. It is only
used when setting up the KVM irq route.

Hope this answers your question.

>
>>
>> S1 S2
>> gIOVA -> gDB -> hDB
>>
>> and program the PCI device with gIOVA MSI doorbell.
>>
>> In case we have several devices attached to this nested domain
>> (devices belonging to the same group), they cannot be isolated
>> on guest side either. So they should also end up in the same domain
>> on guest side. We will enforce that all the devices attached to
>> the host iommu domain use the same physical doorbell and similarly
>> a single virtual doorbell mapping gets registered (1 single
>> virtual doorbell is used on guest as well).
>>
> [...]
>
>> + *
>> + * The associated IOVA can be reused by the host to create a nested
>> + * stage2 binding mapping translating into the physical doorbell used
>> + * by the devices attached to the domain.
>> + *
>> + * All devices within the domain must share the same physical doorbell.
>> + * A single MSI GIOVA/GPA mapping can be attached to an iommu_domain.
>> + */
>> +
>> +int iommu_bind_guest_msi(struct iommu_domain *domain,
>> + dma_addr_t giova, phys_addr_t gpa, size_t size)
>> +{
>> + if (unlikely(!domain->ops->bind_guest_msi))
>> + return -ENODEV;
>> +
>> + return domain->ops->bind_guest_msi(domain, giova, gpa, size);
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_bind_guest_msi);
>> +
>> +void iommu_unbind_guest_msi(struct iommu_domain *domain,
>> + dma_addr_t iova)
> nit: s/iova/giova
sure
>
>> +{
>> + if (unlikely(!domain->ops->unbind_guest_msi))
>> + return;
>> +
>> + domain->ops->unbind_guest_msi(domain, iova);
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_unbind_guest_msi);
>> +
> [...]
>
> Thanks,
> Keqian
>

Thanks

Eric

2021-02-18 12:16:05

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH v13 02/15] iommu: Introduce bind/unbind_guest_msi

Hi Keqian,

On 2/18/21 9:43 AM, Keqian Zhu wrote:
> Hi Eric,
>
> On 2021/2/12 16:55, Auger Eric wrote:
>> Hi Keqian,
>>
>> On 2/1/21 12:52 PM, Keqian Zhu wrote:
>>> Hi Eric,
>>>
>>> On 2020/11/18 19:21, Eric Auger wrote:
>>>> On ARM, MSI are translated by the SMMU. An IOVA is allocated
>>>> for each MSI doorbell. If both the host and the guest are exposed
>>>> with SMMUs, we end up with 2 different IOVAs allocated by each.
>>>> guest allocates an IOVA (gIOVA) to map onto the guest MSI
>>>> doorbell (gDB). The Host allocates another IOVA (hIOVA) to map
>>>> onto the physical doorbell (hDB).
>>>>
>>>> So we end up with 2 untied mappings:
>>>> S1 S2
>>>> gIOVA -> gDB
>>>> hIOVA -> hDB
>>>>
>>>> Currently the PCI device is programmed by the host with hIOVA
>>>> as MSI doorbell. So this does not work.
>>>>
>>>> This patch introduces an API to pass gIOVA/gDB to the host so
>>>> that gIOVA can be reused by the host instead of re-allocating
>>>> a new IOVA. So the goal is to create the following nested mapping:
>>> Does the gDB can be reused under non-nested mode?
>>
>> Under non nested mode the hIOVA is allocated within the MSI reserved
>> region exposed by the SMMU driver, [0x8000000, 80fffff]. see
>> iommu_dma_prepare_msi/iommu_dma_get_msi_page in dma_iommu.c. this hIOVA
>> is programmed in the physical device so that the physical SMMU
>> translates it into the physical doorbell (hDB = host physical ITS
> So, AFAIU, under non-nested mode, at smmu side, we reuse the workflow of non-virtualization scenario.
Without virtualization, the host kernel also transparently allocates an
iova to map the doorbell. With standard passthrough withou vIOMMU, the
iova window is different (MSI RESV region).

Thanks

Eric
>
>> doorbell). The gDB is not used at pIOMMU programming level. It is only
>> used when setting up the KVM irq route.
>>
>> Hope this answers your question.
> Thanks for your explanation!
>>
>
> Thanks,
> Keqian
>
>>>
>>>>
>>>> S1 S2
>>>> gIOVA -> gDB -> hDB
>>>>
>>>> and program the PCI device with gIOVA MSI doorbell.
>>>>
>>>> In case we have several devices attached to this nested domain
>>>> (devices belonging to the same group), they cannot be isolated
>>>> on guest side either. So they should also end up in the same domain
>>>> on guest side. We will enforce that all the devices attached to
>>>> the host iommu domain use the same physical doorbell and similarly
>>>> a single virtual doorbell mapping gets registered (1 single
>>>> virtual doorbell is used on guest as well).
>>>>
>>> [...]
>>>
>>>> + *
>>>> + * The associated IOVA can be reused by the host to create a nested
>>>> + * stage2 binding mapping translating into the physical doorbell used
>>>> + * by the devices attached to the domain.
>>>> + *
>>>> + * All devices within the domain must share the same physical doorbell.
>>>> + * A single MSI GIOVA/GPA mapping can be attached to an iommu_domain.
>>>> + */
>>>> +
>>>> +int iommu_bind_guest_msi(struct iommu_domain *domain,
>>>> + dma_addr_t giova, phys_addr_t gpa, size_t size)
>>>> +{
>>>> + if (unlikely(!domain->ops->bind_guest_msi))
>>>> + return -ENODEV;
>>>> +
>>>> + return domain->ops->bind_guest_msi(domain, giova, gpa, size);
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(iommu_bind_guest_msi);
>>>> +
>>>> +void iommu_unbind_guest_msi(struct iommu_domain *domain,
>>>> + dma_addr_t iova)
>>> nit: s/iova/giova
>> sure
>>>
>>>> +{
>>>> + if (unlikely(!domain->ops->unbind_guest_msi))
>>>> + return;
>>>> +
>>>> + domain->ops->unbind_guest_msi(domain, iova);
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(iommu_unbind_guest_msi);
>>>> +
>>> [...]
>>>
>>> Thanks,
>>> Keqian
>>>
>>
>> Thanks
>>
>> Eric
>>
>> .
>>
>

2021-02-18 12:19:53

by Keqian Zhu

[permalink] [raw]
Subject: Re: [PATCH v13 02/15] iommu: Introduce bind/unbind_guest_msi

Hi Eric,

On 2021/2/12 16:55, Auger Eric wrote:
> Hi Keqian,
>
> On 2/1/21 12:52 PM, Keqian Zhu wrote:
>> Hi Eric,
>>
>> On 2020/11/18 19:21, Eric Auger wrote:
>>> On ARM, MSI are translated by the SMMU. An IOVA is allocated
>>> for each MSI doorbell. If both the host and the guest are exposed
>>> with SMMUs, we end up with 2 different IOVAs allocated by each.
>>> guest allocates an IOVA (gIOVA) to map onto the guest MSI
>>> doorbell (gDB). The Host allocates another IOVA (hIOVA) to map
>>> onto the physical doorbell (hDB).
>>>
>>> So we end up with 2 untied mappings:
>>> S1 S2
>>> gIOVA -> gDB
>>> hIOVA -> hDB
>>>
>>> Currently the PCI device is programmed by the host with hIOVA
>>> as MSI doorbell. So this does not work.
>>>
>>> This patch introduces an API to pass gIOVA/gDB to the host so
>>> that gIOVA can be reused by the host instead of re-allocating
>>> a new IOVA. So the goal is to create the following nested mapping:
>> Does the gDB can be reused under non-nested mode?
>
> Under non nested mode the hIOVA is allocated within the MSI reserved
> region exposed by the SMMU driver, [0x8000000, 80fffff]. see
> iommu_dma_prepare_msi/iommu_dma_get_msi_page in dma_iommu.c. this hIOVA
> is programmed in the physical device so that the physical SMMU
> translates it into the physical doorbell (hDB = host physical ITS
So, AFAIU, under non-nested mode, at smmu side, we reuse the workflow of non-virtualization scenario.

> doorbell). The gDB is not used at pIOMMU programming level. It is only
> used when setting up the KVM irq route.
>
> Hope this answers your question.
Thanks for your explanation!
>

Thanks,
Keqian

>>
>>>
>>> S1 S2
>>> gIOVA -> gDB -> hDB
>>>
>>> and program the PCI device with gIOVA MSI doorbell.
>>>
>>> In case we have several devices attached to this nested domain
>>> (devices belonging to the same group), they cannot be isolated
>>> on guest side either. So they should also end up in the same domain
>>> on guest side. We will enforce that all the devices attached to
>>> the host iommu domain use the same physical doorbell and similarly
>>> a single virtual doorbell mapping gets registered (1 single
>>> virtual doorbell is used on guest as well).
>>>
>> [...]
>>
>>> + *
>>> + * The associated IOVA can be reused by the host to create a nested
>>> + * stage2 binding mapping translating into the physical doorbell used
>>> + * by the devices attached to the domain.
>>> + *
>>> + * All devices within the domain must share the same physical doorbell.
>>> + * A single MSI GIOVA/GPA mapping can be attached to an iommu_domain.
>>> + */
>>> +
>>> +int iommu_bind_guest_msi(struct iommu_domain *domain,
>>> + dma_addr_t giova, phys_addr_t gpa, size_t size)
>>> +{
>>> + if (unlikely(!domain->ops->bind_guest_msi))
>>> + return -ENODEV;
>>> +
>>> + return domain->ops->bind_guest_msi(domain, giova, gpa, size);
>>> +}
>>> +EXPORT_SYMBOL_GPL(iommu_bind_guest_msi);
>>> +
>>> +void iommu_unbind_guest_msi(struct iommu_domain *domain,
>>> + dma_addr_t iova)
>> nit: s/iova/giova
> sure
>>
>>> +{
>>> + if (unlikely(!domain->ops->unbind_guest_msi))
>>> + return;
>>> +
>>> + domain->ops->unbind_guest_msi(domain, iova);
>>> +}
>>> +EXPORT_SYMBOL_GPL(iommu_unbind_guest_msi);
>>> +
>> [...]
>>
>> Thanks,
>> Keqian
>>
>
> Thanks
>
> Eric
>
> .
>