2020-03-20 23:23:28

by Jacob Pan

[permalink] [raw]
Subject: [PATCH V10 00/11] Nested Shared Virtual Address (SVA) VT-d support

Shared virtual address (SVA), a.k.a, Shared virtual memory (SVM) on Intel
platforms allow address space sharing between device DMA and applications.
SVA can reduce programming complexity and enhance security.
This series is intended to enable SVA virtualization, i.e. enable use of SVA
within a guest user application.

This is the remaining portion of the original patchset that is based on
Joerg's x86/vt-d branch. The preparatory and cleanup patches are merged here.
(git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git)

Only IOMMU portion of the changes are included in this series. Additional
support is needed in VFIO and QEMU (will be submitted separately) to complete
this functionality.

To make incremental changes and reduce the size of each patchset. This series
does not inlcude support for page request services.

In VT-d implementation, PASID table is per device and maintained in the host.
Guest PASID table is shadowed in VMM where virtual IOMMU is emulated.

.-------------. .---------------------------.
| vIOMMU | | Guest process CR3, FL only|
| | '---------------------------'
.----------------/
| PASID Entry |--- PASID cache flush -
'-------------' |
| | V
| | CR3 in GPA
'-------------'
Guest
------| Shadow |--------------------------|--------
v v v
Host
.-------------. .----------------------.
| pIOMMU | | Bind FL for GVA-GPA |
| | '----------------------'
.----------------/ |
| PASID Entry | V (Nested xlate)
'----------------\.------------------------------.
| | |SL for GPA-HPA, default domain|
| | '------------------------------'
'-------------'
Where:
- FL = First level/stage one page tables
- SL = Second level/stage two page tables

This is the remaining VT-d only portion of V5 since the uAPIs and IOASID common
code have been applied to Joerg's IOMMU core branch.
(https://lkml.org/lkml/2019/10/2/833)

The complete set with VFIO patches are here:
https://github.com/jacobpan/linux.git:siov_sva

The complete nested SVA upstream patches are divided into three phases:
1. Common APIs and PCI device direct assignment
2. Page Request Services (PRS) support
3. Mediated device assignment

With this set and the accompanied VFIO code, we will achieve phase #1.

Thanks,

Jacob

ChangeLog:
- v10
- Addressed Eric's review in v7 and v9. Most fixes are in 3/10 and
6/10. Extra condition checks and consolidation of duplicated codes.

- v9
- Addressed Baolu's comments for v8 for IOTLB flush consolidation,
bug fixes
- Removed IOASID notifier code which will be submitted separately
to address PASID life cycle management with multiple users.

- v8
- Extracted cleanup patches from V7 and accepted into maintainer's
tree (https://lkml.org/lkml/2019/12/2/514).
- Added IOASID notifier and VT-d handler for termination of PASID
IOMMU context upon free. This will ensure success of VFIO IOASID
free API regardless PASID is in use.
(https://lore.kernel.org/linux-iommu/[email protected]/)

- V7
- Respect vIOMMU PASID range in virtual command PASID/IOASID allocator
- Caching virtual command capabilities to avoid runtime checks that
could cause vmexits.

- V6
- Rebased on top of Joerg's core branch
(git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git core)
- Adapt to new uAPIs and IOASID allocators

- V5
Rebased on v5.3-rc4 which has some of the IOMMU fault APIs merged.
Addressed v4 review comments from Eric Auger, Baolu Lu, and
Jonathan Cameron. Specific changes are as follows:
- Refined custom IOASID allocator to support multiple vIOMMU, hotplug
cases.
- Extracted vendor data from IOMMU guest PASID bind data, for VT-d
will support all necessary guest PASID entry fields for PASID
bind.
- Support non-identity host-guest PASID mapping
- Exception handling in various cases

- V4
- Redesigned IOASID allocator such that it can support custom
allocators with shared helper functions. Use separate XArray
to store IOASIDs per allocator. Took advice from Eric Auger to
have default allocator use the generic allocator structure.
Combined into one patch in that the default allocator is just
"another" allocator now. Can be built as a module in case of
driver use without IOMMU.
- Extended bind guest PASID data to support SMMU and non-identity
guest to host PASID mapping https://lkml.org/lkml/2019/5/21/802
- Rebased on Jean's sva/api common tree, new patches starts with
[PATCH v4 10/22]

- V3
- Addressed thorough review comments from Eric Auger (Thank you!)
- Moved IOASID allocator from driver core to IOMMU code per
suggestion by Christoph Hellwig
(https://lkml.org/lkml/2019/4/26/462)
- Rebased on top of Jean's SVA API branch and Eric's v7[1]
(git://linux-arm.org/linux-jpb.git sva/api)
- All IOMMU APIs are unmodified (except the new bind guest PASID
call in patch 9/16)

- V2
- Rebased on Joerg's IOMMU x86/vt-d branch v5.1-rc4
- Integrated with Eric Auger's new v7 series for common APIs
(https://github.com/eauger/linux/tree/v5.1-rc3-2stage-v7)
- Addressed review comments from Andy Shevchenko and Alex Williamson on
IOASID custom allocator.
- Support multiple custom IOASID allocators (vIOMMUs) and dynamic
registration.


Jacob Pan (10):
iommu/vt-d: Move domain helper to header
iommu/uapi: Define a mask for bind data
iommu/vt-d: Add a helper function to skip agaw
iommu/vt-d: Use helper function to skip agaw for SL
iommu/vt-d: Add nested translation helper function
iommu/vt-d: Add bind guest PASID support
iommu/vt-d: Support flushing more translation cache types
iommu/vt-d: Add svm/sva invalidate function
iommu/vt-d: Cache virtual command capability register
iommu/vt-d: Add custom allocator for IOASID

Lu Baolu (1):
iommu/vt-d: Enlightened PASID allocation

drivers/iommu/dmar.c | 37 +++++
drivers/iommu/intel-iommu.c | 276 +++++++++++++++++++++++++++++++++++-
drivers/iommu/intel-pasid.c | 336 ++++++++++++++++++++++++++++++++++++++++++--
drivers/iommu/intel-pasid.h | 25 +++-
drivers/iommu/intel-svm.c | 224 +++++++++++++++++++++++++++++
include/linux/intel-iommu.h | 45 +++++-
include/linux/intel-svm.h | 17 +++
include/uapi/linux/iommu.h | 5 +-
8 files changed, 938 insertions(+), 27 deletions(-)

--
2.7.4


2020-03-20 23:23:35

by Jacob Pan

[permalink] [raw]
Subject: [PATCH V10 03/11] iommu/vt-d: Add a helper function to skip agaw

Signed-off-by: Jacob Pan <[email protected]>
---
drivers/iommu/intel-pasid.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)

diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index 22b30f10b396..191508c7c03e 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -500,6 +500,28 @@ int intel_pasid_setup_first_level(struct intel_iommu *iommu,
}

/*
+ * Skip top levels of page tables for iommu which has less agaw
+ * than default. Unnecessary for PT mode.
+ */
+static inline int iommu_skip_agaw(struct dmar_domain *domain,
+ struct intel_iommu *iommu,
+ struct dma_pte **pgd)
+{
+ int agaw;
+
+ for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) {
+ *pgd = phys_to_virt(dma_pte_addr(*pgd));
+ if (!dma_pte_present(*pgd)) {
+ return -EINVAL;
+ }
+ }
+ pr_debug_ratelimited("%s: pgd: %llx, agaw %d d_agaw %d\n", __func__, (u64)*pgd,
+ iommu->agaw, domain->agaw);
+
+ return agaw;
+}
+
+/*
* Set up the scalable mode pasid entry for second only translation type.
*/
int intel_pasid_setup_second_level(struct intel_iommu *iommu,
--
2.7.4

2020-03-20 23:23:41

by Jacob Pan

[permalink] [raw]
Subject: [PATCH V10 04/11] iommu/vt-d: Use helper function to skip agaw for SL

Signed-off-by: Jacob Pan <[email protected]>
---
drivers/iommu/intel-pasid.c | 14 ++++----------
1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index 191508c7c03e..9bdb7ee228b6 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -544,17 +544,11 @@ int intel_pasid_setup_second_level(struct intel_iommu *iommu,
return -EINVAL;
}

- /*
- * Skip top levels of page tables for iommu which has less agaw
- * than default. Unnecessary for PT mode.
- */
pgd = domain->pgd;
- for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) {
- pgd = phys_to_virt(dma_pte_addr(pgd));
- if (!dma_pte_present(pgd)) {
- dev_err(dev, "Invalid domain page table\n");
- return -EINVAL;
- }
+ agaw = iommu_skip_agaw(domain, iommu, &pgd);
+ if (agaw < 0) {
+ dev_err(dev, "Invalid domain page table\n");
+ return -EINVAL;
}

pgd_val = virt_to_phys(pgd);
--
2.7.4

2020-03-20 23:23:58

by Jacob Pan

[permalink] [raw]
Subject: [PATCH V10 02/11] iommu/uapi: Define a mask for bind data

Memory type related flags can be grouped together for one simple check.

---
v9 renamed from EMT to MTS since these are memory type support flags.
---

Signed-off-by: Jacob Pan <[email protected]>
---
include/uapi/linux/iommu.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
index 4ad3496e5c43..d7bcbc5f79b0 100644
--- a/include/uapi/linux/iommu.h
+++ b/include/uapi/linux/iommu.h
@@ -284,7 +284,10 @@ struct iommu_gpasid_bind_data_vtd {
__u32 pat;
__u32 emt;
};
-
+#define IOMMU_SVA_VTD_GPASID_MTS_MASK (IOMMU_SVA_VTD_GPASID_CD | \
+ IOMMU_SVA_VTD_GPASID_EMTE | \
+ IOMMU_SVA_VTD_GPASID_PCD | \
+ IOMMU_SVA_VTD_GPASID_PWT)
/**
* struct iommu_gpasid_bind_data - Information about device and guest PASID binding
* @version: Version of this data structure
--
2.7.4

2020-03-20 23:24:20

by Jacob Pan

[permalink] [raw]
Subject: [PATCH V10 09/11] iommu/vt-d: Cache virtual command capability register

Virtual command registers are used in the guest only, to prevent
vmexit cost, we cache the capability and store it during initialization.

Signed-off-by: Jacob Pan <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
Reviewed-by: Lu Baolu <[email protected]>

---
v7 Reviewed by Eric & Baolu
---
---
drivers/iommu/dmar.c | 1 +
include/linux/intel-iommu.h | 5 +++++
2 files changed, 6 insertions(+)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index 4d6b7b5b37ee..3b36491c8bbb 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -963,6 +963,7 @@ static int map_iommu(struct intel_iommu *iommu, u64 phys_addr)
warn_invalid_dmar(phys_addr, " returns all ones");
goto unmap;
}
+ iommu->vccap = dmar_readq(iommu->reg + DMAR_VCCAP_REG);

/* the registers might be more than one page */
map_size = max_t(int, ecap_max_iotlb_offset(iommu->ecap),
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 43539713b3b3..ccbf164fb711 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -194,6 +194,9 @@
#define ecap_max_handle_mask(e) ((e >> 20) & 0xf)
#define ecap_sc_support(e) ((e >> 7) & 0x1) /* Snooping Control */

+/* Virtual command interface capabilities */
+#define vccap_pasid(v) ((v & DMA_VCS_PAS)) /* PASID allocation */
+
/* IOTLB_REG */
#define DMA_TLB_FLUSH_GRANU_OFFSET 60
#define DMA_TLB_GLOBAL_FLUSH (((u64)1) << 60)
@@ -287,6 +290,7 @@

/* PRS_REG */
#define DMA_PRS_PPR ((u32)1)
+#define DMA_VCS_PAS ((u64)1)

#define IOMMU_WAIT_OP(iommu, offset, op, cond, sts) \
do { \
@@ -537,6 +541,7 @@ struct intel_iommu {
u64 reg_size; /* size of hw register set */
u64 cap;
u64 ecap;
+ u64 vccap;
u32 gcmd; /* Holds TE, EAFL. Don't need SRTP, SFL, WBF */
raw_spinlock_t register_lock; /* protect register handling */
int seq_id; /* sequence id of the iommu */
--
2.7.4

2020-03-20 23:24:42

by Jacob Pan

[permalink] [raw]
Subject: [PATCH V10 01/11] iommu/vt-d: Move domain helper to header

Move domain helper to header to be used by SVA code.

Signed-off-by: Jacob Pan <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
---
drivers/iommu/intel-iommu.c | 6 ------
include/linux/intel-iommu.h | 6 ++++++
2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 4be549478691..e599b2537b1c 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -446,12 +446,6 @@ static void init_translation_status(struct intel_iommu *iommu)
iommu->flags |= VTD_FLAG_TRANS_PRE_ENABLED;
}

-/* Convert generic 'struct iommu_domain to private struct dmar_domain */
-static struct dmar_domain *to_dmar_domain(struct iommu_domain *dom)
-{
- return container_of(dom, struct dmar_domain, domain);
-}
-
static int __init intel_iommu_setup(char *str)
{
if (!str)
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 980234ae0312..ed7171d2ae1f 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -595,6 +595,12 @@ static inline void __iommu_flush_cache(
clflush_cache_range(addr, size);
}

+/* Convert generic struct iommu_domain to private struct dmar_domain */
+static inline struct dmar_domain *to_dmar_domain(struct iommu_domain *dom)
+{
+ return container_of(dom, struct dmar_domain, domain);
+}
+
/*
* 0: readable
* 1: writable
--
2.7.4

2020-03-22 01:31:21

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH V10 02/11] iommu/uapi: Define a mask for bind data

On 2020/3/21 7:27, Jacob Pan wrote:
> Memory type related flags can be grouped together for one simple check.
>
> ---
> v9 renamed from EMT to MTS since these are memory type support flags.
> ---
>
> Signed-off-by: Jacob Pan <[email protected]>
> ---
> include/uapi/linux/iommu.h | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
> index 4ad3496e5c43..d7bcbc5f79b0 100644
> --- a/include/uapi/linux/iommu.h
> +++ b/include/uapi/linux/iommu.h
> @@ -284,7 +284,10 @@ struct iommu_gpasid_bind_data_vtd {
> __u32 pat;
> __u32 emt;
> };
> -
> +#define IOMMU_SVA_VTD_GPASID_MTS_MASK (IOMMU_SVA_VTD_GPASID_CD | \
> + IOMMU_SVA_VTD_GPASID_EMTE | \
> + IOMMU_SVA_VTD_GPASID_PCD | \
> + IOMMU_SVA_VTD_GPASID_PWT)

As name implies, can this move to intel-iommu.h?

Best regards,
baolu

> /**
> * struct iommu_gpasid_bind_data - Information about device and guest PASID binding
> * @version: Version of this data structure
>

2020-03-23 19:33:04

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH V10 02/11] iommu/uapi: Define a mask for bind data

On Sun, 22 Mar 2020 09:29:32 +0800
Lu Baolu <[email protected]> wrote:

> On 2020/3/21 7:27, Jacob Pan wrote:
> > Memory type related flags can be grouped together for one simple
> > check.
> >
> > ---
> > v9 renamed from EMT to MTS since these are memory type support
> > flags. ---
> >
> > Signed-off-by: Jacob Pan <[email protected]>
> > ---
> > include/uapi/linux/iommu.h | 5 ++++-
> > 1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
> > index 4ad3496e5c43..d7bcbc5f79b0 100644
> > --- a/include/uapi/linux/iommu.h
> > +++ b/include/uapi/linux/iommu.h
> > @@ -284,7 +284,10 @@ struct iommu_gpasid_bind_data_vtd {
> > __u32 pat;
> > __u32 emt;
> > };
> > -
> > +#define IOMMU_SVA_VTD_GPASID_MTS_MASK
> > (IOMMU_SVA_VTD_GPASID_CD | \
> > + IOMMU_SVA_VTD_GPASID_EMTE
> > | \
> > + IOMMU_SVA_VTD_GPASID_PCD
> > | \
> > +
> > IOMMU_SVA_VTD_GPASID_PWT)
>
> As name implies, can this move to intel-iommu.h?
>
I also thought about this but the masks are in vendor specific part of
the UAPI.

> Best regards,
> baolu
>
> > /**
> > * struct iommu_gpasid_bind_data - Information about device and
> > guest PASID binding
> > * @version: Version of this data structure
> >

[Jacob Pan]

2020-03-24 01:51:11

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH V10 02/11] iommu/uapi: Define a mask for bind data

On 2020/3/24 3:37, Jacob Pan wrote:
> On Sun, 22 Mar 2020 09:29:32 +0800> Lu Baolu<[email protected]> wrote:
>
>> On 2020/3/21 7:27, Jacob Pan wrote:
>>> Memory type related flags can be grouped together for one simple
>>> check.
>>>
>>> ---
>>> v9 renamed from EMT to MTS since these are memory type support
>>> flags. ---
>>>
>>> Signed-off-by: Jacob Pan<[email protected]>
>>> ---
>>> include/uapi/linux/iommu.h | 5 ++++-
>>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
>>> index 4ad3496e5c43..d7bcbc5f79b0 100644
>>> --- a/include/uapi/linux/iommu.h
>>> +++ b/include/uapi/linux/iommu.h
>>> @@ -284,7 +284,10 @@ struct iommu_gpasid_bind_data_vtd {
>>> __u32 pat;
>>> __u32 emt;
>>> };
>>> -
>>> +#define IOMMU_SVA_VTD_GPASID_MTS_MASK
>>> (IOMMU_SVA_VTD_GPASID_CD | \
>>> + IOMMU_SVA_VTD_GPASID_EMTE
>>> | \
>>> + IOMMU_SVA_VTD_GPASID_PCD
>>> | \
>>> +
>>> IOMMU_SVA_VTD_GPASID_PWT)
>> As name implies, can this move to intel-iommu.h?
>>
> I also thought about this but the masks are in vendor specific part of
> the UAPI.
>

I looked through this patch series. It looks good to me. I will do some
code style cleanup and take it to v5.7. I am not the right person to
decide whether include/uapi/linux/iommu.h is the right place for this,
so I will move it to Intel IOMMU driver for now.

Best regards,
baolu

2020-03-27 11:50:27

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH V10 01/11] iommu/vt-d: Move domain helper to header

> From: Jacob Pan <[email protected]>
> Sent: Saturday, March 21, 2020 7:28 AM
>
> Move domain helper to header to be used by SVA code.
>
> Signed-off-by: Jacob Pan <[email protected]>
> Reviewed-by: Eric Auger <[email protected]>
> ---
> drivers/iommu/intel-iommu.c | 6 ------
> include/linux/intel-iommu.h | 6 ++++++
> 2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 4be549478691..e599b2537b1c 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -446,12 +446,6 @@ static void init_translation_status(struct
> intel_iommu *iommu)
> iommu->flags |= VTD_FLAG_TRANS_PRE_ENABLED;
> }
>
> -/* Convert generic 'struct iommu_domain to private struct dmar_domain */
> -static struct dmar_domain *to_dmar_domain(struct iommu_domain *dom)
> -{
> - return container_of(dom, struct dmar_domain, domain);
> -}
> -
> static int __init intel_iommu_setup(char *str)
> {
> if (!str)
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index 980234ae0312..ed7171d2ae1f 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -595,6 +595,12 @@ static inline void __iommu_flush_cache(
> clflush_cache_range(addr, size);
> }
>
> +/* Convert generic struct iommu_domain to private struct dmar_domain */
> +static inline struct dmar_domain *to_dmar_domain(struct iommu_domain
> *dom)
> +{
> + return container_of(dom, struct dmar_domain, domain);
> +}
> +
> /*
> * 0: readable
> * 1: writable
> --
> 2.7.4

Reviewed-by: Kevin Tian <[email protected]>

2020-03-27 11:50:48

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH V10 02/11] iommu/uapi: Define a mask for bind data

> From: Jacob Pan <[email protected]>
> Sent: Saturday, March 21, 2020 7:28 AM
>
> Memory type related flags can be grouped together for one simple check.
>
> ---
> v9 renamed from EMT to MTS since these are memory type support flags.
> ---
>
> Signed-off-by: Jacob Pan <[email protected]>
> ---
> include/uapi/linux/iommu.h | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
> index 4ad3496e5c43..d7bcbc5f79b0 100644
> --- a/include/uapi/linux/iommu.h
> +++ b/include/uapi/linux/iommu.h
> @@ -284,7 +284,10 @@ struct iommu_gpasid_bind_data_vtd {
> __u32 pat;
> __u32 emt;
> };
> -
> +#define IOMMU_SVA_VTD_GPASID_MTS_MASK
> (IOMMU_SVA_VTD_GPASID_CD | \
> + IOMMU_SVA_VTD_GPASID_EMTE | \
> + IOMMU_SVA_VTD_GPASID_PCD | \
> + IOMMU_SVA_VTD_GPASID_PWT)
> /**
> * struct iommu_gpasid_bind_data - Information about device and guest
> PASID binding
> * @version: Version of this data structure
> --
> 2.7.4

Reviewed-by: Kevin Tian <[email protected]>

2020-03-27 11:54:54

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH V10 03/11] iommu/vt-d: Add a helper function to skip agaw

> From: Jacob Pan <[email protected]>
> Sent: Saturday, March 21, 2020 7:28 AM
>
> Signed-off-by: Jacob Pan <[email protected]>

could you elaborate in which scenario this helper function is required?

> ---
> drivers/iommu/intel-pasid.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
> index 22b30f10b396..191508c7c03e 100644
> --- a/drivers/iommu/intel-pasid.c
> +++ b/drivers/iommu/intel-pasid.c
> @@ -500,6 +500,28 @@ int intel_pasid_setup_first_level(struct intel_iommu
> *iommu,
> }
>
> /*
> + * Skip top levels of page tables for iommu which has less agaw
> + * than default. Unnecessary for PT mode.
> + */
> +static inline int iommu_skip_agaw(struct dmar_domain *domain,
> + struct intel_iommu *iommu,
> + struct dma_pte **pgd)
> +{
> + int agaw;
> +
> + for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) {
> + *pgd = phys_to_virt(dma_pte_addr(*pgd));
> + if (!dma_pte_present(*pgd)) {
> + return -EINVAL;
> + }
> + }
> + pr_debug_ratelimited("%s: pgd: %llx, agaw %d d_agaw %d\n",
> __func__, (u64)*pgd,
> + iommu->agaw, domain->agaw);
> +
> + return agaw;
> +}
> +
> +/*
> * Set up the scalable mode pasid entry for second only translation type.
> */
> int intel_pasid_setup_second_level(struct intel_iommu *iommu,
> --
> 2.7.4

2020-03-27 11:56:26

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH V10 04/11] iommu/vt-d: Use helper function to skip agaw for SL

> From: Jacob Pan <[email protected]>
> Sent: Saturday, March 21, 2020 7:28 AM
>
> Signed-off-by: Jacob Pan <[email protected]>
> ---
> drivers/iommu/intel-pasid.c | 14 ++++----------
> 1 file changed, 4 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
> index 191508c7c03e..9bdb7ee228b6 100644
> --- a/drivers/iommu/intel-pasid.c
> +++ b/drivers/iommu/intel-pasid.c
> @@ -544,17 +544,11 @@ int intel_pasid_setup_second_level(struct
> intel_iommu *iommu,
> return -EINVAL;
> }
>
> - /*
> - * Skip top levels of page tables for iommu which has less agaw
> - * than default. Unnecessary for PT mode.
> - */
> pgd = domain->pgd;
> - for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) {
> - pgd = phys_to_virt(dma_pte_addr(pgd));
> - if (!dma_pte_present(pgd)) {
> - dev_err(dev, "Invalid domain page table\n");
> - return -EINVAL;
> - }
> + agaw = iommu_skip_agaw(domain, iommu, &pgd);
> + if (agaw < 0) {
> + dev_err(dev, "Invalid domain page table\n");
> + return -EINVAL;
> }

ok, I see how it is used. possibly combine last and this one together since
it's mostly moving code...

>
> pgd_val = virt_to_phys(pgd);
> --
> 2.7.4

2020-03-27 14:14:01

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH V10 02/11] iommu/uapi: Define a mask for bind data

Hi Jacob,

On 3/21/20 12:27 AM, Jacob Pan wrote:
> Memory type related flags can be grouped together for one simple check.
>
> ---
> v9 renamed from EMT to MTS since these are memory type support flags.
> ---
>
> Signed-off-by: Jacob Pan <[email protected]>
Reviewed-by: Eric Auger <[email protected]>

Thanks

Eric

> ---
> include/uapi/linux/iommu.h | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
> index 4ad3496e5c43..d7bcbc5f79b0 100644
> --- a/include/uapi/linux/iommu.h
> +++ b/include/uapi/linux/iommu.h
> @@ -284,7 +284,10 @@ struct iommu_gpasid_bind_data_vtd {
> __u32 pat;
> __u32 emt;
> };
> -
> +#define IOMMU_SVA_VTD_GPASID_MTS_MASK (IOMMU_SVA_VTD_GPASID_CD | \
> + IOMMU_SVA_VTD_GPASID_EMTE | \
> + IOMMU_SVA_VTD_GPASID_PCD | \
> + IOMMU_SVA_VTD_GPASID_PWT)
> /**
> * struct iommu_gpasid_bind_data - Information about device and guest PASID binding
> * @version: Version of this data structure
>

2020-03-27 16:07:01

by Eric Auger

[permalink] [raw]
Subject: Re: [PATCH V10 04/11] iommu/vt-d: Use helper function to skip agaw for SL

Hi Jacob,

On 3/27/20 12:55 PM, Tian, Kevin wrote:
>> From: Jacob Pan <[email protected]>
>> Sent: Saturday, March 21, 2020 7:28 AM
>>
>> Signed-off-by: Jacob Pan <[email protected]>
>> ---
>> drivers/iommu/intel-pasid.c | 14 ++++----------
>> 1 file changed, 4 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
>> index 191508c7c03e..9bdb7ee228b6 100644
>> --- a/drivers/iommu/intel-pasid.c
>> +++ b/drivers/iommu/intel-pasid.c
>> @@ -544,17 +544,11 @@ int intel_pasid_setup_second_level(struct
>> intel_iommu *iommu,
>> return -EINVAL;
>> }
>>
>> - /*
>> - * Skip top levels of page tables for iommu which has less agaw
>> - * than default. Unnecessary for PT mode.
>> - */
>> pgd = domain->pgd;
>> - for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) {
>> - pgd = phys_to_virt(dma_pte_addr(pgd));
>> - if (!dma_pte_present(pgd)) {
>> - dev_err(dev, "Invalid domain page table\n");
>> - return -EINVAL;
>> - }
>> + agaw = iommu_skip_agaw(domain, iommu, &pgd);
>> + if (agaw < 0) {
>> + dev_err(dev, "Invalid domain page table\n");
is the dev_err() really requested. I see in domain_setup_first_level(),
there is none.
>> + return -EINVAL;
>> }
>
> ok, I see how it is used. possibly combine last and this one together since
> it's mostly moving code...

I tend to agree with Kevin. May be better squash the 2 patches. Also not
sure the inline of iommu_skip_agaw() is meaningful then. Also Add commit
messages on the resulting patch.

Note domain_setup_first_level() also could use the helper while we are
it (if declaration moved to common helper). Only the error code differs
in case !dma_pte_present(pgd), ie. -ENOMEM. May be good to align.

Otherwise those stuff may be done in a fixup patch.

Thanks

Eric
>
>>
>> pgd_val = virt_to_phys(pgd);
>> --
>> 2.7.4
>

2020-03-28 10:05:55

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH V10 09/11] iommu/vt-d: Cache virtual command capability register

> From: Jacob Pan <[email protected]>
> Sent: Saturday, March 21, 2020 7:28 AM
>
> Virtual command registers are used in the guest only, to prevent
> vmexit cost, we cache the capability and store it during initialization.
>
> Signed-off-by: Jacob Pan <[email protected]>
> Reviewed-by: Eric Auger <[email protected]>
> Reviewed-by: Lu Baolu <[email protected]>
>
> ---
> v7 Reviewed by Eric & Baolu
> ---
> ---
> drivers/iommu/dmar.c | 1 +
> include/linux/intel-iommu.h | 5 +++++
> 2 files changed, 6 insertions(+)
>
> diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
> index 4d6b7b5b37ee..3b36491c8bbb 100644
> --- a/drivers/iommu/dmar.c
> +++ b/drivers/iommu/dmar.c
> @@ -963,6 +963,7 @@ static int map_iommu(struct intel_iommu *iommu,
> u64 phys_addr)
> warn_invalid_dmar(phys_addr, " returns all ones");
> goto unmap;
> }
> + iommu->vccap = dmar_readq(iommu->reg + DMAR_VCCAP_REG);
>
> /* the registers might be more than one page */
> map_size = max_t(int, ecap_max_iotlb_offset(iommu->ecap),
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index 43539713b3b3..ccbf164fb711 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -194,6 +194,9 @@
> #define ecap_max_handle_mask(e) ((e >> 20) & 0xf)
> #define ecap_sc_support(e) ((e >> 7) & 0x1) /* Snooping Control */
>
> +/* Virtual command interface capabilities */

capabilities -> capability

Reviewed-by: Kevin Tian <[email protected]>

> +#define vccap_pasid(v) ((v & DMA_VCS_PAS)) /* PASID
> allocation */
> +
> /* IOTLB_REG */
> #define DMA_TLB_FLUSH_GRANU_OFFSET 60
> #define DMA_TLB_GLOBAL_FLUSH (((u64)1) << 60)
> @@ -287,6 +290,7 @@
>
> /* PRS_REG */
> #define DMA_PRS_PPR ((u32)1)
> +#define DMA_VCS_PAS ((u64)1)
>
> #define IOMMU_WAIT_OP(iommu, offset, op, cond, sts)
> \
> do { \
> @@ -537,6 +541,7 @@ struct intel_iommu {
> u64 reg_size; /* size of hw register set */
> u64 cap;
> u64 ecap;
> + u64 vccap;
> u32 gcmd; /* Holds TE, EAFL. Don't need SRTP, SFL, WBF
> */
> raw_spinlock_t register_lock; /* protect register handling */
> int seq_id; /* sequence id of the iommu */
> --
> 2.7.4

2020-03-29 07:23:01

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH V10 03/11] iommu/vt-d: Add a helper function to skip agaw

On 2020/3/27 19:53, Tian, Kevin wrote:
>> From: Jacob Pan <[email protected]>
>> Sent: Saturday, March 21, 2020 7:28 AM
>>
>> Signed-off-by: Jacob Pan <[email protected]>
>
> could you elaborate in which scenario this helper function is required?

I added below commit message:

An Intel iommu domain uses 5-level page table by default. If the
iommu that the domain tries to attach supports less page levels,
the top level page tables should be skipped. Add a helper to do
this so that it could be used in other places.

Best regards,
baolu

>
>> ---
>> drivers/iommu/intel-pasid.c | 22 ++++++++++++++++++++++
>> 1 file changed, 22 insertions(+)
>>
>> diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
>> index 22b30f10b396..191508c7c03e 100644
>> --- a/drivers/iommu/intel-pasid.c
>> +++ b/drivers/iommu/intel-pasid.c
>> @@ -500,6 +500,28 @@ int intel_pasid_setup_first_level(struct intel_iommu
>> *iommu,
>> }
>>
>> /*
>> + * Skip top levels of page tables for iommu which has less agaw
>> + * than default. Unnecessary for PT mode.
>> + */
>> +static inline int iommu_skip_agaw(struct dmar_domain *domain,
>> + struct intel_iommu *iommu,
>> + struct dma_pte **pgd)
>> +{
>> + int agaw;
>> +
>> + for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) {
>> + *pgd = phys_to_virt(dma_pte_addr(*pgd));
>> + if (!dma_pte_present(*pgd)) {
>> + return -EINVAL;
>> + }
>> + }
>> + pr_debug_ratelimited("%s: pgd: %llx, agaw %d d_agaw %d\n",
>> __func__, (u64)*pgd,
>> + iommu->agaw, domain->agaw);
>> +
>> + return agaw;
>> +}
>> +
>> +/*
>> * Set up the scalable mode pasid entry for second only translation type.
>> */
>> int intel_pasid_setup_second_level(struct intel_iommu *iommu,
>> --
>> 2.7.4
>

2020-03-29 07:39:02

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH V10 04/11] iommu/vt-d: Use helper function to skip agaw for SL

On 2020/3/28 0:05, Auger Eric wrote:
> Hi Jacob,
>
> On 3/27/20 12:55 PM, Tian, Kevin wrote:
>>> From: Jacob Pan<[email protected]>
>>> Sent: Saturday, March 21, 2020 7:28 AM
>>>
>>> Signed-off-by: Jacob Pan<[email protected]>
>>> ---
>>> drivers/iommu/intel-pasid.c | 14 ++++----------
>>> 1 file changed, 4 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
>>> index 191508c7c03e..9bdb7ee228b6 100644
>>> --- a/drivers/iommu/intel-pasid.c
>>> +++ b/drivers/iommu/intel-pasid.c
>>> @@ -544,17 +544,11 @@ int intel_pasid_setup_second_level(struct
>>> intel_iommu *iommu,
>>> return -EINVAL;
>>> }
>>>
>>> - /*
>>> - * Skip top levels of page tables for iommu which has less agaw
>>> - * than default. Unnecessary for PT mode.
>>> - */
>>> pgd = domain->pgd;
>>> - for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) {
>>> - pgd = phys_to_virt(dma_pte_addr(pgd));
>>> - if (!dma_pte_present(pgd)) {
>>> - dev_err(dev, "Invalid domain page table\n");
>>> - return -EINVAL;
>>> - }
>>> + agaw = iommu_skip_agaw(domain, iommu, &pgd);
>>> + if (agaw < 0) {
>>> + dev_err(dev, "Invalid domain page table\n");
> is the dev_err() really requested. I see in domain_setup_first_level(),
> there is none.
>>> + return -EINVAL;
>>> }
>> ok, I see how it is used. possibly combine last and this one together since
>> it's mostly moving code...
> I tend to agree with Kevin. May be better squash the 2 patches. Also not
> sure the inline of iommu_skip_agaw() is meaningful then. Also Add commit
> messages on the resulting patch.
>
> Note domain_setup_first_level() also could use the helper while we are
> it (if declaration moved to common helper). Only the error code differs
> in case !dma_pte_present(pgd), ie. -ENOMEM. May be good to align.
>
> Otherwise those stuff may be done in a fixup patch.

Agreed. Will squash these 2 patches with a meaningful commit message. As
for using this helper in other files, like domain_setup_first_level(),
we need more review and test efforts, hence it's better to put it in a
followup patch.

Best regards,
baolu

2020-03-30 17:47:42

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH V10 03/11] iommu/vt-d: Add a helper function to skip agaw

On Sun, 29 Mar 2020 15:20:55 +0800
Lu Baolu <[email protected]> wrote:

> On 2020/3/27 19:53, Tian, Kevin wrote:
> >> From: Jacob Pan <[email protected]>
> >> Sent: Saturday, March 21, 2020 7:28 AM
> >>
> >> Signed-off-by: Jacob Pan <[email protected]>
> >
> > could you elaborate in which scenario this helper function is
> > required?
>
> I added below commit message:
>
> An Intel iommu domain uses 5-level page table by default. If the
> iommu that the domain tries to attach supports less page levels,
> the top level page tables should be skipped. Add a helper to do
> this so that it could be used in other places.
>
Thanks Baolu,
I will also add this to my v11, it might save you some time :)


> Best regards,
> baolu
>
> >
> >> ---
> >> drivers/iommu/intel-pasid.c | 22 ++++++++++++++++++++++
> >> 1 file changed, 22 insertions(+)
> >>
> >> diff --git a/drivers/iommu/intel-pasid.c
> >> b/drivers/iommu/intel-pasid.c index 22b30f10b396..191508c7c03e
> >> 100644 --- a/drivers/iommu/intel-pasid.c
> >> +++ b/drivers/iommu/intel-pasid.c
> >> @@ -500,6 +500,28 @@ int intel_pasid_setup_first_level(struct
> >> intel_iommu *iommu,
> >> }
> >>
> >> /*
> >> + * Skip top levels of page tables for iommu which has less agaw
> >> + * than default. Unnecessary for PT mode.
> >> + */
> >> +static inline int iommu_skip_agaw(struct dmar_domain *domain,
> >> + struct intel_iommu *iommu,
> >> + struct dma_pte **pgd)
> >> +{
> >> + int agaw;
> >> +
> >> + for (agaw = domain->agaw; agaw > iommu->agaw; agaw--) {
> >> + *pgd = phys_to_virt(dma_pte_addr(*pgd));
> >> + if (!dma_pte_present(*pgd)) {
> >> + return -EINVAL;
> >> + }
> >> + }
> >> + pr_debug_ratelimited("%s: pgd: %llx, agaw %d d_agaw %d\n",
> >> __func__, (u64)*pgd,
> >> + iommu->agaw, domain->agaw);
> >> +
> >> + return agaw;
> >> +}
> >> +
> >> +/*
> >> * Set up the scalable mode pasid entry for second only
> >> translation type. */
> >> int intel_pasid_setup_second_level(struct intel_iommu *iommu,
> >> --
> >> 2.7.4
> >

[Jacob Pan]

2020-03-31 22:30:49

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH V10 09/11] iommu/vt-d: Cache virtual command capability register

On Sat, 28 Mar 2020 10:04:38 +0000
"Tian, Kevin" <[email protected]> wrote:

> > From: Jacob Pan <[email protected]>
> > Sent: Saturday, March 21, 2020 7:28 AM
> >
> > Virtual command registers are used in the guest only, to prevent
> > vmexit cost, we cache the capability and store it during
> > initialization.
> >
> > Signed-off-by: Jacob Pan <[email protected]>
> > Reviewed-by: Eric Auger <[email protected]>
> > Reviewed-by: Lu Baolu <[email protected]>
> >
> > ---
> > v7 Reviewed by Eric & Baolu
> > ---
> > ---
> > drivers/iommu/dmar.c | 1 +
> > include/linux/intel-iommu.h | 5 +++++
> > 2 files changed, 6 insertions(+)
> >
> > diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
> > index 4d6b7b5b37ee..3b36491c8bbb 100644
> > --- a/drivers/iommu/dmar.c
> > +++ b/drivers/iommu/dmar.c
> > @@ -963,6 +963,7 @@ static int map_iommu(struct intel_iommu *iommu,
> > u64 phys_addr)
> > warn_invalid_dmar(phys_addr, " returns all ones");
> > goto unmap;
> > }
> > + iommu->vccap = dmar_readq(iommu->reg + DMAR_VCCAP_REG);
> >
> > /* the registers might be more than one page */
> > map_size = max_t(int, ecap_max_iotlb_offset(iommu->ecap),
> > diff --git a/include/linux/intel-iommu.h
> > b/include/linux/intel-iommu.h index 43539713b3b3..ccbf164fb711
> > 100644 --- a/include/linux/intel-iommu.h
> > +++ b/include/linux/intel-iommu.h
> > @@ -194,6 +194,9 @@
> > #define ecap_max_handle_mask(e) ((e >> 20) & 0xf)
> > #define ecap_sc_support(e) ((e >> 7) & 0x1) /* Snooping
> > Control */
> >
> > +/* Virtual command interface capabilities */
>
> capabilities -> capability
Will do, I was thinking the future :)

Thanks,

Jacob
>
> Reviewed-by: Kevin Tian <[email protected]>
>
> > +#define vccap_pasid(v) ((v & DMA_VCS_PAS)) /* PASID
> > allocation */
> > +
> > /* IOTLB_REG */
> > #define DMA_TLB_FLUSH_GRANU_OFFSET 60
> > #define DMA_TLB_GLOBAL_FLUSH (((u64)1) << 60)
> > @@ -287,6 +290,7 @@
> >
> > /* PRS_REG */
> > #define DMA_PRS_PPR ((u32)1)
> > +#define DMA_VCS_PAS ((u64)1)
> >
> > #define IOMMU_WAIT_OP(iommu, offset, op, cond, sts)
> > \
> > do
> > {
> > \ @@ -537,6 +541,7 @@ struct intel_iommu { u64
> > reg_size; /* size of hw register set */ u64 cap;
> > u64 ecap;
> > + u64 vccap;
> > u32 gcmd; /* Holds TE, EAFL. Don't need
> > SRTP, SFL, WBF */
> > raw_spinlock_t register_lock; /* protect register
> > handling */ int seq_id; /* sequence id of the
> > iommu */ --
> > 2.7.4
>

[Jacob Pan]