Mostly extracted from nested SVA/SVM series based on review comments of v7.
https://lkml.org/lkml/2019/10/24/852
This series also adds a few important fixes for native use of SVA. Nested
SVA new code will be submitted separately as a smaller set. Based on the
core branch of IOMMU tree staged for v5.5, where common APIs for vSVA were
applied.
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git core
Changelog:
V3
- Squashed 1/10 & 2/10
- Deleted "8/10 Fix PASID cache flush" from this series
- Addressed reviews from Eric Auger and Baolu
V2
- Coding style fixes based on Baolu's input, no functional change
- Added Acked-by tags.
Thanks,
Jacob
Jacob Pan (8):
iommu/vt-d: Fix CPU and IOMMU SVM feature matching checks
iommu/vt-d: Match CPU and IOMMU paging mode
iommu/vt-d: Reject SVM bind for failed capability check
iommu/vt-d: Avoid duplicated code for PASID setup
iommu/vt-d: Fix off-by-one in PASID allocation
iommu/vt-d: Replace Intel specific PASID allocator with IOASID
iommu/vt-d: Avoid sending invalid page response
iommu/vt-d: Misc macro clean up for SVM
drivers/iommu/Kconfig | 1 +
drivers/iommu/intel-iommu.c | 23 +++----
drivers/iommu/intel-pasid.c | 96 ++++++++------------------
drivers/iommu/intel-svm.c | 162 +++++++++++++++++++++++++-------------------
include/linux/intel-iommu.h | 5 +-
5 files changed, 134 insertions(+), 153 deletions(-)
--
2.7.4
Shared Virtual Memory(SVM) is based on a collective set of hardware
features detected at runtime. There are requirements for matching CPU
and IOMMU capabilities.
The current code checks CPU and IOMMU feature set for SVM support but
the result is never stored nor used. Therefore, SVM can still be used
even when these checks failed. The consequences can be:
1. CPU uses 5-level paging mode for virtual address of 57 bits, but
IOMMU can only support 4-level paging mode with 48 bits address for DMA.
2. 1GB page size is used by CPU but IOMMU does not support it. VT-d
unrecoverable faults may be generated.
The best solution to fix these problems is to prevent them in the first
place.
This patch consolidates code for checking PASID, CPU vs. IOMMU paging
mode compatibility, as well as provides specific error messages for
each failed checks. On sane hardware configurations, these error message
shall never appear in kernel log.
Signed-off-by: Jacob Pan <[email protected]>
Acked-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel-iommu.c | 10 ++--------
drivers/iommu/intel-svm.c | 40 +++++++++++++++++++++++++++-------------
include/linux/intel-iommu.h | 5 ++++-
3 files changed, 33 insertions(+), 22 deletions(-)
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 3f974919d3bd..d598168e410d 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3289,10 +3289,7 @@ static int __init init_dmars(void)
if (!ecap_pass_through(iommu->ecap))
hw_pass_through = 0;
-#ifdef CONFIG_INTEL_IOMMU_SVM
- if (pasid_supported(iommu))
- intel_svm_init(iommu);
-#endif
+ intel_svm_check(iommu);
}
/*
@@ -4471,10 +4468,7 @@ static int intel_iommu_add(struct dmar_drhd_unit *dmaru)
if (ret)
goto out;
-#ifdef CONFIG_INTEL_IOMMU_SVM
- if (pasid_supported(iommu))
- intel_svm_init(iommu);
-#endif
+ intel_svm_check(iommu);
if (dmaru->ignored) {
/*
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 9b159132405d..716c543488f6 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -23,19 +23,6 @@
static irqreturn_t prq_event_thread(int irq, void *d);
-int intel_svm_init(struct intel_iommu *iommu)
-{
- if (cpu_feature_enabled(X86_FEATURE_GBPAGES) &&
- !cap_fl1gp_support(iommu->cap))
- return -EINVAL;
-
- if (cpu_feature_enabled(X86_FEATURE_LA57) &&
- !cap_5lp_support(iommu->cap))
- return -EINVAL;
-
- return 0;
-}
-
#define PRQ_ORDER 0
int intel_svm_enable_prq(struct intel_iommu *iommu)
@@ -99,6 +86,33 @@ int intel_svm_finish_prq(struct intel_iommu *iommu)
return 0;
}
+static inline bool intel_svm_capable(struct intel_iommu *iommu)
+{
+ return iommu->flags & VTD_FLAG_SVM_CAPABLE;
+}
+
+void intel_svm_check(struct intel_iommu *iommu)
+{
+ if (!pasid_supported(iommu))
+ return;
+
+ if (cpu_feature_enabled(X86_FEATURE_GBPAGES) &&
+ !cap_fl1gp_support(iommu->cap)) {
+ pr_err("%s SVM disabled, incompatible 1GB page capability\n",
+ iommu->name);
+ return;
+ }
+
+ if (cpu_feature_enabled(X86_FEATURE_LA57) &&
+ !cap_5lp_support(iommu->cap)) {
+ pr_err("%s SVM disabled, incompatible paging mode\n",
+ iommu->name);
+ return;
+ }
+
+ iommu->flags |= VTD_FLAG_SVM_CAPABLE;
+}
+
static void intel_flush_svm_range_dev (struct intel_svm *svm, struct intel_svm_dev *sdev,
unsigned long address, unsigned long pages, int ih)
{
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index ed11ef594378..7dcfa1c4a844 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -433,6 +433,7 @@ enum {
#define VTD_FLAG_TRANS_PRE_ENABLED (1 << 0)
#define VTD_FLAG_IRQ_REMAP_PRE_ENABLED (1 << 1)
+#define VTD_FLAG_SVM_CAPABLE (1 << 2)
extern int intel_iommu_sm;
@@ -656,7 +657,7 @@ void iommu_flush_write_buffer(struct intel_iommu *iommu);
int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct device *dev);
#ifdef CONFIG_INTEL_IOMMU_SVM
-int intel_svm_init(struct intel_iommu *iommu);
+extern void intel_svm_check(struct intel_iommu *iommu);
extern int intel_svm_enable_prq(struct intel_iommu *iommu);
extern int intel_svm_finish_prq(struct intel_iommu *iommu);
@@ -684,6 +685,8 @@ struct intel_svm {
};
extern struct intel_iommu *intel_svm_device_to_iommu(struct device *dev);
+#else
+static inline void intel_svm_check(struct intel_iommu *iommu) {}
#endif
#ifdef CONFIG_INTEL_IOMMU_DEBUGFS
--
2.7.4
When setting up first level page tables for sharing with CPU, we need
to ensure IOMMU can support no less than the levels supported by the
CPU.
It is not adequate, as in the current code, to set up 5-level paging
in PASID entry First Level Paging Mode(FLPM) solely based on CPU.
Currently, intel_pasid_setup_first_level() is only used by native SVM
code which already checks paging mode matches. However, future use of
this helper function may not be limited to native SVM.
https://lkml.org/lkml/2019/11/18/1037
Fixes: 437f35e1cd4c8 ("iommu/vt-d: Add first level page table
interface")
Signed-off-by: Jacob Pan <[email protected]>
Acked-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel-pasid.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index 040a445be300..e7cb0b8a7332 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -499,8 +499,16 @@ int intel_pasid_setup_first_level(struct intel_iommu *iommu,
}
#ifdef CONFIG_X86
- if (cpu_feature_enabled(X86_FEATURE_LA57))
- pasid_set_flpm(pte, 1);
+ /* Both CPU and IOMMU paging mode need to match */
+ if (cpu_feature_enabled(X86_FEATURE_LA57)) {
+ if (cap_5lp_support(iommu->cap)) {
+ pasid_set_flpm(pte, 1);
+ } else {
+ pr_err("VT-d has no 5-level paging support for CPU\n");
+ pasid_clear_entry(pte);
+ return -EINVAL;
+ }
+ }
#endif /* CONFIG_X86 */
pasid_set_domain_id(pte, did);
--
2.7.4
Make use of generic IOASID code to manage PASID allocation,
free, and lookup. Replace Intel specific code.
IOASID allocator is inclusive for both start and end of the allocation
range. The current code is based on IDR, which is exclusive for
the end of the allocation range. This patch fixes the off-by-one
error in intel_svm_bind_mm, where pasid_max - 1 is used for the end of
allocation range.
Signed-off-by: Jacob Pan <[email protected]>
Reviewed-by: Lu Baolu <[email protected]>
Acked-by: Lu Baolu <[email protected]>
---
drivers/iommu/Kconfig | 1 +
drivers/iommu/intel-iommu.c | 13 +++++++------
drivers/iommu/intel-pasid.c | 36 ------------------------------------
drivers/iommu/intel-svm.c | 39 +++++++++++++++++++++++++--------------
4 files changed, 33 insertions(+), 56 deletions(-)
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index fd50ddffffbf..43ce450a40d3 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -212,6 +212,7 @@ config INTEL_IOMMU_SVM
depends on INTEL_IOMMU && X86
select PCI_PASID
select MMU_NOTIFIER
+ select IOASID
help
Shared Virtual Memory (SVM) provides a facility for devices
to access DMA resources through process address space by
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index d598168e410d..a84f0caa33a0 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5238,7 +5238,7 @@ static void auxiliary_unlink_device(struct dmar_domain *domain,
domain->auxd_refcnt--;
if (!domain->auxd_refcnt && domain->default_pasid > 0)
- intel_pasid_free_id(domain->default_pasid);
+ ioasid_free(domain->default_pasid);
}
static int aux_domain_add_dev(struct dmar_domain *domain,
@@ -5256,10 +5256,11 @@ static int aux_domain_add_dev(struct dmar_domain *domain,
if (domain->default_pasid <= 0) {
int pasid;
- pasid = intel_pasid_alloc_id(domain, PASID_MIN,
- pci_max_pasids(to_pci_dev(dev)),
- GFP_KERNEL);
- if (pasid <= 0) {
+ /* No private data needed for the default pasid */
+ pasid = ioasid_alloc(NULL, PASID_MIN,
+ pci_max_pasids(to_pci_dev(dev)) - 1,
+ NULL);
+ if (pasid == INVALID_IOASID) {
pr_err("Can't allocate default pasid\n");
return -ENODEV;
}
@@ -5295,7 +5296,7 @@ static int aux_domain_add_dev(struct dmar_domain *domain,
spin_unlock(&iommu->lock);
spin_unlock_irqrestore(&device_domain_lock, flags);
if (!domain->auxd_refcnt && domain->default_pasid > 0)
- intel_pasid_free_id(domain->default_pasid);
+ ioasid_free(domain->default_pasid);
return ret;
}
diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index 732bfee228df..3cb569e76642 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -26,42 +26,6 @@
*/
static DEFINE_SPINLOCK(pasid_lock);
u32 intel_pasid_max_id = PASID_MAX;
-static DEFINE_IDR(pasid_idr);
-
-int intel_pasid_alloc_id(void *ptr, int start, int end, gfp_t gfp)
-{
- int ret, min, max;
-
- min = max_t(int, start, PASID_MIN);
- max = min_t(int, end, intel_pasid_max_id);
-
- WARN_ON(in_interrupt());
- idr_preload(gfp);
- spin_lock(&pasid_lock);
- ret = idr_alloc(&pasid_idr, ptr, min, max, GFP_ATOMIC);
- spin_unlock(&pasid_lock);
- idr_preload_end();
-
- return ret;
-}
-
-void intel_pasid_free_id(int pasid)
-{
- spin_lock(&pasid_lock);
- idr_remove(&pasid_idr, pasid);
- spin_unlock(&pasid_lock);
-}
-
-void *intel_pasid_lookup_id(int pasid)
-{
- void *p;
-
- spin_lock(&pasid_lock);
- p = idr_find(&pasid_idr, pasid);
- spin_unlock(&pasid_lock);
-
- return p;
-}
/*
* Per device pasid table management:
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index e90d0b914afe..71b85b892c56 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -17,6 +17,7 @@
#include <linux/dmar.h>
#include <linux/interrupt.h>
#include <linux/mm_types.h>
+#include <linux/ioasid.h>
#include <asm/page.h>
#include "intel-pasid.h"
@@ -335,16 +336,15 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
if (pasid_max > intel_pasid_max_id)
pasid_max = intel_pasid_max_id;
- /* Do not use PASID 0 in caching mode (virtualised IOMMU) */
- ret = intel_pasid_alloc_id(svm,
- !!cap_caching_mode(iommu->cap),
- pasid_max, GFP_KERNEL);
- if (ret < 0) {
+ /* Do not use PASID 0, reserved for RID to PASID */
+ svm->pasid = ioasid_alloc(NULL, PASID_MIN,
+ pasid_max - 1, svm);
+ if (svm->pasid == INVALID_IOASID) {
kfree(svm);
kfree(sdev);
+ ret = -ENOSPC;
goto out;
}
- svm->pasid = ret;
svm->notifier.ops = &intel_mmuops;
svm->mm = mm;
svm->flags = flags;
@@ -354,7 +354,7 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
if (mm) {
ret = mmu_notifier_register(&svm->notifier, mm);
if (ret) {
- intel_pasid_free_id(svm->pasid);
+ ioasid_free(svm->pasid);
kfree(svm);
kfree(sdev);
goto out;
@@ -370,7 +370,7 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
if (ret) {
if (mm)
mmu_notifier_unregister(&svm->notifier, mm);
- intel_pasid_free_id(svm->pasid);
+ ioasid_free(svm->pasid);
kfree(svm);
kfree(sdev);
goto out;
@@ -418,7 +418,15 @@ int intel_svm_unbind_mm(struct device *dev, int pasid)
if (!iommu)
goto out;
- svm = intel_pasid_lookup_id(pasid);
+ svm = ioasid_find(NULL, pasid, NULL);
+ if (!svm)
+ goto out;
+
+ if (IS_ERR(svm)) {
+ ret = PTR_ERR(svm);
+ goto out;
+ }
+
if (!svm)
goto out;
@@ -440,7 +448,7 @@ int intel_svm_unbind_mm(struct device *dev, int pasid)
kfree_rcu(sdev, rcu);
if (list_empty(&svm->devs)) {
- intel_pasid_free_id(svm->pasid);
+ ioasid_free(svm->pasid);
if (svm->mm)
mmu_notifier_unregister(&svm->notifier, svm->mm);
@@ -475,10 +483,14 @@ int intel_svm_is_pasid_valid(struct device *dev, int pasid)
if (!iommu)
goto out;
- svm = intel_pasid_lookup_id(pasid);
+ svm = ioasid_find(NULL, pasid, NULL);
if (!svm)
goto out;
+ if (IS_ERR(svm)) {
+ ret = PTR_ERR(svm);
+ goto out;
+ }
/* init_mm is used in this case */
if (!svm->mm)
ret = 1;
@@ -585,13 +597,12 @@ static irqreturn_t prq_event_thread(int irq, void *d)
if (!svm || svm->pasid != req->pasid) {
rcu_read_lock();
- svm = intel_pasid_lookup_id(req->pasid);
+ svm = ioasid_find(NULL, req->pasid, NULL);
/* It *can't* go away, because the driver is not permitted
* to unbind the mm while any page faults are outstanding.
* So we only need RCU to protect the internal idr code. */
rcu_read_unlock();
-
- if (!svm) {
+ if (IS_ERR_OR_NULL(svm)) {
pr_err("%s: Page request for invalid PASID %d: %08llx %08llx\n",
iommu->name, req->pasid, ((unsigned long long *)req)[0],
((unsigned long long *)req)[1]);
--
2.7.4
After each setup for PASID entry, related translation caches must be
flushed. We can combine duplicated code into one function which is less
error prone.
Signed-off-by: Jacob Pan <[email protected]>
Reviewed-by: Lu Baolu <[email protected]>
Acked-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel-pasid.c | 48 +++++++++++++++++----------------------------
1 file changed, 18 insertions(+), 30 deletions(-)
diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
index e7cb0b8a7332..732bfee228df 100644
--- a/drivers/iommu/intel-pasid.c
+++ b/drivers/iommu/intel-pasid.c
@@ -465,6 +465,21 @@ void intel_pasid_tear_down_entry(struct intel_iommu *iommu,
devtlb_invalidation_with_pasid(iommu, dev, pasid);
}
+static void pasid_flush_caches(struct intel_iommu *iommu,
+ struct pasid_entry *pte,
+ int pasid, u16 did)
+{
+ if (!ecap_coherent(iommu->ecap))
+ clflush_cache_range(pte, sizeof(*pte));
+
+ if (cap_caching_mode(iommu->cap)) {
+ pasid_cache_invalidation_with_pasid(iommu, did, pasid);
+ iotlb_invalidation_with_pasid(iommu, did, pasid);
+ } else {
+ iommu_flush_write_buffer(iommu);
+ }
+}
+
/*
* Set up the scalable mode pasid table entry for first only
* translation type.
@@ -518,16 +533,7 @@ int intel_pasid_setup_first_level(struct intel_iommu *iommu,
/* Setup Present and PASID Granular Transfer Type: */
pasid_set_translation_type(pte, 1);
pasid_set_present(pte);
-
- if (!ecap_coherent(iommu->ecap))
- clflush_cache_range(pte, sizeof(*pte));
-
- if (cap_caching_mode(iommu->cap)) {
- pasid_cache_invalidation_with_pasid(iommu, did, pasid);
- iotlb_invalidation_with_pasid(iommu, did, pasid);
- } else {
- iommu_flush_write_buffer(iommu);
- }
+ pasid_flush_caches(iommu, pte, pasid, did);
return 0;
}
@@ -591,16 +597,7 @@ int intel_pasid_setup_second_level(struct intel_iommu *iommu,
*/
pasid_set_sre(pte);
pasid_set_present(pte);
-
- if (!ecap_coherent(iommu->ecap))
- clflush_cache_range(pte, sizeof(*pte));
-
- if (cap_caching_mode(iommu->cap)) {
- pasid_cache_invalidation_with_pasid(iommu, did, pasid);
- iotlb_invalidation_with_pasid(iommu, did, pasid);
- } else {
- iommu_flush_write_buffer(iommu);
- }
+ pasid_flush_caches(iommu, pte, pasid, did);
return 0;
}
@@ -634,16 +631,7 @@ int intel_pasid_setup_pass_through(struct intel_iommu *iommu,
*/
pasid_set_sre(pte);
pasid_set_present(pte);
-
- if (!ecap_coherent(iommu->ecap))
- clflush_cache_range(pte, sizeof(*pte));
-
- if (cap_caching_mode(iommu->cap)) {
- pasid_cache_invalidation_with_pasid(iommu, did, pasid);
- iotlb_invalidation_with_pasid(iommu, did, pasid);
- } else {
- iommu_flush_write_buffer(iommu);
- }
+ pasid_flush_caches(iommu, pte, pasid, did);
return 0;
}
--
2.7.4
Add a check during SVM bind to ensure CPU and IOMMU hardware capabilities
are met.
Signed-off-by: Jacob Pan <[email protected]>
Acked-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel-svm.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 716c543488f6..74df10a39dfc 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -238,6 +238,9 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
if (!iommu || dmar_disabled)
return -EINVAL;
+ if (!intel_svm_capable(iommu))
+ return -ENOTSUPP;
+
if (dev_is_pci(dev)) {
pasid_max = pci_max_pasids(to_pci_dev(dev));
if (pasid_max < 0)
--
2.7.4
Hi Jacob,
On 11/19/19 6:56 PM, Jacob Pan wrote:
> Shared Virtual Memory(SVM) is based on a collective set of hardware
> features detected at runtime. There are requirements for matching CPU
> and IOMMU capabilities.
>
> The current code checks CPU and IOMMU feature set for SVM support but
> the result is never stored nor used. Therefore, SVM can still be used
> even when these checks failed. The consequences can be:
> 1. CPU uses 5-level paging mode for virtual address of 57 bits, but
> IOMMU can only support 4-level paging mode with 48 bits address for DMA.
> 2. 1GB page size is used by CPU but IOMMU does not support it. VT-d
> unrecoverable faults may be generated.
>
> The best solution to fix these problems is to prevent them in the first
> place.
>
> This patch consolidates code for checking PASID, CPU vs. IOMMU paging
> mode compatibility, as well as provides specific error messages for
> each failed checks. On sane hardware configurations, these error message
> shall never appear in kernel log.
>
> Signed-off-by: Jacob Pan <[email protected]>
> Acked-by: Lu Baolu <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
Thanks
Eric
> ---
> drivers/iommu/intel-iommu.c | 10 ++--------
> drivers/iommu/intel-svm.c | 40 +++++++++++++++++++++++++++-------------
> include/linux/intel-iommu.h | 5 ++++-
> 3 files changed, 33 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 3f974919d3bd..d598168e410d 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -3289,10 +3289,7 @@ static int __init init_dmars(void)
>
> if (!ecap_pass_through(iommu->ecap))
> hw_pass_through = 0;
> -#ifdef CONFIG_INTEL_IOMMU_SVM
> - if (pasid_supported(iommu))
> - intel_svm_init(iommu);
> -#endif
> + intel_svm_check(iommu);
> }
>
> /*
> @@ -4471,10 +4468,7 @@ static int intel_iommu_add(struct dmar_drhd_unit *dmaru)
> if (ret)
> goto out;
>
> -#ifdef CONFIG_INTEL_IOMMU_SVM
> - if (pasid_supported(iommu))
> - intel_svm_init(iommu);
> -#endif
> + intel_svm_check(iommu);
>
> if (dmaru->ignored) {
> /*
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index 9b159132405d..716c543488f6 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
> @@ -23,19 +23,6 @@
>
> static irqreturn_t prq_event_thread(int irq, void *d);
>
> -int intel_svm_init(struct intel_iommu *iommu)
> -{
> - if (cpu_feature_enabled(X86_FEATURE_GBPAGES) &&
> - !cap_fl1gp_support(iommu->cap))
> - return -EINVAL;
> -
> - if (cpu_feature_enabled(X86_FEATURE_LA57) &&
> - !cap_5lp_support(iommu->cap))
> - return -EINVAL;
> -
> - return 0;
> -}
> -
> #define PRQ_ORDER 0
>
> int intel_svm_enable_prq(struct intel_iommu *iommu)
> @@ -99,6 +86,33 @@ int intel_svm_finish_prq(struct intel_iommu *iommu)
> return 0;
> }
>
> +static inline bool intel_svm_capable(struct intel_iommu *iommu)
> +{
> + return iommu->flags & VTD_FLAG_SVM_CAPABLE;
> +}
> +
> +void intel_svm_check(struct intel_iommu *iommu)
> +{
> + if (!pasid_supported(iommu))
> + return;
> +
> + if (cpu_feature_enabled(X86_FEATURE_GBPAGES) &&
> + !cap_fl1gp_support(iommu->cap)) {
> + pr_err("%s SVM disabled, incompatible 1GB page capability\n",
> + iommu->name);
> + return;
> + }
> +
> + if (cpu_feature_enabled(X86_FEATURE_LA57) &&
> + !cap_5lp_support(iommu->cap)) {
> + pr_err("%s SVM disabled, incompatible paging mode\n",
> + iommu->name);
> + return;
> + }
> +
> + iommu->flags |= VTD_FLAG_SVM_CAPABLE;
> +}
> +
> static void intel_flush_svm_range_dev (struct intel_svm *svm, struct intel_svm_dev *sdev,
> unsigned long address, unsigned long pages, int ih)
> {
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index ed11ef594378..7dcfa1c4a844 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -433,6 +433,7 @@ enum {
>
> #define VTD_FLAG_TRANS_PRE_ENABLED (1 << 0)
> #define VTD_FLAG_IRQ_REMAP_PRE_ENABLED (1 << 1)
> +#define VTD_FLAG_SVM_CAPABLE (1 << 2)
>
> extern int intel_iommu_sm;
>
> @@ -656,7 +657,7 @@ void iommu_flush_write_buffer(struct intel_iommu *iommu);
> int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct device *dev);
>
> #ifdef CONFIG_INTEL_IOMMU_SVM
> -int intel_svm_init(struct intel_iommu *iommu);
> +extern void intel_svm_check(struct intel_iommu *iommu);
> extern int intel_svm_enable_prq(struct intel_iommu *iommu);
> extern int intel_svm_finish_prq(struct intel_iommu *iommu);
>
> @@ -684,6 +685,8 @@ struct intel_svm {
> };
>
> extern struct intel_iommu *intel_svm_device_to_iommu(struct device *dev);
> +#else
> +static inline void intel_svm_check(struct intel_iommu *iommu) {}
> #endif
>
> #ifdef CONFIG_INTEL_IOMMU_DEBUGFS
>
Hi Jacob,
On 11/19/19 6:56 PM, Jacob Pan wrote:
> Add a check during SVM bind to ensure CPU and IOMMU hardware capabilities
> are met.
>
> Signed-off-by: Jacob Pan <[email protected]>
> Acked-by: Lu Baolu <[email protected]>
I sent my Rb on v2:
Reviewed-by: Eric Auger <[email protected]>
Thanks
Eric
> ---
> drivers/iommu/intel-svm.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index 716c543488f6..74df10a39dfc 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
> @@ -238,6 +238,9 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
> if (!iommu || dmar_disabled)
> return -EINVAL;
>
> + if (!intel_svm_capable(iommu))
> + return -ENOTSUPP;
> +
> if (dev_is_pci(dev)) {
> pasid_max = pci_max_pasids(to_pci_dev(dev));
> if (pasid_max < 0)
>
Hi Jacob,
On 11/19/19 6:56 PM, Jacob Pan wrote:
> After each setup for PASID entry, related translation caches must be
> flushed. We can combine duplicated code into one function which is less
> error prone.
>
> Signed-off-by: Jacob Pan <[email protected]>
> Reviewed-by: Lu Baolu <[email protected]>
> Acked-by: Lu Baolu <[email protected]>
I sent my R-b already on v2:
Reviewed-by: Eric Auger <[email protected]>
Eric
> ---
> drivers/iommu/intel-pasid.c | 48 +++++++++++++++++----------------------------
> 1 file changed, 18 insertions(+), 30 deletions(-)
>
> diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
> index e7cb0b8a7332..732bfee228df 100644
> --- a/drivers/iommu/intel-pasid.c
> +++ b/drivers/iommu/intel-pasid.c
> @@ -465,6 +465,21 @@ void intel_pasid_tear_down_entry(struct intel_iommu *iommu,
> devtlb_invalidation_with_pasid(iommu, dev, pasid);
> }
>
> +static void pasid_flush_caches(struct intel_iommu *iommu,
> + struct pasid_entry *pte,
> + int pasid, u16 did)
> +{
> + if (!ecap_coherent(iommu->ecap))
> + clflush_cache_range(pte, sizeof(*pte));
> +
> + if (cap_caching_mode(iommu->cap)) {
> + pasid_cache_invalidation_with_pasid(iommu, did, pasid);
> + iotlb_invalidation_with_pasid(iommu, did, pasid);
> + } else {
> + iommu_flush_write_buffer(iommu);
> + }
> +}
> +
> /*
> * Set up the scalable mode pasid table entry for first only
> * translation type.
> @@ -518,16 +533,7 @@ int intel_pasid_setup_first_level(struct intel_iommu *iommu,
> /* Setup Present and PASID Granular Transfer Type: */
> pasid_set_translation_type(pte, 1);
> pasid_set_present(pte);
> -
> - if (!ecap_coherent(iommu->ecap))
> - clflush_cache_range(pte, sizeof(*pte));
> -
> - if (cap_caching_mode(iommu->cap)) {
> - pasid_cache_invalidation_with_pasid(iommu, did, pasid);
> - iotlb_invalidation_with_pasid(iommu, did, pasid);
> - } else {
> - iommu_flush_write_buffer(iommu);
> - }
> + pasid_flush_caches(iommu, pte, pasid, did);
>
> return 0;
> }
> @@ -591,16 +597,7 @@ int intel_pasid_setup_second_level(struct intel_iommu *iommu,
> */
> pasid_set_sre(pte);
> pasid_set_present(pte);
> -
> - if (!ecap_coherent(iommu->ecap))
> - clflush_cache_range(pte, sizeof(*pte));
> -
> - if (cap_caching_mode(iommu->cap)) {
> - pasid_cache_invalidation_with_pasid(iommu, did, pasid);
> - iotlb_invalidation_with_pasid(iommu, did, pasid);
> - } else {
> - iommu_flush_write_buffer(iommu);
> - }
> + pasid_flush_caches(iommu, pte, pasid, did);
>
> return 0;
> }
> @@ -634,16 +631,7 @@ int intel_pasid_setup_pass_through(struct intel_iommu *iommu,
> */
> pasid_set_sre(pte);
> pasid_set_present(pte);
> -
> - if (!ecap_coherent(iommu->ecap))
> - clflush_cache_range(pte, sizeof(*pte));
> -
> - if (cap_caching_mode(iommu->cap)) {
> - pasid_cache_invalidation_with_pasid(iommu, did, pasid);
> - iotlb_invalidation_with_pasid(iommu, did, pasid);
> - } else {
> - iommu_flush_write_buffer(iommu);
> - }
> + pasid_flush_caches(iommu, pte, pasid, did);
>
> return 0;
> }
>
Hi Jacob,
On 11/19/19 6:56 PM, Jacob Pan wrote:
> Make use of generic IOASID code to manage PASID allocation,
> free, and lookup. Replace Intel specific code.
> IOASID allocator is inclusive for both start and end of the allocation
> range. The current code is based on IDR, which is exclusive for
> the end of the allocation range. This patch fixes the off-by-one
> error in intel_svm_bind_mm, where pasid_max - 1 is used for the end of
> allocation range.
no more as this is handled in 5/8
>
> Signed-off-by: Jacob Pan <[email protected]>
> Reviewed-by: Lu Baolu <[email protected]>
> Acked-by: Lu Baolu <[email protected]>
> ---
> drivers/iommu/Kconfig | 1 +
> drivers/iommu/intel-iommu.c | 13 +++++++------
> drivers/iommu/intel-pasid.c | 36 ------------------------------------
> drivers/iommu/intel-svm.c | 39 +++++++++++++++++++++++++--------------
> 4 files changed, 33 insertions(+), 56 deletions(-)
>
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index fd50ddffffbf..43ce450a40d3 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -212,6 +212,7 @@ config INTEL_IOMMU_SVM
> depends on INTEL_IOMMU && X86
> select PCI_PASID
> select MMU_NOTIFIER
> + select IOASID
> help
> Shared Virtual Memory (SVM) provides a facility for devices
> to access DMA resources through process address space by
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index d598168e410d..a84f0caa33a0 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -5238,7 +5238,7 @@ static void auxiliary_unlink_device(struct dmar_domain *domain,
> domain->auxd_refcnt--;
>
> if (!domain->auxd_refcnt && domain->default_pasid > 0)
> - intel_pasid_free_id(domain->default_pasid);
> + ioasid_free(domain->default_pasid);
> }
>
> static int aux_domain_add_dev(struct dmar_domain *domain,
> @@ -5256,10 +5256,11 @@ static int aux_domain_add_dev(struct dmar_domain *domain,
> if (domain->default_pasid <= 0) {
> int pasid;
>
> - pasid = intel_pasid_alloc_id(domain, PASID_MIN,
> - pci_max_pasids(to_pci_dev(dev)),
> - GFP_KERNEL);
> - if (pasid <= 0) {
> + /* No private data needed for the default pasid */
> + pasid = ioasid_alloc(NULL, PASID_MIN,
> + pci_max_pasids(to_pci_dev(dev)) - 1,
> + NULL);
> + if (pasid == INVALID_IOASID) {
> pr_err("Can't allocate default pasid\n");
> return -ENODEV;
> }
> @@ -5295,7 +5296,7 @@ static int aux_domain_add_dev(struct dmar_domain *domain,
> spin_unlock(&iommu->lock);
> spin_unlock_irqrestore(&device_domain_lock, flags);
> if (!domain->auxd_refcnt && domain->default_pasid > 0)
> - intel_pasid_free_id(domain->default_pasid);
> + ioasid_free(domain->default_pasid);
>
> return ret;
> }
> diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
> index 732bfee228df..3cb569e76642 100644
> --- a/drivers/iommu/intel-pasid.c
> +++ b/drivers/iommu/intel-pasid.c
> @@ -26,42 +26,6 @@
> */
> static DEFINE_SPINLOCK(pasid_lock);
> u32 intel_pasid_max_id = PASID_MAX;
> -static DEFINE_IDR(pasid_idr);
> -
> -int intel_pasid_alloc_id(void *ptr, int start, int end, gfp_t gfp)
> -{
> - int ret, min, max;
> -
> - min = max_t(int, start, PASID_MIN);
> - max = min_t(int, end, intel_pasid_max_id);
> -
> - WARN_ON(in_interrupt());
> - idr_preload(gfp);
> - spin_lock(&pasid_lock);
> - ret = idr_alloc(&pasid_idr, ptr, min, max, GFP_ATOMIC);
> - spin_unlock(&pasid_lock);
> - idr_preload_end();
> -
> - return ret;
> -}
> -
> -void intel_pasid_free_id(int pasid)
> -{
> - spin_lock(&pasid_lock);
> - idr_remove(&pasid_idr, pasid);
> - spin_unlock(&pasid_lock);
> -}
> -
> -void *intel_pasid_lookup_id(int pasid)
> -{
> - void *p;
> -
> - spin_lock(&pasid_lock);
> - p = idr_find(&pasid_idr, pasid);
> - spin_unlock(&pasid_lock);
> -
> - return p;
> -}
>
> /*
> * Per device pasid table management:
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index e90d0b914afe..71b85b892c56 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
> @@ -17,6 +17,7 @@
> #include <linux/dmar.h>
> #include <linux/interrupt.h>
> #include <linux/mm_types.h>
> +#include <linux/ioasid.h>
> #include <asm/page.h>
>
> #include "intel-pasid.h"
> @@ -335,16 +336,15 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
> if (pasid_max > intel_pasid_max_id)
> pasid_max = intel_pasid_max_id;
>
> - /* Do not use PASID 0 in caching mode (virtualised IOMMU) */
> - ret = intel_pasid_alloc_id(svm,
> - !!cap_caching_mode(iommu->cap),
> - pasid_max, GFP_KERNEL);
> - if (ret < 0) {
> + /* Do not use PASID 0, reserved for RID to PASID */
> + svm->pasid = ioasid_alloc(NULL, PASID_MIN,
> + pasid_max - 1, svm);
> + if (svm->pasid == INVALID_IOASID) {
> kfree(svm);
> kfree(sdev);
> + ret = -ENOSPC;
> goto out;
> }
> - svm->pasid = ret;
> svm->notifier.ops = &intel_mmuops;
> svm->mm = mm;
> svm->flags = flags;
> @@ -354,7 +354,7 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
> if (mm) {
> ret = mmu_notifier_register(&svm->notifier, mm);
> if (ret) {
> - intel_pasid_free_id(svm->pasid);
> + ioasid_free(svm->pasid);
> kfree(svm);
> kfree(sdev);
> goto out;
> @@ -370,7 +370,7 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
> if (ret) {
> if (mm)
> mmu_notifier_unregister(&svm->notifier, mm);
> - intel_pasid_free_id(svm->pasid);
> + ioasid_free(svm->pasid);
> kfree(svm);
> kfree(sdev);
> goto out;
> @@ -418,7 +418,15 @@ int intel_svm_unbind_mm(struct device *dev, int pasid)
> if (!iommu)
> goto out;
>
> - svm = intel_pasid_lookup_id(pasid);
> + svm = ioasid_find(NULL, pasid, NULL);
> + if (!svm)
> + goto out;
> +
> + if (IS_ERR(svm)) {
> + ret = PTR_ERR(svm);
> + goto out;
> + }
> +
> if (!svm)
> goto out;
>
> @@ -440,7 +448,7 @@ int intel_svm_unbind_mm(struct device *dev, int pasid)
> kfree_rcu(sdev, rcu);
>
> if (list_empty(&svm->devs)) {
> - intel_pasid_free_id(svm->pasid);
> + ioasid_free(svm->pasid);
> if (svm->mm)
> mmu_notifier_unregister(&svm->notifier, svm->mm);
>
> @@ -475,10 +483,14 @@ int intel_svm_is_pasid_valid(struct device *dev, int pasid)
> if (!iommu)
> goto out;
>
> - svm = intel_pasid_lookup_id(pasid);
> + svm = ioasid_find(NULL, pasid, NULL);
> if (!svm)
> goto out;
>
> + if (IS_ERR(svm)) {
> + ret = PTR_ERR(svm);
> + goto out;
> + }
> /* init_mm is used in this case */
> if (!svm->mm)
> ret = 1;
> @@ -585,13 +597,12 @@ static irqreturn_t prq_event_thread(int irq, void *d)
>
> if (!svm || svm->pasid != req->pasid) {
> rcu_read_lock();
> - svm = intel_pasid_lookup_id(req->pasid);
> + svm = ioasid_find(NULL, req->pasid, NULL);
> /* It *can't* go away, because the driver is not permitted
> * to unbind the mm while any page faults are outstanding.
> * So we only need RCU to protect the internal idr code. */
> rcu_read_unlock();
> -
> - if (!svm) {
> + if (IS_ERR_OR_NULL(svm)) {
> pr_err("%s: Page request for invalid PASID %d: %08llx %08llx\n",
> iommu->name, req->pasid, ((unsigned long long *)req)[0],
> ((unsigned long long *)req)[1]);
>
Besides
Reviewed-by: Eric Auger <[email protected]>
Thanks
Eric
Hi Jacob,
On 11/19/19 6:56 PM, Jacob Pan wrote:
> When setting up first level page tables for sharing with CPU, we need
> to ensure IOMMU can support no less than the levels supported by the
> CPU.
>
> It is not adequate, as in the current code, to set up 5-level paging
> in PASID entry First Level Paging Mode(FLPM) solely based on CPU.
>
> Currently, intel_pasid_setup_first_level() is only used by native SVM
> code which already checks paging mode matches. However, future use of
> this helper function may not be limited to native SVM.
> https://lkml.org/lkml/2019/11/18/1037
>
> Fixes: 437f35e1cd4c8 ("iommu/vt-d: Add first level page table
> interface")
> Signed-off-by: Jacob Pan <[email protected]>
> Acked-by: Lu Baolu <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
Thanks
Eric
> ---
> drivers/iommu/intel-pasid.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
> index 040a445be300..e7cb0b8a7332 100644
> --- a/drivers/iommu/intel-pasid.c
> +++ b/drivers/iommu/intel-pasid.c
> @@ -499,8 +499,16 @@ int intel_pasid_setup_first_level(struct intel_iommu *iommu,
> }
>
> #ifdef CONFIG_X86
> - if (cpu_feature_enabled(X86_FEATURE_LA57))
> - pasid_set_flpm(pte, 1);
> + /* Both CPU and IOMMU paging mode need to match */
> + if (cpu_feature_enabled(X86_FEATURE_LA57)) {
> + if (cap_5lp_support(iommu->cap)) {
> + pasid_set_flpm(pte, 1);
> + } else {
> + pr_err("VT-d has no 5-level paging support for CPU\n");
> + pasid_clear_entry(pte);
> + return -EINVAL;
> + }
> + }
> #endif /* CONFIG_X86 */
>
> pasid_set_domain_id(pte, did);
>