On Mon, 2016-09-19 at 14:18 +0200, Joerg Roedel wrote:
> [Cc'ing David]
>
> On Mon, Sep 12, 2016 at 10:49:11AM +0800, Xunlei Pang wrote:
> >
> > According to the vt-d spec, the size of pasid (state) entry is 8B
> > which equals 3 in power of 2, the number of pasid (state) entries
> > is (ecap_pss + 1) in power of 2.
> >
> > Thus the right size of pasid (state) table in power of 2 should be
> > ecap_pss(iommu->ecap) plus "1+3=4" other than 7.
> >
> > Signed-off-by: Xunlei Pang <[email protected]>
> > ---
> > drivers/iommu/intel-svm.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> > index 8ebb353..cfa75c2 100644
> > --- a/drivers/iommu/intel-svm.c
> > +++ b/drivers/iommu/intel-svm.c
> > @@ -39,7 +39,7 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
> > struct page *pages;
> > int order;
> >
> > - order = ecap_pss(iommu->ecap) + 7 - PAGE_SHIFT;
> > + order = ecap_pss(iommu->ecap) + 4 - PAGE_SHIFT;
> > if (order < 0)
> > order = 0;
>
> The patch seems to be correct, but I'll let David comment on it first.
Yes, that looks correct. I think we may also need to limit it, because
full 20-bit PASID support means we'll attempt an order 11 allocation.
But that's certainly correct for now
Acked-by: David Woodhouse <[email protected]>
--
dwmw2
On Wed, 2016-10-12 at 13:17 +0100, David Woodhouse wrote:
> Yes, that looks correct. I think we may also need to limit it, because
> full 20-bit PASID support means we'll attempt an order 11 allocation.
> But that's certainly correct for now
Actually, not quite correct. You fixed the allocation but not the free.
And Mika had reported that even the *correct* allocation was failing
because it was too large. So I've done it differently (untested)...
-----
Subject: [PATCH] iommu/vt-d: Fix PASID table allocation
Somehow I ended up with an off-by-three error in calculating the size of
the PASID and PASID State tables, which triggers allocations failures as
those tables unfortunately have to be physically contiguous.
In fact, even the *correct* maximum size of 8MiB is problematic and is
wont to lead to allocation failures. Since I have extracted a promise
that this *will* be fixed in hardware, I'm happy to limit it on the
current hardware to a maximum of 0x20000 PASIDs, which gives us 1MiB
tables — still not ideal, but better than before.
Reported by Mika Kuoppala <[email protected]> and also by
Xunlei Pang <[email protected]> who submitted a simpler patch to fix
only the allocation (and not the free) to the "correct" limit... which
was still problematic.
Signed-off-by: David Woodhouse <[email protected]>
---
drivers/iommu/intel-svm.c | 28 +++++++++++++++++-----------
include/linux/intel-iommu.h | 1 +
2 files changed, 18 insertions(+), 11 deletions(-)
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 8ebb353..cb72e00 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -39,10 +39,18 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
struct page *pages;
int order;
- order = ecap_pss(iommu->ecap) + 7 - PAGE_SHIFT;
- if (order < 0)
- order = 0;
-
+ /* Start at 2 because it's defined as 1^(1+PSS) */
+ iommu->pasid_max = 2 << ecap_pss(iommu->ecap);
+
+ /* Eventually I'm promised we will get a multi-level PASID table
+ * and it won't have to be physically contiguous. Until then,
+ * limit the size because 8MiB contiguous allocations can be hard
+ * to come by. The limit of 0x20000, which is 1MiB for each of
+ * the PASID and PASID-state tables, is somewhat arbitrary. */
+ if (iommu->pasid_max > 0x20000)
+ iommu->pasid_max = 0x20000;
+
+ order = get_order(sizeof(struct pasid_entry) * iommu->pasid_max);
pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, order);
if (!pages) {
pr_warn("IOMMU: %s: Failed to allocate PASID table\n",
@@ -53,6 +61,8 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
pr_info("%s: Allocated order %d PASID table.\n", iommu->name, order);
if (ecap_dis(iommu->ecap)) {
+ /* Just making it explicit... */
+ BUILD_BUG_ON(sizeof(struct pasid_entry) != sizeof(struct pasid_state_entry));
pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, order);
if (pages)
iommu->pasid_state_table = page_address(pages);
@@ -68,11 +78,7 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
int intel_svm_free_pasid_tables(struct intel_iommu *iommu)
{
- int order;
-
- order = ecap_pss(iommu->ecap) + 7 - PAGE_SHIFT;
- if (order < 0)
- order = 0;
+ int order = get_order(sizeof(struct pasid_entry) * iommu->pasid_max);
if (iommu->pasid_table) {
free_pages((unsigned long)iommu->pasid_table, order);
@@ -371,8 +377,8 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
}
svm->iommu = iommu;
- if (pasid_max > 2 << ecap_pss(iommu->ecap))
- pasid_max = 2 << ecap_pss(iommu->ecap);
+ if (pasid_max > iommu->pasid_max)
+ pasid_max = iommu->pasid_max;
/* Do not use PASID 0 in caching mode (virtualised IOMMU) */
ret = idr_alloc(&iommu->pasid_idr, svm,
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 2d9b6500..d49e26c 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -429,6 +429,7 @@ struct intel_iommu {
struct page_req_dsc *prq;
unsigned char prq_name[16]; /* Name for PRQ interrupt */
struct idr pasid_idr;
+ u32 pasid_max;
#endif
struct q_inval *qi; /* Queued invalidation info */
u32 *iommu_state; /* Store iommu states between suspend and resume.*/
--
2.5.5
--
dwmw2
On 2016/10/30 at 20:18, David Woodhouse wrote:
> On Wed, 2016-10-12 at 13:17 +0100, David Woodhouse wrote:
>> Yes, that looks correct. I think we may also need to limit it, because
>> full 20-bit PASID support means we'll attempt an order 11 allocation.
>> But that's certainly correct for now
> Actually, not quite correct. You fixed the allocation but not the free.
> And Mika had reported that even the *correct* allocation was failing
> because it was too large. So I've done it differently (untested)...
Yes, your fix looks correct.
Thanks,
Xunlei
> -----
> Subject: [PATCH] iommu/vt-d: Fix PASID table allocation
>
> Somehow I ended up with an off-by-three error in calculating the size of
> the PASID and PASID State tables, which triggers allocations failures as
> those tables unfortunately have to be physically contiguous.
>
> In fact, even the *correct* maximum size of 8MiB is problematic and is
> wont to lead to allocation failures. Since I have extracted a promise
> that this *will* be fixed in hardware, I'm happy to limit it on the
> current hardware to a maximum of 0x20000 PASIDs, which gives us 1MiB
> tables — still not ideal, but better than before.
>
> Reported by Mika Kuoppala <[email protected]> and also by
> Xunlei Pang <[email protected]> who submitted a simpler patch to fix
> only the allocation (and not the free) to the "correct" limit... which
> was still problematic.
>
> Signed-off-by: David Woodhouse <[email protected]>
> ---
> drivers/iommu/intel-svm.c | 28 +++++++++++++++++-----------
> include/linux/intel-iommu.h | 1 +
> 2 files changed, 18 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index 8ebb353..cb72e00 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
> @@ -39,10 +39,18 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
> struct page *pages;
> int order;
>
> - order = ecap_pss(iommu->ecap) + 7 - PAGE_SHIFT;
> - if (order < 0)
> - order = 0;
> -
> + /* Start at 2 because it's defined as 1^(1+PSS) */
> + iommu->pasid_max = 2 << ecap_pss(iommu->ecap);
> +
> + /* Eventually I'm promised we will get a multi-level PASID table
> + * and it won't have to be physically contiguous. Until then,
> + * limit the size because 8MiB contiguous allocations can be hard
> + * to come by. The limit of 0x20000, which is 1MiB for each of
> + * the PASID and PASID-state tables, is somewhat arbitrary. */
> + if (iommu->pasid_max > 0x20000)
> + iommu->pasid_max = 0x20000;
> +
> + order = get_order(sizeof(struct pasid_entry) * iommu->pasid_max);
> pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, order);
> if (!pages) {
> pr_warn("IOMMU: %s: Failed to allocate PASID table\n",
> @@ -53,6 +61,8 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
> pr_info("%s: Allocated order %d PASID table.\n", iommu->name, order);
>
> if (ecap_dis(iommu->ecap)) {
> + /* Just making it explicit... */
> + BUILD_BUG_ON(sizeof(struct pasid_entry) != sizeof(struct pasid_state_entry));
> pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, order);
> if (pages)
> iommu->pasid_state_table = page_address(pages);
> @@ -68,11 +78,7 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
>
> int intel_svm_free_pasid_tables(struct intel_iommu *iommu)
> {
> - int order;
> -
> - order = ecap_pss(iommu->ecap) + 7 - PAGE_SHIFT;
> - if (order < 0)
> - order = 0;
> + int order = get_order(sizeof(struct pasid_entry) * iommu->pasid_max);
>
> if (iommu->pasid_table) {
> free_pages((unsigned long)iommu->pasid_table, order);
> @@ -371,8 +377,8 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
> }
> svm->iommu = iommu;
>
> - if (pasid_max > 2 << ecap_pss(iommu->ecap))
> - pasid_max = 2 << ecap_pss(iommu->ecap);
> + if (pasid_max > iommu->pasid_max)
> + pasid_max = iommu->pasid_max;
>
> /* Do not use PASID 0 in caching mode (virtualised IOMMU) */
> ret = idr_alloc(&iommu->pasid_idr, svm,
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index 2d9b6500..d49e26c 100644
> --- a/include/linux/intel-iommu.h
> +++ b/include/linux/intel-iommu.h
> @@ -429,6 +429,7 @@ struct intel_iommu {
> struct page_req_dsc *prq;
> unsigned char prq_name[16]; /* Name for PRQ interrupt */
> struct idr pasid_idr;
> + u32 pasid_max;
> #endif
> struct q_inval *qi; /* Queued invalidation info */
> u32 *iommu_state; /* Store iommu states between suspend and resume.*/
> --
> 2.5.5
>
Hi David,
On Sun, Oct 30, 2016 at 06:18:22AM -0600, David Woodhouse wrote:
> + /* Start at 2 because it's defined as 1^(1+PSS) */
This probably means 2^(1+PSS), right?
> + iommu->pasid_max = 2 << ecap_pss(iommu->ecap);
Otherwise the patch looks good. Do you want to send it upstream yourself
or should I pick it up?
Joerg
On Thu, 2016-11-10 at 11:45 +0100, Joerg Roedel wrote:
> Hi David,
>
> On Sun, Oct 30, 2016 at 06:18:22AM -0600, David Woodhouse wrote:
> > + /* Start at 2 because it's defined as 1^(1+PSS) */
>
> This probably means 2^(1+PSS), right?
Or 1 << (1+PSS). Yeah.
> > + iommu->pasid_max = 2 << ecap_pss(iommu->ecap);
>
> Otherwise the patch looks good. Do you want to send it upstream yourself
> or should I pick it up?
I'll let it sit in -next for a day or two more once I've fixed the
above, then ask Linus to pull it. Thanks.
--
dwmw2