2022-03-03 21:49:11

by Robin Murphy

[permalink] [raw]
Subject: [PATCH] iommu/iova: Improve 32-bit free space estimate

For various reasons based on the allocator behaviour and typical
use-cases at the time, when the max32_alloc_size optimisation was
introduced it seemed reasonable to couple the reset of the tracked
size to the update of cached32_node upon freeing a relevant IOVA.
However, since subsequent optimisations focused on helping genuine
32-bit devices make best use of even more limited address spaces, it
is now a lot more likely for cached32_node to be anywhere in a "full"
32-bit address space, and as such more likely for space to become
available from IOVAs below that node being freed.

At this point, the short-cut in __cached_rbnode_delete_update() really
doesn't hold up any more, and we need to fix the logic to reliably
provide the expected behaviour. We still want cached32_node to only move
upwards, but we should reset the allocation size if *any* 32-bit space
has become available.

Reported-by: Yunfei Wang <[email protected]>
Signed-off-by: Robin Murphy <[email protected]>
---
drivers/iommu/iova.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index b28c9435b898..170e0f33040e 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -95,10 +95,11 @@ __cached_rbnode_delete_update(struct iova_domain *iovad, struct iova *free)
cached_iova = to_iova(iovad->cached32_node);
if (free == cached_iova ||
(free->pfn_hi < iovad->dma_32bit_pfn &&
- free->pfn_lo >= cached_iova->pfn_lo)) {
+ free->pfn_lo >= cached_iova->pfn_lo))
iovad->cached32_node = rb_next(&free->node);
+
+ if (free->pfn_lo < iovad->dma_32bit_pfn)
iovad->max32_alloc_size = iovad->dma_32bit_pfn;
- }

cached_iova = to_iova(iovad->cached_node);
if (free->pfn_lo >= cached_iova->pfn_lo)
--
2.28.0.dirty


2022-03-04 08:39:43

by Miles Chen

[permalink] [raw]
Subject: Re: [PATCH] iommu/iova: Improve 32-bit free space estimate

Hi Robin,

> For various reasons based on the allocator behaviour and typical
> use-cases at the time, when the max32_alloc_size optimisation was
> introduced it seemed reasonable to couple the reset of the tracked
> size to the update of cached32_node upon freeing a relevant IOVA.
> However, since subsequent optimisations focused on helping genuine
> 32-bit devices make best use of even more limited address spaces, it
> is now a lot more likely for cached32_node to be anywhere in a "full"
> 32-bit address space, and as such more likely for space to become
> available from IOVAs below that node being freed.
>
> At this point, the short-cut in __cached_rbnode_delete_update() really
> doesn't hold up any more, and we need to fix the logic to reliably
> provide the expected behaviour. We still want cached32_node to only move
> upwards, but we should reset the allocation size if *any* 32-bit space
> has become available.
>
> Reported-by: Yunfei Wang <[email protected]>
> Signed-off-by: Robin Murphy <[email protected]>

Would you mind adding:

Cc: <[email protected]>

to this path? I checked and I think the patch can be applied to
5.4 and later.

thanks,
Miles

2022-03-04 14:18:54

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH] iommu/iova: Improve 32-bit free space estimate

On Fri, Mar 04, 2022 at 07:36:46AM +0800, Miles Chen wrote:
> Hi Robin,
>
> > For various reasons based on the allocator behaviour and typical
> > use-cases at the time, when the max32_alloc_size optimisation was
> > introduced it seemed reasonable to couple the reset of the tracked
> > size to the update of cached32_node upon freeing a relevant IOVA.
> > However, since subsequent optimisations focused on helping genuine
> > 32-bit devices make best use of even more limited address spaces, it
> > is now a lot more likely for cached32_node to be anywhere in a "full"
> > 32-bit address space, and as such more likely for space to become
> > available from IOVAs below that node being freed.
> >
> > At this point, the short-cut in __cached_rbnode_delete_update() really
> > doesn't hold up any more, and we need to fix the logic to reliably
> > provide the expected behaviour. We still want cached32_node to only move
> > upwards, but we should reset the allocation size if *any* 32-bit space
> > has become available.
> >
> > Reported-by: Yunfei Wang <[email protected]>
> > Signed-off-by: Robin Murphy <[email protected]>
>
> Would you mind adding:
>
> Cc: <[email protected]>

Applied without stable tag for now. If needed, please consider
re-sending it for stable when this patch is merged upstream.

Regards,

Joerg

2022-03-04 18:31:15

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH] iommu/iova: Improve 32-bit free space estimate

On 2022-03-04 09:41, Joerg Roedel wrote:
> On Fri, Mar 04, 2022 at 07:36:46AM +0800, Miles Chen wrote:
>> Hi Robin,
>>
>>> For various reasons based on the allocator behaviour and typical
>>> use-cases at the time, when the max32_alloc_size optimisation was
>>> introduced it seemed reasonable to couple the reset of the tracked
>>> size to the update of cached32_node upon freeing a relevant IOVA.
>>> However, since subsequent optimisations focused on helping genuine
>>> 32-bit devices make best use of even more limited address spaces, it
>>> is now a lot more likely for cached32_node to be anywhere in a "full"
>>> 32-bit address space, and as such more likely for space to become
>>> available from IOVAs below that node being freed.
>>>
>>> At this point, the short-cut in __cached_rbnode_delete_update() really
>>> doesn't hold up any more, and we need to fix the logic to reliably
>>> provide the expected behaviour. We still want cached32_node to only move
>>> upwards, but we should reset the allocation size if *any* 32-bit space
>>> has become available.
>>>
>>> Reported-by: Yunfei Wang <[email protected]>
>>> Signed-off-by: Robin Murphy <[email protected]>
>>
>> Would you mind adding:
>>
>> Cc: <[email protected]>
>
> Applied without stable tag for now. If needed, please consider
> re-sending it for stable when this patch is merged upstream.

Yeah, having figured out the history, I ended up with the opinion that
it was a missed corner-case optimisation opportunity, rather than an
actual error with respect to intent or implementation, so I
intentionally left that out. Plus figuring out an exact Fixes tag might
be tricky - as above I reckon it probably only started to become
significant somwehere around 5.11 or so.

All of these various levels of retry mechanisms are only a best-effort
thing, and ultimately if you're making large allocations from a small
space there are always going to be *some* circumstances that still
manage to defeat them. Over time, we've made them try harder, but that
fact that we haven't yet made them try hard enough to work well for a
particular use-case does not constitute a bug. However as Joerg says,
anyone's welcome to make a case to Greg to backport a mainline commit if
it's a low-risk change with significant benefit to real-world stable
kernel users.

Thanks all!

Robin.

2022-03-04 19:08:28

by Miles Chen

[permalink] [raw]
Subject: Re: [PATCH] iommu/iova: Improve 32-bit free space estimate

> For various reasons based on the allocator behaviour and typical
> use-cases at the time, when the max32_alloc_size optimisation was
> introduced it seemed reasonable to couple the reset of the tracked
> size to the update of cached32_node upon freeing a relevant IOVA.
> However, since subsequent optimisations focused on helping genuine
> 32-bit devices make best use of even more limited address spaces, it
> is now a lot more likely for cached32_node to be anywhere in a "full"
> 32-bit address space, and as such more likely for space to become
> available from IOVAs below that node being freed.
>
> At this point, the short-cut in __cached_rbnode_delete_update() really
> doesn't hold up any more, and we need to fix the logic to reliably
> provide the expected behaviour. We still want cached32_node to only move
> upwards, but we should reset the allocation size if *any* 32-bit space
> has become available.
>
> Reported-by: Yunfei Wang <[email protected]>
> Signed-off-by: Robin Murphy <[email protected]>

Reviewed-by: Miles Chen <[email protected]>

2022-03-05 01:23:26

by Miles Chen

[permalink] [raw]
Subject: Re: [PATCH] iommu/iova: Improve 32-bit free space estimate

Hi Joerg, Robin,

> Applied without stable tag for now. If needed, please consider
> re-sending it for stable when this patch is merged upstream.

> Yeah, having figured out the history, I ended up with the opinion that
> it was a missed corner-case optimisation opportunity, rather than an
> actual error with respect to intent or implementation, so I
> intentionally left that out. Plus figuring out an exact Fixes tag might
> be tricky - as above I reckon it probably only started to become
> significant somwehere around 5.11 or so.
>
> All of these various levels of retry mechanisms are only a best-effort
> thing, and ultimately if you're making large allocations from a small
> space there are always going to be *some* circumstances that still
> manage to defeat them. Over time, we've made them try harder, but that
> fact that we haven't yet made them try hard enough to work well for a
> particular use-case does not constitute a bug. However as Joerg says,
> anyone's welcome to make a case to Greg to backport a mainline commit if
> it's a low-risk change with significant benefit to real-world stable
> kernel users.

Got it, thank you.
We will try to push to the android LTS trees we need.

Thanks,
Miles

>
> Thanks all!
>
> Robin.