Hi all,
This is version two of the patches I posted last week:
https://lore.kernel.org/r/[email protected]
Many thanks to Petr and Christoph for the discussion on that.
Changes since v1 include:
- Fix swiotlb_alloc() to honour the alignment requirements of
dma_alloc_coherent(). This is a new patch, and I think it's been
broken forever (practically stopping at page alignment). I've left
swiotlb_map() alone, so that doesn't necessarily return page-aligned
DMA addresses, but I think that's ok.
- Avoid updating 'alloc_align_mask' and instead just compute the
'stride' directly to avoid a superfluous alignment requirement
for mapping requests greater than a page.
- Use get_max_slots() instead of open-coding the same logic.
- Remove the extra 'goto' in swiotlb_search_pool_area() and collapse
the conditionals instead.
- Reword warning message when swiotlb_alloc() receives non page-aligned
allocation.
- Annotate non page-aligned case with unlikely().
Cheers,
Will
Cc: [email protected]
Cc: Christoph Hellwig <[email protected]>
Cc: Marek Szyprowski <[email protected]>
Cc: Robin Murphy <[email protected]>
Cc: Petr Tesarik <[email protected]>
Cc: Dexuan Cui <[email protected]>
--->8
Will Deacon (3):
swiotlb: Fix allocation alignment requirement when searching slots
swiotlb: Enforce page alignment in swiotlb_alloc()
swiotlb: Honour dma_alloc_coherent() alignment in swiotlb_alloc()
kernel/dma/swiotlb.c | 39 ++++++++++++++++++++++++---------------
1 file changed, 24 insertions(+), 15 deletions(-)
--
2.43.0.429.g432eaa2c6b-goog
core-api/dma-api-howto.rst states the following properties of
dma_alloc_coherent():
| The CPU virtual address and the DMA address are both guaranteed to
| be aligned to the smallest PAGE_SIZE order which is greater than or
| equal to the requested size.
However, swiotlb_alloc() passes zero for the 'alloc_align_mask'
parameter of swiotlb_find_slots() and so this property is not upheld.
Instead, allocations larger than a page are aligned to PAGE_SIZE,
Calculate the mask corresponding to the page order suitable for holding
the allocation and pass that to swiotlb_find_slots().
Cc: Christoph Hellwig <[email protected]>
Cc: Marek Szyprowski <[email protected]>
Cc: Robin Murphy <[email protected]>
Cc: Petr Tesarik <[email protected]>
Cc: Dexuan Cui <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
kernel/dma/swiotlb.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 4485f216e620..8ec37006ac70 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -1632,12 +1632,14 @@ struct page *swiotlb_alloc(struct device *dev, size_t size)
struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
struct io_tlb_pool *pool;
phys_addr_t tlb_addr;
+ unsigned int align;
int index;
if (!mem)
return NULL;
- index = swiotlb_find_slots(dev, 0, size, 0, &pool);
+ align = (1 << (get_order(size) + PAGE_SHIFT)) - 1;
+ index = swiotlb_find_slots(dev, 0, size, align, &pool);
if (index == -1)
return NULL;
--
2.43.0.429.g432eaa2c6b-goog
On 31/01/2024 12:25 pm, Will Deacon wrote:
> core-api/dma-api-howto.rst states the following properties of
> dma_alloc_coherent():
>
> | The CPU virtual address and the DMA address are both guaranteed to
> | be aligned to the smallest PAGE_SIZE order which is greater than or
> | equal to the requested size.
>
> However, swiotlb_alloc() passes zero for the 'alloc_align_mask'
> parameter of swiotlb_find_slots() and so this property is not upheld.
> Instead, allocations larger than a page are aligned to PAGE_SIZE,
>
> Calculate the mask corresponding to the page order suitable for holding
> the allocation and pass that to swiotlb_find_slots().
I guess this goes back to at least e81e99bacc9f ("swiotlb: Support
aligned swiotlb buffers") when the explicit argument was added - not
sure what we do about 5.15 LTS though (unless the answer is to not care...)
As before, though, how much of patch #1 is needed if this comes first?
Cheers,
Robin.
> Cc: Christoph Hellwig <[email protected]>
> Cc: Marek Szyprowski <[email protected]>
> Cc: Robin Murphy <[email protected]>
> Cc: Petr Tesarik <[email protected]>
> Cc: Dexuan Cui <[email protected]>
> Signed-off-by: Will Deacon <[email protected]>
> ---
> kernel/dma/swiotlb.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 4485f216e620..8ec37006ac70 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -1632,12 +1632,14 @@ struct page *swiotlb_alloc(struct device *dev, size_t size)
> struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
> struct io_tlb_pool *pool;
> phys_addr_t tlb_addr;
> + unsigned int align;
> int index;
>
> if (!mem)
> return NULL;
>
> - index = swiotlb_find_slots(dev, 0, size, 0, &pool);
> + align = (1 << (get_order(size) + PAGE_SHIFT)) - 1;
> + index = swiotlb_find_slots(dev, 0, size, align, &pool);
> if (index == -1)
> return NULL;
>
On Wed, Jan 31, 2024 at 04:03:38PM +0000, Robin Murphy wrote:
> On 31/01/2024 12:25 pm, Will Deacon wrote:
> > core-api/dma-api-howto.rst states the following properties of
> > dma_alloc_coherent():
> >
> > | The CPU virtual address and the DMA address are both guaranteed to
> > | be aligned to the smallest PAGE_SIZE order which is greater than or
> > | equal to the requested size.
> >
> > However, swiotlb_alloc() passes zero for the 'alloc_align_mask'
> > parameter of swiotlb_find_slots() and so this property is not upheld.
> > Instead, allocations larger than a page are aligned to PAGE_SIZE,
> >
> > Calculate the mask corresponding to the page order suitable for holding
> > the allocation and pass that to swiotlb_find_slots().
>
> I guess this goes back to at least e81e99bacc9f ("swiotlb: Support aligned
> swiotlb buffers") when the explicit argument was added - not sure what we do
> about 5.15 LTS though (unless the answer is to not care...)
Thanks. I'll add the Fixes: tag but, to be honest, if we backport the first
patch then I'm not hugely fussed about this one in -stable kernels simply
because I spotted it my inspection rather than an real failure.
> As before, though, how much of patch #1 is needed if this comes first?
See my reply over there, but I think we need all of this.
Will