LinuxLists.cc - [PATCH v1 0/2] swiotlb: Fix a couple of bugs in sizing areas

2023-06-26 13:08:51

Subject: [PATCH v1 0/2] swiotlb: Fix a couple of bugs in sizing areas

From: Petr Tesarik <[email protected]>

While reworking the dynamic SWIOTLB implementation, I ran into some
locking issues with the current implementation. I believe the bugs
are serious enough to be fixed separately.

Petr Tesarik (2):
swiotlb: Always set the number of areas before allocating the pool
swiotlb: Reduce the number of areas to match actual memory pool size

kernel/dma/swiotlb.c | 46 +++++++++++++++++++++++++++++++++-----------
1 file changed, 35 insertions(+), 11 deletions(-)

--
2.25.1

2023-06-26 13:17:46

by Petr Tesarik

[permalink] [raw]

Subject: [PATCH v1 1/2] swiotlb: Always set the number of areas before allocating the pool

From: Petr Tesarik <[email protected]>

The number of areas defaults to the number of possible CPUs. However, the
total number of slots may have to be increased after adjusting the number
of areas. Consequently, the number of areas must be determined before
allocating the memory pool. This is even explained with a comment in
swiotlb_init_remap(), but swiotlb_init_late() adjusts the number of areas
after slots are already allocated. The areas may end up being smaller than
IO_TLB_SEGSIZE, which breaks per-area locking.

While fixing swiotlb_init_late(), move all relevant comments before the
definition of swiotlb_adjust_nareas() and convert them to kernel-doc.

Fixes: 20347fca71a3 ("swiotlb: split up the global swiotlb lock")
Signed-off-by: Petr Tesarik <[email protected]>
---
kernel/dma/swiotlb.c | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index af2e304c672c..16f53d8c51bc 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -115,9 +115,16 @@ static bool round_up_default_nslabs(void)
return true;
}

+/**
+ * swiotlb_adjust_nareas() - adjust the number of areas and slots
+ * @nareas: Desired number of areas. Zero is treated as 1.
+ *
+ * Adjust the default number of areas in a memory pool.
+ * The default size of the memory pool may also change to meet minimum area
+ * size requirements.
+ */
static void swiotlb_adjust_nareas(unsigned int nareas)
{
- /* use a single area when non is specified */
if (!nareas)
nareas = 1;
else if (!is_power_of_2(nareas))
@@ -298,10 +305,6 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
if (swiotlb_force_disable)
return;

- /*
- * default_nslabs maybe changed when adjust area number.
- * So allocate bounce buffer after adjusting area number.
- */
if (!default_nareas)
swiotlb_adjust_nareas(num_possible_cpus());

@@ -363,6 +366,9 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask,
if (swiotlb_force_disable)
return 0;

+ if (!default_nareas)
+ swiotlb_adjust_nareas(num_possible_cpus());
+
retry:
order = get_order(nslabs << IO_TLB_SHIFT);
nslabs = SLABS_PER_PAGE << order;
@@ -397,9 +403,6 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask,
(PAGE_SIZE << order) >> 20);
}

- if (!default_nareas)
- swiotlb_adjust_nareas(num_possible_cpus());
-
area_order = get_order(array_size(sizeof(*mem->areas),
default_nareas));
mem->areas = (struct io_tlb_area *)
--
2.25.1

2023-06-26 13:17:50

by Petr Tesarik

[permalink] [raw]

Subject: [PATCH v1 2/2] swiotlb: Reduce the number of areas to match actual memory pool size

From: Petr Tesarik <[email protected]>

Although the desired size of the SWIOTLB memory pool is increased in
swiotlb_adjust_nareas() to match the number of areas, the actual allocation
may be smaller, which may require reducing the number of areas.

For example, Xen uses swiotlb_init_late(), which in turn uses the page
allocator. On x86, page size is 4 KiB and MAX_ORDER is 10 (1024 pages),
resulting in a maximum memory pool size of 4 MiB. This corresponds to 2048
slots of 2 KiB each. The minimum area size is 128 (IO_TLB_SEGSIZE),
allowing at most 2048 / 128 = 16 areas.

If num_possible_cpus() is greater than the maximum number of areas, areas
are smaller than IO_TLB_SEGSIZE and contiguous groups of free slots will
span multiple areas. When allocating and freeing slots, only one area will
be properly locked, causing race conditions on the unlocked slots and
ultimately data corruption, kernel hangs and crashes.

Fixes: 20347fca71a3 ("swiotlb: split up the global swiotlb lock")
Signed-off-by: Petr Tesarik <[email protected]>
---
kernel/dma/swiotlb.c | 27 ++++++++++++++++++++++++---
1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 16f53d8c51bc..079df5ad38d0 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -138,6 +138,23 @@ static void swiotlb_adjust_nareas(unsigned int nareas)
(default_nslabs << IO_TLB_SHIFT) >> 20);
}

+/**
+ * limit_nareas() - get the maximum number of areas for a given memory pool size
+ * @nareas: Desired number of areas.
+ * @nslots: Total number of slots in the memory pool.
+ *
+ * Limit the number of areas to the maximum possible number of areas in
+ * a memory pool of the given size.
+ *
+ * Return: Maximum possible number of areas.
+ */
+static unsigned int limit_nareas(unsigned int nareas, unsigned long nslots)
+{
+ if (nslots < nareas * IO_TLB_SEGSIZE)
+ nareas = nslots / IO_TLB_SEGSIZE;
+ return nareas;
+}
+
static int __init
setup_io_tlb_npages(char *str)
{
@@ -297,6 +314,7 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
{
struct io_tlb_mem *mem = &io_tlb_default_mem;
unsigned long nslabs;
+ unsigned int nareas;
size_t alloc_size;
void *tlb;

@@ -309,10 +327,12 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
swiotlb_adjust_nareas(num_possible_cpus());

nslabs = default_nslabs;
+ nareas = limit_nareas(default_nareas, nslabs);
while ((tlb = swiotlb_memblock_alloc(nslabs, flags, remap)) == NULL) {
if (nslabs <= IO_TLB_MIN_SLABS)
return;
nslabs = ALIGN(nslabs >> 1, IO_TLB_SEGSIZE);
+ nareas = limit_nareas(nareas, nslabs);
}

if (default_nslabs != nslabs) {
@@ -358,6 +378,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask,
{
struct io_tlb_mem *mem = &io_tlb_default_mem;
unsigned long nslabs = ALIGN(size >> IO_TLB_SHIFT, IO_TLB_SEGSIZE);
+ unsigned int nareas;
unsigned char *vstart = NULL;
unsigned int order, area_order;
bool retried = false;
@@ -403,8 +424,8 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask,
(PAGE_SIZE << order) >> 20);
}

- area_order = get_order(array_size(sizeof(*mem->areas),
- default_nareas));
+ nareas = limit_nareas(default_nareas, nslabs);
+ area_order = get_order(array_size(sizeof(*mem->areas), nareas));
mem->areas = (struct io_tlb_area *)
__get_free_pages(GFP_KERNEL | __GFP_ZERO, area_order);
if (!mem->areas)
@@ -418,7 +439,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask,
set_memory_decrypted((unsigned long)vstart,
(nslabs << IO_TLB_SHIFT) >> PAGE_SHIFT);
swiotlb_init_io_tlb_mem(mem, virt_to_phys(vstart), nslabs, 0, true,
- default_nareas);
+ nareas);

swiotlb_print_info();
return 0;
--
2.25.1

2023-06-26 14:15:59

by Petr Tesařík

[permalink] [raw]

Subject: Re: [PATCH v1 0/2] swiotlb: Fix a couple of bugs in sizing areas

On Mon, 26 Jun 2023 15:01:02 +0200
Petr Tesarik <[email protected]> wrote:

> From: Petr Tesarik <[email protected]>
>
> While reworking the dynamic SWIOTLB implementation, I ran into some
> locking issues with the current implementation. I believe the bugs
> are serious enough to be fixed separately.

As an aside (and not directly related to the bugfixes themselves), I
wonder why the area size cannot be always equal to IO_TLB_SEGSIZE. Of
course, we would (usually) end up with more areas, but that should be
a good thing, shouldn't it? The area structure is quite small, so it
cannot be because of memory consumption concerns. The overhead of
taking an uncontended spinlock should also be negligible.

Do I miss something important here?

Petr T

2023-06-26 14:16:34

by Roberto Sassu

[permalink] [raw]

Subject: Re: [PATCH v1 0/2] swiotlb: Fix a couple of bugs in sizing areas

On Mon, 2023-06-26 at 15:01 +0200, Petr Tesarik wrote:
> From: Petr Tesarik <[email protected]>
>
> While reworking the dynamic SWIOTLB implementation, I ran into some
> locking issues with the current implementation. I believe the bugs
> are serious enough to be fixed separately.

Thanks, excellent work!

Reviewed-by: Roberto Sassu <[email protected]>

Roberto

> Petr Tesarik (2):
> swiotlb: Always set the number of areas before allocating the pool
> swiotlb: Reduce the number of areas to match actual memory pool size
>
> kernel/dma/swiotlb.c | 46 +++++++++++++++++++++++++++++++++-----------
> 1 file changed, 35 insertions(+), 11 deletions(-)
>

2023-06-29 05:22:11

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [PATCH v1 0/2] swiotlb: Fix a couple of bugs in sizing areas

On Mon, Jun 26, 2023 at 04:07:25PM +0200, Petr Tesařík wrote:
> As an aside (and not directly related to the bugfixes themselves), I
> wonder why the area size cannot be always equal to IO_TLB_SEGSIZE. Of
> course, we would (usually) end up with more areas, but that should be
> a good thing, shouldn't it? The area structure is quite small, so it
> cannot be because of memory consumption concerns. The overhead of
> taking an uncontended spinlock should also be negligible.

What would be the benefit of this?

2023-06-29 05:30:05

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [PATCH v1 0/2] swiotlb: Fix a couple of bugs in sizing areas

Thanks,

I've applied this the dma-mapping for-next tree with a trivial
cleanup.

2023-06-29 05:53:14

by Petr Tesařík

[permalink] [raw]

Subject: Re: [PATCH v1 0/2] swiotlb: Fix a couple of bugs in sizing areas

On Thu, 29 Jun 2023 07:12:38 +0200
Christoph Hellwig <[email protected]> wrote:

> On Mon, Jun 26, 2023 at 04:07:25PM +0200, Petr Tesařík wrote:
> > As an aside (and not directly related to the bugfixes themselves), I
> > wonder why the area size cannot be always equal to IO_TLB_SEGSIZE. Of
> > course, we would (usually) end up with more areas, but that should be
> > a good thing, shouldn't it? The area structure is quite small, so it
> > cannot be because of memory consumption concerns. The overhead of
> > taking an uncontended spinlock should also be negligible.
>
> What would be the benefit of this?

For me, as a newcomer to this code, the existence of areas _and_
segments was confusing, especially since segments are not represented by
a data structure. It took me more than one day to realize that slots
are grouped into segments for allocation, but changes to the slot
metadata are protected by area spinlocks. In my case, a segment crossed
an area boundary, so the area spinlock protected only half of the
allocated slots.

I could also ask differently: What is the benefit of grouping slots
into segments, and then again grouping segments into areas?

Petr T