2023-03-25 18:00:28

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: [PATCH v2 1/1] swiotlb: Track and report io_tlb_used high water mark in debugfs

swiotlb currently reports the total number of slabs and the instantaneous
in-use slabs in debugfs. But with increased usage of swiotlb for all I/O
in Confidential Computing (coco) VMs, it has become difficult to know
how much memory to allocate for swiotlb bounce buffers, either via the
automatic algorithm in the kernel or by specifying a value on the
kernel boot line. The current automatic algorithm generously allocates
swiotlb bounce buffer memory, and may be wasting significant memory in
many use cases.

To support better understanding swiotlb usage, add tracking of the
the high water mark usage of swiotlb bounce buffer memory. Report the
high water mark in debugfs along with the other swiotlb metrics. Allow
the high water to be reset to zero at runtime by writing to it.

Since a global in-use slab count is added alongside the existing
per-area in-use count, the mem_used() function that sums across all
areas is no longer needed. Remove it and replace with the global
in-use count.

Signed-off-by: Michael Kelley <[email protected]>

Changes in v2:
* Only reset the high water mark to zero when the specified new value
is zero, to prevent confusion about the ability to reset to some
other value [Dexuan Cui]

---
kernel/dma/swiotlb.c | 49 +++++++++++++++++++++++++++++++++++++------------
1 file changed, 37 insertions(+), 12 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index f9f0279..3e50639 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -76,6 +76,9 @@ struct io_tlb_slot {
static unsigned long default_nslabs = IO_TLB_DEFAULT_SIZE >> IO_TLB_SHIFT;
static unsigned long default_nareas;

+static atomic_long_t total_used = ATOMIC_LONG_INIT(0);
+static atomic_long_t used_hiwater = ATOMIC_LONG_INIT(0);
+
/**
* struct io_tlb_area - IO TLB memory area descriptor
*
@@ -587,6 +590,7 @@ static int swiotlb_do_find_slots(struct device *dev, int area_index,
unsigned long flags;
unsigned int slot_base;
unsigned int slot_index;
+ unsigned long old_hiwater, new_used;

BUG_ON(!nslots);
BUG_ON(area_index >= mem->nareas);
@@ -659,6 +663,14 @@ static int swiotlb_do_find_slots(struct device *dev, int area_index,
area->index = wrap_area_index(mem, index + nslots);
area->used += nslots;
spin_unlock_irqrestore(&area->lock, flags);
+
+ new_used = atomic_long_add_return(nslots, &total_used);
+ old_hiwater = atomic_long_read(&used_hiwater);
+ do {
+ if (new_used <= old_hiwater)
+ break;
+ } while (!atomic_long_try_cmpxchg(&used_hiwater, &old_hiwater, new_used));
+
return slot_index;
}

@@ -681,16 +693,6 @@ static int swiotlb_find_slots(struct device *dev, phys_addr_t orig_addr,
return -1;
}

-static unsigned long mem_used(struct io_tlb_mem *mem)
-{
- int i;
- unsigned long used = 0;
-
- for (i = 0; i < mem->nareas; i++)
- used += mem->areas[i].used;
- return used;
-}
-
phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
size_t mapping_size, size_t alloc_size,
unsigned int alloc_align_mask, enum dma_data_direction dir,
@@ -723,7 +725,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, phys_addr_t orig_addr,
if (!(attrs & DMA_ATTR_NO_WARN))
dev_warn_ratelimited(dev,
"swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n",
- alloc_size, mem->nslabs, mem_used(mem));
+ alloc_size, mem->nslabs, atomic_long_read(&total_used));
return (phys_addr_t)DMA_MAPPING_ERROR;
}

@@ -791,6 +793,8 @@ static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr)
mem->slots[i].list = ++count;
area->used -= nslots;
spin_unlock_irqrestore(&area->lock, flags);
+
+ atomic_long_sub(nslots, &total_used);
}

/*
@@ -887,10 +891,29 @@ bool is_swiotlb_active(struct device *dev)

static int io_tlb_used_get(void *data, u64 *val)
{
- *val = mem_used(&io_tlb_default_mem);
+ *val = (u64)atomic_long_read(&total_used);
return 0;
}
+
+static int io_tlb_hiwater_get(void *data, u64 *val)
+{
+ *val = (u64)atomic_long_read(&used_hiwater);
+ return 0;
+}
+
+static int io_tlb_hiwater_set(void *data, u64 val)
+{
+ /* Only allow setting to zero */
+ if (val != 0)
+ return -EINVAL;
+
+ atomic_long_set(&used_hiwater, val);
+ return 0;
+}
+
DEFINE_DEBUGFS_ATTRIBUTE(fops_io_tlb_used, io_tlb_used_get, NULL, "%llu\n");
+DEFINE_DEBUGFS_ATTRIBUTE(fops_io_tlb_hiwater, io_tlb_hiwater_get,
+ io_tlb_hiwater_set, "%llu\n");

static void swiotlb_create_debugfs_files(struct io_tlb_mem *mem,
const char *dirname)
@@ -902,6 +925,8 @@ static void swiotlb_create_debugfs_files(struct io_tlb_mem *mem,
debugfs_create_ulong("io_tlb_nslabs", 0400, mem->debugfs, &mem->nslabs);
debugfs_create_file("io_tlb_used", 0400, mem->debugfs, NULL,
&fops_io_tlb_used);
+ debugfs_create_file("io_tlb_used_hiwater", 0600, mem->debugfs, NULL,
+ &fops_io_tlb_hiwater);
}

static int __init __maybe_unused swiotlb_create_default_debugfs(void)
--
1.8.3.1


2023-03-25 21:57:23

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH v2 1/1] swiotlb: Track and report io_tlb_used high water mark in debugfs

> From: Michael Kelley (LINUX) <[email protected]>
> Sent: Saturday, March 25, 2023 10:53 AM

LGTM

Reviewed-by: Dexuan Cui <[email protected]>

2023-03-28 01:44:05

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] swiotlb: Track and report io_tlb_used high water mark in debugfs

On Sat, Mar 25, 2023 at 10:53:10AM -0700, Michael Kelley wrote:
> @@ -659,6 +663,14 @@ static int swiotlb_do_find_slots(struct device *dev, int area_index,
> area->index = wrap_area_index(mem, index + nslots);
> area->used += nslots;
> spin_unlock_irqrestore(&area->lock, flags);
> +
> + new_used = atomic_long_add_return(nslots, &total_used);
> + old_hiwater = atomic_long_read(&used_hiwater);
> + do {
> + if (new_used <= old_hiwater)
> + break;
> + } while (!atomic_long_try_cmpxchg(&used_hiwater, &old_hiwater, new_used));
> +
> return slot_index;

Hmm, so we're right in the swiotlb hot path here and add two new global
atomics?

> static int io_tlb_used_get(void *data, u64 *val)
> {
> - *val = mem_used(&io_tlb_default_mem);
> + *val = (u64)atomic_long_read(&total_used);
> return 0;
> }
> +
> +static int io_tlb_hiwater_get(void *data, u64 *val)
> +{
> + *val = (u64)atomic_long_read(&used_hiwater);

I can't see how these casts would be needed.

2023-03-28 13:13:51

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH v2 1/1] swiotlb: Track and report io_tlb_used high water mark in debugfs

From: Christoph Hellwig <[email protected]> Sent: Monday, March 27, 2023 6:34 PM
>
> On Sat, Mar 25, 2023 at 10:53:10AM -0700, Michael Kelley wrote:
> > @@ -659,6 +663,14 @@ static int swiotlb_do_find_slots(struct device *dev, int
> area_index,
> > area->index = wrap_area_index(mem, index + nslots);
> > area->used += nslots;
> > spin_unlock_irqrestore(&area->lock, flags);
> > +
> > + new_used = atomic_long_add_return(nslots, &total_used);
> > + old_hiwater = atomic_long_read(&used_hiwater);
> > + do {
> > + if (new_used <= old_hiwater)
> > + break;
> > + } while (!atomic_long_try_cmpxchg(&used_hiwater, &old_hiwater, new_used));
> > +
> > return slot_index;
>
> Hmm, so we're right in the swiotlb hot path here and add two new global
> atomics?

It's only one global atomic, except when the high water mark needs to be
bumped. That results in an initial transient of doing the second global
atomic, but then it won't be done unless there's a spike in usage or the
high water mark is manually reset to zero. Of course, there's a similar
global atomic subtract when the slots are released.

Perhaps this accounting should go under #ifdef CONFIG_DEBUGFS? Or
even add a swiotlb-specific debugfs config option to cover all the swiotlb
debugfs code. From Petr Tesarik's earlier comments, it sounds like there
is interest in additional accounting, such as for fragmentation.

>
> > static int io_tlb_used_get(void *data, u64 *val)
> > {
> > - *val = mem_used(&io_tlb_default_mem);
> > + *val = (u64)atomic_long_read(&total_used);
> > return 0;
> > }
> > +
> > +static int io_tlb_hiwater_get(void *data, u64 *val)
> > +{
> > + *val = (u64)atomic_long_read(&used_hiwater);
>
> I can't see how these casts would be needed.

OK. Will drop the casts in the next version.

Michael

2023-03-28 14:03:10

by Petr Tesařík

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] swiotlb: Track and report io_tlb_used high water mark in debugfs

On Tue, 28 Mar 2023 13:12:13 +0000
"Michael Kelley (LINUX)" <[email protected]> wrote:

> From: Christoph Hellwig <[email protected]> Sent: Monday, March 27, 2023 6:34 PM
> >
> > On Sat, Mar 25, 2023 at 10:53:10AM -0700, Michael Kelley wrote:
> > > @@ -659,6 +663,14 @@ static int swiotlb_do_find_slots(struct device *dev, int
> > area_index,
> > > area->index = wrap_area_index(mem, index + nslots);
> > > area->used += nslots;
> > > spin_unlock_irqrestore(&area->lock, flags);
> > > +
> > > + new_used = atomic_long_add_return(nslots, &total_used);
> > > + old_hiwater = atomic_long_read(&used_hiwater);
> > > + do {
> > > + if (new_used <= old_hiwater)
> > > + break;
> > > + } while (!atomic_long_try_cmpxchg(&used_hiwater, &old_hiwater, new_used));
> > > +
> > > return slot_index;
> >
> > Hmm, so we're right in the swiotlb hot path here and add two new global
> > atomics?
>
> It's only one global atomic, except when the high water mark needs to be
> bumped. That results in an initial transient of doing the second global
> atomic, but then it won't be done unless there's a spike in usage or the
> high water mark is manually reset to zero. Of course, there's a similar
> global atomic subtract when the slots are released.
>
> Perhaps this accounting should go under #ifdef CONFIG_DEBUGFS? Or
> even add a swiotlb-specific debugfs config option to cover all the swiotlb
> debugfs code. From Petr Tesarik's earlier comments, it sounds like there
> is interest in additional accounting, such as for fragmentation.

For my purposes, it does not have to be 100% accurate. I don't really
mind if it is off by a few slots because of a race window, so we could
(for instance):

- update a local variable and set the atomic after the loop,
- or make it a per-cpu to reduce CPU cache bouncing,
- or just about anything that is less heavy-weight than an atomic
CMPXCHG in the inner loop of a slot search.

Just my two cents,
Petr T

2023-03-28 14:12:14

by Petr Tesařík

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] swiotlb: Track and report io_tlb_used high water mark in debugfs

On Tue, 28 Mar 2023 15:50:17 +0200
Petr Tesařík <[email protected]> wrote:

> On Tue, 28 Mar 2023 13:12:13 +0000
> "Michael Kelley (LINUX)" <[email protected]> wrote:
>
> > From: Christoph Hellwig <[email protected]> Sent: Monday, March 27, 2023 6:34 PM
> > >
> > > On Sat, Mar 25, 2023 at 10:53:10AM -0700, Michael Kelley wrote:
> > > > @@ -659,6 +663,14 @@ static int swiotlb_do_find_slots(struct device *dev, int
> > > area_index,
> > > > area->index = wrap_area_index(mem, index + nslots);
> > > > area->used += nslots;
> > > > spin_unlock_irqrestore(&area->lock, flags);
> > > > +
> > > > + new_used = atomic_long_add_return(nslots, &total_used);
> > > > + old_hiwater = atomic_long_read(&used_hiwater);
> > > > + do {
> > > > + if (new_used <= old_hiwater)
> > > > + break;
> > > > + } while (!atomic_long_try_cmpxchg(&used_hiwater, &old_hiwater, new_used));
> > > > +
> > > > return slot_index;
> > >
> > > Hmm, so we're right in the swiotlb hot path here and add two new global
> > > atomics?
> >
> > It's only one global atomic, except when the high water mark needs to be
> > bumped. That results in an initial transient of doing the second global
> > atomic, but then it won't be done unless there's a spike in usage or the
> > high water mark is manually reset to zero. Of course, there's a similar
> > global atomic subtract when the slots are released.
> >
> > Perhaps this accounting should go under #ifdef CONFIG_DEBUGFS? Or
> > even add a swiotlb-specific debugfs config option to cover all the swiotlb
> > debugfs code. From Petr Tesarik's earlier comments, it sounds like there
> > is interest in additional accounting, such as for fragmentation.
>
> For my purposes, it does not have to be 100% accurate.

Actually, why are these variables global? There can be multiple
io_tlb_mem instances in the system (one SWIOTLB and multiple restricted
DMA pools). Tracking the usage of restricted DMA pools might be useful,
but summing them up with the SWIOTLB not so much. AFAICS the watermark
should be added to struct io_tlb_mem.

Petr T

2023-03-28 14:36:05

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH v2 1/1] swiotlb: Track and report io_tlb_used high water mark in debugfs

From: Petr Tesa??k <[email protected]> Sent: Tuesday, March 28, 2023 6:50 AM
>
> On Tue, 28 Mar 2023 13:12:13 +0000
> "Michael Kelley (LINUX)" <[email protected]> wrote:
>
> > From: Christoph Hellwig <[email protected]> Sent: Monday, March 27, 2023 6:34
> PM
> > >
> > > On Sat, Mar 25, 2023 at 10:53:10AM -0700, Michael Kelley wrote:
> > > > @@ -659,6 +663,14 @@ static int swiotlb_do_find_slots(struct device *dev, int
> > > area_index,
> > > > area->index = wrap_area_index(mem, index + nslots);
> > > > area->used += nslots;
> > > > spin_unlock_irqrestore(&area->lock, flags);
> > > > +
> > > > + new_used = atomic_long_add_return(nslots, &total_used);
> > > > + old_hiwater = atomic_long_read(&used_hiwater);
> > > > + do {
> > > > + if (new_used <= old_hiwater)
> > > > + break;
> > > > + } while (!atomic_long_try_cmpxchg(&used_hiwater, &old_hiwater, new_used));
> > > > +
> > > > return slot_index;
> > >
> > > Hmm, so we're right in the swiotlb hot path here and add two new global
> > > atomics?
> >
> > It's only one global atomic, except when the high water mark needs to be
> > bumped. That results in an initial transient of doing the second global
> > atomic, but then it won't be done unless there's a spike in usage or the
> > high water mark is manually reset to zero. Of course, there's a similar
> > global atomic subtract when the slots are released.
> >
> > Perhaps this accounting should go under #ifdef CONFIG_DEBUGFS? Or
> > even add a swiotlb-specific debugfs config option to cover all the swiotlb
> > debugfs code. From Petr Tesarik's earlier comments, it sounds like there
> > is interest in additional accounting, such as for fragmentation.
>
> For my purposes, it does not have to be 100% accurate. I don't really
> mind if it is off by a few slots because of a race window, so we could
> (for instance):
>
> - update a local variable and set the atomic after the loop,
> - or make it a per-cpu to reduce CPU cache bouncing,
> - or just about anything that is less heavy-weight than an atomic
> CMPXCHG in the inner loop of a slot search.
>

Perhaps I'm missing your point, but there's no loop here. The atomic
add is done once per successful slot allocation. If swiotlb_do_find_slots()
doesn't find any slots for the current area, it exits at the "not_found" label
and the atomic add isn't done.

In the case where the high water mark is bumped, the try_cmpxchg()
is in a loop only to deal with a race condition where another CPU updates
the high water mark first. The try_cmpxchg() should only rarely be
executed more than once, and again, only when the high water mark
changes.

I thought about tracking the high water mark on a per-CPU basis or
per-area basis, but I don't think the resulting data is useful. Adding up
the individual high water marks likely significantly over-estimates the
true high water mark. Is there a clever way to make this useful that I'm
not thinking about?

Tracking the global high water mark using non-atomic add and subtract
could be done, with reduced accuracy. But I wanted to be tracking the
high water mark over an extended time period (days, or even weeks),
and I don't have a feel for how much accuracy would be lost from
non-atomic arithmetic on the global high water mark. It would still
be a shared cache line.

Regarding your other email about non-default io_tlb_mem instances,
my patch just extends what is already reported in debugfs, which
is only for the default io_tlb_mem. The non-default instances seemed
to me to be fairly niche cases that weren't worth the additional
complexity, but maybe I'm wrong about that.

Michael

2023-03-28 15:12:07

by Petr Tesařík

[permalink] [raw]
Subject: Re: [PATCH v2 1/1] swiotlb: Track and report io_tlb_used high water mark in debugfs

On Tue, 28 Mar 2023 14:29:03 +0000
"Michael Kelley (LINUX)" <[email protected]> wrote:

> From: Petr Tesařík <[email protected]> Sent: Tuesday, March 28, 2023 6:50 AM
> >
> > On Tue, 28 Mar 2023 13:12:13 +0000
> > "Michael Kelley (LINUX)" <[email protected]> wrote:
> >
> > > From: Christoph Hellwig <[email protected]> Sent: Monday, March 27, 2023 6:34
> > PM
> > > >
> > > > On Sat, Mar 25, 2023 at 10:53:10AM -0700, Michael Kelley wrote:
> > > > > @@ -659,6 +663,14 @@ static int swiotlb_do_find_slots(struct device *dev, int
> > > > area_index,
> > > > > area->index = wrap_area_index(mem, index + nslots);
> > > > > area->used += nslots;
> > > > > spin_unlock_irqrestore(&area->lock, flags);
> > > > > +
> > > > > + new_used = atomic_long_add_return(nslots, &total_used);
> > > > > + old_hiwater = atomic_long_read(&used_hiwater);
> > > > > + do {
> > > > > + if (new_used <= old_hiwater)
> > > > > + break;
> > > > > + } while (!atomic_long_try_cmpxchg(&used_hiwater, &old_hiwater, new_used));
> > > > > +
> > > > > return slot_index;
> > > >
> > > > Hmm, so we're right in the swiotlb hot path here and add two new global
> > > > atomics?
>[...]
> > For my purposes, it does not have to be 100% accurate. I don't really
> > mind if it is off by a few slots because of a race window, so we could
> > (for instance):
> >
> > - update a local variable and set the atomic after the loop,
> > - or make it a per-cpu to reduce CPU cache bouncing,
> > - or just about anything that is less heavy-weight than an atomic
> > CMPXCHG in the inner loop of a slot search.
> >
>
> Perhaps I'm missing your point, but there's no loop here. The atomic
> add is done once per successful slot allocation. If swiotlb_do_find_slots()
> doesn't find any slots for the current area, it exits at the "not_found" label
> and the atomic add isn't done.

My bad. I read the patch too quickly and thought that the update was
done for each searched area. I stay corrected here.

>[...]
> I thought about tracking the high water mark on a per-CPU basis or
> per-area basis, but I don't think the resulting data is useful. Adding up
> the individual high water marks likely significantly over-estimates the
> true high water mark. Is there a clever way to make this useful that I'm
> not thinking about?

No, not that I'm aware of. Min/max cannot be easily split.

>[...]
> Regarding your other email about non-default io_tlb_mem instances,
> my patch just extends what is already reported in debugfs, which
> is only for the default io_tlb_mem. The non-default instances seemed
> to me to be fairly niche cases that weren't worth the additional
> complexity, but maybe I'm wrong about that.

What I mean is that the values currently reported in debugfs only refer
to io_tlb_default_mem. Since restricted DMA pools also use
swiotlb_find_slots() and swiotlb_release_slots(), the global counters
now get updated both for io_tlb_default_mem and all restricted DMA
pools.

In short, this hunk is a change in behaviour:

static int io_tlb_used_get(void *data, u64 *val)
{
- *val = mem_used(&io_tlb_default_mem);
+ *val = (u64)atomic_long_read(&total_used);
return 0;
}

Before the change, it shows the number of used slots in the default
SWIOTLB, after the change it shows the total number of used slots in
the SWIOTLB and all restricted DMA pools.

Petr T

2023-03-28 15:28:55

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH v2 1/1] swiotlb: Track and report io_tlb_used high water mark in debugfs

From: Petr Tesařík <[email protected]> Sent: Tuesday, March 28, 2023 8:08 AM
>
> On Tue, 28 Mar 2023 14:29:03 +0000
> "Michael Kelley (LINUX)" <[email protected]> wrote:
>
> > From: Petr Tesařík <[email protected]> Sent: Tuesday, March 28, 2023 6:50 AM
> > >
> > > On Tue, 28 Mar 2023 13:12:13 +0000
> > > "Michael Kelley (LINUX)" <[email protected]> wrote:
> > >
> > > > From: Christoph Hellwig <[email protected]> Sent: Monday, March 27, 2023
> 6:34
> > > PM
> > > > >
> > > > > On Sat, Mar 25, 2023 at 10:53:10AM -0700, Michael Kelley wrote:
> > > > > > @@ -659,6 +663,14 @@ static int swiotlb_do_find_slots(struct device *dev, int
> > > > > area_index,
> > > > > > area->index = wrap_area_index(mem, index + nslots);
> > > > > > area->used += nslots;
> > > > > > spin_unlock_irqrestore(&area->lock, flags);
> > > > > > +
> > > > > > + new_used = atomic_long_add_return(nslots, &total_used);
> > > > > > + old_hiwater = atomic_long_read(&used_hiwater);
> > > > > > + do {
> > > > > > + if (new_used <= old_hiwater)
> > > > > > + break;
> > > > > > + } while (!atomic_long_try_cmpxchg(&used_hiwater, &old_hiwater, new_used));
> > > > > > +
> > > > > > return slot_index;
> > > > >
> > > > > Hmm, so we're right in the swiotlb hot path here and add two new global
> > > > > atomics?
> >[...]
> > > For my purposes, it does not have to be 100% accurate. I don't really
> > > mind if it is off by a few slots because of a race window, so we could
> > > (for instance):
> > >
> > > - update a local variable and set the atomic after the loop,
> > > - or make it a per-cpu to reduce CPU cache bouncing,
> > > - or just about anything that is less heavy-weight than an atomic
> > > CMPXCHG in the inner loop of a slot search.
> > >
> >
> > Perhaps I'm missing your point, but there's no loop here. The atomic
> > add is done once per successful slot allocation. If swiotlb_do_find_slots()
> > doesn't find any slots for the current area, it exits at the "not_found" label
> > and the atomic add isn't done.
>
> My bad. I read the patch too quickly and thought that the update was
> done for each searched area. I stay corrected here.
>
> >[...]
> > I thought about tracking the high water mark on a per-CPU basis or
> > per-area basis, but I don't think the resulting data is useful. Adding up
> > the individual high water marks likely significantly over-estimates the
> > true high water mark. Is there a clever way to make this useful that I'm
> > not thinking about?
>
> No, not that I'm aware of. Min/max cannot be easily split.
>
> >[...]
> > Regarding your other email about non-default io_tlb_mem instances,
> > my patch just extends what is already reported in debugfs, which
> > is only for the default io_tlb_mem. The non-default instances seemed
> > to me to be fairly niche cases that weren't worth the additional
> > complexity, but maybe I'm wrong about that.
>
> What I mean is that the values currently reported in debugfs only refer
> to io_tlb_default_mem. Since restricted DMA pools also use
> swiotlb_find_slots() and swiotlb_release_slots(), the global counters
> now get updated both for io_tlb_default_mem and all restricted DMA
> pools.
>
> In short, this hunk is a change in behaviour:
>
> static int io_tlb_used_get(void *data, u64 *val)
> {
> - *val = mem_used(&io_tlb_default_mem);
> + *val = (u64)atomic_long_read(&total_used);
> return 0;
> }
>
> Before the change, it shows the number of used slots in the default
> SWIOTLB, after the change it shows the total number of used slots in
> the SWIOTLB and all restricted DMA pools.
>

Got it -- I understand your point now. You are right. I'll fix in the next version.

Michael