2022-06-09 01:43:29

by Dongli Zhang

[permalink] [raw]
Subject: [PATCH RFC v1 0/7] swiotlb: extra 64-bit buffer for dev->dma_io_tlb_mem

Hello,

I used to send out a patchset on 64-bit buffer and people thought it was
the same as Restricted DMA. However, the 64-bit buffer is still not supported.

https://lore.kernel.org/all/[email protected]/

This RFC is to introduce the extra swiotlb buffer with SWIOTLB_ANY flag,
to support 64-bit swiotlb.

The core ideas are:

1. Create an extra io_tlb_mem with SWIOTLB_ANY flags.

2. The dev->dma_io_tlb_mem is set to either default or the extra io_tlb_mem,
depending on dma mask.


Would you please help suggest for below questions in the RFC?

- Is it fine to create the extra io_tlb_mem?

- Which one is better: to create a separate variable for the extra
io_tlb_mem, or make it an array of two io_tlb_mem?

- Should I set dev->dma_io_tlb_mem in each driver (e.g., virtio driver as
in this patchset)based on the value of
min_not_zero(*dev->dma_mask, dev->bus_dma_limit), or at higher level
(e.g., post pci driver)?


This patchset is to demonstrate that the idea works. Since this is just a
RFC, I have only tested virtio-blk on qemu-7.0 by enforcing swiotlb. It is
not tested on AMD SEV environment.

qemu-system-x86_64 -cpu host -name debug-threads=on \
-smp 8 -m 16G -machine q35,accel=kvm -vnc :5 -hda boot.img \
-kernel mainline-linux/arch/x86_64/boot/bzImage \
-append "root=/dev/sda1 init=/sbin/init text console=ttyS0 loglevel=7 swiotlb=327680,3145728,force" \
-device virtio-blk-pci,id=vblk0,num-queues=8,drive=drive0,disable-legacy=on,iommu_platform=true \
-drive file=test.raw,if=none,id=drive0,cache=none \
-net nic -net user,hostfwd=tcp::5025-:22 -serial stdio


The kernel command line "swiotlb=327680,3145728,force" is to allocate 6GB for
the extra swiotlb.

[ 2.826676] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[ 2.826693] software IO TLB: default mapped [mem 0x0000000037000000-0x000000005f000000] (640MB)
[ 2.826697] software IO TLB: high mapped [mem 0x00000002edc80000-0x000000046dc80000] (6144MB)

The highmem swiotlb is being used by virtio-blk.

$ cat /sys/kernel/debug/swiotlb/swiotlb-hi/io_tlb_nslabs
3145728
$ cat /sys/kernel/debug/swiotlb/swiotlb-hi/io_tlb_used
8960


Dongli Zhang (7):
swiotlb: introduce the highmem swiotlb buffer
swiotlb: change the signature of remap function
swiotlb-xen: support highmem for xen specific code
swiotlb: to implement io_tlb_high_mem
swiotlb: add interface to set dev->dma_io_tlb_mem
virtio: use io_tlb_high_mem if it is active
swiotlb: fix the slot_addr() overflow

arch/powerpc/kernel/dma-swiotlb.c | 8 +-
arch/x86/include/asm/xen/swiotlb-xen.h | 2 +-
arch/x86/kernel/pci-dma.c | 5 +-
drivers/virtio/virtio.c | 8 ++
drivers/xen/swiotlb-xen.c | 16 +++-
include/linux/swiotlb.h | 14 ++-
kernel/dma/swiotlb.c | 136 +++++++++++++++++++++-------
7 files changed, 145 insertions(+), 44 deletions(-)

Thank you very much for feedback and suggestion!

Dongli Zhang



2022-06-09 01:58:34

by Dongli Zhang

[permalink] [raw]
Subject: [PATCH RFC v1 4/7] swiotlb: to implement io_tlb_high_mem

This patch is to implement the extra 'io_tlb_high_mem'. In the future, the
device drivers may choose to use either 'io_tlb_default_mem' or
'io_tlb_high_mem' as dev->dma_io_tlb_mem.

The highmem buffer is regarded as active if
(high_nslabs && io_tlb_high_mem.nslabs) returns true.

Cc: Konrad Wilk <[email protected]>
Cc: Joe Jin <[email protected]>
Signed-off-by: Dongli Zhang <[email protected]>
---
arch/powerpc/kernel/dma-swiotlb.c | 8 ++-
arch/x86/kernel/pci-dma.c | 5 +-
include/linux/swiotlb.h | 2 +-
kernel/dma/swiotlb.c | 103 +++++++++++++++++++++---------
4 files changed, 84 insertions(+), 34 deletions(-)

diff --git a/arch/powerpc/kernel/dma-swiotlb.c b/arch/powerpc/kernel/dma-swiotlb.c
index ba256c37bcc0..f18694881264 100644
--- a/arch/powerpc/kernel/dma-swiotlb.c
+++ b/arch/powerpc/kernel/dma-swiotlb.c
@@ -20,9 +20,11 @@ void __init swiotlb_detect_4g(void)

static int __init check_swiotlb_enabled(void)
{
- if (ppc_swiotlb_enable)
- swiotlb_print_info();
- else
+ if (ppc_swiotlb_enable) {
+ swiotlb_print_info(false);
+ if (swiotlb_high_active())
+ swiotlb_print_info(true);
+ } else
swiotlb_exit();

return 0;
diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index 30bbe4abb5d6..1504b349b312 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -196,7 +196,10 @@ static int __init pci_iommu_init(void)
/* An IOMMU turned us off. */
if (x86_swiotlb_enable) {
pr_info("PCI-DMA: Using software bounce buffering for IO (SWIOTLB)\n");
- swiotlb_print_info();
+
+ swiotlb_print_info(false);
+ if (swiotlb_high_active())
+ swiotlb_print_info(true);
} else {
swiotlb_exit();
}
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index e61c074c55eb..8196bf961aab 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -166,7 +166,7 @@ static inline void swiotlb_adjust_size(unsigned long size)
#endif /* CONFIG_SWIOTLB */

extern bool swiotlb_high_active(void);
-extern void swiotlb_print_info(void);
+extern void swiotlb_print_info(bool high);

#ifdef CONFIG_DMA_RESTRICTED_POOL
struct page *swiotlb_alloc(struct device *dev, size_t size);
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 7988883ca7f9..ff82b281ce01 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -101,6 +101,21 @@ setup_io_tlb_npages(char *str)
}
early_param("swiotlb", setup_io_tlb_npages);

+static struct io_tlb_mem *io_tlb_mem_get(bool high)
+{
+ return high ? &io_tlb_high_mem : &io_tlb_default_mem;
+}
+
+static unsigned long nslabs_get(bool high)
+{
+ return high ? high_nslabs : default_nslabs;
+}
+
+static char *swiotlb_name_get(bool high)
+{
+ return high ? "high" : "default";
+}
+
bool swiotlb_high_active(void)
{
return high_nslabs && io_tlb_high_mem.nslabs;
@@ -133,17 +148,18 @@ void __init swiotlb_adjust_size(unsigned long size)
pr_info("SWIOTLB bounce buffer size adjusted to %luMB", size >> 20);
}

-void swiotlb_print_info(void)
+void swiotlb_print_info(bool high)
{
- struct io_tlb_mem *mem = &io_tlb_default_mem;
+ struct io_tlb_mem *mem = io_tlb_mem_get(high);

if (!mem->nslabs) {
pr_warn("No low mem\n");
return;
}

- pr_info("mapped [mem %pa-%pa] (%luMB)\n", &mem->start, &mem->end,
- (mem->nslabs << IO_TLB_SHIFT) >> 20);
+ pr_info("%s mapped [mem %pa-%pa] (%luMB)\n",
+ swiotlb_name_get(high), &mem->start, &mem->end,
+ (mem->nslabs << IO_TLB_SHIFT) >> 20);
}

static inline unsigned long io_tlb_offset(unsigned long val)
@@ -184,15 +200,9 @@ static void *swiotlb_mem_remap(struct io_tlb_mem *mem, unsigned long bytes)
}
#endif

-/*
- * Early SWIOTLB allocation may be too early to allow an architecture to
- * perform the desired operations. This function allows the architecture to
- * call SWIOTLB when the operations are possible. It needs to be called
- * before the SWIOTLB memory is used.
- */
-void __init swiotlb_update_mem_attributes(void)
+static void __init __swiotlb_update_mem_attributes(bool high)
{
- struct io_tlb_mem *mem = &io_tlb_default_mem;
+ struct io_tlb_mem *mem = io_tlb_mem_get(high);
void *vaddr;
unsigned long bytes;

@@ -207,6 +217,19 @@ void __init swiotlb_update_mem_attributes(void)
mem->vaddr = vaddr;
}

+/*
+ * Early SWIOTLB allocation may be too early to allow an architecture to
+ * perform the desired operations. This function allows the architecture to
+ * call SWIOTLB when the operations are possible. It needs to be called
+ * before the SWIOTLB memory is used.
+ */
+void __init swiotlb_update_mem_attributes(void)
+{
+ __swiotlb_update_mem_attributes(false);
+ if (swiotlb_high_active())
+ __swiotlb_update_mem_attributes(true);
+}
+
static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start,
unsigned long nslabs, unsigned int flags, bool late_alloc)
{
@@ -240,15 +263,13 @@ static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t start,
return;
}

-/*
- * Statically reserve bounce buffer space and initialize bounce buffer data
- * structures for the software IO TLB used to implement the DMA API.
- */
-void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
- int (*remap)(void *tlb, unsigned long nslabs, bool high))
+static void __init
+__swiotlb_init_remap(bool addressing_limit, unsigned int flags,
+ int (*remap)(void *tlb, unsigned long nslabs, bool high),
+ bool high)
{
- struct io_tlb_mem *mem = &io_tlb_default_mem;
- unsigned long nslabs = default_nslabs;
+ struct io_tlb_mem *mem = io_tlb_mem_get(high);
+ unsigned long nslabs = nslabs_get(high);
size_t alloc_size;
size_t bytes;
void *tlb;
@@ -274,7 +295,7 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
return;
}

- if (remap && remap(tlb, nslabs, false) < 0) {
+ if (remap && remap(tlb, nslabs, high) < 0) {
memblock_free(tlb, PAGE_ALIGN(bytes));

nslabs = ALIGN(nslabs >> 1, IO_TLB_SEGSIZE);
@@ -293,7 +314,20 @@ void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
swiotlb_init_io_tlb_mem(mem, __pa(tlb), nslabs, flags, false);

if (flags & SWIOTLB_VERBOSE)
- swiotlb_print_info();
+ swiotlb_print_info(high);
+}
+
+/*
+ * Statically reserve bounce buffer space and initialize bounce buffer data
+ * structures for the software IO TLB used to implement the DMA API.
+ */
+void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
+ int (*remap)(void *tlb, unsigned long nslabs, bool high))
+{
+ __swiotlb_init_remap(addressing_limit, flags, remap, false);
+ if (high_nslabs)
+ __swiotlb_init_remap(addressing_limit, flags | SWIOTLB_ANY,
+ remap, true);
}

void __init swiotlb_init(bool addressing_limit, unsigned int flags)
@@ -364,23 +398,20 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask,
(nslabs << IO_TLB_SHIFT) >> PAGE_SHIFT);
swiotlb_init_io_tlb_mem(mem, virt_to_phys(vstart), nslabs, 0, true);

- swiotlb_print_info();
+ swiotlb_print_info(false);
return 0;
}

-void __init swiotlb_exit(void)
+static void __init __swiotlb_exit(bool high)
{
- struct io_tlb_mem *mem = &io_tlb_default_mem;
+ struct io_tlb_mem *mem = io_tlb_mem_get(high);
unsigned long tbl_vaddr;
size_t tbl_size, slots_size;

- if (swiotlb_force_bounce)
- return;
-
if (!mem->nslabs)
return;

- pr_info("tearing down default memory pool\n");
+ pr_info("tearing down %s memory pool\n", swiotlb_name_get(high));
tbl_vaddr = (unsigned long)phys_to_virt(mem->start);
tbl_size = PAGE_ALIGN(mem->end - mem->start);
slots_size = PAGE_ALIGN(array_size(sizeof(*mem->slots), mem->nslabs));
@@ -397,6 +428,16 @@ void __init swiotlb_exit(void)
memset(mem, 0, sizeof(*mem));
}

+void __init swiotlb_exit(void)
+{
+ if (swiotlb_force_bounce)
+ return;
+
+ __swiotlb_exit(false);
+ if (swiotlb_high_active())
+ __swiotlb_exit(true);
+}
+
/*
* Return the offset into a iotlb slot required to keep the device happy.
*/
@@ -786,6 +827,10 @@ static void swiotlb_create_debugfs_files(struct io_tlb_mem *mem,
static int __init __maybe_unused swiotlb_create_default_debugfs(void)
{
swiotlb_create_debugfs_files(&io_tlb_default_mem, "swiotlb");
+
+ if (swiotlb_high_active())
+ swiotlb_create_debugfs_files(&io_tlb_high_mem, "swiotlb-hi");
+
return 0;
}

--
2.17.1

2022-06-09 05:25:26

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH RFC v1 4/7] swiotlb: to implement io_tlb_high_mem

All this really needs to be hidden under the hood.

2022-06-10 22:22:38

by Dongli Zhang

[permalink] [raw]
Subject: Re: [PATCH RFC v1 4/7] swiotlb: to implement io_tlb_high_mem

Hi Christoph,

On 6/8/22 10:05 PM, Christoph Hellwig wrote:
> All this really needs to be hidden under the hood.
>

Since this patch file has 200+ lines, would you please help clarify what does
'this' indicate?

The idea of this patch:

1. Convert the functions to initialize swiotlb into callee. The callee accepts
'true' or 'false' as arguments to indicate whether it is for default or new
swiotlb buffer, e.g., swiotlb_init_remap() into __swiotlb_init_remap().

2. At the caller side to decide if we are going to call the callee to create the
extra buffer.

Do you mean the callee if still *too high level* and all the
decision/allocation/initialization should be down at *lower level functions*?

E.g., if I re-work the "struct io_tlb_mem" to make the 64-bit buffer as the 2nd
array of io_tlb_mem->slots[32_or_64][index], the user will only notice it is the
default 'io_tlb_default_mem', but will not be able to know if it is allocated
from 32-bit or 64-bit buffer.

Thank you very much for the feedback.

Dongli Zhang

2022-06-13 06:36:00

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH RFC v1 4/7] swiotlb: to implement io_tlb_high_mem

On Fri, Jun 10, 2022 at 02:56:08PM -0700, Dongli Zhang wrote:
> Since this patch file has 200+ lines, would you please help clarify what does
> 'this' indicate?

This indicates that any choice of a different swiotlb pools needs to
be hidden inside of Ń•wiotlb. The dma mapping API already provides
swiotlb the addressability requirement for the device. Similarly we
already have a SWIOTLB_ANY flag that switches to a 64-bit buffer
by default, which we can change to, or replace with a flag that
allocates an additional buffer that is not addressing limited.